Columns

Serenity through markup

XML comes primarily from the world of markup, where authors are allowed rich expression within a formal framework that binds a variety of works. The technology is mostly about elements, attributes and text. The text is the heart of the document. Elements provide structure according to the formal framework, traditionally the DTD, but now a variety of schema options. Attributes provide meta data for the framework. They tell processing systems how to deal with elements and text. There are no hard and fast rules as to what belongs in element content and what belongs in attributes. There is, however, a sense that attributes are meant to be abstract from the expression of the main data. Attributes are best used for machine-readable data. If a piece of data is directly relevant to the likely end consumer of the XML, this is a strong hint that it belongs in element content. If not, it may be a good fit for an attribute.

Since the promise of interoperability is a key draw of XML, developers fuss to build all-encompassing agreements into which we can fit our conforming XML documents. This can lead to quite byzantine formats, such as XML Metadata Interchange (XMI). But just as the swami tells the acolyte that to bend the spoon with his mind he must let go of the idea of a physical spoon, the markup state of mind suggests a calmer approach: One achieves interoperability by not striving too consciously for it.

If locally convenient global rules are available, use them, but beware of designing document framework with all users and uses in mind. As documents move between systems, trust the remote system's ability to interpret the document to meet its own local needs. This is known as the principle of loose coupling, and is reminiscent of the idea of late binding in programming theory. In the technology sphere, the idea is that an XML document used in a transaction between multiple systems need not always build in all possible aspects of the combined systems. Source systems design documents to meet their needs, while the target system interprets them to its own needs.

Web services technology borrowed from distributed programming the idea of an IDL that constrains communications to the view of all participating systems. The problem is that such an approach is no different from older technologies like CORBA and DCOM. There is a growing understanding in Web services that these should support exchanges involving looser coupling between the participating systems. This need is more readily apparent in the exchange of human-readable documents than highly structured data sets, and is often called the 'document style' of Web services.

But is such looseness in the generation and interpretation of data acceptable? Ideally, one would design data to meet local needs, and have a means of adapting the design to shared/global conventions. Traditional markup systems achieve this with 'architectures,' document frameworks that come with specialized tools for drafting a data format to a global spec, but which provide built-in rules for generation and interpretation of the document under local conventions.

An alternative is pipelines of data, where a document starts out in one form and ends up in one suited for a different environment. This varies from the common approach to XML interchange where data conversion is a hefty process that binds one node to another, causing an increase in the needed machinery as the nodes increase. Processing pipelines minimize network complexities and maximize the usefulness of intermediate data forms.

Markup systems have enjoyed the sort of large-scale successes and longevity of deployments that have often eluded mainstream code. Perhaps it isn't so clever to disdain the wizened old master.


....................

To read more columns by Uche Ogbuji, click here.

About the Author

Uche Ogbuji is a consultant and co-founder at Fourthought Inc. in Boulder, Colo. He may be contacted at [email protected].