XML: The next Esperanto?

"How about a date?" he asked. "No thanks, but I'll take an orange if you have one," she replied. What was this conversation about? Was the gentleman asking the lady out on a date or offering her a piece of fruit? Was she declining a date, messing with his mind or asking for a different kind of fruit? Without a context for the conversation, it's difficult to tell what's going on. What does this have to do with XML? Quite a bit. There are significant parallels between communication among humans and communication among systems. The challenges of structure, grammar and context apply to both types of communication. What does that mean for XML and communication between systems?

Some have touted XML as the solution for integrating systems. They believe that XML provides a common language for intersystem communication. It should allow systems to exchange understandable information, not just raw data. XML's tagging approach allows systems to move beyond simple wire protocols for raw data communication. Tagging adds structure to the data. This structure in turn helps convert raw data into information. Without a doubt, this is a very good thing. We are already seeing the benefits of this approach as we use XML and SOAP to integrate systems. But is this sufficient for all our system integration needs? Do we need more than structure to create information and facilitate communication?

Think of the dialogue at the beginning of the column. The sentences have structure, nouns and verbs and everything is in the correct order. Is this enough to understand the dialogue? No. Just as we need to move beyond structure to grammar to understand the dialogue, for system communication, we need to move beyond tags to grammar. This is where XML Schema comes in. XML Schema can fill the role of grammar for us in the world of XML. Schema allows us to put boundaries around our structure. For example, we can do more than create a <MONTH> tag for a document. We can constrain it to have only the twelve months as valid values. This is a big step over structure by itself. Grammar lets us convey more information than structure alone provides. This capability is already showing its value in system integration.

So far, with XML, we've moved from raw data to structure and beyond structure to grammar, to add value and information to our system communications. Is this enough for all our system communication needs? Think back once again to our dialogue. The dialogue obeys the rules of both structure and grammar, yet we are still unsure of its meaning. Does this kind of ambiguity happen in system-to-system communication? It certainly does. As we will see, this can be an issue in developing B2B exchanges. So where does that leave us with XML today? We have structure. We have grammar. Unfortunately, we have no context. Why is this a problem?

I think the current state of XML is like the Esperanto language. Esperanto was introduced in 1887 by Dr. L.L. Zamenhof. He proposed Esperanto as a second language that would allow people who speak different native languages to communicate, yet at the same time retain their own languages and cultural identities. Sounds like a great idea doesn't it? It sounds much like the rationale behind XML. So how has it worked? While there is a claim that millions of people around the world speak Esperanto, evidence seems to suggest that the language is not widely used for communication. Why is this and how does it relate to intersystem communications?

Like XML, Esperanto has structure and grammar, but that isn't sufficient for it to become the predominant language people use to communicate. There is a line of thought from anthropology that says language shapes the way we think and perceive. The Sapir-Whorf hypothesis states that the structure of a language constrains thought in that language, and constrains and influences the culture that uses it. In other words, if concepts or structural patterns are difficult to express in a language, the society and culture using the language will tend to avoid them. Individuals might overcome this barrier, but the society as a whole will not. Esperanto does not have this type of influence, because it is an artificial language. People do not think natively in Esperanto. It does not provide anyone the native context for understanding and interpreting their world. So what?

Think back to our dialogue one more time. Sometimes, communication requires more than structure and grammar. Remember how the lack of context makes the message ambiguous. There are often unspoken messages that shape our understanding of a message. This is known as metacommunication. Metacommunication doesn't just happen on the level of the individual message. Instead, there are whole sets of metamessages that, together, tell us how to interpret what is happening by invoking a particular set of expectations.

Such a set of expectations is what Gregory Bateson called a frame, like the physical frame that separates a picture from the wall on which it is hung. The picture frame tells us that we are to interpret the patterns within it as "art," not as "background." In other words, the frame tells us that "a work of art is happening here," not an accidental splashing on the wall or a continuation of the wallpaper pattern.

In any social situation, the "frame" is the answer to the question, "What is going on here?" And our perception of the frame determines how we interpret what happens within it. For XML to move beyond the communication limitations we see in Esperanto, we need to add the concept of a frame to the XML environment. Frames can take us beyond tags and grammar to context and meaning. Why would we care about this for systems?

Think for a moment about creating a B2B system. What does it mean to buy and sell something? Charles Fillmore defined the basic commercial event frame. There are six key aspects of this frame: 1) one person is the buyer, 2) one person is the seller, 3) there must be some object to be purchased, 4) the ownership of the object changes from seller to buyer, 5) in exchange for money, 6) whose ownership changes from buyer to seller. Even understanding this level of frame can require additional frame information. For example, the concept of ownership implies the rights to use, give or exchange an object however one wishes. The concept of rights in return implies some type of social agreement. Rights could include things like buy, loan or mortgage. There are also types of failures that need to be considered in this frame. These would include steal, default and defraud. As you can see, even a simple understanding of a basic commercial exchange can require extensive context to interpret what is going on. This is why building B2B exchanges will always take significant effort. What is a purchase order and what do the items mean to your business? What information do you need to create a contract?

Typically, when we define system interfaces, we don't deal explicitly at this level. Often the definition and understanding of these items is negotiated outside the system. This understanding is often implicit in the way we build the interfaces. Is this implicit approach adequate? I don't think so. We need a mechanism for creating context and meaning in system and business exchanges. Look at the work with ebXML, RosettaNet, OBI and the eCo Framework. Much of this work can be characterized as creating specific frames for commercial exchange. This work is needed because implicit assumptions don't work in creating large-scale commercial environments. Systems need a context in which to interpret information.

I propose that we create a standard, called XML Frames, to provide this context for systems. Systems would use these frames for interpreting the information structured by XML tags and constrained by XML Schema. Frames should not become buckets full of everything. It makes little sense to try to define the whole world in a single frame. Instead, frames should be focused and contain only the information needed to provide context for a given exchange (though that may be fairly complex). We could supplement frames with a methodology for understanding and resolving differences between frames. As an example, Michael Agar, in his book Language Shock, describes discontinuities between frames as "rich points" that provide a fertile ground for building new understanding and context.

If this discussion of frames seems a bit esoteric or theoretical, I assure you it's not. Significant barriers exist in building sophisticated interfaces and exchanges between systems. While some tout XML as the answer, as we've seen, plain XML only goes so far. Simply adding structure to data is not enough for all types of system exchanges. Even adding a grammar through constraints defined in XML Schema is not enough. Systems need context in which to interpret and act on information. All of the work on standards for commercial exchanges bears witness to this. Let's take our use of XML to the next level by creating a standard for XML Frames that will provide context to systems. This should greatly aid in the creation of complex system exchanges. Let me know what you think.

About the Author

John D. Williams is a contributor to Application Development Trends. He is president of Blue Mountain Commerce, a Cary, N.C.-based consulting firm specializing in enterprise, domain and application architectures. He can be reached via e-mail at [email protected].