Binary XML is fast in theory but may be slow in adoption

As any coder can tell you, XML documents are verbose.

If you wanted to design a language for high-speed processing and transactions, you could probably do better than XML. But the question is: Could a new format or standard achieve the widespread popularity of XML?

As Rich Salz, chief security architect for Cambridge, Mass.-based DataPower, points out, XML is similar to English. It’s not the most precise language in the world, but it’s the one most programmers speak. If you want to talk to programmers, English is good to know. If you want to develop Web services, XML is good to use.

"XML’s success is that while it's not great at everything, it's good enough for almost anything," he says.

But there is a move afoot to create a binary version of XML that would travel lighter and process faster than the text-based HTML-like current version of the standard. Salz is a member of the W3C XML Binary Characterization Working Group, which is currently looking into the possibility of expressing XML in ones and zeros. He is quick to point out that this working group is not tasked with creating such a standard. It is only looking into possible uses of it.

Or as the W3C Web site explains in its own version of English: "The XML Binary Characterization Working Group is tasked with gathering information about use cases where the overhead of generating, parsing, transmitting, storing, or accessing XML-based data may be deemed too great for a particular application, characterizing the properties that XML provides as well as those that are required by the use cases, and establishing objective, shared measurements to help judge whether XML 1.x and alternate (binary) encodings provide the required properties."

There are many issues to consider in studying how workable binary XML might be in business applications, according to Ron Schmelzer, senior analyst at ZapThink, the analyst firm specializing in XML standards and applications.

In a report on the issue, "Will Binary XML Solve XML Performance Woes?" he notes that because XML is designed to be read both by machines and humans, it results in "message sizes that can easily be 10 to 50 times larger than equivalent messages sent via binary encodings."

"To make matters worse," Schmelzer writes, "conducting a simple point-to-point exchange between XML conversant endpoints might require each of the following operations: decryption, validation, parsing, marshalling, serialization, canonicalization, document signing, and encryption."

Binary XML would speed this processing up and is cleaner than the alternative approach of compressing the XML text into a zip file, which has the downside of adding a processing step–zipping and unzipping–at each end of any communication, he notes.

That said, the ZapThink analyst does not see binary XML as the mythological silver bullet. The main downside is that every point in an XML process would have to be set up to handle the binary format, according to the analyst. While this might seem like a simple requirement, it may not be.

"While proponents often talk about how endpoints can easily be configured to deal with binary XML," Schmelzer writes, "they often neglect the fact that intermediaries between the communicating parties often must be able to inspect and make decisions on that traffic. As a result, binary XML's global acceptance hinges upon all security, process, management, and transformation systems or devices being able to understand and process the binary XML format."

The issue of proprietary formats also raises its ugly head.

"Furthermore," the ZapThink analyst writes, "binary XML raises the specter of potential compatibility and vendor lock-in concerns. For example, the format chosen to represent numerical data, such as integers, floating point numbers, or arrays, must be platform independent, so that different consuming platforms are able to take advantage of the performance boost that such native formatting offers–a tall order in today’s complex, heterogeneous IT environment."

So whither binary XML?

DataPower’s Salz has been around the computer standards wars long enough to be wary of making big predictions. But what he sees most likely happening is the creation of "targeted binary standards in generic XML." This could include a binary XML format for the wireless world, in which the verbose text format could be a hang-up. Another possibility is a MilSpec version of binary XML for military applications, such as sending coded messages to soldiers in the field. Transaction processing in the financial world is another possibility.

The W3C committee will finish looking into the possibilities later this year, and then there may or may not be a next step toward creating a binary XML standard.

It may be good to keep in mind that Jean Francois Sudre’s universal Solresol language, which might replace English, is currently said to be gathering supporters, just 150 years after it was first proposed.

About the Author

Rich Seeley is Web Editor for Campus Technology.