In-Depth

XML Is Not Yet A Cornerstone Technology

XML is the high-tech industry's new "favorite word," as the technology rides a wave of popularity last witnessed just a few short years ago with Java. Along with the hyperbole comes an overload of information, valid and otherwise, from a variety of sources describing how XML technology can solve most Enterprise Application Integration (EAI) problems.

Despite the promises, corporate developers need to make smart decisions about how to apply the technology as it is today to specific integration problems and challenges. Perhaps just as important, developers have to disregard some of the growing myths that surround the eXtensible Markup Language (XML). This article will show that while XML is not the cornerstone of EAI, it is an important enabler that, when used correctly, can be a key weapon in any corporation's IT arsenal.

Nevertheless, the Web as a delivery mechanism and XML as the delivery format is already a very powerful combination that can enable integration across the board for business-to-business (B2B), business-to-consumer (B2C) and application-to-application (A2A) connectivity.

XML/Web and information integration

As an EAI practitioner, I hear XML described in many different ways. It is important to remember that XML is not a product and it is not an application. It is simply a specification of a technology. The power of the XML model is its ability to separate presentation from content.

The term XML is often used to mean a category of middleware. This is a misnomer and certainly a misapplication of the technology. The true value of XML is its ability to provide a standard means of exchanging information between apps, business partners, businesses and consumers. As such, XML should be described as "informational middleware," a phrase that reinforces its strengths as an implementation-independent information exchange mechanism.

Information exchange in the form of XML messages has emerged as a new trend in information connectivity. The XML messaging drive has become the technology du jour for enabling B2B, B2C and A2A information connectivity (see Fig. 1), supplementing and sometimes replacing older and somewhat less-flexible mechanisms such as EDI and file transfers.

Combining XML with the World Wide Web (dubbed XML/Web) provides a basis for functionality with widespread potential usage. A new form of XML messaging, called XML/HTTP, has emerged from this powerful combination. Joining the openness and extensibility of XML with the ubiquity of HTTP as a connectivity mechanism provides a foundation for a lightweight and universal information-messaging platform that enables the exchange of information between businesses, as well as businesses and consumers. This significantly broadens the range of problems that can be solved. Because XML can be used at various levels in A2A, B2B and B2C integration projects, the technology holds the keys to the exchange of information and data inside and across organizations.

The XML/HTTP messaging specification allows partners to treat their business as black boxes and to describe relationships at the informational level, which can lead to "componentization" of the supply chain. By using standard technologies, "standard information exchange interfaces" can be created that allow the swapping of trading partners to be driven by business needs and not by technical dependencies.

In the B2B arena, XML/Web provides a solid foundation for establishing an "information pipeline" between partners. XML is used to capture information content by specifying how the data is organized and described in a hierarchical fashion. The power of XML is its ability to create self- describing documents.

Beyond the self-describing nature of XML is the creation and adoption of vocabularies. These vocabularies contain the meaning of the tags and allow partners to exchange information in a coordinated fashion by capturing rules regarding how data in an XML document should be interpreted. Currently, this is where a significant amount of work is being conducted by organizations such as OASIS, an industry standards consortium formed late last year that now boasts 100-plus sponsors, including IBM, Sun Microsystems, Oracle Corp. and Microsoft Corp. The OASIS team is charged with building a framework that will allow businesses to exchange data using XML.

Document Type Definitions (DTD), an XML specification that defines the allowable structures in an XML document, cannot capture vocabularies; rather they capture and contain rules that dictate if a document is "valid." Valid is a sticky word here because validity means very little when dealing with XML documents, as evidenced by the fact that DTDs are optional when using XML.

A popular misconception regarding XML-based data is that it conveys semantics about the data in an XML document. In most cases, this is false because XML data is expressed in the form of syntax-based tags that provide no more meaning than the simple identification of the data within the tags. Therefore, applications that parse XML documents can extract values from the tags but cannot understand the data.

A recent industry development is the proposal of a standard for information exchange based on XML, called Trading Partner Agreement Markup Language or tpaML. The tpaML specifications submitted to OASIS are an attempt to standardize in XML what is required for e-business information exchange and collaboration. The promise of tpaML is the creation of a vocabulary and related semantics to allow open information exchange between trading partners. The significant benefit here is that tpaML will let partners set up an agreement in hours instead of days. The vocabulary and relationships are captured in a tpaML document.

XML/Web can lower the cost of entry into information integration by enabling information exchange over standard and commodity technologies. XML/HTTP provides an open information transfer mechanism in a non-proprietary format. In terms of information exchange, this lets content providers furnish information in a standard format while giving the format provider ultimate flexibility in how to present it.

Such capabilities are crucial for service providers and other businesses whose function is to provide information to customers. For example, Emeryville, Calif.-based Ask Jeeves Inc., the company behind ask.com, leverages XML to exchange information with its service provider partners. By standardizing on XML, Ask Jeeves achieves economies of scale for information integration from multiple sources while maintaining independence in terms of presentation customization. This allows Ask Jeeves to "re-skin" applications by changing the user interface without impacting content.

It's all in the interpretation

Establishing relationships requires more than placing data in a self-describing and standard format. In fact, the ability to exchange data in this manner adds very little value to a B2B relationship. In such relationships, both partners must be able to interpret the information in XML messages. Parsing an XML Document Object Model (DOM) lets data be "extracted" into program code, but tells little to nothing about how to interpret and use this information.

Industry literature touts XML as a key step toward plug-and-play interoperability between different applications. In this vein, XML technology does provide a standard approach that can replace and supplement patchwork and proprietary information exchange mechanisms such as database extracts and file transfers. Unfortunately, these statements are more hopeful in nature given the potential of XML, and are not based in the harsh realities of the A2A inter-application integration problem space. Despite the obvious benefits and temptation of XML as an inter-application integration mechanism, it can provide only part of an EAI solution.

Current EAI efforts focus on leveraging some type of integration middleware environment that can exchange information between application systems in the form of messages. In this set of circumstances, the value of XML as an open information interchange format cannot be understated. XML can provide the canonical format all applications support, replacing proprietary message and meta data formats. Companies can continue to leverage integration broker products because XML is not sufficient to address the gamut of enterprise-specific, inter-application data transfor- mation requirements, including delivery, routing, marshalling and transformation.

For such environments, XML provides mechanisms for describing and capturing meta data structures and formats, while XML messaging provides the vehicle for passing this meta data among apps. The obvious advantage is that XML can provide the basis for inter-application integration through the sharing and use of common data elements. The benefit to XML/Web comes when one of the apps is Web-based, allowing data from an internal app to be "rendered" to an external app or partner in a standard and cost-effective manner.

Indeed, the vendor market has been quick to recognize this and several leading integration broker products — such as ActiveWorks from Active Software, Santa Clara, Calif., and Wilton, Conn.-based Mercator Software's Broker product line, including Enterprise Broker and Web Broker — include built-in support for XML. Using XML/Web, such products can extend the reach of information exchange from A2A to B2B and B2C initiatives using a single and consistent technology base that saves time and money.

XML/Web has also made inroads as a distributed application processing mechanism due to its ability to support Java. For instance, XML/Web can be used to "flatten" objects into streams for exchange between applications (see Fig. 2). The advantage is that these objects can be written in any programming language as long as both the receiver and sender understand the semantics of the grammar, and how to construct and reconstruct objects at each end of the pipe.

In this case, XML messages are used to carry state and data between processes. On the server side, a Java object can "write" its state to an XML document. This document is then shipped over HTTP to the client browser, and parsed into a DOM. From this DOM, an object is then created. Conceptually, this example appears to be a great use of the technology and it can be used to overcome some of the current limitations of distributed processing over Web protocols such as HTTP, a session-less protocol with no associated state or context semantics.

However, this technology remains limited, even in cases where XML is used to transport "objects" as a distributed object protocol. Developers still require rules for interpreting what to do with the DOM objects to create runtime entities. Certainly, the DOM objects can be placed in the code that processes the objects, but that solution defeats the purpose of using XML/Web as a data exchange mechanism.

In order for XML/Web to work, the semantics and rules have to come along for the ride with the XML data message. DTDs are not sufficient, and the relative immaturity and lack of standardization of XML schemas make this more of a promise now and a reality over the next few years.

The promise of XSL

However, all is not doom and gloom. A key XML technology, eXtensible Style Language (XSL), holds great promise in filling in some of the gaps. Even though XSL has not been formally standardized, a significant amount of work is being done in this area to "activate" XML as a viable application processing technology. XSL promises to provide the layer that allows processing rules to be associated with XML elements. The key to XSL is the XSL style sheet that holds the rules for how an XML document is processed by an XSL parser.

Some vendors are now leveraging the XSL model to provide XSL-type functionality. Such products focus on "uniting" data in XML documents and presentation formats in the form of HTML pages to provide a key first step in information integration. These products provide a processing engine that takes data in an XML document and renders it in the form of an HTML page that is sent back to a client browser. In addition, some products, such as Rhythmyx from Stoneham, Mass.-based Percussion Software, provide a built-in engine to build XML documents from database queries.

This approach, while not on the magnitude of a B2B information exchange event, provides a solid first step in leveraging XML as an information integration mechanism. It also allows XML to complement and not replace HTML as the preferred presentation format of the Web.

Most suppliers of integration tools see the value of XML as a standard exchange format and are rushing to "XML-itize" or XML-enable products in an attempt to capture XML market share. Today's first-generation products focus on providing basic XML functionality by rendering data in the form of XML documents or processing data in XML documents. For companies whose main competency is grounded in data transformation and document generation, XML is a natural extension to their products. Figure 3 captures some of the "XML usage patterns" discussed below.

The ieIntegration Server from Burlington, Mass.-based iE (previously called Intelligent Environments), connects middle-tier processing components to back-end systems running on a variety of technologies, including CICS, MQSeries and multiple database management systems. The iE technology views XML as just another information delivery mechanism or connector type. XML DBLink and XML CORBALink, from Boulder, Colo.-based Rogue Wave Software, are also products that leverage XML's power as an information exchange vehicle.

XML DBLink combines the power of the current XML model with what I feel will be the next great application of XML technology: service-based interfaces. With DBLink, the client app is not exposed to SQL at all. Instead, data is requested on a service basis where the SQL is stored in a repository and associated with the service. At runtime, a client app issues a call to a service that is intercepted by the DBLink runtime, which then issues the appropriate SQL call. When the result set is returned to DBLink, it is reformatted as XML and sent back to the client. XML CORBALink provides a similar mechanism for mapping services to CORBA server objects.

This separation of content from presentation fits in perfectly with the Domino/Notes model, so it should not be surprising that the Lotus Development Corp. unit of IBM has made a significant commitment to XML. The Domino Application Server (DAS) treats XML as a first-class data type, allowing XML data documents to be assembled on demand. In this sense, XML becomes an extension of the Notes environment. The new version of DAS, due out in the middle of this year, will provide a series of integrated tools to support XML across the board for information exchange.

DataChannel Inc., Bellevue, Wash., leverages XML as an enabling technology to deliver information portals. The firm's Enterprise Information Portal (EIP) can personalize information delivery from disparate data sources, giving subscribed users the ability to customize information delivery based upon categories. EIP leverages XML on the front end as the standard delivery mechanism.

The major database vendors and packaged apps suppliers have either started shipping or are promising to add XML interfaces to their products, enabling data from those sources to be rendered in XML and used in application server environments and integration broker environments. The interfaces can also send data to a client via XML messaging. Even though such capabilities can only facilitate syntactic data interoperability, the benefit statement is clear — users can reap the rewards of having all corporate information in one common format. Developers can replace the user-unfriendly APIs of packaged applications with a standard data format and leverage semantic data interoperability standards as they are introduced.

The true value of XML messaging as an application interoperability protocol will be realized when XML messages are exchanged over application processing transports that provide the qualities of service required for real-time transactional exchange of information in the form of XML documents. While HTTP as a transport mechanism falls far short in this area, such functionality can be achieved by using XML as the message format over traditional message-oriented middleware like IBM's MQSeries. However, that approach is not in line with the loosely connected nature of the XML/Web "black-box" model, in which a partner cannot be forced to use a certain requisite set of technology. What is required is a middle ground that provides these features over a standard technology set. This is where a technology such as Java Messaging Service (JMS) comes in. JMS provides message queuing and topic-based publishing of information in the form of messages. The messaging mechanism provides guaranteed and reliable qualities of service required for application-level message exchange in a Web environment.

Combining this messaging infrastructure with the power of XML messages can yield a powerful and pervasive technology combination. For example, SonicMQ from Bedford, Mass.-based Progress Software is described as a full JMS-compliant product that allows quality relationships to be established utilizing XML messaging over Java. ChanneLinx.com, Greenville, S.C., which creates complete e-business solutions by integrating entities into supply chains, leverages SonicMQ as the basis for its connectivity. The SonicMQ technology enables ChanneLinx.com to build supply-chain relationships that leverage a point-to-point query/response mechanism on the supply chain. By allowing information systems to listen on "information queues," SonicMQ enables XML messaging with the security and transactional capabilities required of e-business.

A taste of things to come

Given today's XML limitations, an IT development manager must wonder whether it is worth jumping on the XML/Web bandwagon. Though the technology is evolving rapidly, XML is still in its first generation of maturity. Conservative managers are adopting a wait-and-see approach before "betting the business" on XML. More aggressive operations are viewing XML as a basis for a common data exchange dialect for internal users, as well as outside suppliers and partners.

EDI has a reputation, and deservedly so, as an expensive and inflexible information exchange mechanism. Because of EDI's limitations, and because of the flexibility and functionality of XML/Web, many experts and technologists have seriously discussed abandoning EDI in favor of XML as the basis for B2B information exchange. But that change will not happen anytime soon. The many large companies that have invested heavily in EDI and established well-formed relationships based on the technology will not replace EDI solutions that have worked for many years with a new technology that still lacks a standards base.

I also do not expect existing B2B solutions to be supplanted and replaced by XML/Web technologies within the next two to four years. Installed EDI solutions provide a "Keep the business" approach to information exchange. Existing so- lutions have a solid "infrastructure" in terms of information relationships and technical connectivity. These relationships contain the semantics of information exchange that allow partners to interpret exchanged information. Thus, businesses can continue to operate without significant disruptions.

The attraction of XML/Web to many established operations is its ability to supplement existing relationships while adding new relationships. Under such a mindset, organizations can create an "Extend the business" functionality that can enable the establishment of new business delivery channels by providing in- formation and data over standard and commodity technologies. These delivery channels can be real-time or near real-time in nature, which overcomes a significant limitation due to the batch nature of EDI relationships.

XML messaging starts to fall short when information becomes an integral component of a business process. As shown earlier, standalone XML lacks the ability to integrate semantics or interpret information. This is a considerable barrier when undertaking A2A integration projects because the value of information is based on how it is interpreted. It is not sufficient to just render data in a standard message format in A2A systems.

Therefore, XML is not yet the answer to all of the integration requirements of A2A and B2B connectivity. In fact, it is just a small part of the overall puzzle. If used correctly, however, XML can be a great benefit to IT development organizations.

With the momentum that has gathered behind XML, it will only be a matter of time before XML/Web can provide the technologies and capabilities needed for true integration. An investment in the technology today can pay substantial dividends as the technology matures. Every company and industry has its own set of specific problems and issues to solve. XML messaging provides a common and simple way to resolve most of them.