In-Depth

XML lets loose the data stream

Is XML an unstoppable tidal wave? Like most things, that depends on who you talk to. Ted Friedman, a principal analyst who covers database and business intelligence technologies at Gartner, says it’s worth recalling all the hype around the marketing of XML-enabled DBMS products two to three years ago. “That’s all dead and gone, and it never amounted to a hill of beans because no one wanted to buy another type of database,” he says.


Possible exceptions include the likes of Software AG’s Tamino XML Server database product, Friedman says. “They have marketed it for several years. I don’t think they have gotten much traction with that, and I think they are now re-orienting that toward integration,” he says. Another XML database vendor Friedman mentions is Excelon, which, along with its XML database product, was acquired by Progress Software in 2002. However, he notes, many other XML players have disappeared or been acquired.

The problem, Friedman says, is that customers in the real world wanted to use what they had, and they judged the XML capabilities of database vendors like IBM and Oracle as good enough.

At the time, he says, many observers believed XML was the future of computing. “We may be moving that way, but I don’t see our clients quickly getting rid of legacy databases or their current investment in relational databases like Microsoft SQL,” he says. Friedman sees the relational database market as a huge river that can’t be stopped, though it sometimes spawns branches like object-oriented or XML databases.

“If you look back at the late 1980s and early 1990s,” he says, “object-oriented was the big hype. But the relational databases figured out object-oriented data types and drew it back into the main flow. I think XML is going the same way.” The IT world will become more XML-oriented as time goes by, “but I don’t think it will eradicate the RDBMS,” he says.

Although Friedman captures the skepticism of the market, not everyone agrees. Peter O’Kelly, senior analyst at the Burton Group, paints the emerging crop of XML technologies as a powerful force that will re-orient the database river as well as other parts of IT. The promise of XML is of easier, more flexible and more powerful databases that integrate more effectively across apps and firms. The question is whether XML can deliver that promise in a way that will appeal to the market.

Betting on the future of RDBMS
In thinking about XML, O’Kelly urges a broader perspective, starting with the history of the database. “For almost the last 10 years, DBMSs have been relegated to a reduced role,” he says. People saw the DBMS market as mature or staid -- a place where not much was happening. “There was even background noise about RDBMSs being obsolete,” he says. “I see them as resurgent with an expanding role in the server software stack for a variety of reasons, and XML is one of the most important.”

In O’Kelly’s view, a DBMS is something that “allows you to put stuff into it and get stuff out of it.” It is robust, reliable and prevents concurrency. The DBMS will ensure that one person’s changes don’t walk over someone else’s changes. And it retains data in a state of high fidelity.

“There is a growing awareness that those are useful attributes for all sorts of things, including documents and e-mails. If the government tells me I must have that sort of control for everything, then what is not to like about the DBMS?” he asks. Until now, however, he says people haven’t done more with those capabilities because of cost, complexity and development issues. “It raised the bar too high; even just the fact that you might need to know some SQL was a problem,” he says.

“Part of the XML dimension to the story is that the type of content you can manage on a practical basis with a DBMS is greatly expanded with XML,” O’Kelly says. With XML, you can have a wider range of data types and work with it in a variety of ways.

Some of the changes this could bring about are subtle, but some have radical potential in terms of market implications such as being able to take semi-structured documents and store them in a DBMS. O’Kelly says the fact that XML is on the verge of some major shifts makes things all the more interesting. “People talk about XML as if it was from the Jurassic, but it is really only six years old,” he says. “In many areas, in terms of its completeness and maturity, it is just coming together now.” In particular, he cites XQuery and XPath 2.0.

In his view, XQuery is the key to XML in databases. “The early versions were limited and complex to use,” he says. In particular, they relied on shredding, which helped XML reside happily within the database.

Similarly, he says XPath 1.0 was very difficult to use “but with 2.0 and XQuery, you are seeing the decades of best practices in the database community coming into focus for XML.”

What will a kinder, gentler XML do to the database market? O’Kelly predicts it will lead to some consolidation. However, he cautions, it is important to understand that XML isn’t a universal solution.

“If you have a departmental or personal database, it isn’t clear if you will ever need to extend it with XML,” he says. Rather, it is more for those who need app integration or data interchange because it is platform independent.

For those companies trying to figure out their own stance relative to XML, O’Kelly says the most important step is “to check your assumptions about the product category.” Many people have made assumptions about what DBMSs can do and are now converging in a new middle ground largely because of the XML schema, he says.

Another point O’Kelly makes is that for the last few years many people have been saying that app logic should go in the app tier and data should go in the DBMS tier. “Now that you can be more flexible with XML and Web services, those assumptions should be checked,” he says.

Untapped potential
Taking a slightly different but equally upbeat tack is Rita Knox, a Gartner research director and vice president who focuses primarily on XML. She says part of the technology’s problem is simply a matter of people adjusting to something new. “In the past,” she notes, “people have agreed on what the data stream will look like on both ends, including things like the order of the data and the length of the data elements.” That way, she says, when you send and receive data you know how to parse it and bring it into an application.

Thus, she says, people regard XML as rather alien because it explicitly labels what each component is so that it can be parsed and extracted on the sending end and then put into an app.

Still, she says, XML is now stable and has grown in leaps and bounds in its use. However, each of the different verticals such as medical and financial has developed its own variants.

Knox says the first generation of XML-based database technology “fell flat on its face” and everyone went right back to relational mode. “All those vendors with XML-aware databases have pretty much disappeared, except for Software AG,” she says. However, she notes, there are some new players coming around and a new area of development called XML appliances. These, she explains, are board-level technologies for offloading different kinds of XML processing, XML security and XML translation.

These devices will “have done the work for you before the XML gets to the database,” she says. If you needed to go into a relational model you could do that, she says, with the appliance providing a preamble to hitting the general-purpose machine. “We have had some discussion with the end-user community and they are beginning to look at these choices,” she says.

“People complain that XML is very verbose, but it is so much more flexible than a message where you don’t know what it is all about,” she says. That’s driving a huge growth of different enterprises using XML to share data with partners, or within a company for different apps or business units, she explains. “Because XML is really just text, you can change it and how it is used. It is a foundation for data-driven software.”

Buzzwords vs. real world
On the skeptical side of the XML question, Gartner’s Friedman says XML still needs to be kept in focus as nothing more than a markup language “designed to put some boundaries around and define the structure of different types of information.

“What we have seen,” he says, “is that the major DBMS vendors have all added XML capabilities in their DBMS products in some form.” It began with simple things like the ability to take a piece of XML and shred it to rows and columns so that it could map to standard tables in a relational structure. “It was typically a process of wrapping the DBMS with something that made it look like they understood XML,” says Friedman. Some vendors, such as Microsoft, haven’t progressed beyond that point, he notes. “They probably won’t take another step with XML until they deliver their next version of SQL Server.

“The way we see it, the DBMS vendors view XML as important but they are mostly treating it as another data type, like they did with pictures and audio,” Friedman says. “For them, it is another evolution in the extension of relational database management systems -- it is not like they are re-architecting.” Nor, says Friedman, should they. “In the end, it is still just bits and bytes, and I’m not sure they need to actually do a wholesale rearchitecting.”

The argument in favor of broader employment of XML boils down to favoring interoperability and flexibility, he says. “That is what’s nice about XML -- it can be an agreed upon way to share information with customers, suppliers and within different applications within a business,” he says. It is also portable and platform independent.

Despite the current round of XML hype, Friedman thinks it is still early for XQuery. For one thing, it remains only a working draft within the W3C, though some vendors, BEA for example, have built it into their products. “BEA’s product operates through an XQuery model and engine where you pose the queries and it gets the data,” Friedman explains.

Still, he says, “I don’t hear much demand from clients for XML query; they are mostly oriented toward the SQL paradigm. Thus, it will be a big leap in terms of paradigms and skills from that to XML.” If you think about BI tools, he says, they are all SQL-oriented and have no support for XML. But, Friedman adds, it is a transition that will come eventually, especially when XQuery matures.

Putting XML on its label
In the database realm, says OpenLink Software CEO Kingsley Idehen, XML provides “the perfect vehicle for a long-forgotten technology called the InfoBase, a storage vehicle for contextualized data.” Prior to XML, he says, relational databases provided views as the mechanism for information persistence and access. Unfortunately this also meant that the ability to exploit views was strictly database engine-specific. Database-independent data access technologies such as ODBC, JDBC and ADO.NET attempted to address this matter, he says, by broadening the accessibility of SQL views via generic data access APIs; but they demand SQL centricity on the part of the view, consuming technology.

“XML doesn’t suffer from the same limitations,” says Idehen, “and it is in this regard that XML provides real value in the DBMS realm via SQL-XML integration.” He says XML provides the critical technology for rejuvenating the InfoBase via its ability to provide persistence and platform-independent accessibility to contextualized data from either homogeneous or heterogeneous data sources.

OpenLink’s Virtuoso 3.5 is a cross-platform Universal Server that provides virtual database, XML database, SQL database and Web app server functionality as part of a single server solution. “It can generate XML and derivative formats from existing data sources, and publish time-tested, mission-critical app logic as remotely invoked Web services endpoints,” Idehen says. Perhaps the most important feature of Virtuoso, he says, is that it can integrate multiple platforms, OSes and programming languages seamlessly.

Big Blue also has a focus on XML. Jim Kleewein, distinguished engineer at IBM’s Silicon Valley Lab, is in charge of the firm’s XML database forays. In fact, he notes, “we have had the first commercial implementation of XML [XML Extender] in the database available since 1998.”

Back then “XML looked important, but no one was sure.” Since then, XML usage has exploded as a data interchange format not only between businesses but within businesses. In the case of XML Extender, Kleewein says experience has shown that its two options -- storing XML as a character string or “decomposing it” -- are not sufficient. “In a future version of DB2 you will see true support of XML as a data type,” he says. XQuery will be a crucial part of that development.

“Trying to query XML without XQuery is like asking directions in Mexico City in Japanese,” Kleewein says.

Keith Swenson, chief architect and director of development at Fujitsu Software, says his excitement regarding XML comes from a simple fact. “People need to grasp that indexing doesn’t necessarily make a database work faster,” he says.

He champions a new search algorithm that, in effect, searches all the data. “It sounds like a brute-force method, but we can show that for mid-sized and larger collections of data it can actually be faster,” he says. And that means not only opening the door even wider to XML, but perhaps even storing many different kinds of data within one structure.

It is a vision that seems to go to the heart of XML’s promise. “With this approach you can just start adding records; you don’t need to set up a schema in advance,” Swenson says.