News

XQuery marks the spot

XML has emerged in only five years as a startlingly powerful means of handling data. It has also been accompanied by a slew of "X-centric" helper tools, APIs and standards such as XSLT, XPath and, of late, XQuery.

Generally, XML must exist within the context of other software system languages, and so Java developers, .NET developers, Cobol developers, SQL developers and others have had to learn something about this markup language that grew out of SGML, a relatively obscure document-related language.

With more developers encountering XML all the time, some will soon look at XQuery as an option for system building.

According to the W3C group that is working to specify XQuery, the markup language will build on the idea that a query language using the structure of XML intelligently can express queries across all kinds of data.

To cast some light on this technology, we spoke recently with Jonathan Robie, XML program manager at DataDirect as well as a member of the W3C's XML Query Working Group that is at work on XQuery. Robie is an editor of the XQuery specification and several related specifications.

"In many development environments, people have to work with relational data, XML and data found in objects. These have three very different data models -- three very different ways of representing and manipulating data," said Robie. "For XML and relational data, XQuery allows you to work in one common model, an XML-based model that can also be used with XML views of relational data."

Thus, the data architect on a team may soon begin to look at some type of XML model to handle diverse needs. XML is, among other things, hierarchical in structure, and some modelers may seek to exploit this and other attributes. Of course, some critics suggest one model may not turn out to be the answer.

But some XML-centric apps may do well with an XML-centric model, Robie suggests. "Most Web sites have some connection to a database. Many are using XML to transfer data from the database to the Web site. When you want to present hierarchical information to site users, you don't give them a series of tables and ask them to do joins in their mind," Robie jokes. "You create a hierarchy on screen as an outline or graphical representation that shows the hierarchy. All the relational databases can give you is a table," he said.

"And that's generally not what you want if you have to build a Web page or a SOAP message," he noted. "The programmer winds up writing code to restructure the data. That's tedious and not very maintainable."

A variety of techniques have arisen, some now discarded, to help merge XML data with programmatic techniques. XQuery has gained attention in some part because of its apparent support from major database vendors, because it seeks to handle both relational and non-relational data in one fashion, and from the notion that it might be familiar -- like SQL.

That is a simplistic view that does not hold up too well. And SQL zealots may find more favor with another emerging XML standard known as SQL/XML, which is also supported by some major database makers.

"XQuery as 'SQL for XML' is an analogy that is both very true and very false," chides Robie. In fact, he said, it is "as revolutionary as it is evolutionary.

"XQuery and SQL/XML are two standards that use declarative, portable queries to return XML by querying relational data. The XML can have any desired structure, and the queries can be arbitrarily complex. The database vendors have their proprietary ways to do it," he explained.

"But these are not portable [ways]; they often use open standards like XSLT and XPath, and Java APIs like DOM, SAX or JDOM, but the basic framework for integration is proprietary, works only on databases from that vendor, and is completely different from the approach used for another vendor's database," Robie said.

A "native" XML modeling approach may be in order. And XQuery may lead to that.

"Native XML programming is at the heart of XQuery. XML is not object oriented, nor is it relational. XQuery is portable, type safe and can be optimized for database access. It is ideal for queries that must represent results as XML, to query XML stored inside or outside the database, or to span relational and XML sources," said Robie.

"If you're in the XML world and you want to integrate data sources, XQuery is the way to do it. If you are in a SQL world and XML is just one more type of data, then SQL/XML is good for that," he noted.

"XQuery is XML-centric. SQL/XML is SQL-centric. People who want to think in XML will use XQuery, while people who want to think in SQL will use SQL/XML," Robie said.

Speaking to W3C group efforts, Robie said, "we leaned heavily on the design of existing languages, including SQL, XPath and XQL, XML-QL and OQL, but we combined them in new ways and needed to innovate because of the structure of XML."

As is their lot in development, programmers will need to consider their options and learn much by doing. "There's more learning involved because this is a new language, not just a dialect of SQL," Robie said.


Related stories and links:

"Blue Titan and BEA join XQuery forces" by Rich Seeley, www.adtmag.com/article.asp?id=7646

"XML engine combines varied data sources" by Jack Vaughan, www.adtmag.com/article.asp?id=6723

"New DB2 tools spring from IBM initiatives" by Colleen Frye, www.adtmag.com/article.asp?id=7301

"What's in store for XML storage?" by Jack Vaughan, www.adtmag.com/article.asp?id=6418

To find out more about the XQuery spec on the W3C.org Web site, please go to www.w3.org/TR/xquery/

For other Programmers Report articles, please go to www.adtmag.com/article.asp?id=6265

About the Author

Jack Vaughan is former Editor-at-Large at Application Development Trends magazine.