In-Depth
XML meets the data warehouse
- By Jack Vaughan
- December 31, 2002
On first view, the marriage of data ware- housing and Web
services seems somewhat suspect -- like a Hollywood marriage that is not given
much chance to last. Data warehouses are big and mission-critical; they are
sturdy, almost heavy. Web services are more ephemeral, almost like applets; they
are lightweight and loosely coupled. Thus, this marriage may be a mismatch. One,
some would say, made in Market Hype Heaven.
But useful data management advances may come from this match. There may be
something to the application of Web service techniques to BI problems, say
several industry insiders. But, some suggest, it might be wise not to expect Web
services to change this area overnight.
Coming to terms
What are Web services? They are still as much
high-concept as they are technical reality. Underlying Web services is XML, a
descriptive means of handling data that has gained wide industry support. When
IBM, Microsoft and others agreed upon SOAP in 1999, the basic means for an
XML-based, industry standard communication mechanism was put in place.
SOAP is a simple pipe, and its authors left room for enhancements. SOAP may
or may not include an RPC. But it stands as an alternative to the tightly
coupled computing protocols of the past. Moreover, it is posed as easier to
program. XML is somewhat more readily understandable than lower-level languages.
Closely associated with SOAP are WSDL and UDDI.
While easier integration is more than most application development managers
can ever hope for, there is more to the Web services story. The idea of light,
unfettered apps and data marts has infiltrated the world of business
intelligence, data analytics and data warehousing.
Meanwhile, Microsoft and others have painted a picture of small, general apps
that tap into a Web services architecture as needed. In data warehousing, this
could lead to, for example, a calculating formula engine that slices data for
analysis, or even pre-packaged multidimensional data cubes available on demand
and on the Internet.
Some envision ''supply chains'' of data interchange. This can happen now, for
example, between K-Mart and Proctor & Gamble. The P&G soap product line
manager can get a feed on how a line of merchandise is doing in the Midwest,
look at in-house inventory reads and cross-reference this to demographic
databases, all to formulate a promotional campaign to run at K-Mart's Midwest
stores. With Web services, theoretically, the data will come to the P&G
power user's desktop with less need for drastic transformation; and, again,
theoretically, a third-party demographic database or analytical algorithm could
be briefly rented and immediately applied to the K-Mart store reports.
It happened one night
Why will this not happen overnight? XML may be a great
lingua franca
for integration, but the industry will not suddenly
move all of its data into XML data stagers or integrating data hubs.
Only when a number of these are built will IT shops be able to gauge the
''speed hit'' that can come with XML descriptive (''verbose'' some would say)
prowess. Already, makers of novel XML-specific-processing engines are readying
hardware accelerators to push XML integration more effortlessly over speed
bumps.
Speed gained here counts only to help system throughput on the middle tier
and back end. What about the client side?
Analytic tool vendors have spent years creating and touting HTML, then
JavaScript, then Java and DHTML interfaces for clients to do data analysis. They
have tried to create intuitive data navigation schemes, and to run end-around
plays to avoid cumbersome trips to the server and data sources.
The dirty little secret of the industry may be that power users want a robust
client and a tight connection to data. If Web services in analytics succeed, new
types of user interfaces may be needed.
Deeper, more specific standards must be built, beyond SOAP, for Web services
to flourish. Underway are data analysis-oriented initiatives like XMLA and
JOLAP, as well as many others, including vertical industry-specific formats.
The challenges to applying Web services within firewalls to data integration
problems are one thing. Creating chains of services across the Web are another,
say industry players close to this issue. UDDI, the ''yellow pages''-style
directory for finding services, is still in its infancy. The special
requirements of data warehouses must be met before XML and business intelligence
truly becomes a marriage. Assuring the safe transmission of services and
creating acceptable mechanisms for paying for services will take time.
These challenges are likely to temper unbridled enthusiasm for Web services
in warehousing, not to dash it. This year should be significant. XMLA proponents
are preparing to engage in public ''bake-offs'' that show interoperability of this
protocol.
Why a warehouse?
What is the purpose of data warehousing today?
''To provide a unified view of something like customer information,'' said Aaron
Zornes, executive VP, app delivery strategies at Meta Group. ''Today, it is
mostly internal. And they use fixed tools going against a subset of a data
warehouse.
''At some point, we have to decouple those tools,'' added Zornes. ''People don't
want to hard-code proprietary tools to networks that hook into information from
all over the world, or even internally. They want to hook in flexibly on the
fly.''
Issues to overcome include XML throughput. ''Some say there's too much
overhead for some 'real-time' data transformations,'' noted Zornes. ''But for
batch transforms after an initial [intersystem] handshake, it may be adequate.
If we are looking for information on a single customer, and aggregating and
feeding back [to an analytical cube], that's a tremendous amount of overhead,''
he said.
''We are going to have to have a loosely coupled connection,'' he continued,
''so servers can recognize each other on the fly, rather than be hard-wired, and
without having the [user interface] software installed on the computer.''
At the line of scrimmage
Among the data warehousing vendors to
talk up the idea of Web services early last year was Ascential. ''We were the
first data integration vendor to aggressively push a vision around Web services,
and how it fits with data warehouse and operational data mart initiatives,'' said
Bob Zurek, Ascential VP of advanced technology.
Ascential's plan to embed XML capabilities in its products does not force a
customer to use those capabilities, Zurek said. Products are expected soon.
''It's straightforward: We're providing the functionality as a basic feature.
Those who will use it are toward the innovative side,'' he said.
How will Ascential support these new capabilities, we asked?
''Today, if you use our product, you're reaching into all sorts of data silos.
These could be relational databases or flat files,'' said Zurek. ''They've done
some cleansing and then [staged it] for transform. For us, it doesn't matter
what it is. We let you expose that process or job as a Web service. We have in
the lab dozens of data sources exposed as Web services. You can expose your data
to anything that can consume a Web service.''
It is still early, and views on how to constitute Web services vary. Vendors
preceding or following Ascential with Web service-related data integration, data
warehouse or BI announcements are many -- Actuate, Brio, Business Objects,
ClearForest, Cognos, Crystal Decisions, Dimensional Insight, Hummingbird,
Hyperion, IBI, IBM, Informatica, Microsoft, MicroStrategy, NCR, Oracle, Sagent,
SAS Institute and Sybase among others.
''iWay uses Web services as a new kind of interface to access business
processes,'' said Jake Freivald, director of marketing at iWay software, an IBI
unit.
''We provide Web services as part of our XML transformation engine,'' he said.
And the iWay engine (which evolved from EDA/SQL) can support extensions like
ebXML because of its support of Web services. ''We can handle ebXML using Web
services,'' Freivald added.
Meanwhile, Actuate supports Web services as part of Actuate Version 6. ''We
come from the application development space, unlike the data warehouse and OLAP
[vendors],'' said Nobby Akiha, Actuate's VP of marketing. ''We were able to build
a SOAP-based API from scratch. This lets us leverage existing technology. Web
services expose 100% of the report server,'' he said, which eases adding new
reports, scheduling reports and conducting administrative functions.
In 2000, Sagent created a Web services layer on top of the Sagent platform
that can use SOAP messaging for tasks like adding and deleting users across
multiple platforms and sites.
For its part, Cognos last summer announced Cognos Web services, supporting
XML, SOAP and WSDL to build connections between the Cognos suite and multiple
apps. Meanwhile, Business Objects incorporated Web services into the
BusinessObject Developer Suite.
A clearer picture
''At first I wasn't convinced there was any
truly useful application for Web services in business intelligence or data
warehousing,'' admitted Sanju Bansal, vice chairman and COO, MicroStrategy. ''But
in the last year to 18 months, as I visit clients, I'm getting a clear picture
of how they want to use it with their BI systems.''
Bansal described a scenario in which a retailer's sales data is made
available to suppliers, and Web service calls are used to interrogate the
retailer's database using the supplier's own pricing algorithms.
People want historical data warehouse and up-to-the-minute operational data
feeding into pricing applications, he said.
''But nobody wants to rebuild a pricing application to make the data warehouse
happy,'' chided Bansal.
''People want to connect applications [within organizations], and they want
the connection to be automatic. My prediction: That is where the real value of
Web services applications will be,'' he said. These in-house applications will be
more valuable, he suggested, than even the sizable advances Web services might
make in supply-chain partner connections.
Bansal noted that his firm moved forward on the Web services front this year
with MicroStrategy Web Universal. The product marks the company's foray into
Unix. It runs on J2EE servers, including BEA's WebLogic Server. Web Universal
APIs are exposed and accessible via XML.
Although upbeat on Web services, Bansal is less than a champion of present
XML-oriented analytical standard proposals. ''XML for OLAP' is not taking off in
the industry,' he asserted.
Moreover, he said, ''JOLAP has not caught on.'' A key standard, he said, is a
de facto standard: Microsoft's Multidimensional Extensions (MDX), which can be
enhanced with recent XML-oriented add-ons to Microsoft's SQL database.
The elite and the masses
As with many advances in Web-enabling
data analytics, Web services may broaden the casual user base of such tools,
without particularly easing the task of analysts who work with such tools every
day.
''The experts are always going to say 'It's not as powerful, it's not as
fast,''' said Ragnar Edholm, director of Essbase tools at Hyperion. ''Those are
the power users.
''But you should also listen to the people that never tried to do this
before,'' added Edholm. A major driver for both XML and Web services in data
integration apps is the accelerating effort of relational database vendors to
add better XML data handling traits to their products.
''I think vendors are moving faster than users,'' noted Edholm.
Edholm pointed to the OMG's Common Warehouse Metamodel (CWM) as a useful
XML-oriented standard for data integration. In fact, the JOLAP API criticized by
MicroStrategy's Bansal and others relies on CWM to some extent for OLAP service
interfacing.
JOLAP is a significant continuation in the definition of industry standards
for software interoperability, according to John Kopcke, CTO at Hyperion, which
has taken a leading role in creating JOLAP.
Hyperion has also been active in the effort to create XMLA, which was
recently published as an updated spec and API standard for vendors to access
multidimensional databases as a Web service. Other council members are
Microsoft, Crystal Decisions, SAP AG and Silvon Software.
Overhead, overheard
Also active on XMLA standards is SAS
Institute. ''We view it as having great promise,'' said Jim Metcalf, the firm's
director of foundation technology strategy, while noting recent progress in
preliminary interoperability tests between XMLA-enabled systems. ''We are
watching JOLAP unfold,'' he said.
''Web services hold promise, as an 'interoperability' standard,''' said
Metcalf, emphasizing that one should put ''interoperability'' in quotes.
''It's a fancy API and, if everyone can agree, we'll have interapplication
communication over the Internet. We still have a long way to go,'' said
Metcalf.
''Before Web services can really be effectively used, especially outside the
firewall, we have to get a robust security model in place,'' explained
Metcalf.
XML overhead may be an issue, he said. Also an issue: How people package
their services. ''It's not binary. It's 'text strings' with lots of markup,'' he
noted. Things could tend to break, Metcalf indicated, ''if people stretch it to
its limits.''
The history of computing, he added, is one of people chasing bottlenecks. And
when they fix one, another one will pop up elsewhere.
Warehouse insulation
''Vendors will use Web services as an
interface,'' said Wayne Eckerson, director of education and research, The Data
Warehousing Institute.
''From the vendor's viewpoint, the promise is that we will have true
Internet-based services that people can dial into; and people will be able to
find those services from a variety of folks in a variety of categories as easily
as in the [telephone] Yellow Pages,'' said Eckerson.
Some will be fee-based, others will be free. Some will reside on extranets,
some will be internal. ''We won't get there for a long time because, although
those services exist today, they are now delivered in proprietary format,'' said
Eckerson.
Firms have already made progress in moving toward XML-enabled data
architectures, he noted.
''I've seen customers use it to insulate layers of the their BI architecture
so they can plug-and-play components or swap out components as necessary,'' he
said.
But, Eckerson added, people will be very cautious about opening up their
systems to the world. Overall, standards bear watching, Eckerson said, noting
that XMLA was definitely becoming a reality.
It may be said that SOAP -- the original driver of the notion of XML-based
Web services -- obtained its simplicity by overlooking niggling details. But the
idea has a hold on many in the data warehousing segment. Filling in the details,
however, will be an ongoing effort. Stay tuned.
Includes reporting by Mike Bucken
See the representative product listing ''Snapshot of BI tools
supporting Web services''