In-Depth
Meta is the word
The basic definition of meta data is simple -- 'Data about data.' With definitions out of the way, however, the subject quickly becomes more complex. The reason: Meta data means different things to different people in different industries. Particularly, the meta data view of
the object developer is quite different than that of the data warehouse builder. But a 'unified,' all encompassing meta data format, sometimes
seemingly the goal of meta data repository standards, is probably not going to appear any time soon, say experienced industry practitioners.
Still, standards would help, and the data warehouse segment may have taken a march in this regard. It is a best-of-breed business in which standard models are increasingly critical. Traditional data expert Oracle Corp. is working on such a standard, with something of an eye toward broader meta data standard solutions as well. Meanwhile, the meta data standard created by powerful industry force and relative data upstart Microsoft Corp. seems further ahead in the standards race. Just last month, the Open Information Model (OIM) based on Microsoft technology was accepted as a standard by the Meta Data Coalition (MDC), a standards coalition of software and services companies.
Microsoft's repository efforts actually go back a number of years.
The company initially worked with Texas Instruments Software (now part
of Sterling Software) and other software players to define its repository.
More recently, it has worked closely with Rational Software and Platinum
Technology (now part of Computer Associates) to complete the work. As
Microsoft has increased its presence in the database business, and more
recently launched an online analytical processing (OLAP) product line,
its repository efforts have come to include OIM. "Microsoft went after
the architecture," said David Marco, president of Enterprise Warehousing
Solutions Inc., Palos Hills, Ill. "Architecture is tough. But they brought
in all the data warehouse majors. Now the Meta Data Coalition owns those
Microsoft OIM models," he added. In a way, OIM works as the model store,
while the Microsoft repository acts as the meta data engine.
Of course, users and viewers alike take standards efforts with a grain
of salt. While a meta data standard may yet emerge as a universal model
embraced by all software developers everywhere, you may be hard-pressed
to find true believers among the practitioners getting their hands dirty
building meta data repositories.
Like other technologies, meta data repository plans are influenced
by the emergent eXtensible Markup Language (XML) standard for data description.
Besides representing data types in ways specific to specific industries,
XML may also have use as a meta data interchange format between distinct
meta data repositories. The potential for distributed architectures,
rather than centralized repositories with their spotty record, is enough
to give some pause.
In fact, the specification Microsoft and its partners developed through
the Meta Data Coalition, and the competing and as-yet unpublished data
warehousing standard created by Oracle and partners including IBM, still
seems pretty academic to developers of meta data repositories. For Microsoft,
it has become something of a competitive issue. The SQL Server database
has been a primary part of Microsoft's push for continued growth. The
market target is therefore relational leader Oracle. One niche that
Microsoft has targeted is the growing area of OLAP and data warehousing.
As part of its push, the company went after the data architecture --
or infrastructure -- which led to an effort with partners and the Meta
Data Coalition to develop a standard way of dealing with meta data.
The view that Microsoft is suddenly far ahead of the competition may
be tempered by advisories from Meta Group Analyst (and OLAP and warehouse
maven) Aaron Zornes, who recently said that many users could be said
to be waiting to see what OIM really is. A clear disposition of the
status of the Platinum Technology repository (an important element of
the Microsoft repository), said Zornes, is also awaited from Computer
Associates. "Is OIM ready for prime time? Have people deployed it?"
he asks. "It's the best effort to date," said Zornes, answering question
one. "I haven't seen it deployed yet," he added, answering question
two.
Does meta matter?
Why bother with meta data at all? Without a meta data standard, developers
are forced to write interfaces between best-of-breed tools available
in larger database, data warehousing and data mart markets. With one
standard, this time-consuming task goes away. When you talk to developers
who are actually building meta data repositories, the skepticism about
standards borders on cynicism. They have heard big announcements of
universal standards before and have seen them evaporate. Some developers
of meta data repositories hold out hope for the eventual industry-wide
adoption of one of two competing standards being marketed by Microsoft
and Oracle, but there is a bit of disparagement.
While some industry analysts suggest the Y2K problem has convinced
the developer community of the need for standard models for applications
and databases -- in Y2K remediation, much time was lost figuring out
what a corporation's data was about -- one expert pronounced a pox on
both the Microsoft and Oracle houses, noting "We'll be fixing the Year
3000 problem before there is a meta data standard."
Other impressions are more measured, but still laced with cynicism.
"My feeling on the standards is a little cynical," admits Brenda Moncla,
senior national data warehouse consultant at Chicago-based Sysix Technologies.
"My cynicism comes from being a practitioner in this environment for
a very long time, and having worked for a software product company.
I understand the claims, politics, vendor alliances and things like
that. And I think what we tend to overlook is how big and how complex
this problem is.
"One of the big issues is that all meta data is not created equal,"
she added. "Therefore, it is very difficult to trace an element from
our operational environment through our DSS [decision support system]
environment using meta data in an easy and intuitive fashion."
Moncla says vendors of meta data tools have not yet gotten their acts
together. "Then we come to the tool vendors and products that are actually
considered meta data tools," she said. "We are talking typically about
repositories and meta models and mechanisms for moving data in and out.
Do we have a standard for the manner for which we store meta data, or
the form and format in which we store meta data? Do we have a standard
for the form and format in which we interchange meta data? Those are
two different philosophies and we've seen different vendors adopt those
philosophies over some period of time. They've either said, 'Well, gosh,
we all need to strive to make the meta data fit a particular physical
model. Or [other vendors say] it doesn't matter what kind of physical
model you store it in, as long as we can all speak English when we exchange
it.'"
This can be rephrased in a question: Should developers work with a
version of the meta data models Microsoft and Oracle advocate, or will
it be enough just to accept new-on-the-scene XML as an appropriate lingua
franca? Some observers refer to XML as the ASCII of data.
Moncla is of the opinion that when you get to the nitty gritty of
a client wanting to sell products over the Internet, the meta data model
must be pretty specific as to what users need. Among developers, Moncla
is not alone in her skepticism about a grand, unified field theory for
meta data.
"I've found that meta data is different depending on which client
you go to," said Russ Burry, who works in the data warehousing practice
of Keane Inc., a Boston-based consulting firm.
This view is shared by John Spiers, formerly of Open Meaning, a meta
data integration tools company acquired by Oracle last year. Spiers
now works in the EAI business unit of Forté Software Inc., Oakland,
Calif. He believes that rather than adopting a grand, unified theory,
specific meta data standards will emerge to meet vertical industry needs.
"I would say the world of meta data is much more toward defining limited
vertical market sets as opposed to a much greater framework for modeling
everything that I could ever encounter in my environment," he said.
Spiers sees the real action in standards coming in the form of XML
Data Type Definitions (DTDs), which basically define shared data structures
to be used across applications and organizations. Specific industries,
such as healthcare and telecommunications, are setting standard DTDs
so that organizations can exchange data within their specific industry.
This movement is motivated by the growing trend toward moving business-to-business
transactions to the Internet.
Taking an even more micro view of a meta data standard, Spiers believes
XML may become a universal language for data integration. "As people
look to integrate loosely coupled systems, they seem to be looking increasingly
toward XML to provide the shared data representation," said Spiers.
"What XML promises to do is to become the mechanism for sharing information
between loosely coupled applications over Web-styled networks. And the
important thing about XML is that it is basically a meta language. It's
a meta language in which you can define any data structure you want,
and within which you can represent both data and meta data."
Up and humming
Bucking the skepticism and cynicism surrounding the development of
a universally accepted meta data model is Enterprise Warehousing Solutions'
Marco. While agreeing that there is currently no standard model for
meta data repositories and the applications that use them, he sees one
coming -- and soon.
"Certainly within a year, I think maybe closer to six months, you're
going to see some repositories up and humming on the Microsoft/Meta
Data Coalition standard," predicts Marco. One of the driving forces
behind this coming standardization is the production gains overtasked
IT departments stand to gain, said Marco, who is himself in the business
of building meta data repositories.
As Marco sees it: "I could just plug into this standard model. It's
just two interfaces for me to write: one to send data to the model,
[and] one for me to pull the information I need. You have gone from
a hundred interfaces at the largest end, to two. From a software development
standpoint, it is a great idea."
He also sees it as a positive for those clients who need meta data
repositories, but want to continue to benefit from having a best-of-breed
market for tools. Rather than creating a Tower of Babel, all of the
available tools could work together as long as software vendors embrace
the standard.
IT departments in major corporations would also gain; if meta data
standards are adhered to in applications development, it would be easier
for new hires in the glass house to maintain these systems. It would
also eliminate some of the "quirkiness" built into many legacy COBOL
applications that has come back to haunt Y2K teams as they sort out
how a long dead or retired programmer defined dates.
Marco believes the enormous cost of fixing Y2K has been a wakeup call
to many organizations that now realize that had data standards been
in place in 1970, the century date change problem would have been easier
to fix.
A meta data standard would therefore have a lot of advantages, on
a number of levels. And yet, there still is no accepted standard. The
major obstacle now to the adoption of a standard, as Marco views it,
is yet another grudge match between Bill Gates and Larry Ellison.
As Marco sees the competition in the real world, the battle over meta
data standards is partially due to the Microsoft strategy of moving
into and taking over markets they do not yet own. Microsoft wants SQL
Server to be the dominant enterprise database. But in Marco's view,
SQL Server is not dominant either technologically or in market share.
Oracle has the better database for enterprise applications. But examples
of Microsoft succeeding in new markets are plentiful.
Microsoft is enticing IT departments to buy SQL Server by offering
a free meta data repository based on the OIM. Since people like getting
things for free, it is a marketing strategy that could worry Oracle.
Or as Marco whimsically puts it: "You're Oracle and you're the 600-pound
gorilla in the database world. But you look up and see that the 1,000-pound
gorilla has just entered your jungle."
For many database developers and administrators, implementing repositories
is too daunting a task today. While the Microsoft repository may be
viewed as simplified by some, its use of some object-oriented development
methods (read: COM) and the Unified Modeling Language (UML) can make
it less than intuitive to the uninitiated.
There are shortcomings to the Microsoft standard, Marco said, but
he predicts it will begin to be used in meta data repository development
within the next six months. He notes that COM is being removed from
repository models, though not from the Microsoft Repository. In fact,
COM is not all that familiar to a lot of application developers much
less database developers.
"I can talk to you all day about what the Microsoft repository doesn't
do because I have a product to look at," Marco said. "With Oracle, I
don't have a product to look at. Certainly, both companies have a ways
to go. But the Microsoft standard has had more input from everybody."
Playing catch up
In order to catch up, Oracle has acquired One Meaning, a meta data
tools and standards technology company. It then partnered with IBM and
began working on a standard it plans to offer the OMG in September.
Marco leans toward the Microsoft OIM standard, because while he does
not believe very many, if any, developers are using it at the moment,
it is available with SQL Server.
The reason you can find fault with OIM, he said, is that it is published
and available on the Meta Data Coalition Web site. The Oracle model,
on the other hand, is not-yet published and the tools that will incorporate
it are still in what Oracle calls internal beta testing.
Understanding the status of the Oracle repository effort requires
some background. In mid-1998, Oracle outlined its future meta data repository
strategy. Details and timetables are still awaited at the time of this
writing. Early in 1999, Oracle joined with IBM and Unisys to promote
the XML Meta Data Interchange (XMI) format within the Object Management
Group. At the time, XMI's purpose was described as providing a vendor-
and middleware-neutral open interface format for meta data in distributed
environments. The plan was to start with modeling and programming meta
data and to expand to data warehouse, component and other data types.
Some viewers suggest that Oracle's efforts in the data warehouse realm
-- which may use the banner of the Common Warehouse Metadata (CWM) model
-- have been complicated by its efforts to support a more encompassing
meta data effort. Clearly, Oracle wants to expand its XMI efforts into
data warehouse formats.
What's next?
But what happens when Oracle's model becomes available this fall or
early next year? Does the industry end up with two incompatible models,
and a prolonged marketing war between Microsoft and Oracle with the
major casualty being the hope for a standard model? Oracle's view, as
voiced by Michael Howard, the company's vice president of data warehousing,
is that the Oracle CWM model will emerge from the war as the industry
standard. He said Oracle has been able to leverage its dominant position
in the enterprise database market to develop a standard based on input
from its large customer base. Thus, CWM is designed to meet the real
world needs of corporations.
Howard argues that Microsoft had limited access to information from
both tool vendors and corporations because SQL Server is not an enterprise
solution and nobody trusts Microsoft enough to share much information
with them.
In Oracle's scenario, Microsoft will eventually build a bridge from
OIM to CWM in the same way that it had to make some accommodation between
ActiveX and Java.
However, Enterprise Warehousing Solutions' Marco is not so sure it
won't be the other way around. "Microsoft has done some very good work
in developing their standard with input from all its software vendor
partners," he said.
It cannot be said that the two camps are not talking to one another.
In April, the Meta Data Coalition and the Object Management Group announced
a cooperative effort to develop meta data standards. They describe this
as a "formal technical liaison." So the notion of détente, as
opposed to wrestling, seems proper. Marco, for one, said détente
may work. Rather than see the two software industry gorillas stage a
Wrestlemania over meta data standards, Marco voices hope that Microsoft
and Oracle can reach an accommodation. Could, perhaps, the best of the
two models be merged into one grand, unified standard?