In-Depth
Overcoming the Obstacles to META DATA Management
- By Mike Budd
- January 1, 2001
Meta data "repositories" are slow to catch on, but XML and other technologies
promise to help overcome obstacles; emerging integrated meta data management
systems promise to better link corporate islands of information.
Last September, while not as obvious to the general media, the lack of meta
data management led to NASA's $300 million Mars expedition failure. According
to a NASA statement dated September 30, 1999, "a failure to recognize and correct
an error in a transfer of information between the Mars Climate Orbiter spacecraft
team in Colorado and the mission navigation team in California led to the loss
of the spacecraft." The statement went on to say that "preliminary findings
indicate that one team used English units (e.g., inches, feet and pounds), while
the other used metric units for a key spacecraft operation."
If basic meta data management pro-cesses had been used, the key variables would
have been explicitly and visibly linked to the units in which they were to be
measured; these links would then have been replicated for each team. The software
failure would have been avoided because the unit of measurement information
— relevant meta data for any variable — would have been managed with the variable
definition.
Integrated meta data management (IMDM) aims to provide a coherent, consistent
and complete view of a company's business processes, the IT systems that support
them and the company's structured knowledge. It is not just about software development.
Because of this, IMDM is becoming an increasingly attractive prospect. Since
the early 1990s, the accelerating pace of business change has put pressure on
companies to adapt more quickly to the changing competitive environment. Meta
data management is a key enabler of such change — making it easier to understand
the current state of the business, easier to experiment with potential changes,
and easier to implement changes.
The drivers for meta data management are not new. Since the late 1980s there
has been software that has offered to manage meta data in "repositories," but
it failed to find a mass market. During the last two years, however, a number
of things have changed that now have the potential to make meta data management
and/or repository technology into a mainstream IT market. Among the changes:
- Data warehousing has increased the visibility of the value of meta data.
- The year 2000 issue has made the value of understanding existing IT applications
more apparent.
- There has been considerable hype suggesting that the galaxy of eXtensible
Markup Language (XML)-related standards will "solve the meta data problem."
- Microsoft entered the repository market, thus (initially, at least) boosting
the subject's credibility.
- Two significant standards bodies, the Object Management Group (OMG) and
the Meta Data Coalition (MDC), have embarked on meta data management standards.
Meta data management software is now available from companies ranging from
Microsoft, IBM and Oracle to numerous start-ups. Some vendors base their marketing
on the potential promise of XML, others emphasize a mature technology base,
and some vendors do both.
What is meta data?
Ovum defines meta data as the information you need to store about an item in
order for an agent to be able to use it. The agent can be a human being, a machine
or a software component. The item can be an information item, software component
or a physical object (for example, a car) that can be at any level of abstraction.
The precise "information you need" will be relative to its use; in the case
of an information or software item, however, you always have to be able to find
it, understand it (this includes being able to evaluate it), and discover any
usage restrictions/costs.
Meta data's relativity means that it can be described as a temporary role for
information that is defined in terms of a particular purpose. Making meta data
available for each particular purpose would result in a series of overlapping,
partial solutions.
Creating these overlapping, partial solutions implies an overlapping effort
— re-inventing the wheel or at least parts of a wheel. Overlapping solutions
also create multiple, redundant copies of the meta data they manage. Keeping
these copies updated and in step involves extra effort that either incurs ongoing
maintenance costs or simply does not happen — with dire consequences for data
quality and the long-term success of meta data projects.
There are opportunities for removing redundancy and sharing, rather than copying,
meta data. This is the goal of integrated meta data management — to deliver
total availability of meta data for all purposes and across all levels of abstraction,
and between different types of information and software, as well as between
business areas.
Ovum defines integrated meta data management as the processes of:
- assembling and integrating information from disparate sources;
- keeping the information up-to-date and in line with business needs;
- simplifying it, and making it accessible and comprehensible to end users
and integration tools; and
- propagating change from integrated information to the real business world
(and vice versa).
In this definition, integrating information means interconnecting related information
so that one item can be accessed easily by another.
In practice, integration costs money, so a business must set priorities for
what should be integrated and when. This sets limits on what a business chooses
to manage using integrated meta data management solutions, and what it chooses
to regard as meta data.
Why integrated meta data management?
Integrated meta data management helps users decide how to change the business
and its software. Once that decision is made, IMDM makes it less costly to effect
those changes by:
- providing easy access to information and software definitions, and helping
users understand them;
- helping users simplify information and software; and
- providing services to help users make change decisions and manage the process
of change to their information, software or business.
In each case, the value-add of IMDM arises from the integration of information
and the consistency of the services by which it is managed, as well as by sup-
plementing the inadequate information management services provided by many pieces
of software (particularly for information tangential to their main purposes).
There are three types of meta data management software: repository software,
"stovepipe" meta data management solutions and XML-based software. Repositories
are database-based meta data management solutions. Most of the repository software
products are mature — many of them are a decade or more old — although the Microsoft
repository is a notable exception. They were originally developed to support
application development, but vendors claim that modern repositories are adaptable
enough to provide integrated meta data management across a variety of application
domains.
"Stovepipe" meta data management solutions manage meta data in a single application
context; for example, data warehousing, application development or systems management.
These solutions benefit from their specialization, but almost always lack the
sophisticated meta data management functionality offered by most repository
products.
An increasing number of new vendors are also venturing into the meta data management
market based on the promise of XML and its supporting standards and technologies.
XML technology is being used to support stovepipe and repository solutions,
and also as an alternative so- lution. As an alternative solution, XML technology
is considerably less capable than the more mature repository-based products.
While these solution types will compete with each other in the marketplace
over the next five years, repository products are the most capable in terms
of core functionality. However, they are hindered by the perception that they
are an aging technology — a perception that both the products and their vendors
need to do more to counter. The main repository offerings are the Oracle Repository,
the Microsoft Repository, Softlab Enabler, Viasoft Rochade, IBM Team Connection
and the Unisys Repository.
Stovepipe solutions are generally focused and effective, but do not deliver
the benefits of widely scoped integration and are frequently weak in important
functionality areas, such as configura- tion management. Examples include data
warehouse meta data management tools and middleware directories, such as Microsoft's
Active Directory.
The XML-based solutions are the wildcards of the market. There is a big gap
between the "XML-as-panacea" hype and what it actually delivers to assist in
meta data management. This gives vendors of XML-based products a tremendous
marketing opportunity and a significant technology challenge. XML solutions
include eXcellon's Integration Server, Sequoia's XML Portal server and Novell's
DirXML. At present, these tools integrate standard business/administration data
and may hold some more abstract information as a result — so while they fall
within our very broad definition by integrating information, most people would
not see them as meta data management tools. Indeed, the main current significance
of XML in the mainstream meta data management market is as part of the new wave
of meta data exchange standards.
Meta data solutions checklist
Meta data management is a complex market, and the attitudes of vendor groups
toward the realities of deploying meta data management solutions in real-world
situations make it more so. Choosing the right meta data management technology
is a core strategic choice for IT user firms, but the choice is not straightforward.
In fact, there are six things users need to know about meta data management:
Meta data management is integrated information management.Meta data is the
information needed to make use of a company's software and information resources.
Making this information available requires integrated meta data management,
which means integrated information management. It also requires the integration
of information from documents, including multimedia; databases, both schema
and contents; program code; and specialized information from a variety of application
areas, from applications development through business modeling to knowledge
management.
The information varies from concrete to abstract, and comes from several areas:
software and application development; app integration; data warehousing and
business intelligence; business, process and data modeling; knowledge management
and group working systems; and network management.
Effectively managing this information requires consistent management services
to help users access, understand, simplify and change the diverse data. Repository
products promise to provide such functionality and, as stated earlier, many
are mature products dating back to the 1980s.
A new generation of meta data management products has come through in the last
few years, many of them making use of XML-based technologies. The battle between
sophisticated repository products and recently introduced meta data management
products that take full advantage of both the technological and the mind-share
advantages of XML will dominate this market over the next five years.
There are big benefits, but there are also implementation risks. Meta data
management has huge potential benefits. Integrated information leads to better
understanding, which, in turn, leads to better business decisions. Furthermore,
business changes can be more easily implemented with the help of meta data management's
change management facilities.
Successful meta data management implementations must be sensitive to the implementation
risks of integration technologies. In particular, they must align control over
information with business responsibilities and demonstrate short-term value,
while keeping the long-term imperative to minimize complexity in view.
There is a big market, but much investment is needed. The potential market
for integrated information management solutions is vast. The historical evolution
of software has left disconnected information islands in almost every area —
from application development to business intelligence, and from knowledge management
to operational data management.
Integrating the management of that information adds a substantial percentage
— Ovum estimates at least 10% — to the value delivered by each of
these markets (a total of about $10 billion). But today's meta data management
solutions do not provide the necessary facilities. Today's repositories, the
most general solution, emphasize software development and business modeling
needs, and provide only partial support for business intelligence (and very
little in other areas). Information integration levels are variable. Information
cannot be easily kept up-to-date because of insufficient tool integration. Management
services are inconsistent and usually basic. All other solutions integrate information
from just one application area — they are stovepipe integration solutions.
Standards are essential, and they are coming. To integrate information, you
must agree on how information with a given meaning should be expressed. Competing
standards for the expression of abstract information are now on offer from Microsoft
(via the Meta Data Coalition) and the OMG, although both agree on the Unified
Modeling Language (UML) in the software development area.
Microsoft's OIM standard covers software development and business intelligence,
and is expanding into knowledge management and business modeling. The OMG's
UML covers software development only, but is expanding into the other areas.
OIM and UML both use an XML encoding of their models for data interchange. Standards
for less-abstract information for schemas are being developed by vertical industry
groupings. Unfortunately, these are not consistent with the more abstract standards,
which is likely to result in another software cottage industry in format transfer.
XML is important, but it is not a meta data management solution. XML is a mark-up
language, not a full meta data management solution. It can represent information,
but it provides no management services. XML can solve some integrated information
management problems by enabling information exchange between one tool and another,
and between tools and integrated information management solutions. XML-enabled
databases and other software-assisted XML solutions have the potential to provide
more functionality, but are currently limited to EAI roles.
Despite its lack of appropriate management services, people are trying to use
XML for broader meta data management purposes. Ovum forecasts a large market
in legacy XML maintenance in a few years.
A sea change is on the way. Encouraged by the development of standards, the
repository market is changing from a centralized to a distributed model, and
from support for app development out into other app areas.
The new distributed model recognizes that software vendors wish to provide
their own data storage layer; however, it resolves the consequent difficulties
in keeping repository information up-to-date by seeking out interesting information
from a user's network and requesting information from tools using standard APIs.
From applications development, repositories are beginning to extend outward
into business intelligence and knowledge management, and (in one case) into
distributed systems management. In so doing, they are multiplying their value
proposition — they can now deliver value by integrating information between
software areas, as well as within one area. They can also provide the background
integration structure for enterprise information portals. This role extension
will require a change in style — from controlling to assisting, and from passive
to active — that may paradoxically make them more acceptable to their original
users.
Today's business needs integrated information management, and meta data management
technology helps to bring it all together. In principle, with good meta data
management, organizations should be able to provide a coherent and consistent
view of their business processes, the IT systems that support them and their
unique knowledge structures. This is an important enabler for business change
and growth.
Many current technologies still fall short of the mark. However, forthcoming
developments suggest that we can look forward to a future of more competent,
friendlier and less-demanding software. This software will be informed by knowledge
management approaches and supported by industry meta data standards.