In-Depth

The next step for meta data: Application integration

Talk about teaching old dogs new tricks: Meta data repositories, around since at least the 1990s and employed primarily for data warehousing and business intelligence applications, are now being used to help companies develop an enterprise-wide approach to application integration.

If done correctly, a meta data-based approach can allow for reuse of interface definitions, messages and other pieces of the integration puzzle. For reuse to happen, though, developers must be able to quickly and easily find previously developed pieces. A centralized repository approach is one way to let that happen.

The Holy Grail for this technology is creating, managing and reusing meta data models that are then redeployed to execution engines to reduce the amount of code that needs to be developed manually. But given the holes in today’s technology and the organizational issues that remain around reuse, the consensus is that we are still pretty far away from that scenario.

Benefits beyond reuse
Even without reuse, meta data can provide substantial benefits, namely system-wide change management and impact assessment. In other words, meta data can help an IT team figure out that once application X is hooked into application Y, then these 15 interfaces, adapters, business processes and messages will need to be changed. Similarly, any future changes to either side of the proposition can also be tracked and more easily implemented, with fewer errors resulting, especially in applications that interact with many different data sources.

A meta data repository can help with integration in two primary ways, said Ron Schmelzer, founder and senior analyst at ZapThink LLC in Waltham, Mass. “It can help you access data you may not be aware of -- loosely coupling the application source with the consumer -- and it can help smooth differences between the information sources themselves.”

A big reason behind the renewed interest in meta data is a bevy of new regulations, including Sarbanes-Oxley, a federal law that requires public companies to provide better disclosure to protect investors. Among other things, it calls for a four-year retention period for many different types of documents, as well as of e-mail and Instant Messenger communications. It often falls to IT to prove to regulators and others that these rules have been followed, and meta data can play a key role here.

“Meta data is gaining in exposure because of a need to manage assets better,” explained David Marco, president of independent consultancy Enterprise Warehousing Solutions Inc. in Hinsdale, Ill. “The second you have to cut across the enterprise, meta data comes front and center. As one of our customers said, ‘If we had a fully functioning repository, Sarbanes-Oxley would be a non-issue.’”

In essence, said Marco, “meta data helps you to manage your systems. Every company has a ton of meta data; the question is if you’re going to manage it” or just leave it in its silos of separate applications: customer relationship management, enterprise resource planning, generic databases or spreadsheets.

In addition, there is a shift going on in how people are approaching integration projects in general, said ZapThink’s Schmelzer. “Optimally, you don’t want to move information from one place to another and keep replicating it -- like you do with the older extract-transform-load approaches.” The newer method, with much less impact on network traffic and system resources in general, is to “leave the information where it’s at” and create meta data and hooks to allow applications to use it right where it already exists, he noted.

Technical, organizational challenges
Whatever the approach, using meta data for integration is not for the meek. Integration can be challenging enough on its own, and then you need to factor in some of the issues around managing meta data, among them the lack of well-understood industry standards and the lack of infrastructure in most companies for effectively collecting and using meta data. Using multiple meta data tools or repositories for different integration projects, as is the case with most customers currently, ultimately compounds the technical and management problems of uniting it all eventually.

And there are cultural and organizational issues to muddle through, from turf wars over which piece of an enterprise “owns” which data, to helping programmers understand and make better use of meta data.

Fundamentally, effective use of meta data helps programmers avoid having to re-invent the same wheels over and over again. “We’re pre-analyzing the information that a project team would need,” said Charles Betz, who heads up the meta data capability for a Fortune 100 retailer he was not allowed to name. Traditionally, “every time a project team gets together, they’d get out their machetes and hack their way through, and they’d wind up with spreadsheets and Visio diagrams,” he explained. “Then the next application would come along, and they’d repeat the exercise. Stuff gets redone because there’s no system of record for it. You’re always re-analyzing.”

Where meta data can help here is to say to that team that the customer address, for instance, is always up-to-date in a particular system of record, that the customer number is up-to-date in a different system, and that the customer name is in a third.

At its core, meta data helps the IT group to better understand its own environment -- which systems it is supporting and how they interact.

“You start to break down the IT infrastructure so it doesn’t look like one giant blob of servers, to inventory the systems and do it precisely,” Betz said. He feels so strongly about this notion of meta data as the IT capability for information technology itself that he has started a weblog about the issue; it can be found at http://www.erp4it.com

Technology catch-up
In general, the technology is just catching up to where Betz and others need it to be. The problem with most of today’s repositories is that they can collect only the information related to a specific toolset or information type. Some repositories handle only relational data; some handle XML. Others handle UML models; some do not. It has been difficult to find one uber-repository that can accommodate all the types of information found in most large companies.

While UDDI registries certainly can and do handle meta data, they are pretty much the “ugly ducklings” of the Web services world, said Jason Bloomberg, senior analyst at ZapThink. “UDDI as a standard isn’t really complete, so you wouldn’t go and buy yourself one. They always do other stuff, like LDAP or other things.”

Also, “there are modeling tools that can help to define application integration meta data,” said Jess Thompson, research director at Gartner Inc. in Stamford, Conn., “but there’s no consensus.” Today’s application integration tools -- including the broker suites from the likes of Tibco or webMethods -- all have repositories, sure. But most are based on proprietary technology, meaning that users must store all the meta data in the specific tool, which has its limitations.

“You can ask it the impact of changing a specific field in an application, but it can’t tell you all the adapters or other things that might be affected by a change,” explained Thompson. “You’re solving part of the problem, but not the entire problem.”

And, added Thompson, if you have more than one tool for integration, “you’re in trouble” because a particular tool generally captures only the meta data associated with it. So all integration-related meta data needs to be in Tibco’s suite, for instance, to be able to do system-wide impact assessment or change management.

That situation is starting to change, thanks to the Meta-Object Facility (MOF), an upcoming standard from the OMG that in turn comprises many sub-standards, including the Common Warehouse Metamodel and others.

The tools
Vendors, including Redwood City, Calif.-based Informatica Corp. and Bournemouth, U.K.-based Adaptive, have implemented MOF for their warehouse tools and engines. Informatica’s SuperGlue helps to put context around information used in traditional enterprise application integration (EAI), said Sanjay Poonen, senior vice president of worldwide marketing at Informatica.

“EAI’s done a very good job of messaging, routing workflow and so on,” he noted. “But a messaging system does not help you to find out what the context is around a number, say 400 million. Is that dollars? Where did that number come from? And which country conversion code was used in the calculation?”

Informatica SuperGlue, introduced in August 2003, collects, stores and helps to analyze meta data, including providing audit trails. “The value is in being able to see the dependencies and linkages between data,” Poonen explained. Six customers, including one federal government agency that Poonen could not name, currently use SuperGlue.

For its part, Adaptive’s IT Portfolio Manager separates the assets being managed -- applications, data, business processes and the like -- from the management features. New asset categories can be added.

“We can work with pretty much any data or object-oriented design tools to capture a schema directly from a database,” said Pete Rivett, consulting architect for the firm and the editor of the MOF standard. “We don’t move data around; we’re managing the design of the data from an enterprise context.”

Around 20 organizations use Adaptive’s wares for heavy-duty integration work, Rivett estimates.

MetaMatrix, New York City, is another player in this area. Its MetaBase repository product is about to hit Version 4.0, which is promised by the end of March. When that happens, it will be “fully” MOF-compliant, explained Michael Lang, the company’s founder and executive vice president of development.

“You need a way of capturing meta data from disparate systems. But if I try to represent a relational structure in XML,” noted Lang, “that causes me to make too many compromises.”

That problem will be solved with Version 4.0 of MetaBase, he said. This will allow customers to collect meta data from Rational Rose, relational databases, Popkin diagrams and pretty much anything they want.

Other vendors take different approaches, and not all are using MOF as their basis. San Jose, Calif.-based BEA Systems’ LiquidData, for instance, allows customers to access, aggregate and share information via a repository-based approach. Although it uses J2EE and XQuery, it is not yet MOF-compliant, And the tool is for customers of BEA’s WebLogic development platform.

Infravio Ensemble, from Cupertino, Calif.-based Infravio Inc., allows customers to manage Web services delivery contracts. So if there are different classes of users on mobile devices or a browser, each might have different service-level requirements. Infravio’s repository holds information about service levels and specific clients, among other things. And the TigerLogic XML Data Management System (XDMS), from Irvine, Calif.-based Raining Data Corp., provides a native XML database that caches models and their underlying data, so even if the data source goes down it is still available.

Composite Software Inc., San Mateo, Calif., is brand new to the scene and uses a repository approach to help provide a unified view of all corporate data. At the moment, its Composite Information Server product is being marketed more as a way to process queries and present the resulting information to end users than it is as a tool for IT to handle heavy-duty application integration.

More than technology

Beyond technology, successful meta data-based integration will depend on corporate culture and practices. Not everyone agrees that a centralized “center of excellence” approach to integration meta data is an absolute given to make it work, but some methodology or approach clearly is -- especially if reuse is ever going to happen.

Brian Mulloy, director of product marketing for BEA WebLogic integration, said that a centralized approach can be great, if it is done right. At one leading-edge company, he said, the integration group is set up as a central developer resource. When developers begin a project, they e-mail the integration center to find out if there is anything already done that they could use or build on.

“But if I send the e-mail and there’s no response in a day or two or three, I quickly stop doing that,” he explained. “I was talking to another company who tried that, but there wasn’t any response to developer requests; so now each person manages their own meta data models.”

Adaptive’s Rivett noted that there is a lot to be said for an organic, grassroots approach to integration meta data. “It doesn’t need to be a big-bang, huge corporate change; it can start with a small number of projects that use meta data,” he noted. Besides cost savings, that approach saves time as one need not spend time convincing the IT chiefs, among other things.

But at some level, “you need the senior management pushing a paradigm shift” about meta data being the way to go, said MetaMatrix’s Lang. “They need to insist on this as an alternative to coding or building the project, and then pick projects that seem well-suited” to this approach. One way to convince the big guns, he said, is to look at meta data as a way of saving time and money for application development and maintenance. “If I knew where everything was, and its relationship to everything else, when a business manager gives me a task for a new application I can go to the central repository, discover everything I need and then begin writing my business logic,” explained Lang.

On the cultural side, BEA’s Mulloy said “tension” often exists between three key constituencies -- data architects, developers and enterprise architects. Each traditionally owns different pieces of the IT pie, but they must all learn to work together, especially on this issue.

What it often comes down to, said Gartner’s Thompson, is how much overall integration work a firm has done and where it is on the maturity scale. “You have to crawl before you walk, and walk before you run,” he said. “And many of the organizations deploying integration technologies are still learning to crawl.” Initially, integration begins with one specific project, then expands from there; business processes around integration can then be figured out.

By 2005, predict Gartner analysts, more than half of large organizations will have multiple sources of integration technology. “As that proliferation occurs, being able to recognize the use of meta data and have consistent use across all the different deployments becomes important,” said Thompson.

Because ultimately, as meta data implementer Betz said, “to handle integration meta data, you have to understand the meaning and semantics of the messages being passed back and forth. If you don’t understand how the data is flowing, you don’t understand your environment.”

Please see the following related stories:
“What integration meta data should you collect?” by Johanna Ambrosio


“About standards” by Johanna Ambrosio


“Best practices: Meta data integration” by Johanna Ambrosio