Evaluating meta data tools

It is no surprise that meta data has become a hot issue with today's growing emphasis on empowering business users to do more with less centralized IT help along with an increasing focus on enabling the IT organization to produce better results faster. To leverage all of the meta data from the disparate sources that exist in your organization, you need a way to gather it into a central location—a meta data repository. This brings meta data integration tools into the picture. These tools can help you integrate your enterprise meta data into the repository and make it available to all the user groups—such as business, IT and executive—in your company. But how do you determine which tool is right for your organization? There are a wide variety of tools on the market that will claim to solve all of your meta data problems. But, as we all know, vendor hype is often not reality.

Within any organization there are going to be different "sources" of meta data. Meta data integration tools can integrate meta data from three broad source categories, each of which requires varying levels of integration complexity. It is important to identify all of the various sources of meta data that you need to integrate into the repository.

The three classifications of meta data sources are certified, generic and non-supported sources. A certified source is a source that the meta data integration tool can read directly, interpret properly and load into the correct attributes of the meta model without changing the model. An example of a certified source is a Computer-Aided Software Engineering (CASE) tool like Erwin. Most meta data integration tools are certified for several vendor tools.

A generic source is a source in a common format (for example, tab, space or comma delimited) that the tool can read, but cannot interpret, and which may require a meta model change. Most tools support one or more generic meta data sources. An example of a generic meta data source is data that is stored in databases and spreadsheets.

The last source of meta data is the non-supported source, which is neither certified nor generic. Extensive analysis to produce an in-house solution may be required to access a non-supported source. The non-supported source can neither be read nor interpreted by the meta data integration tool. Once you have classified the meta data sources available, you will quickly be able to determine the complexity of your project, and whether the tools you are evaluating support all of your sources. Table 1 lists some of the more common meta data sources and how they are typically classified. (Note that your specific implementation may differ.)

Table 1: Example of meta data sources
Meta data sources Meta data description Type Model extension
CASE tool Physical and logical models, domain values, technical entity definitions and technical attribute definitions Certified No
Extraction/transformation tool Technical transformation rules Certified No
Custom data dictionary Business attribute and entity definitions Non-supported No
MS Excel Data stewards list Generic Yes
Reporting tool Access patterns and frequency of use Generic Yes
Source: David Marco

Vendor interview process
After determining your requirements you are ready to begin the vendor tool interview process.

By doing some research using product information that is readily available on the Internet and in industry magazines and journals, you can narrow the field of potential vendors and products to those vendors that have tools that meet your general criteria. Once you have prepared a preliminary list of vendors, you are ready to start interviewing and evaluating. Try to have the vendor come on-site and use your existing meta data sources rather than using the canned demos the vendor provides.

To look at all of the criteria objectively you should use a weighted vendor checklist to perform a tool analysis. The vendor checklist allows you to come up with a numerical score based on the criteria you specify. Table 2 depicts a small excerpt of a completed checklist outlining the kinds of questions that need to be answered when evaluating meta data tools. By using an in-depth checklist you will be able to get an unbiased view of how well the tool fits your company's needs as well as the tool's strengths and weaknesses.

Table 2: Excerpt from the complete weighted vendor checklist
Section/Description Weight %Met Score Comments
Technical Requirements        
What programming requirements are needed to support the proposed meta data repository solution (that is, script, SQL and so on)? 9 .75 8.4 Learning curve
How does the product allow multiple meta data developers to work simultaneously with the same IT project? 6 1 6 What memory and processing requirements are needed for each user? How does the vendor suggest calculating these needs?
Meta data management        
Is the meta data repository tool active or passive in controlling the processes of the IT environment? If active, explain. 7 .3 2.1 What agents and/or triggers can be used to make the repository proactive?
Source: David Marco

Once the checklist is complete you can look at the scores to see just how well or how poorly each product did. The higher the score, the better the tool fits your needs. Always remember that every company has its own unique meta data requirements and that nothing is perfect. Be prepared to compromise along the way, but if you keep your requirements and priorities clearly in mind—and manage to ignore the bells and whistles that vendors will try to distract you with—you can't go wrong.

About the Author

David Marco is the author of Building and Managing the Meta Data Repository: A Full Life-Cycle Guide from John Wiley & Sons. He is founder and president of Enterprise Warehousing Solutions Inc. (EWS), a Chicago-based system integrator. He can be reached at 708-233-6330 or via E-mail at [email protected].