Rewiring EDI: XML makes new connections

AMR Research projects that B2B e-commerce will grow into a $5.7 trillion market by 2004. In the midst of this very large growth cycle, corporations that use Electronic Data Interchange (EDI) for automated links to hundreds of trading partners are moving to the Internet to connect with tens of thousands of trading partners. For these implementations to succeed, it is essential for these companies to rapidly deploy B2B trading partner networks for e-commerce so that transactions can be sent and received.

As B2B e-commerce grows, trading partners seeking to capitalize on XML's benefits are moving to streamline and automate their relationships. With thousands of products and trading partners, many firms must support extremely reliable connections to ensure quick delivery and efficient operations. Yet, challenging the growth of B2B e-commerce is the complex task of linking applications between trading partners. Many companies are now aiming to eliminate this obstacle, which represents the potential for costly, time-consuming friction in the supply chain.

For companies that have already made significant investments in EDI technology, processes and training, the new generation of XML-based e-business infrastructure presents a number of significant opportunities. While the promise of dynamic and flexible trading partner relationships has many advantages, most companies are justifiably unwilling to tear out workable systems and replace them with relatively unproven systems. Wisely, many companies are taking a hard look at the cost of evolving their EDI systems for the next generation of e-business.

The evolution has only just begun, and the transition from EDI to XML-based systems will not occur overnight. Many companies will migrate systems to XML-driven infrastructure only as standards consolidate and the technology stabilizes as the mainstay of business computing. One of the challenges facing companies looking to integrate EDI and XML-based systems is linking and synchronizing the business content, including the documents, policies and procedures that form the foundation of the two types of infrastructure. Firms planning to make the EDI-to-XML migration can apply a different technological approach—canonical modeling—to create a smooth transition with a stable foundation for future adaptability and rapid deployment timelines.

Scope of management
Figure 1
The modeling approach can automatically create maps to configure e-business systems.

Industry dynamics
Though EDI continues to dominate transactions between large enterprises, it is too expensive and cumbersome for small to mid-sized enterprises (SMEs) to adopt. Since many large firms seek to upgrade their SME vendors to electronic transactions, XML offers a viable communications channel. But XML itself is not a standard—it is a way of expressing data that is particularly well suited to communities defining their own standards. The result is nearly 900 published XML schemas, all of which require a significant amount of EDI development unique to each firm using them.

For example, manually programming a single link for a common business document such as a purchase order can require up to 40 hours and rare domain expertise of multiple EDI/XML standards. Multiplied by hundreds of transaction types and standards, the linking process becomes too cumbersome to be fast, flexible or practical. It is little wonder that EDI remains a trusted, reliable standard with ample vendor resources to ensure continued operations.

For these reasons, the market transformation from EDI to XML is estimated to require several years. Meanwhile, a dilemma emerges: Should larger companies continue to leave SME vendors out of the e-business loop until XML achieves enterprise-caliber scalability, or forge ahead before standards have stabilized? After all, connecting these disparate technologies to address SME vendors will require applications that save deployment time, increase the flexibility of communications and facilitate trading partner relationships—a tall order. Nevertheless, a number of factors will drive the EDI-to-XML evolution. Cheaper bandwidth will allow more information to be exchanged between companies for any given transaction. XML drives down implementation costs, especially in light of the well-established commodity pricing for EDI VANs that cannot go lower. Also, because many EDI implementations require significant custom development, XML conversion can reduce the amount of code written to make applications at different companies operate together seamlessly.

But to take advantage of these benefits, companies will have to redefine their integration strategies, not as discrete tasks that arise as new business needs dictate, but rather as integration ecosystems that can leverage prior efforts and introduce reuse into code development. This type of integration architecture can rapidly model communication protocols for common business documents between trading partners. For example, basic business documents such as purchase orders, invoices and shipment validation must be linked between the two systems, and the trading partners must be able to easily modify their e-commerce infrastructure to adapt to new trading partners, new connectivity options and new and ever-changing business content.

Traditionally, this integration has been done by hand, which is a complex, costly and labor-intensive process. Business analysts and programmers had to analyze and code the relationships between the data of any two documents. When the number of systems and documents increase, this leads to an exponential curve problem: the addition of each incremental system means exponentially more connections.

But a new approach to B2B integration architecture can automate linking and synchronizing XML and EDI business content. Known as Enterprise Integration Modeling (EIM), this new integration approach leverages a repository-based model—much like a thesaurus—to automatically identify matches between business trading partners' business content, regardless of whether it is contained in an XML- or EDI-based system. By automating this critical process, EIM has the potential to accelerate adoption of XML-based trading partner networks, while preserving and leveraging vast corporate investments in EDI-based systems.

So what is the technology that makes the EIM approach possible, and how can automation be applied to the tasks of connecting, linking and synchronizing business documents, policies and procedures between XML and EDI systems?

About modeling
Modeling and model-driven technologies enable teams to leverage their previous development efforts. In IT systems, model-driven technologies help automate the process of interconnecting systems and managing shared processes. Modeling, already broadly used in applications, is poised to take on increasing importance in interface development as it provides the means to directly impact the behavior of systems based on business models. Modeling technologies aggregate and reuse knowledge regarding the documents and processes that fuel business. This in turn enables tight integration with existing business systems without giving up flexibility and adaptability. For the emerging EDI-to-XML transition, modeling represents a primary method of transferring business rules and common business document information between applications and trading partners.

Model-driven systems change their behavior based on abstract models that describe the information and processes. For example, a picture of a dog can elicit several "interpretations" in different languages: "cane" in Italian, "perro" in Spanish and "chien" in French. A model is just like a picture that enables the correct conversion of data fields, processes and policies. The tools and user interfaces used to develop model-based systems can significantly lower development time relative to hard-coded systems because they only require a one-time definition into the model.

Modeling systems can flatten the exponential EDI-to-XML transition curve not only by reducing the amount of time it takes to configure a system, but also by sharing models across the many systems that must be integrated. As all of the various systems involved in performing a business function evolve to use shared models, the exponential curve flattens; that is, the effort to add the 101st partner is no greater than the effort to add the first.

Characteristics of canonical modeling
Canonical modeling is an information architecture strategy to establish a shared, common view of business information by eliminating insignificant differences in the representation of enterprise data resources. Eliminating these redundancies in data representation improves the data integrity and greatly eases the translation of EDI systems to XML-based networks.

If the basis for the canonical model is adopted from organized open standards, the value added to the enterprise data resources is increased. Done correctly, standards-based models benefit from the collective experience of the membership of the standards group; that is, defining the correct representation of data may have already been done by another member. So, if canonical modeling is a key component of an integration architecture, what should it look like?

An important assumption made when working with enterprise vocabularies and canonical models is to make them as large as possible. Some models include everything that anyone might want to use. However, the size of the model has a direct impact on costs, benefits and acceptance.

Large, precise models can make data more reusable in more business contexts. If the model is too large, then publishing costs pose a barrier to success. In general, the larger the model and the business objects described in the model, the more effort it will take to publish. In application integration, the publisher's burden is primarily driven by manual data entry and cleaning.

Small models are easier to develop, maintain and learn. Yet if the model is too small, trading partner applications on the other end of an XML-driven transaction cannot process data properly.

The approach recommended here is to make the processes that are used to maintain the model flexible enough to enlarge or shrink the model as circumstances require. The canonical model chosen should meet about 80% of potential data type requirements.

Leverage, leverage, leverage
Combined with an integration infrastructure and supported by methodology, processes and technology, canonical modeling can enable technical efforts to be leveraged across departments and business partners. By overcoming many of the limitations of traditional point-to-point integration, canonical modeling can significantly improve ROI and agility.

At the core of a canonical model for information architecture is the identification and description of an enterprise's data resources. The descriptions must include syntax and semantics. Syntax is establishing a preferred representation for each significant data element. Semantics involves deciding on definitions for significant data elements. These decisions are based on subjective evaluation of the application functionality, business requirements and data sources.

In addition to these basic definitions, fully developed canonical models can include identifying systems of record and reference, and defining the processes to create, access, manage, validate and delete data resources.

Drilling down: Modeling for mapping
To illustrate the benefits of canonical modeling, this article examines a persistent, yet underrated, component of integration: mapping. Mapping is the process of defining a set of relationships between the data in one business document and the data in another. These relationships are expressed in the rules that are to be applied as the source document is transformed into the target document format. Rules can include how to change the syntax or how to validate or modify the contents. For example, one date format, say 01-Jul-01, may be mapped to a different format, say 07-01-2001.

By most industry estimates, mapping constitutes 60% of the integration effort and is the main impediment to broad deployment of XML in trading partner networks. To date, the mapping process—that of linking business content—has been very slow, sometimes requiring a minimum of 40 hours per link. When there are thousands of links per hub or trading partner, a huge trading partner integration roadblock must be overcome. Moreover, each link must be separately analyzed by a business analyst, then coded by a programmer for use in a middleware architecture.

Traditionally, a distinct map is made for each information source and target pair. The process is then repeated as new integration requirements are identified; for example, a new trading partner is added or information from another application is required. Each map is specific both to the information requirements of the process and the technologies being integrated.

Traditional manual mapping appears to work for point-to-point integration. However, limitations become apparent over time and as more elements of an enterprise's infrastructure are added. Customarily, future projects have limited leverage from earlier projects. The specific detail required in each map limits one's ability to transfer code or knowledge from the earlier mapping projects.

In addition to the difficulty of transferring detail between maps, new mappings can introduce new errors and inefficiencies, even for data fields that have not changed. Newly created maps are often unnecessarily different because the individuals who created them took different approaches. These differences include the relationships between data fields and the software code that implements the transformation. Upgrades and revisions to applications and trading partner standards require several maps to be updated.

Adding to the complexity of this process is the necessity to maintain maps over time as specifications change, the inability to transfer disparate technical and business skills and knowledge, and the sheer number of systems to include for integrated IT solutions. With manual mapping methods, these complexities combine to make mapping one of the most time-consuming and error-prone tasks in any integration project.

EIM reduces that manual effort to build links between XML trading partners, introduces reusability and reduces programming errors and inefficiencies. By drawing on the canonical model to analyze the relationships between business documents, the modeling approach can then automatically create maps to configure e-business software systems. Valuable knowledge about new business documents is retained in an evolving knowledge base that is made available to the constituents of the trading community. With the knowledge base, the automated work of building links does not have to be duplicated, thus further accelerating efficiency. By automating this linking process, organizations can dramatically reduce their time-to-market for new B2B trading partnerships. Moreover, a canonical model helps to standardize the methods of exchanging and collaborating on business documents with trading partners.

As the representation of a canonical model, an auto-mapping thesaurus is used to create a dictionary of business terms (semantics), rules, concepts and maps. The software looks for matches between the source and target message formats and applies the appropriate rules when matches are found. A well-designed thesaurus is self-learning and constantly growing with each deployment.

For example, an EDI programmer with expertise in the ANSI X12 standard may need to connect a purchase order between the programmer's own company and an SME partner that uses the RosettaNet XML standard. In ANSI X12, a purchase order is simply known as an "850." It has a standard structure, set of contents and method of communicating with other EDI systems. Even if this programmer logged onto the RosettaNet Web site, the programmer might not know that the equivalent of an ANSI 850 is called a PIP3A4. Further, it is extremely unlikely that the programmer would know how to correlate X12's concise header-level information to the PIP3A4's verbose content. With pre-built maps that address this specific correlation and hundreds of others, a thesaurus can save significant amounts of time. In addition to accelerating the deployment process, this domain expertise is preserved in an auto-mapping thesaurus, so it can be shared with other partners, saving time and effort.

There are several benefits of EIM for mapping business documents.

  • Speed—By using pre-built models of documents and capitalizing on a thesaurus of established links, deployments occur much faster and require fewer staff members.
  • Reusability—Each link established has value in saving time for future connections. Also, domain expertise is spread automatically across the company and between trading partners for each deployment, which is especially important for highly specialized areas such as EDI standards knowledge.
  • Cost reduction—Lower costs paid to consultants, if used for linking, as well as fewer person-hours of staff time devoted to linking per partner, which decreases the cost of establishing each relationship. In addition, ongoing maintenance costs drop as additional documents used in trading relationships are easier to integrate.
  • Error reduction—By using maps that are proven to work, companies can reduce the number of mistakes made by manually programming each link, further accelerating deployment.
  • Sharable data—Trading partners benefit from each others' EDI-to-XML expertise, accelerating their time to market and ability to capitalize on new business opportunities.
  • Process integration—The ease of integrating and improved understanding of the nature of the data available between trading partners improves process integration.
  • Project scheduling independence—By increasing the reusability of code and diversifying domain expertise, departments can reduce the amount of inter-team dependencies, increasing their independence to complete projects on their own timelines.

Moving ahead
Even with the extended timelines offered by industry analysts for the EDI-to-XML transformation to occur in the market, the issue of how to address the complexities of the problem remains, and it will continue to grow in importance. To successfully overcome the challenges of modeling and mapping, companies must consistently move toward a shared, common and reusable view of the business information without prohibiting the specific information requirements of specific processes and applications.

EIM provides companies with a conceptual approach to successfully implement coordinated, enterprise-wide EDI-to-XML or XML-to-XML integration projects. The successful implementation of the modeling/auto-mapping approach helps companies to quickly pilot, build out and test integration projects; keep pace with business integration demands; improve business analytics with more consistent data; and improve the deployment of key technical human resources.

In short, EIM improves a company's ability to focus investment at the business level—where competitive differentiation resides. Well-managed organizations will use this approach as a management practice, not just an IT program, ensuring sustainable trading partner relationships for years to come.