The Next Generation of Business Intelligence

Decision processing systems‚ and their underlying analytic applications, provide business users with the information they need to track and analyze business trends, and to explore new business opportunities. As businesses become increasingly competitive and complex, effective decision processing systems are essential for success.

A decision processing system analyzes business information captured from operational systems (back- and front-office, and e-business applications) and distributes it to corporate decision-makers. As data flows from operational files and databases to a decision processing system -- and is distributed to business users via corporate intranets and extranets -- its information content improves in quality, accuracy and business value. This flow of data can be thought of as an information supply chain whose objective is to convert raw operational data into useful business information that helps managers and executives make informed business decisions. Such decisions usually result in changes to operational systems in areas like product pricing, channel marketing, sales quotas and so forth. The positive or negative effect of these changes can then be measured by business intelligence tools and analytic applications to form a closed-loop decision-making environment (see Fig. 1).

Closing the loop between decision processing and operational processing is, at present, primarily a manual process; it often involves exchanging information via collaborative processing in the form of E-mails, presentations, office documents and memos. Tracking this information, as well as the business decisions arising from "closing the loop," allows managers and executives to discover why a particular decision was made and its impact. In order to gather this data, however, it is essential to integrate collaborative processing into the decision-making environment.

E-business applications are also becoming an important data source for a data warehouse. But feedback from a decision processing environment to an e-business application may need to occur more rapidly (maybe even in real time) vs. back- or front-office applications. This is because analytic results from the decision processing environment may be used to control the interaction (sequence of Web pages displayed) between the e-business application and the user.

However, this need can only be satisfied if existing manual feedback mechanisms are automated. This is easier to achieve if the same vendor supplies both the decision processing and e-business products. If this is not possible, an open framework is required to allow products from different vendors to communicate. As we shall see later, an enterprise information portal plays an important role in this stage by managing the information supply chain and automating a closed-loop decision-making environment.

Four tasks to master

Decision processing involves four distinct, but related, tasks: extracting and transforming information, managing information, analyzing and modeling information, and distributing information.

Extracting and transforming information -- this involves capturing data from operational systems, transforming it into business information, and loading it into a data warehouse information store. Many organizations employ data warehouse extraction, transformation and loading (ETL) tools to perform this task. They are finding, however, that ETL tools cannot always deal with the quality problems that occur in source data. ETL tools are ideal for automating tasks like restructuring and decoding input fields, merging data from multiple files and building data aggregations; when significant quality problems exist in source data, data profiling and reengineering tools should first be used to analyze and identify problem areas and to clean the data prior to ETL tool processing. Product examples include Evoke Migration Architect, Trillium Software System and Vality Integrity.

The direction of vendors is toward providing customizable data warehouse templates that contain a starter kit for constructing a business area-specific data warehouse (usually called a data mart). These templates consist of three components: a business-area template, an extract template and a transformation template. Not all products contain all three components, however.

The business-area template documents the business metrics and underlying business rules that are typically required by users when analyzing and modeling a specific business process. In addition, the business-area template often comes with a data model that provides a customizable data warehouse design.

The extract template provides easy-to-use interfaces for capturing business data from operational applications. Current extract templates on the market are aimed primarily at capturing data from ERP transaction processing systems. Some of these products support ERP-centric data warehouses (for example, Informatica PowerCenter for SAP BW, SAP Business Information Warehouse and PeopleSoft BPM Data Warehouse), while others provide ERP-independent solutions (for example, Acta ActaWorks and RapidMarts, and
Informatica PowerConnect).

A transformation template defines the processing required to transform data captured using the extract template into a form for loading into a data warehouse built to a design outlined by the business-area template.

Managing information -- this task encompasses the maintenance of business information in information stores, and how these information stores are accessed by business intelligence tools and analytic applications. The cornerstone of decision processing is data warehousing, and warehouse information stores should be organized in a federated data warehouse topology (see "The federated data warehouse," p. 64 for more details). In such a topology, data captured from operational applications is maintained in information stores managed by relational and/or multidimensional database products.

Analyzing and modeling information -- the traditional approach to decision processing is to build a data warehouse and then supply business users with a set of business intelligence tools (query, reporting, OLAP and data mining products, for example) to process information in data warehouse information stores. This approach may be acceptable for query and reporting, or for experienced users, but it does not work for business managers who need detailed analyses, or who do not have the time and experience to master complex business intelligence tools. Even when business users feel comfortable with a particular tool, they often find navigating a data warehouse to be a difficult and time-consuming task.

Canned queries and reports, as well as vendor-supplied reporting and analysis templates, can help reduce the learning curve associated with a business intelligence tool; however, this approach is only a partial solution for business users who need to do in-depth analyses involving drill-down queries and to model different business scenarios. A better approach is to employ turn-key and Web-based analytic application packages that are designed to provide comprehensive analyses for the business area being researched, and that offer a familiar and simple Web interface for the business user.

In many organizations, there are a handful of key business metrics (for example, revenue dollars per sales rep per day) that are employed by business managers to monitor the health of the company or to determine the success/failure of sales campaigns, new product introductions and so forth. The business rules behind these metrics are often very complex, and it is sometimes difficult to configure business intelligence tools to provide these metrics. One benefit of analytic application packages is that they can ease the burden of building applications to create and maintain these metrics. A key requirement of analytic applications is that they should store metrics, and the business rules behind those metrics, in an open repository so that they can be customized to suit an organization's requirements.

Distributing information -- business intelligence tools and analytic applications distribute information and the results of analysis operations to business users via standard graphical and Web interfaces. Many of these products also support the schedule- and event-driven delivery of information and analyses to Web servers and E-mail systems. As the amount of business information managed by a decision processing system increases, it is likely that this information will be distributed across a range of information stores on different servers. This will be especially true as organizations begin to capture business knowledge from external sources, internal collaborative and office systems, and Web servers. To help users uncover and organize this range of business information, an enterprise information portal is required.

An enterprise information portal (EIP) provides a single point of entry to any piece of business information, no matter where it resides. It provides access to all of the information flowing from operational applications to decision and collaborative processing systems. Information viewed through an EIP is customized (personalized) to match the user's role in the organization -- users see only the information they are interested in or are authorized to access. Executives can be quickly notified about items requiring urgent action, while business analysts can drill down through multiple levels of information when doing detailed analysis tasks like financial analysis or supply-chain optimization.

The main components of an EIP are the information assistant, a business information directory and a subscription facility. The information assistant provides a customizable Web browser interface that works in conjunction with a navigation and delivery engine to process user requests for business information. The business information directory is a Web server-based index of an organization's business information. The index is maintained via an interactive publishing facility that uses automated information scanners to regularly scan selected servers for new business information, and by import/export interfaces that let external applications maintain directory information via flat files or a programmatic interface. The subscription facility allows users to have business information distributed to them on a regular basis (immediately, at a certain time and date, at user-defined intervals or when certain business rules are satisfied).

In the future, we are likely to see a business rules directory added to an EIP, which will help automate the feedback loop between the decision processing environment and operational systems. The subscription facility will route feedback messages to corporate business users, and transaction and e-business applications, based on how the results of information analyses satisfy rules defined to the business rules directory.

In addition to independent portals like the Plumtree Corporate Portal and VIT SeeChain Portal, enterprise information portals are being integrated into business intelligence tools (for example, Brio ReportMart, Sterling MyEureka, Viador E-Portal Suite) and packaged applications (for example, Onyx Enterprise Portal, VIT SeeChain Supply Chain Performance Measurement Applications).

Integrating the pieces

Given the diversity of corporate decision processing requirements, it is unlikely that one vendor will be able to provide an integrated application suite that can support all of an organization's requirements. Companies will therefore have to employ multiple products -- but it is essential that these products can be integrated to form a cohesive decision processing environment. DataBase Associates' decision processing blueprint documents a set of interfaces and tools for creating an integrated decision processing system infrastructure.This enables products from multiple decision processing vendors to interoperate and exchange business information and associated meta data.

Colin White, founder of DataBase Associates International Inc., is a leading information technology consultant. He specializes in data warehousing, business intelligence tools and analytic applications, enterprise information portals and database systems. He can be reached at [email protected].

The federated data warehouse

Most organizations building a data warehousing system use either a top-down or bottom-up development approach. In the top-down approach, an enterprise data warehouse (EDW) is built in an iterative manner -- business area by business area -- and underlying dependent data marts are created as required from the EDW contents. In the bottom-up approach, independent data marts are created so that, at some time in the future, they can be integrated into an enterprise data warehouse. While there is much industry debate about the pros and cons of each approach, there is a steady trend toward the use of independent data marts. The move toward turn-key analytic application packages will accelerate this trend.

What organizations require is a solution that offers the low cost and rapid ROI advantages of the independent data mart approach, without the problems of data integration in the future. To achieve this, the design and development of independent data marts must be managed and based on a common shared business model of an organization's decision processing requirements. In the decision processing blueprint introduced in this article, this hybrid solution is called a federated data warehouse. Two key components of a federated data warehouse are the common business model and shared information staging areas.

Common business model

The federated data warehouse approach supports the iterative development of a data warehousing system containing independent data marts. The key to data integration in a federated data warehouse is a common business model of the business information stored and managed by the warehousing system. The ensures consistency in data names and business definitions across warehousing projects.

The common business model is updated as new data marts are built. When the data mart design is dictated by the data in operational systems, the common business model is used -- and updated as appropriate -- in parallel with the development of underlying data mart data models. In an environment driven directly from business-user decision processing requirements, the common business model is developed first and then used to create one or more underlying data mart data models. In this latter case, the use of customizable business area and data model templates can significantly reduce the effort involved in developing the common business model and its associated data models. Products like Appsco's AppsMart and Decisionisms's Aclue also provide a business model-driven approach to data mart development. If packaged analytic applications are used, it is important that the business models used by these applications can be customized and integrated with the organization's common business model.

Shared information staging areas

When a new independent data mart is built, developers typically create a new suite of data extraction and transformation applications that are rarely integrated with the applications used to build other independent data marts. The net result is that as the use of independent data marts increases, so do the number of extraction and transformation routines. Maintaining these routines is resource intensive, and coordinating their execution at runtime is an operational nightmare.

The solution to this problem is to break the processing into multiple steps. This involves developing a set of routines that extract and clean (using data profiling and data reengineering tools, for example) the source data and load it into shared information staging areas. The staging areas are then used to feed data into independent data marts. As data flows out of the staging areas, it is enhanced and mapped by ETL tools into the format required by the target warehouse information store. As new data marts are added, existing extraction routines and staging area data can be re-used or enhanced as required. This technique works very well in a federated data warehouse environment where the common business model can be used in the design of the staging areas and data extraction routines.