In-Depth
The Next Generation of Business Intelligence
- By Colin White
- November 1, 1999
Decision processing systems and their underlying analytic applications,
provide business users with the information they need to track and analyze
business trends, and to explore new business opportunities. As businesses
become increasingly competitive and complex, effective decision processing
systems are essential for success.
A decision processing system analyzes business information captured
from operational systems (back- and front-office, and e-business applications)
and distributes it to corporate decision-makers. As data flows from operational
files and databases to a decision processing system -- and is distributed
to business users via corporate intranets and extranets -- its information
content improves in quality, accuracy and business value. This flow of
data can be thought of as an information supply chain whose objective
is to convert raw operational data into useful business information that
helps managers and executives make informed business decisions. Such decisions
usually result in changes to operational systems in areas like product
pricing, channel marketing, sales quotas and so forth. The positive or
negative effect of these changes can then be measured by business intelligence
tools and analytic applications to form a closed-loop decision-making
environment (see Fig. 1).
Closing the loop between decision processing and operational processing
is, at present, primarily a manual process; it often involves exchanging
information via collaborative processing in the form of E-mails, presentations,
office documents and memos. Tracking this information, as well as the
business decisions arising from "closing the loop," allows managers and
executives to discover why a particular decision was made and its impact.
In order to gather this data, however, it is essential to integrate collaborative
processing into the decision-making environment.
E-business applications are also becoming an important data source for
a data warehouse. But feedback from a decision processing environment
to an e-business application may need to occur more rapidly (maybe even
in real time) vs. back- or front-office applications. This is because
analytic results from the decision processing environment may be used
to control the interaction (sequence of Web pages displayed) between the
e-business application and the user.
However, this need can only be satisfied if existing manual feedback
mechanisms are automated. This is easier to achieve if the same vendor
supplies both the decision processing and e-business products. If this
is not possible, an open framework is required to allow products from
different vendors to communicate. As we shall see later, an enterprise
information portal plays an important role in this stage by managing the
information supply chain and automating a closed-loop decision-making
environment.
Four tasks to master
Decision processing involves four distinct, but related, tasks: extracting
and transforming information, managing information, analyzing and modeling
information, and distributing information.
Extracting and transforming information -- this involves capturing data
from operational systems, transforming it into business information, and
loading it into a data warehouse information store. Many organizations
employ data warehouse extraction, transformation and loading (ETL) tools
to perform this task. They are finding, however, that ETL tools cannot
always deal with the quality problems that occur in source data. ETL tools
are ideal for automating tasks like restructuring and decoding input fields,
merging data from multiple files and building data aggregations; when
significant quality problems exist in source data, data profiling and
reengineering tools should first be used to analyze and identify problem
areas and to clean the data prior to ETL tool processing. Product examples
include Evoke Migration Architect, Trillium Software System and Vality
Integrity.
The direction of vendors is toward providing customizable data warehouse
templates that contain a starter kit for constructing a business area-specific
data warehouse (usually called a data mart). These templates consist of
three components: a business-area template, an extract template and a
transformation template. Not all products contain all three components,
however.
The business-area template documents the business metrics and underlying
business rules that are typically required by users when analyzing and
modeling a specific business process. In addition, the business-area template
often comes with a data model that provides a customizable data warehouse
design.
The extract template provides easy-to-use interfaces for capturing business
data from operational applications. Current extract templates on the market
are aimed primarily at capturing data from ERP transaction processing
systems. Some of these products support ERP-centric data warehouses (for
example, Informatica PowerCenter for SAP BW, SAP Business Information
Warehouse and PeopleSoft BPM Data Warehouse), while others provide ERP-independent
solutions (for example, Acta ActaWorks and RapidMarts, and
Informatica PowerConnect).
A transformation template defines the processing required to transform
data captured using the extract template into a form for loading into
a data warehouse built to a design outlined by the business-area template.
Managing information -- this task encompasses the maintenance of business
information in information stores, and how these information stores are
accessed by business intelligence tools and analytic applications. The
cornerstone of decision processing is data warehousing, and warehouse
information stores should be organized in a federated data warehouse topology
(see "The federated data warehouse," p. 64 for more details). In such
a topology, data captured from operational applications is maintained
in information stores managed by relational and/or multidimensional database
products.
Analyzing and modeling information -- the traditional approach to decision
processing is to build a data warehouse and then supply business users
with a set of business intelligence tools (query, reporting, OLAP and
data mining products, for example) to process information in data warehouse
information stores. This approach may be acceptable for query and reporting,
or for experienced users, but it does not work for business managers who
need detailed analyses, or who do not have the time and experience to
master complex business intelligence tools. Even when business users feel
comfortable with a particular tool, they often find navigating a data
warehouse to be a difficult and time-consuming task.
Canned queries and reports, as well as vendor-supplied reporting and
analysis templates, can help reduce the learning curve associated with
a business intelligence tool; however, this approach is only a partial
solution for business users who need to do in-depth analyses involving
drill-down queries and to model different business scenarios. A better
approach is to employ turn-key and Web-based analytic application packages
that are designed to provide comprehensive analyses for the business area
being researched, and that offer a familiar and simple Web interface for
the business user.
In many organizations, there are a handful of key business metrics (for
example, revenue dollars per sales rep per day) that are employed by business
managers to monitor the health of the company or to determine the success/failure
of sales campaigns, new product introductions and so forth. The business
rules behind these metrics are often very complex, and it is sometimes
difficult to configure business intelligence tools to provide these metrics.
One benefit of analytic application packages is that they can ease the
burden of building applications to create and maintain these metrics.
A key requirement of analytic applications is that they should store metrics,
and the business rules behind those metrics, in an open repository so
that they can be customized to suit an organization's requirements.
Distributing information -- business intelligence tools and analytic
applications distribute information and the results of analysis operations
to business users via standard graphical and Web interfaces. Many of these
products also support the schedule- and event-driven delivery of information
and analyses to Web servers and E-mail systems. As the amount of business
information managed by a decision processing system increases, it is likely
that this information will be distributed across a range of information
stores on different servers. This will be especially true as organizations
begin to capture business knowledge from external sources, internal collaborative
and office systems, and Web servers. To help users uncover and organize
this range of business information, an enterprise information portal is
required.
An enterprise information portal (EIP) provides a single point of entry
to any piece of business information, no matter where it resides. It provides
access to all of the information flowing from operational applications
to decision and collaborative processing systems. Information viewed through
an EIP is customized (personalized) to match the user's role in the organization
-- users see only the information they are interested in or are authorized
to access. Executives can be quickly notified about items requiring urgent
action, while business analysts can drill down through multiple levels
of information when doing detailed analysis tasks like financial analysis
or supply-chain optimization.
The main components of an EIP are the information assistant, a business
information directory and a subscription facility. The information assistant
provides a customizable Web browser interface that works in conjunction
with a navigation and delivery engine to process user requests for business
information. The business information directory is a Web server-based
index of an organization's business information. The index is maintained
via an interactive publishing facility that uses automated information
scanners to regularly scan selected servers for new business information,
and by import/export interfaces that let external applications maintain
directory information via flat files or a programmatic interface. The
subscription facility allows users to have business information distributed
to them on a regular basis (immediately, at a certain time and date, at
user-defined intervals or when certain business rules are satisfied).
In the future, we are likely to see a business rules directory added
to an EIP, which will help automate the feedback loop between the decision
processing environment and operational systems. The subscription facility
will route feedback messages to corporate business users, and transaction
and e-business applications, based on how the results of information analyses
satisfy rules defined to the business rules directory.
In addition to independent portals like the Plumtree Corporate Portal
and VIT SeeChain Portal, enterprise information portals are being integrated
into business intelligence tools (for example, Brio ReportMart, Sterling
MyEureka, Viador E-Portal Suite) and packaged applications (for example,
Onyx Enterprise Portal, VIT SeeChain Supply Chain Performance Measurement
Applications).
Integrating the pieces
Given the diversity of corporate decision processing requirements, it
is unlikely that one vendor will be able to provide an integrated application
suite that can support all of an organization's requirements. Companies
will therefore have to employ multiple products -- but it is essential
that these products can be integrated to form a cohesive decision processing
environment. DataBase Associates' decision processing blueprint documents
a set of interfaces and tools for creating an integrated decision processing
system infrastructure.This enables products from multiple decision processing
vendors to interoperate and exchange business information and associated
meta data.
Colin White, founder of DataBase Associates International Inc., is a
leading information technology consultant. He specializes in data warehousing,
business intelligence tools and analytic applications, enterprise information
portals and database systems. He can be reached at [email protected].
The federated data warehouse Most organizations building a data warehousing system use either a top-down
or bottom-up development approach. In the top-down approach, an enterprise
data warehouse (EDW) is built in an iterative manner -- business area
by business area -- and underlying dependent data marts are created as
required from the EDW contents. In the bottom-up approach, independent
data marts are created so that, at some time in the future, they can be
integrated into an enterprise data warehouse. While there is much industry
debate about the pros and cons of each approach, there is a steady trend
toward the use of independent data marts. The move toward turn-key analytic
application packages will accelerate this trend.
What organizations require is a solution that offers the low cost and
rapid ROI advantages of the independent data mart approach, without the
problems of data integration in the future. To achieve this, the design
and development of independent data marts must be managed and based on
a common shared business model of an organization's decision processing
requirements. In the decision processing blueprint introduced in this
article, this hybrid solution is called a federated data warehouse. Two
key components of a federated data warehouse are the common business model
and shared information staging areas.
Common business model
The federated data warehouse approach supports the iterative development
of a data warehousing system containing independent data marts. The key
to data integration in a federated data warehouse is a common business
model of the business information stored and managed by the warehousing
system. The ensures consistency in data names and business definitions
across warehousing projects.
The common business model is updated as new data marts are built. When
the data mart design is dictated by the data in operational systems, the
common business model is used -- and updated as appropriate -- in parallel
with the development of underlying data mart data models. In an environment
driven directly from business-user decision processing requirements, the
common business model is developed first and then used to create one or
more underlying data mart data models. In this latter case, the use of
customizable business area and data model templates can significantly
reduce the effort involved in developing the common business model and
its associated data models. Products like Appsco's AppsMart and Decisionisms's
Aclue also provide a business model-driven approach to data mart development.
If packaged analytic applications are used, it is important that the business
models used by these applications can be customized and integrated with
the organization's common business model.
Shared information staging areas
When a new independent data mart is built, developers typically create
a new suite of data extraction and transformation applications that are
rarely integrated with the applications used to build other independent
data marts. The net result is that as the use of independent data marts
increases, so do the number of extraction and transformation routines.
Maintaining these routines is resource intensive, and coordinating their
execution at runtime is an operational nightmare.
The solution to this problem is to break the processing into multiple
steps. This involves developing a set of routines that extract and clean
(using data profiling and data reengineering tools, for example) the source
data and load it into shared information staging areas. The staging areas
are then used to feed data into independent data marts. As data flows
out of the staging areas, it is enhanced and mapped by ETL tools into
the format required by the target warehouse information store. As new
data marts are added, existing extraction routines and staging area data
can be re-used or enhanced as required. This technique works very well
in a federated data warehouse environment where the common business model
can be used in the design of the staging areas and data extraction routines.