Columns
ETL beats at the heart of BI
- By Wayne W. Eckerson
- February 28, 2003
ETL is the heart and soul of business intelligence (BI). ETL processes bring
together and combine data from multiple, different source systems, enabling all
users to work off a single, integrated set of data -- a single version of the
truth. The result is that an organization no longer spins its wheels collecting
data or arguing about whose data is correct; instead, it uses information as a
key process enabler and competitive weapon.
In these organizations, BI systems are no longer nice to have, but are
essential to success. These systems are no longer standalone and separate from
operational processing -- they are integrated with overall business processes.
As a result, an effective BI environment based on integrated data enables users
to make strategic, tactical and operational decisions that drive the business on
a daily basis.
Why ETL is hard
According to most practitioners, ETL design work
consumes between 60% and 80% of an entire BI project. With such an inordinate
amount of resources tied up in ETL work, it behooves BI teams to optimize this
layer of their BI environment.
ETL is so time-consuming because it has the unenviable
task of re-integrating the enterprise's data from scratch. Over the span of many
years, organizations have allowed their business processes to
dis-integrate
into dozens or hundreds
of local processes, each managed by a single fiefdom (such as a department,
business unit or division) with its own systems and data, as well as its own
view of the world.
With the goal of achieving a single version of the
truth, business executives are appointing BI teams to re-integrate
what has taken years or decades to undo.
Equipped with ETL and modeling tools, BI teams are now expected to swoop in like
conquering heroes and rescue the organization from information chaos. Obviously,
the challenges and risks are daunting.
Although ETL tools will never transform a database administrator into General
Patton, the tools are perhaps the most critical instruments in a BI team's
toolbox. Whether built or bought, a good ETL tool in the hands of an experienced
ETL designer can speed development, minimize the impact of systems changes and
new user requirements, and mitigate project risk. On the other hand, a weak ETL
tool in the hands of an untrained developer can wreak havoc on BI project
schedules and budgets.
ETL in flux
Given the demands placed on ETL, and the more
prominent role BI is playing in corporate boardrooms, it is no wonder that this
technology is now in a state of flux.
More complete solutions. Organizations are now pushing ETL vendors to deliver
more complete BI ''solutions.'' Primarily, this means handling additional back-end
data management and processing responsibilities, such as providing data
profiling, data cleansing and enterprise meta data management utilities. But a
growing number of users also want ETL vendors to deliver a complete BI solution
that spans both back-end data management functions and front-end reporting and
analysis applications.
Better throughput and scalability. Users also want ETL tools to increase
throughput and performance to handle exploding volumes of data and shrinking
batch windows. Rather than refresh the entire data warehouse from scratch, they
want ETL tools to capture and update changes that have occurred in source
systems since the last load.
More sources, greater complexity and better
administration. ETL tools also
need to handle a wider variety of source system data, including Web, XML and
packaged applications. To integrate these diverse data sets, ETL tools must also
handle more complex mappings and transformations, as well as offer enhanced
administration to improve reliability and speedy deployments.
''Near real-time'' data . Finally, ETL tools need to feed data warehouses more
quickly with more up-to-date information. This is because batch processing
windows are shrinking and business users want integrated data delivered on a
more timely basis (such as the previous day, hour or minute) so they can make
critical operational decisions without delay.
Clearly, the market for ETL tools is changing and expanding. In response to
user requirements, ETL vendors are transforming their products from
single-purpose ETL products into multi-purpose data integration platforms. BI
professionals now need to leverage the convergence of ETL with new technologies
to optimize their BI architectures and ensure a healthy return on their ETL
investments.
About the Author
Wayne W. Eckerson is director of education and research for The Data Warehousing Institute, where he oversees TDWI's educational curriculum, member publications, and various research and consulting services. He has published and spoken extensively on data warehousing and business intelligence subjects since 1994.