Columns

ETL beats at the heart of BI

ETL is the heart and soul of business intelligence (BI). ETL processes bring together and combine data from multiple, different source systems, enabling all users to work off a single, integrated set of data -- a single version of the truth. The result is that an organization no longer spins its wheels collecting data or arguing about whose data is correct; instead, it uses information as a key process enabler and competitive weapon.

In these organizations, BI systems are no longer nice to have, but are essential to success. These systems are no longer standalone and separate from operational processing -- they are integrated with overall business processes. As a result, an effective BI environment based on integrated data enables users to make strategic, tactical and operational decisions that drive the business on a daily basis.

Why ETL is hard
According to most practitioners, ETL design work consumes between 60% and 80% of an entire BI project. With such an inordinate amount of resources tied up in ETL work, it behooves BI teams to optimize this layer of their BI environment.

ETL is so time-consuming because it has the unenviable task of re-integrating the enterprise's data from scratch. Over the span of many years, organizations have allowed their business processes to dis-integrate into dozens or hundreds of local processes, each managed by a single fiefdom (such as a department, business unit or division) with its own systems and data, as well as its own view of the world.

With the goal of achieving a single version of the truth, business executives are appointing BI teams to re-integrate what has taken years or decades to undo. Equipped with ETL and modeling tools, BI teams are now expected to swoop in like conquering heroes and rescue the organization from information chaos. Obviously, the challenges and risks are daunting.

Although ETL tools will never transform a database administrator into General Patton, the tools are perhaps the most critical instruments in a BI team's toolbox. Whether built or bought, a good ETL tool in the hands of an experienced ETL designer can speed development, minimize the impact of systems changes and new user requirements, and mitigate project risk. On the other hand, a weak ETL tool in the hands of an untrained developer can wreak havoc on BI project schedules and budgets.

ETL in flux
Given the demands placed on ETL, and the more prominent role BI is playing in corporate boardrooms, it is no wonder that this technology is now in a state of flux.

More complete solutions. Organizations are now pushing ETL vendors to deliver more complete BI ''solutions.'' Primarily, this means handling additional back-end data management and processing responsibilities, such as providing data profiling, data cleansing and enterprise meta data management utilities. But a growing number of users also want ETL vendors to deliver a complete BI solution that spans both back-end data management functions and front-end reporting and analysis applications.

Better throughput and scalability. Users also want ETL tools to increase throughput and performance to handle exploding volumes of data and shrinking batch windows. Rather than refresh the entire data warehouse from scratch, they want ETL tools to capture and update changes that have occurred in source systems since the last load.

More sources, greater complexity and better administration. ETL tools also need to handle a wider variety of source system data, including Web, XML and packaged applications. To integrate these diverse data sets, ETL tools must also handle more complex mappings and transformations, as well as offer enhanced administration to improve reliability and speedy deployments.

''Near real-time'' data . Finally, ETL tools need to feed data warehouses more quickly with more up-to-date information. This is because batch processing windows are shrinking and business users want integrated data delivered on a more timely basis (such as the previous day, hour or minute) so they can make critical operational decisions without delay.

Clearly, the market for ETL tools is changing and expanding. In response to user requirements, ETL vendors are transforming their products from single-purpose ETL products into multi-purpose data integration platforms. BI professionals now need to leverage the convergence of ETL with new technologies to optimize their BI architectures and ensure a healthy return on their ETL investments.

About the Author

Wayne W. Eckerson is director of education and research for The Data Warehousing Institute, where he oversees TDWI's educational curriculum, member publications, and various research and consulting services. He has published and spoken extensively on data warehousing and business intelligence subjects since 1994.