Taming runaway spreadmarts

In 2002, I defined “spreadmart” to describe the renegade spreadsheets and desktop databases wreaking havoc on organizations. Since then, many people have adopted the term because it highlights a painful, yet largely ignored, problem that plagues organizations today.

Spreadmarts contain vital pieces of data the business needs to run. However, because individuals create spreadmarts at different times using different data sources and rules to define metrics, they create a fractured view of the enterprise. Without a single version of corporate data and centrally defined metrics, employees can’t share a common understanding of the business.

TDWI’s latest research report, In Search of a Single Version of Truth: Strategies for Consolidating Analytic Silos, reveals organizations want to consolidate on average 28.5 spreadmarts. In reality, most organizations have no idea how many spreadmarts they have, and some are afraid to count!

Controlling spreadmarts
Recognizing that you have a spreadmart problem and fixing it are two different matters. Some firms believe eradicating spreadmarts is impossible. Of course, this approach is shortsighted; unless you develop a strategy to deal with spreadmarts, they will end up choking your business like invasive weeds.

Microsoft’s Excel spreadsheet and Access database have succeeded beyond the imagination of even the most rosy-eyed Microsoft developer. They have greatly empowered business professionals, but at a tremendous cost to enterprise productivity and alignment.

The problem is that these tools make it too easy for knowledge workers to collect, integrate, define and analyze data. Although knowledge workers are paid to analyze data, they are not paid to do the other three tasks. Presumably, organizations maintain IT departments to collect and integrate data, and assign data stewards to define and manage the meaning of corporate metrics and data.

Another reason it is difficult to wean users from spreadmarts is that corporate BI environments often lack adequate functionality. If a BI tool doesn’t provide enough analytical power, users will simply dump the results of a BI query into Excel for analysis. In addition, if the information architecture doesn’t let power users readily access data from multiple sources, users will be tempted to use Excel or Access to circumvent IT to get the data they need.

How do you ensure that knowledge workers use Excel and Access to analyze data but not collect, integrate or define it? And how do you wean them from the more pernicious qualities of spreadsheets and personal databases without taking away their ability to effectively analyze data? There are three options: coercion, conversion and co-existence.

Coercion is the least-effective strategy. Unless you work for the military where high-level commands are enforced, you can’t banish Excel and Access from corporate desktops unless you want an angry mob on your hands. In fact, coercion usually makes the problem worse. Users go underground with their data, managing their departments with clandestine spreadmarts that run parallel to official systems.

One symptom of this problem is poor query performance. Secretive spreadmart users will simply use a BI product as an ETL tool to create a data set they want to analyze in Excel. Often, they submit runaway queries that bog down query performance for everyone. This forces IT administrators to constrain data access or limit the size of result sets, causing power users to take more drastic measures to circumvent IT and its information architecture.

Conversion is a better strategy. Most organizations have a commercial BI tool. The conversion strategy involves persuading workers to use the standard BI tool to access data and perform standard analytical tasks rather than Excel or Access. This can be a hard sell. First, you need to make sure the BI environment provides at least 150% of the value of the spreadmarts. Then you need to relentlessly sell the benefits of the BI environment. Here are some tips on selling your high-value BI environment:

1. It saves time. Tell spreadmart users the BI tool runs on top of a data warehouse that collects and integrates data across more subject areas and data sources than they could possibly access and integrate on their own using Excel or Access. Calculate the amount of time and money they will save by converting to the standard BI environment.

2. It validates data quality. Show spreadmart users how the data in the warehouse is cleaner and more accurate than the data in their spreadsheets.

3. It can provide deeper insights. Show them how analyzing data across many subject areas can lead to deeper insights.

4. It has comparable functions. Show them how the BI tool provides the same analytical functions and features as their spreadmarts.

5. It has additional functions. Show them the additional functions the BI tool supports, such as the ability to schedule, share or annotate reports; and publish reports to a portal.

6. It provides better support. Remind them IT will support them if they run into problems with the tool or the data, but they won’t get any support for their spreadmarts.

With a robust BI environment, a vigilant sales campaign and top management support, these holdouts will eventually convert to the new environment.

The best strategy to eliminate spreadmarts is to co-opt them. You do this by making Excel a true client to your BI server. Here, workers use Excel to access data and reports governed by a central server administered by IT. The client/server interface is hardwired into users’ desktops using a Web service or Excel plug-in. Users get all the analytical features of a spreadsheet, but the organization gets to manage access to data and reports in a controlled manner. This co-existence strategy is known as a “managed spreadsheet” approach, and it is an extension of the conversion strategy.

The co-existence strategy gives organizations two levels of control over the way users access data. In a tightly controlled, managed spreadsheet environment, the IT department can create and run reports on the server. The Excel client can access and interact with these pre-defined reports but cannot access data directly. Excel users can view and interact with any report on the BI server, whether or not it was created or modified to work in Excel.

In a more open environment, the IT department sets up a semantic layer consisting of information objects that represent data elements and measures in one or more source systems. Users drag and drop these objects into an Excel query builder to create a custom query and report. Because IT defines the objects, it ensures all metrics and data elements are defined, calculated and accessed in a consistent fashion.

Ideally, the semantic layer possesses a distributed query and join capability that lets users create a report or table that blends data from multiple systems. This lets them, for example, combine historical data from a data warehouse and real-time data from an operational system. Only the most recalcitrant users will want, or need, to bypass this layer to obtain data on their own.

Many leading BI vendors have greatly enhanced their Excel integration capabilities. They have transformed these modules from static import/export mechanisms to true front ends to analytical servers. Most important, Excel users can access data in BI reports as if the reports were created in Excel.

Endorsement is mandatory
Although we’ve defined three strategies for taming runaway spreadmarts, none is effective unless top management says it will no longer tolerate spreadmarts.

One way for executives to do this is to stand behind a corporate BI standard -- use the BI tool or its output to run the business. It is also important that executives require each business unit or division to hash out definitions and rules for common metrics and terms used in the BI environment. These actions serve as a catalyst to change individual attitudes and habits that free the organization of its relentless addiction to spreadmarts.

Whatever strategies you use to deliver a single version of the truth, the key is patience. Analytic habits don’t change overnight. With a heavy dose of patience, strong communications skills and a robust data warehousing environment, you should be able to tame the spreadsheet jockeys or, better yet, convert them into enthusiastic proselytizers of the new BI environment.

About the Author

Wayne W. Eckerson is director of education and research for The Data Warehousing Institute, where he oversees TDWI's educational curriculum, member publications, and various research and consulting services. He has published and spoken extensively on data warehousing and business intelligence subjects since 1994.