Real Time Is the Right Time

Talking Points

  • There’s an undeniable trend toward faster and better access to ever-increasing amounts of data. This one is evolutionary, more often based on existing
    investments and familiar products.
  • Instead of extract, transform and load, data warehouses need to do something that is more like data integration, transform and load.
  • Driven by demand for real time, ETL vendors have begun to move away from the idea of microbatches to real time by exposing their ETL engines as a service.

Although the U.S. Air Force might be best known for its awe-inspiring fighter jets and precision-guided rockets, it’s also a vast global organization that depends on smooth and efficient business practices. So, in the late 1990s, when the service was faced with pressures to squeeze more performance from a massive $37 billion annual budget, the Air Force hatched plans to create a global data warehouse backed by financial management analysis and executive decision support tools called CRIS, short for the Commanders’ Resource Integration System.

Because the Air Force’s data was widely scattered among many systems, the system engineers chose scale-out architecture based on independent modules running on top of a SQL Server, according to David Reeves, marketing manager at Teksouth, which built the system.

The system had to deliver all kinds of information to a huge and varied user audience, and it had to do it as much as possible in real time. Today, in its current refined form, CRIS provides for ad hoc queries as well as flash Web views for managers who want a broad, up-to-date overview and extensive drill-down capability. Most important, CRIS fulfills the mantra of faster and better, more than doubling the number of users served between 2003 and 2004, and handling a five-fold increase in individual queries while reducing response time to seconds.

Says Reeves, “What it boils down to is that because queries can now be performed for pennies, in addition to being fast, this is also data warehousing for the masses.” The Air Force’s CRIS is part of an undeniable trend toward faster and better access to ever-increasing amounts of data. For many organizations, the benefits are obvious.

That’s because unlike some other IT trends, which have turned out to be built on shaky new technology, this one is evolutionary, more often based on existing investments and familiar products. In short, it takes money and time, but it’s not rocket science. (See related story, “On-demand data points.”)

Right time is the right time
Still, cautionary voices point out that there can be some potentially daunting complexities and problems. The goal isn’t necessarily on-demand or real-time data access but right-time access—which highlights the notion the business cycle is different in different enterprises and that data warehousing speed need only match that cycle to bring benefits. Therefore, daily data may be perfectly fine for some businesses and anything faster a waste of money.

Of course, some industries move so fast that minutes count, and they must seek out the best available technology, regardless of cost. Noel-Levitz is good example of that. As a top player in the highly competitive business serving higher education, faster is always better. Tim Thein, senior VPat Noel-Levitz, explains his company uses its proprietary information and modeling expertise to give its higher-ed clients the ability to score student records on the fly, which helps them determine who among their thousands of applicants is most likely to enroll, obviously a key factor in admission decisions.

“One of the challenges in college admissions is being first in the mailbox,” Thein says. Previously, Noel-Levitz’s clients had to wait at least 72 hours before receiving a scored file of student records, on which the mailings are based. Now, using software from SAS Institute delivered through Web services, it only takes a matter of minutes to access key data, making it possible for schools to communicate quicker with qualified student candidates.

“These records can number into the hundreds of thousands, and SAS makes it easy to not only predict potential student interest, but [also] to do so much faster and more inexpensively than previously possible,” Thein says.

Thein’s experience also illustrates the way in which evolving business demands and ever more capable technology are conspiring to rewrite the once straightforward definitions that used to describe data warehousing. On-demand is one of a number of related terms that describes a faster-better variant of yesterday’s typical data warehouse. (See related story, “Words to live by.”)

As Mark Robinson, a consultant at Greenbrier & Russel, points out: “Now instead of extract, transform and load, data warehouses need to do something that is more like data integration, transform and load.”

One version of truth
What’s driving on-demand is the trend toward harnessing BI applications and analysis tools to tactical, day-to-day decisions, says Gary O’Connell, market manager for IBM Information Integration Solutions (formerly Ascential).

“It used to be the data warehouse was an organization’s central view and often the official version of the truth,” he says. On the other hand, he adds, BI applications usually were not mission critical: if they stopped running, it didn’t have an immediate impact on the business.

Now, O’Connell says, businesses are working to get a single view of data and to make that view available for BI tools for operational as well as strategic purposes.

“That is leading to a shift in how data warehouses are designed and even in how organizations are operating,” he says.

Sense of urgency
Forrester analyst Phil Russom says although the concept may not be new, the urgency is. “People are being driven by the need to react faster to information,” he says. “It is the whole compression that business is experiencing.” Thus, in the past, data warehousing tended to be focused on historic rather than operational data. Refresh rates were low. Increasingly, however, daily refresh is the standard, and hourly, or even more frequent, refresh rates are common.

Russom says the average data warehouse is not designed to receive a real- time information push. So if a company has a data warehouse and follows the common model of orienting it toward historical information, the enterprise can’t suddenly go to real time—it must redesign the warehouse, usually by adding a whole new layer of processing functionality, he says.

Although the field has its leading players, such as Teradata, the unique aspect of on-demand is that anyone can build it using familiar technologies. Jazzed-up Oracle, DB2 and SQL Server back ends (as well as open-source alternatives) can become real time, focused with the right investments.

“A real-time or right-time data warehouse is one in which the updates to data are occurring in less time than the standard business cycle,” says Mark Beyer, an analyst at Gartner. Beyer says businesses don’t need to build anything faster than their standard business cycle, which he defines as the time it takes to complete a successful transaction that generates revenue or avoids cost.

Beyer says the differences in what constitutes real time can be seen in contrasting examples of operating cycles. Battlefield situations and emergency response activities are must be analyzed within seconds. Online furniture shopping normally has a multi-day cycle while shoppers mull choices and consider options.

Doable on demand
Although there is a need for speed, the key is often extract, transform and load functions. Driven by demand for real time, Beyer says ETLvendors have begun to move away from the idea of microbatches that simply use the same basic batch processing engine. “Some are now moving toward honest real time, such as Informatica, Ascential and possibly Pervasive and Ab Initio,” he says. All are exposing their ETL engines as a service.

On-demand is entirely doable, Beyer says, but those moving to real time need to recognize that architectural issues are paramount. For instance, if an enterprise wants to analyze in real time—using up-to-the-minute source data—it can’t force source systems to keep older data, too. “If you want to be optimized and effective, you can’t say you must keep four years of data,” he says.

To drive accuracy, Beyer suggests building an infrastructure that optimizes data for accuracy based on an SLA. “In the case of a database structure, I could choose to have a big table with all the old historic information, and a small table that follows the same structure but with just the newer data,” he says.

By the dashboard lights
The on-demand data warehouse might be thought of primarily as an offspring of traditional data warehouses, but its characteristics are also inviting other approaches. For instance, some organizations are pursuing much the same vision using a dashboard. UNICCO Service, one of North America’s largest integrated facilities services outsourcing companies ($700 million in annual sales), recently selected Bowstreet’s Corporate Performance Suite for its executive dashboard. According to Bill Jenkins, senior director of IT, the executive dashboard will drive UNICCO’s decision making and provide customers with improved process visibility, too.

Jenkins says UNICCO is using the dashboard to access batch data, track balance score card initiatives and identify trends for top executives down. “It will provide information for decision makers to help them better focus and to identify areas of opportunity,” Jenkins says.

Eric Lofstrom, business intelligence manager at Quicken Loans, also uses a dashboard for an on-demand data warehouse. Quicken Loans deployed SQL Server for a 64-bit data warehouse to build development and production data marts. It also employs SQLServer Analysis Services and Reporting Services for Real-Time BI.

Lofstrom says when the project started about 3 years ago, it was not intended to produce real-time results. “When we started, it was more of a traditional BI project that pulled information from the main production system,” he says. “As the project evolved, though, it gradually turned into a dashboard system.”

Quicken Loans approached data warehousing to yield an analytical point of view. “As we built, we realized the types of information that we wanted to analyze and that we also had a tactical component—we knew we wanted to analyze not only the last 18 months but also the last 18 minutes,” he says.

At the same time, the production systems were getting clobbered by reporting requests, he says. At first, Quicken Loans looked at purchasing new hardware, Lofstrom says, but decided to look instead at accelerating its BI system by running it during the day. At first, it was once an hour, and then every half hour and then every 10 minutes. “Now we are almost up to real time,” he says.

“We do a classic change data, capture method,” Lofstrom explains. “We get data throughout the day. “If certain attributes, or the status of the loan changes, we apply that information as it occurs.” The source system posts the change using HTTP to send it to the ETL engine.

The initial system rollout included about 1,500 users, but will eventually roll out to 2,500 users, he says.

Right time to make the right decision
But as systems get faster, some, like TDWI’s Wayne Eckerson, wonder if the human element will keep up. Information latency may soon be a thing of the past, but what he calls “decision latency” may remain. [Note: TDWI and ADTshare the same parent company.]

Indeed, he notes, as the last stop in the chain, the business user may just turn into the biggest bottleneck. So, as users move toward on-demand data warehousing, Eckerson says, enterprises need to look at their business processes and their people. It may be that on-demand data warehousing will only reach its full potential when business processes are optimized and when business users receive proper training, too.

Finally, Ray Johnson at consulting firm Greenbrier & Russel warns: “The closer you get to real time, the more expensive it gets.”

Sidebar: On-demand data points
Sidebar: Words to live by

Goods for the Last Drop of Data
By Kathleen Ohlson

BI Tool Keeps Independence Air Competitive
By Lana Gates

Kalido’s Master Data Management Approach
By John K. Waters