A Closer Look: Oracle’s New Warehouse Builder R2

It’s been a long time coming, but Oracle Corp.’s next-gen Warehouse Builder (OWB) is finally here. It’s actually been here for awhile now, according to Oracle officials, who note that OWB Release 2 has been available—to select customers, anyway—since early May.

In any event, Oracle last month delivered OWB R2 for the rest of us, pulling out a number of stops to help trumpet the availability of its would-be enterprise ETL contender. OWB’s next-gen feature set has been discussed to death, but its new-and-improved pricing and packaging is almost certainly worth a closer look. Is OWB R2—which Oracle bills as a “completely free” core ETL tool—really that?

OWB used to be sold as part of Oracle Developer Suite, but Oracle now packages it with its bread-and-butter 10g database. Why the change? Paul Narth, the Oracle senior group manager who heads up OWB development, says the shift was driven largely by feedback from customers.

“Beforehand, we were part of DeveloperSuite, but feedback we got from our customers and partners on that was that [OWB] belonged with the Oracle database,” he comments. “There’s a broad space that we play in, and the changes that we made: we’ve kind of tried to align ourselves with expectations. What we have is a set of core functionality that is actually now free with the Oracle database—core ETL functionality, free with the Standard Edition, Standard Edition One, and Enterprise Editions of the database.”

And that’s that, Oracle officials conclude. Of course, as former President Bill Clinton might say, the “free”-ness of OWB R2 could be said to hinge on just what the meaning of the word “core” is. Nor is there any dissimulation in this regard, insists Oracle’s Narth: “It’s full-fledged ETL functionality. If you consider not the product but the functionality we had in the past, the previous release of Oracle Warehouse Builder, all of that plus the new feature enhancements that we’ve made in those [existing] areas, is actually now in the core ETL product.”

At the same time, Narth concedes, “core” does not necessarily connote heterogeneous connectivity—at least in the form of Oracle’s Transparent Gateway adapters for PeopleSoft, Siebel, SAP, and other platforms. “Core ETL is basically the OWB functionality that you need to design and deploy and execute ETL jobs. In the case of the Transparent Gateways, you would still need to license the Transparent Gateways, but the enabling technology in the past that you would have had to pay for, OWB itself, that is now free,” Narth comments.

This doesn’t mean OWB R2 is a homogeneous, Oracle-only play, he stresses—just that Oracle’s high performance Transaction Gateway adapters, which facilitate bi-directional synchronization into and out of PeopleSoft and Siebel applications, as well as one-way synchronization out of SAP, still haven’t trickled down into OWB’s “core” ETL set. This makes ODBC and JDBC the go-to connectivity options for the core OWB R2 suite.

Not that this is a handicap, Narth insists; users can tap OWB’s native functionality—along with either ODBC or JDBC—to perform “all sorts” of integration activities: “You can import metadata, design objects, you can design the dataflow mapping, you can design dimensions, and one of the things we’ve added is time dimensions, so you can quickly build a dimension that does time. Then you can deploy and execute these jobs, so that functionality is available for free, and with the Oracle database, you get the interface for ODBC [and JDBC, too], so we will read and write to that.”

Oracle’s understanding of “core” ETL functionality also encompasses a basic data quality feature set, too, Narth says. “There is some data quality functionality in [the core OWB R2], support for things like match-merge, probabilistic [or] deterministic, and fuzzy logic,” he comments.

On top of this, however, Oracle now offers a premium “Data Quality Option,” which provides what Narth describes as “more sophisticated” data quality capabilities. “We’re building on top of [OWB R2’s native data quality functionality], so this [Option] is the profiling, this is the data rules, this is the auto generation of ETL maps to correct the data according to rules, this is the data auditing, the ability to audit data sources according to rules, even before you take the hit with doing any kind of transformations,” Narth explains. “This also includes data profiling, where we go and analyze the whole data or a subset of it, either remotely or locally, and we can determine various bits of information.”

OWB’s Data Quality Option delivers robust profiling capabilities, he says. “It will do data type analysis. It will profile data according to rules, so our profiling looks across tables and files—and by the way, we’ll even profile SAP data—it will look at data within a single column, it will look at data across a single table,” he says. “We can annotate specific rows, so what we actually do is clone the object and add columns, so we can say, ‘This row of data violated rules in this way,’ but we can take it another step further and automatically correct the data, because the customer has already defined the rules, so we can automatically correct or generate ETL maps to correct the data according to those rules.”

On top of its new Data Quality Option, Oracle also offers an Enterprise ETL Option, which extends the core OWB R2 feature set with an interactive, drag-and-drop design and analysis environment. “There’s functionality in the core [OWB] for interactive lineage and impact analysis, and it’s basically Web-based, so [customers] can see the pics so they can see where the data’s coming from. What they get as part of the Enterprise ETL functionality is an interactive version of this, an interactive editor that allows you to drag and drop [objects into the flow],” he explains. “It’s interactive, so I can expand these objects in space, and I can see not just that this dimension was fed by these maps which was fed by these three tables, I can see that the attributes in these dimensions were fed by these columns,… which in turn was sourced from this column.”

The Enterprise ETL Option almost makes it easier to tweak ETL jobs once they’ve been defined and deployed. “Sometimes when you deploy these things out into production, and the user says ‘Instead of being 100 characters, this description field needs to be 1,000.’ So what we have inside of this is for the objects that Oracle Warehouse Builder generates and deploys, right inside this [interactive] impact analysis or lineage [console], I can right-click on it and change the data type, the size, things like that, even actually the name. I can click on it actually and change it, and it will ripple these changes [to all of the associated objects],” Narth says.

Narth says both options are analogous to premium, best-of-breed offerings. And to some extent, their pricing is consistent with that: Oracle plans to charge $15,000 per CPU (or $300 per named user) for its Data Quality Option, and $10,000 (or $200 per named user) for its Enterprise ETL Option. Its ERP connectors, on the other hand, are $20,000 per connector per target, such that if a customer has five instances of SAP but only a single Oracle database, they only need pay for one connector, not five, Narth says.

Tough Luck for Developer Suite Customers?

Oracle used to resell OWB as part of its Developer Suite. In the R2 release, however, Oracle has decoupled Warehouse Builder from Developer Suite and repatriated it back into the bread-and-butter Oracle 10g database itself.

Does this mean customers who ponied up the cash for Developer Suite (even as recently as a month ago) are, as it were, out of luck? Not at all, says Narth: Oracle intends to do right by these customers. “For the Developer Suite customers, we’ve gone through to make sure that we want to keep them as whole as possible, so they don’t lose out on this packaging, so those customers get the core functionality plus the SAP connector for free,” he indicates.

About the Author

Stephen Swoyer is a contributing editor for Enterprise Systems. He can be reached at [email protected].