In-Depth

Plato vs. the republic

This summer, Microsoft is on final approach for its long-awaited entrée into the online analytical processing (Olap) market with its product, code-named Plato. For the last year and a half, the Olap market has been girding itself for Plato's entrance onto the scene. Ever mindful of the potential returns that might be garnered from riding Microsoft's coattails, vendors have been quick to announce alliances with the Redmond, Wash., software juggernaut, and their support of Plato and OLE DB for Olap (code-named Tensor). Out of the spotlight, however, there is healthy skepticism about Plato and what it can deliver.

Having decided relatively late in the game to incorporate a data warehouse framework into its NT BackOffice architecture, Microsoft followed its standard operating procedure of buy, build out and bundle. In October 1996, the firm announced its acquisition of Olap technology from Panorama Software Systems, an Israeli-based company. With an explosive development effort that few other than Microsoft could afford, the six-member Panorama team has expanded to 50, and Plato -- now in its final beta -- has been readied for an expected general release in Q4/'98. Microsoft has also announced that it will bundle Plato with SQL Server 7.0, thus providing a free Olap engine to SQL Server customers.

Plato's storage engine

An Olap environment has three fundamental components: an Olap storage engine, a method of data access (often an API), and an analytic application that uses data queried from the Olap engine. While this is a very simplified view, it helps clarify what Plato and Tensor are, and what they can offer. Plato is an Olap storage engine -- it provides no associated Olap application to use the data it stores. OLE DB for Olap (Tensor) is an API that can be used to query data from Plato. Together, Plato and Tensor can provide Olap storage and access, but an analytic application is required to actually do something with the data. Microsoft has made bold claims about Plato's ability to manage enterprise Olap storage. In order to support these claims, however, Plato needs to deliver in several key areas.

PERFORMANCE: Olap engines must be able to load data quickly, as well as process large volumes of queries without delays. Yet these two functions actually work at odds with each other. While fast query performance is best achieved by pre-aggregating the data at load time, extensive pre-calculation can bog down the load or refresh cycle. The best measure of an Olap engine's performance considers both load and query performance for a given example. In this way, the results take into account the degree of pre-aggregation for both loads and queries, rather than showing only the more favorable result.

The question of when and how much to aggregate is key to an Olap engine's performance, and is far from trivial. Ideally, dense dimensions that are frequently queried should be pre-calculated at load time. Sparse dimensions that are lightly accessed can be dynamically calculated at query time with less of an impact on query performance. This balancing of pre- and dynamic aggregation (also referred to as sparse aggregation) optimizes the Olap engine's performance for both loads and queries.

Plato provides sophisticated heuristics to determine the ideal balance of pre- and dynamic aggregations based on a given data model. It can then refine the aggregation balance ex-post based on observed usage patterns. This means that customers can balance their aggregations for optimal performance out of the box more easily. This will help customers optimize dynamic partitioning, but still does not indicate what the engine's performance will be.

The Olap Council's APB-1 benchmark is one of the best ways to assess a product's performance relative to other Olap tools (see "Using the Olap benchmark," Application Development Trends, Sept. 1997, p. 67). The benchmark measures both load and query times under a set of realistic, audited conditions. According to Steve Murchie, Microsoft's product manager for data warehousing, Plato's performance will be measured under the APB-1 benchmark. While this is something of a surprise (Microsoft is notably not on the Olap Council), it is an important step in validating Microsoft's vaunted performance claims.

SCALABILITY: Scalability has always been a particular challenge to multidimensional storage engines. Numbers of dimensions, hierarchy complexity, sparsity and degree of aggregation can quickly explode a multidimensional database into Brob-dingnagian proportions. One of the most effective solutions to multidimensional scalability is partitioning.

There are two flavors of partitioning: physical and dimensional. Physical partitioning allows an application to work with data from a larger, virtual data cube that is essentially composed of multiple, smaller data cubes. The Olap engine manages the meta data, semantic mappings and middleware necessary to process the queries and return the data sets transparently, leaving the application with the appearance of a single data cube. Because each data cube can potentially sit on its own server, physical partitioning can greatly improve enterprise scalability by taking full advantage of dedicated server resources and the proximity to physically disparate data sets.

In dimensional partitioning, data can be traced throughout multiple cubes as long as each cube shares at least a single dimension with another cube. The shared dimensions act as a join between data cubes, allowing a query to navigate through the cubes to find the necessary data. Dimensional partitioning allows different application cubes, that are optimized to solve different business problems, to share data. While there must be sufficient consistency in the dimensional model to allow the cubes to be "joined," all of the application-specific characteristics of aggregations, calculations, attributes and measures can vary based on the given application's specific requirements.

Microsoft has built partitioning functionality into Plato, claiming support of both physical and dimensional partitioning. While this functionality is important to multidimensional scalability, Plato will have tough competition. [Ed. note: In May, Arbor Software Corp., Sunnyvale, Calif., announced an agreement to merge with Hyperion Software Corp., Stamford, Conn. See "Arbor and Hyperion plan to merge," p. 22.]

CROSS-PLATFORM SUPPORT: While Microsoft is saying all the right things about performance and scalability, its message regarding cross-platform support is disappointing, albeit vintage Microsoft -- there is none. Plato will be bundled with SQL Server 7.0 and will, therefore, run only on NT. While the attempt is to validate and extend NT as the primary (and, preferably, only) enterprise-server platform, the result is a niche, single-platform Olap engine. Because an enterprise-level product must be able to support both client and server software on a variety of platforms, Plato's single-platform server makes it less suitable on an enterprise scale.

Single-platform support quickly reduces the scalable growth partitioning can offer an Olap engine. While cubes can be partitioned across servers, the servers must all be NT. Vendor-enforced constraints on platform selection for software products are not acceptable at the enterprise server level. There is simply too much good competition offering multiplatform support. Microsoft has hurt itself with this decision, and has moved Plato off the enterprise level and relegated it to the workgroup environment.

API access to Olap data

The first question to ask when considering access to an Olap server is whether or not an open API is even needed. If you use a front-end application that is compatible with a particular vendor's Olap engine, a standard API is not required to access an Olap server. For example, an application developed in Oracle's Express could directly communicate with the Express Olap engine. This also holds true in situations where vendor partnerships guarantee interoperability between products. Native access within a given product suite or among interoperable products is a very efficient form of application data access. An open API standard becomes essential, however, as different Olap applications and servers are introduced into the same environment and need to talk to one another. An open API also encourages the development of third-party Olap applications that can then run against a broader range of Olap servers.

When an open API is required, there are currently two choices: Microsoft's OLE DB for Olap (aka Tensor) or the Olap Council's Multi-Dimensional API (MDAPI). Tensor is an extension to OLE DB, Microsoft's intended data access mechanism within its Component Object Model (COM) architecture. MDAPI is a second-generation API drafted by the Olap Council, a vendor consortium comprising many of the largest Olap vendors, although not Microsoft.

The short version of the API battle is that both APIs provide good Olap query support. However, there are some notable differences (see Table 1):

*Tensor is a stateless API and therefore cannot readily support the dynamic queries that are typical of interactive Olap analysis. It is very well suited, however, to the type of static queries typically used in report generation. In contrast, MDAPI is stateful and very well suited to dynamic queries, pivots and drill downs.

*Tensor supports Windows 95 and NT clients. In addition to Windows clients, MDAPI supports COM and Java object interfaces (making it compatible with any Java Virtual Machine) and is fully Corba-compliant. In fact, MDAPI's object model easily enables custom Olap application development.

*Tensor supports data writebacks to the Olap engine. This is an important feature for business forecasting and "what-if" analysis. MDAPI currently does not support data writebacks. Interestingly enough, while Tensor supports writebacks, Plato does not have writeback capability.

Most of the major Olap vendors support both APIs, but even here the picture is not so simple. Of the Olap Council members, Arbor Software Corp. is the only one not supporting MDAPI. Arbor's history of aggressive partnering, however, has given the firm interoperability with many Olap front-end products. It is no surprise that Plato will not support MDAPI, considering Microsoft's effort to drive Tensor as the industry standard. Of the major vendors, only Oracle Corp., Redwood Shores, Calif., has taken an anti-Microsoft, pro-Council position by not supporting Tensor.

How Tensor and MDAPI compare

OLE DB for Olap (Tensor)

MDAPI

Developed by Microsoft Developed by the Olap Council
Stateless Stateful
Well suited for static reporting Well suited for dynamic, interactive analysis
Supports Windows 95 & NT clients Supports Windows 95 & NT clients, as well as COM and Java interfaces. Fully Corba-compliant.
Difficult to develop custom applications Easy to develop custom applications
Supports data writes to Olap server (Plato does not support writes) No support for data writes to Olap server
All major vendors, except Oracle, support Tensor All major vendors, except Arbor and Microsoft, support MDAPI
Source: IDT Tech

Applications supported by Plato

Software Productivity Group's (SPG's) recently published Data Warehousing Market Intelligence Report defines a data warehousing market maturity model that describes the successive stages of analytic application usage. According to this model, users new to an analytic environment fall under the first stage, data access. Data access is typified by data familiarization at the atomic level, and most SQL-based query and reporting tools fit into this stage. As users become more sophisticated in their business analysis, they want to see more summary information in a comparative format. This second stage, data summarization, is generally the entry level for Olap applications. Once acquainted with general Olap navigation and reporting, users' needs shift to the third stage, business-specific analysis. This stage applies Olap functionality to particular business areas, such as marketing analysis or finance. The fourth and final stage in the maturity model, complex modeling and forecasting, represents use of the most sophisticated analytic applications, potentially employing both Olap and data mining techniques.

SPG's market maturity model provides a useful framework for evaluating the applications likely to run atop Plato. If application access to Plato is through Tensor, application functionality will be limited to the more static querying and reporting within the second stage (data summarization) because of Tensor's lack of state. This positions Plato squarely in the first stage/ early second stage range. Plato would then be appropriate for early Olap adopters who have not yet developed sophisticated analytic requirements. As a user grows upward through the maturity model, and develops more refined needs and sophisticated analytic requirements, Plato will be less able to support the required functionality. A more robust, full-featured Olap server will then be necessary.

If the maturity model is viewed as a pyramid with the earliest stages on the bottom, it is easy to see why the higher end Olap vendors do not view Microsoft as a threat. If Plato can broaden its base of Olap users in the early use stages, more users will ultimately migrate upwards to the later stages of the maturity model. Microsoft will introduce more companies and users to Olap, enabling users to develop basic analytic skills and qualify them for more complex types of Olap analysis supported by more sophisticated products.

While most existing Olap client products will be able to run against Plato via Tensor, Ottawa-based Cognos Inc. recently added a Plato-specific client product to its wares. In May, Cognos announced its acquisition of the worldwide marketing and distribution rights to Panorama Software System's Plato-specific Olap client, now called Aristotle. It is difficult to view this move by Cognos as anything other than tactical. Aristotle does not fit in with the Cognos product architecture, and does not provide any additional functionality not already available in its PowerPlay product. The firm is positioning Aristotle as uniquely able to leverage Plato functionality, yet Plato is not nearly as full-featured as PowerPlay.

Cognos will follow Microsoft's pricing strategy, short of giving away Aristotle, and set the price point very low. The hope is to sell volume through the Plato and Microsoft association, and to develop a Cognos-customer relationship with Aristotle that will hopefully lead to future sales of its other products. Unfortunately, for users opting to go this route, there will be little consistency or integration between Aristotle and Cognos' other products should users require more sophisticated functionality down the road.

Microsoft's pricing and distribution of Plato will also likely impact the existing Olap market, but not nearly as much as many observers believe. By bundling Plato free with SQL Server, Microsoft is essentially commoditizing its Olap engine as yet one more backoffice function. While this will certainly introduce Olap to many more organizations, one must ask at what level. Plato might be attractive as a low-cost trial product for individuals or departments unfamiliar with Olap technology. However, larger groups with a specific, critical business need will spend the money on a higher end product and a vendor relationship that offers a complete business solution.

Do not expect to see Arbor or Oracle slashing prices and running toward indirect distribution channels -- neither of these firms sees a need to compete with Microsoft at this level. Of the major multidimensional Olap engines, TM1 from Applix Inc., Westborough, Mass., has the most developed indirect channel strategy (as an outgrowth of its early days as Sinper Corp., where it offered the TM1 product but lacked an internal distribution mechanism). Applix, however, does not see Plato as a significant threat. What Microsoft's pricing and distribution strategy is likely to cause is a mushrooming market for third-party value-added services providing lightweight Olap development, administration and support.

Et tu, Microsoft?

One of the most fascinating questions surrounding Microsoft's development and release of Plato, is why did they do it? Why did the undisputed 500-lb. gorilla of the desktop choose to develop backoffice server functionality, while completely ignoring the potential desktop data access applications that could run against that server? Microsoft's client-side solution to accessing Plato's data is to use pivot tables in Excel. (Not a very compelling desktop strategy.) Throw into the mix the fact that Microsoft's public relations and marketing has focused primarily on Tensor, rather than Plato, and a neat little puzzle emerges. What are those folks in Redmond thinking?

Clearly the dominance of Oracle in the relational DBMS market weighs heavily in Microsoft's planning. SQL Server's market share is growing, but not nearly enough to pose a serious threat to Oracle -- yet. The slight degree of marketing attention Plato has received, together with its free bundling with SQL Server, would suggest that Plato is a checklist item for SQL Server, and not a strategic product direction for Microsoft. However, Tensor may not be so easily dismissed. If OLE DB (including OLE DB for Olap) can be established as the de facto standard for distributed data access across the enterprise, Microsoft would have a powerful middleware stake in the enterprise systems game. How simple it would then be to develop some slick front-end tools providing analytic analysis and Olap functionality with tight integration to the Microsoft Office desktop.