Integrating Data Across Enterprises as a Service

Software-as-a-service (SaaS) proponents advocate integrating data across and between enterprises as a service. One company providing such functionality is Informatica on Demand, whose director of product management, John Hegstrom, came to SaaScon 2007 to speak on this topic.

Hegstrom started out with some general observations about SaaS's multitenant architecture, which supports multiple Web-based customers, each with a distinct look and feel, workflow and data access. He pointed out that multitenancy eliminates multiple OS hassles, minimizes R&D investment in porting and shortens release cycles. Moreover, all customers use the current version of your software, eliminating legacy system support costs. Consequently, the whole R&D budget can go toward developing new features.

And companies can focus on their core competence -- not their IT infrastructure.

Multitenancy also lowers sales costs for SaaS providers. You can let customers try out the software for free, without having to send out a sales engineer to install a copy. A free trial involves registration, which provisions the prospective customer as a tenant. The provisioning, in turn, enables flow-through business processing, creating leads, customers, billing and more for the SaaS provider.

Multitenancy even improves product testing. SaaS applications can log customer usage patterns, giving product managers fast feedback about how customers use a new feature.

Improving Data Access
Getting to the point of his SaaScon talk, Hegstrom got into ways you can improve data access in SaaS environments. Traditionally, data resides in data silos within an enterprise. This makes it hard to access data across all on-premise sources, reconcile data definitions and certify data quality.

All of these hassles multiply when you try to access and integrate data in other companies' silos as well. You have to standardize integration options, work with disparate APIs (application programming interfaces) and, above all, protect sensitive data.

But SaaS data integration won't automatically fix these problems, Hegstrom cautions. You can get an API "salad," for one thing. Some SaaS companies' APIs are easy; some not. Web services APIs can be proprietary, or incomplete, or just designed for data loads and not for business processes such as responding to events and triggering workflows. Point-to-point solutions can be hard to manage if they're from multiple vendors, and Hegstrom believes that few integration vendors have robust API support. They may limit cross-firewall communications too much, or provide limited data integration tool support for SaaS vendor APIs. And system integrators may have insufficient expertise with SaaS integration.

Data integration matters. SaaS applications are starting to be used as enterprise-wide standards. Companies are building systems infrastructure using multiple SaaS applications. The aim is to extract maximum business utility from all of their data, wherever it sits. Hegstrom asserted that this kind of utility can be achieved best through business process-specific data integration.

Data Integration Example
The process starts with your marketing department running a campaign, creating leads in the CRM (customer relationship management) system. Inside sales qualifies the leads, creating prospects. Field sales closes deals, creating customers that sales operations add to the ERP (enterprise resource planning) system. Then sales and marketing use the customer data to promote upgrades, add-ons, and new products and services. Data integration is how you link together those end-to-end business processes.

The key to linking all of those processes is to focus on the cross-functional touchpoints -- the places where there's a handoff from one process to another. You can only do this if the SaaS provider's API exposes workflow and events.

For example, employees need data in a particular form at a particular time. Sales reps need to see the sales pipeline, forecast, service requests, receivables balance and past-due invoices. However, they don't need to see line-item details on payments for every invoice. In contrast, accounts payables staff need to see line-item details on invoices, payments and billing addresses. You need both real-time and batch processing, depending on the process.

Real-World SaaS
Hegstrom stressed the real-world aspects of SaaS implementation. In his experience, some customers try to enforce the same level of data quality for sales leads as for customer orders. But that's wrong. Instead, the quality of the data should increase incrementally throughout the cycle. You never want to ignore it, but obviously it gets more critical as you go from lead to customer. So you collect and refine prospect data in your CRM application during the sales cycle, while you manage customer data in your ERP, always defining data quality metrics appropriately.

This means that SaaS providers need to provide data quality extensions to their applications, publishing "events" when records are created, merged and deleted, and providing callouts for data quality services.

As you'd expect, Hegstrom wrapped up his presentation with a pitch for his company's products, describing their use in cross-enterprise data integration, SaaS connectivity in large enterprises, on-demand data integration solutions for service providers, Informatica-hosted integration solutions for mid-size SaaS customers, and lastly the Informatica Secure Agent. Hegstrom described this Agent as a solution to the "firewall problem." It's a downloadable component that nestles inside customer firewalls, obviating the need for customers to open up holes in their firewalls.

And just to show he "eats his own dogfood," his company's products are available for free trials.

About the Author

Lee Thé's first computer was a state-of-the-art unit with 48K RAM and a 1MHz processor. He has been writing and editing computer magazine articles since then, in between scuba diving trips. He's based in the San Francisco Bay Area.