Keeping SOA Under Control

By now, most IT professionals and managers are well-acquainted with the concept of service oriented architecture (SOA). Many have undertaken projects that employ its fundamental precepts -- building standardized, loosely coupled application components that are available for reuse across the enterprise. A growing number have reaped some of the benefits of SOA such as decreased development costs and increased agility for the business.

Unfortunately, the path to SOA success can be quite circuitous. Just ask anyone who’s tried to deploy a services network without regard to the heightened needs for monitoring, control, and validation they bring because SOA-based applications are heterogeneous, federated, evolving systems. They are composite applications that become complex in the blink of an eye and include application components beyond an organization’s control (such as a partner’s services). They comprise more than just Web services, often tying in mainframes, packaged applications, and messaging systems other than SOAP.  Ensuring the health of the running system calls for capabilities beyond the scope of conventional enterprise system management and network system management solutions.

Consider a simple example of the new issues raised by loosely coupled applications. Imagine that a developer learns of a new order entry service built by a colleague from another department and decides to incorporate that service into his application. Suddenly there’s a dependency -- one for which there’s no formal record -- from a service making unauthorized calls that may overload the order entry service. With SOA, a single component going down can cripple the entire system unless there are provisions for handling such circumstances.

In just such a case, SOA runtime governance -- the oversight of SOA applications and their constituent components -- is so important for growing SOA-based infrastructures. Runtime governance ensures that applications perform as expected and withstand changes as the SOA evolves. Unlike design-time governance, which oversees people and processes, runtime governance watches over services in their runtime environment.

Beyond simply monitoring service networks, runtime governance helps enterprises control systems and their components. Governance helps operations teams understand the composition and behavior of the SOA-based service network, as well as detect, diagnose, and (ultimately) prevent problems that arise during the operation of the service network. Ideally, to bring reliability to SOA applications, runtime governance must detect issues and resolve them before these problems can affect the business.

Wherever possible, runtime governance needs to be an automated process. The large number of moving parts in a SOA environment would overwhelm manual efforts to handle governance tasks. Putting a runtime governance solution in place also means spanning a range of heterogeneous systems -- from back-end mainframes to .NET and Java. Therefore, the governance solution should be well integrated with leading application servers, enterprise service buses, and other SOA infrastructure products. Close vendor partnerships in this industry can take some of the bumps out of your SOA adoption path.  

Runtime governance provides a wide range of benefits, the most prominent is helping enterprises understand their service network topology, manage operational health, detect and diagnose exceptions, enhance security, and ensure operational integrity.

Understanding the Service Network Topology

SOA-based systems can and should be dynamic. Services can be added, updated, or removed at any time. In such a shifting environment, it can be a challenge to understand what is installed and running. This problem is much greater in the SOA world, where any service may be added to the topology simply by calling it, yet there may be no record of the existence of this call.

Done properly, runtime governance dynamically discovers the topology of the SOA service network. It observes the actual components that are installed in the environment (no matter if it’s in a development, staging, or production environment) and records their existence. The governance system can also record the details of the discovered service’s interface. This discovery information can also be stored to a registry or repository, making the information available to architecture, development, and operational teams.

By recording which services exist, their current state, and the rate at which the services are being promoted from one lifecycle stage to another, the enterprise will have a clearer picture of service reuse rates, and thus the effectiveness of its SOA initiatives.

Ensuring the Operational Health

Maintaining performance, availability, and service-level management have long been challenges for IT. However, SOA-based applications introduce additional wrinkles. For example, services are reused, and the services that get the most reuse may also experience the most performance problems. Loads on the services themselves may change independently of any particular application that uses those services. Thus, the performance of each service must be tracked over time and correlated against the known reuse of the service to determine if new uses of the service will prevent it from properly supporting existing applications. Under unexpectedly heavy loads, the service might not meet its performance criteria.

The trick is to keep the service from getting overloaded. Your runtime governance system can track service reuse rates and performance metrics so you can keep request loads in line with service-level agreements or add capacity as needed.

The governance system should provide detailed information -- down to the per-end-user or per-transaction level -- enabling operations teams to make the most of service level agreement monitoring and enforcement. The runtime governance system can slice and dice this data in various dimensions, enabling inspection of performance statistics from many vantage points. Runtime governance systems can also be applied to services still under development to ensure that they are delivered in a state that will meet performance requirements.

Detecting and Diagnosing Exceptions

Discovery is the first step to visibility. Once the topology of the service network is known, its dynamic behavior must be understood. Is it up and running? Is it properly processing business transactions? Is it performing as expected?

In times gone by, when an error occurred, figuring out what went wrong and where it went wrong was a difficult task. In many cases, technicians responsible for each service would get together and manually trace their way through the system information logs, correlating messages and looking for anomalies. One user organization spent more than 14 hours looking for such a problem that impacted only one customer’s transactions. They finally realized that one service had been updated in a minor way, but that change affected transactions whose serial numbers were encoded in a specific format used by the one customer.

Runtime governance can reduce much of the labor of such tasks. Governance systems can generate correlated log information from all participating services and make them readily available to the diagnostic team. In addition, these systems automatically discover anomalies within service networks and initiate corrective action. This is a process that used to take days spent manually sifting through error logs.

With a runtime governance system, messages can be recorded and correlated automatically. Standard patterns can be detected automatically and queries and inspections can be applied to the correlated messages in an effort to find anomalous behavior. If the problem is chronic, perhaps due to some physical failure or some recurring logical inconsistency, the runtime governance system can automatically detect such conditions and initiate corrective actions. As each problem is diagnosed, further rules can be added to the exception system to detect similar problems in the future, making the system even more responsive.

Securing the Service Network

There are two main challenges to SOA security. One is the ability to authenticate users and authorize their access to specific services and applications. The other is to ensure the privacy and integrity of data managed by the service or application.

Traditional applications are typically “tightly coupled” and secured at the application level. That is, the user signs in to the application using a username and password. Once the user has been authenticated, it is up to the application to authorize the use of its features.

In a service network, this one-to-one model no longer holds. A SOA application consists of an aggregated set of discrete services, each of which is an independent entity that can be reused across multiple applications. Thus, SOA services cannot depend on a single application to implement authentication and authorization policies — each service must be able to perform a range of security processing —including authentication and authorization — independently. However, implementing security processing in every application would obliterate the core value proposition associated with SOA, which is business agility.

Runtime governance solutions address this conundrum by simultaneously offloading security processing and policy enforcement from the applications themselves while enabling embedded security processing on their behalf. Implementing authentication and authorization at the service interface offloads security programming and configuration from application developers and places responsibility for security in the hands of security administrators. This enables “last-mile security” — or policy enforcement at the service endpoint, where it belongs.

Specific security needs that can be met by the runtime governance system working in conjunction with the application infrastructure include:

  • populating messages with user credentials
  • authenticating requesters
  • determining if an authenticated requester is authorized to make a specific request
  • managing privacy and integrity
  • propagating identity information across multiple service invocations

Runtime governance systems also enforce the mapping of a user’s role to the features of the services available to users in that role. The system can also delegate all user and role management to a dedicated identity management solution. In addition, the governance system can provide the functionality necessary to leverage the identity management system and perform fine-grained authorization at the service endpoint.

In terms of data privacy and integrity, SOA services are often responsible for transmitting sensitive or regulated data across the network. As an SOA evolves, more consumers may come to rely on that data. Runtime governance controls access to that data following corporate policies and regulations about data sharing. Governance puts censorship (content filtering) policies in place to ensure that unless a consumer has the appropriate entitlements, sensitive or regulated data never leaves the container where the service is running. Runtime governance also supports privacy and integrity requirements within these services by implementing standards including XML Signature, XML Encryption, and WS-Security.

Ensuring Operational Integrity

One of the great SOA challenges is validating the correct operation of the service network when changes are introduced. The operational integrity problem is amplified in SOA environments, since services are shared among applications and a change to a service may impact many applications. In addition, services may change dynamically since a change to a service is “effective” as soon as the updated service is installed and message traffic is delivered to it. Since a service in the operational environment may require a change to support a new or existing application, all applications that use the service may be impacted by the change.

In the case of federated services, a service may change without notice, affecting the consumers of that service. It also means a test version of the service may not be available for validating changes to the application. Plus, the owners of a service may not have access to the consumers of the service in order to validate changes made to the service.

Runtime governance can also support operational validation, which specifically addresses the problem of validating the service network in the face of continuous and dynamic change involving shared services, federated services, and service consumers. By capturing traffic from all consumers of a service, the validation system provides operations teams a realistic sampling of the traffic they have to support for the service (versus directly accessing federated consumers for testing purposes). This traffic report is continuously updated.


With the increasing popularity of SOA and the growing number of services now in production, runtime governance has become critical to SOA success. SOA projects can be full of surprises, including performance issues arising from spikes in demand for services from unknown or new applications, security shortfalls, and difficult-to-locate system anomalies. Left unchecked, these issues will eliminate any cost savings or increase agility the SOA provided. 

Runtime governance reduces costs, increases operational effectiveness, and ensures that applications perform as expected and withstand changes as the service network evolves. With a runtime governance system that provides visibility into, and automated control of, your complete services network, you’ll be better prepared to reap the many benefits of SOA.