In-Depth

Managing Data Center Performance and Availability (Part 1 of 2)

As companies become ever-more reliant on IT, system availability becomes essential. Application performance management tools can help.

Availability is one of the fundamental principles of doing business in the digital age. Information must be accessible anytime, anywhere—and so must the systems and applications that put information into the hands of end users.

In today’s fast-paced, information-driven world, availability is no longer simply about uptime or downtime. It is also about quality of service. After all, the objective of availability is to ensure reliable and timely access to the information and applications users need. Consequently, when application performance problems prevent a user from getting what they need when they need it, those problems ultimately affect the user’s productivity and customer service experience. And, if customers do not receive the level of service they demand, they will take their business elsewhere.

What’s more, as companies become even more reliant on information systems for performing high-value operations (such as accounting, payroll, and inventory management), availability becomes essential.

Performance and End Users

Just as customers would find it unacceptable to find their bank or other brick-and-mortar businesses open only intermittently and unpredictably, end users become dissatisfied with institutions that fail to meet their expectations for rapid, dependable, and competent service online. Web sites and services that are slow, prone to error, or unreliable fall far short of user demands.

There is a price to pay for poor online service. Performance problems disrupt more than a user’s browsing habits. They thwart money-making transactions, erode customer trust, disrupt critical business operations, and undermine a company’s brand. Users have too little patience and too many options to waste their time waiting for Web pages to load or transactions to complete.

What’s more, sluggish performance of mission-critical applications and databases threatens to undermine the very purpose for which they were put in place—that is, to enable business processes that create value for the company. However, it is both difficult and time-consuming to diagnose performance problems in an operational environment where information is constantly flowing along diverse network paths which connect a variety of application and database servers and storage subsystems. Each path (and server and storage subsystem along it) has the potential to limit overall application performance. Consequently, translating an end user’s complaint about poor performance into an actionable task that will resolve a bottleneck requires navigating a broad spectrum of monitoring and diagnostic technologies and techniques.

Additional challenges appear as companies turn to the Web as the primary channel for consumer and business applications. Consider an external self-service Web process such as a self-service airline check-in kiosk that integrates with the airline’s passenger management system. The Web-enabled applications are service-delivery channels. Poor application performance can have a detrimental impact on productivity and revenue.

As a result, managing application performance to meet service level agreements (SLAs) has become a concern for IT organizations as well as corporate decision-makers. Organizations need a consistent and repeatable way to measure the service level of each tier within an application, from the source of the request to the source of the data. Organizations must be able to achieve measurable application performance, productivity, and availability improvements that directly translate into bottom-line savings. They must be able to detect, diagnose, and correct performance problems before service levels are affected.

Application Performance Management

The complex infrastructure navigated by application data-access requests can significantly impede the process of diagnosing performance problems and undermine day-to-day management. Bottlenecks can appear at any connection in the data path. With limited visibility into a constantly changing infrastructure, IT must rely on automated tools that streamline the identification and isolation of performance problems wherever they occur.

Traditional system management frameworks are effective at detecting and alerting system level and network issues such as servers and applications with high utilization, high packet rates, or numerous retries. However, they do not provide a view of what the user actually experiences at the desktop nor do they provide direction on where the problem is or how to correct it. Other technologies such as "robots" generate synthetic transactions because they are entirely automated and have no user behind them, while stovepipe management technologies, such as the database or Java, manage tiers in isolation without any indication of how the performance degradation of one tier affects other tiers.

These approaches cannot find, isolate, and focus on the root cause of a problem or understand the real user response-time experience. Synthetic solutions do not provide a representation of the real end-user experience. Stovepipe solutions do not allow the correlation of data collected across tiers of the IT architecture to accurately assess the real root of the performance issue. Without this information, it is virtually impossible to correct a problem before the end user experiences it.

Application performance management tools eliminate this challenge. They correlate the metrics collected throughout the composite architecture to pinpoint the actual cause of a performance problem by decomposing the problem into a ranked order of contributing offenders. Metrics derived from collection and instrumentation are used to fine-tune the application, which results in faster transactions and response times. Further, metrics collected over time provide a historical view of how applications are behaving against expected service levels to enable better performance and capacity planning.

Application performance management tools isolate the causes of performance degradation and provide the steps needed to quickly address application performance issues. As a result, the application infrastructure runs at peak efficiency, delivering the service levels users demand while allowing IT to focus on additional high-value projects.

By providing visibility and an understanding of application availability and performance trends through service-level reporting, trending, and capacity planning, application performance management tools also enable organizations to better justify their IT investments and add quantifiable value for end users.

The deployment of application performance management tools allows organizations to immediately identify, prioritize, and assign an owner to application slowdowns, which can significantly reduce business cycle times. Organizations are also better able to understand the actual response times users are experiencing and respond accordingly, thereby improving customer service and satisfaction. Application performance management tools represent a critical component of an overall strategy for high availability, since performance is a natural extension of availability, and applications represent the user experience to IT. Enterprises must manage performance degradation as well as major failures that might result in downtime. To ensure data center uptime, organizations should consider implementing data protection, replication, performance management, and clustering technologies.

- - -

Next week, in part 2 of this article, we discuss technologies and practices for ensuring data center availability.

About the Author

Bob Maness is vice president of product operations for the Data Center Management Group at Symantec Corporation, where he currently focuses on Application Performance Management solutions. Bob joined Symantec through the company's merger with VERITAS Software.