In-Depth
Load testing: A developer's crystal ball
- By Sandra Taylor
- July 31, 2001
The reason to load test your client/server application is to detect multi-user interaction errors, right? Wrong! If this is your view of load testing, you are seeing only one dimension of this multifaceted tool. Consider the following.
More than a few client/server systems have been blind-sided by scalability issues. The term "hitting the scalability wall" evolved from the far too common occurrence of a client/server application's inability to accommodate a growing number of users. While proponents of a three-tier model will argue its alternative merits, the truth of the matter is that it is virtually impossible to predict if and/or when any client/server application (two, three or n-tier) will begin to degrade and whether a given configuration can support the desired number of users.
Most discussions on scalability deal with technological reasons for a degradation. What tends to be ignored is the impact the degradation has on the business. As more and more client/server applications can rightfully be termed mission-critical, the impact of systems that fail to scale becomes more acute. Research conducted by Software Productivity Group, Natick, Mass., in 1995 showed that nearly 31.1% of the Fortune 1000 respondents to our survey were using client/server systems for critical business applications. In 1996, that number had risen to 41.1%. Client/server is growing up -- the applications are more central to the heart of the business -- and the stakes are getting higher.
Some companies have already discovered the one way to determine if their system will scale or fail. These organizations are using automated load testing tool suites that provide a virtual software crystal ball into an application's performance characteristics before the application is deployed. Not only can these organizations graphically see the characteristics of an application as it is being implemented, they also can look into the future and see what an increasing load will do to the applications -- and more importantly, to the business functions and users of that application.
Load testing in the financial services sector
Physically moving a multimillion-dollar-a-day financial trading
application in the space of a weekend is not for the faint-hearted.
Add the unknowns of a totally new environment -- new terminals,
new LAN, new servers, new database -- and the move takes on another
dimension of complexity and pressure. Such a move was undertaken
recently by a large Wall Street securities-brokerage firm, which
ADT was asked not to identify.
In the firm's preparation for the combined physical move/database
transition, each major system component was attended to by its team
of experts -- database experts, networking experts, operating system
experts and other experts. Each subsystem had to be checked, re-checked,
and checked again. Yet in this system, as in the vast majority of
client/server systems, there was no unifying element that could
pull everything together and orchestrate a full-bore system level
test.
Real-Time Technology Solutions (RTTS), a New York City-based consulting
firm specializing in automated testing software, was initially brought
in to develop a test methodology, then implement and deploy SQA's
Team Test automated testing system. According to Bill Hayduk, President
of RTTS, he started the project looking for application errors.
As the project advanced, he and the other members of the development
team found that load testing offered far more than multi-user error
detection. It provided the basis for consistent, repeatable system
level testing. It is one thing to run a subsystem test to validate
that records can be read from and written to the database. It is
another thing to validate transactions and responses when a few
users are driving the application from their PCs. It is yet another
plateau to systematically emulate an increasing number of users
performing a variety of typical operations and graphically watch
response times and system resource utilization. Said Bill Hayduk,
"We started the project looking for errors. We ended with a framework
for an integrated system level test."
Following the close of trading on a Friday afternoon, the system
was powered down and moved to its new home. The following Monday
morning, a switch was toggled and the system went live. To date,
no major errors have been detected.
|
|
Orchestrating an environmental test
The subsystems of a client/server system are like an orchestra. If each component
is tuned and working well with the other subsystems, the result is harmony.
If only one subsystems is off, the result is discord.
Load testing introduces a conductor into the picture -- a conductor that can
direct the operation of the entire environment. If an aberration develops and
a certain part of the score needs work, test cases that isolate the errant operation
can be run again and again until the problem is solved and the system is functioning
properly and within specs.
Just as a conductor cannot guarantee that a member of the orchestra will not
make some unexpected faux pas, automated testing software cannot guarantee error-free
code. However, it can trap -- and has in an increasing number of high-profile
installations -- many of the show stoppers that lie in wait in today's complex
client/server systems. Equally important, it can also provide the knowledge
and confidence that the system will function under a growing user load.
Different approaches
Today's load testing tool suites generally take one of two approaches. Each
approach involves the use of test scripts. Scripts can be generated by recording
actual user input, programmed using a test tool's scripting language, or generated
using a combination -- first recording actual user input and then modifying
those scripts with an editor provided with the test tool. Executing a script
is essentially (and simply) the same as a user sitting at a PC and interacting
with the application.
Single test per processor
In one approach to load testing, the load is
driven by test scripts, which are generated by recording actual
user input, executing on multiple PCs.
Source: Software Productivity Group
Fig.1
|
Multiple test per processor
Test scripts programmatically drive the application
from the PCs essentially in the same manner as multiple users would
drive an application in a live production environment.
Source: Software Productivity Group
Fig. 2
|
Hybrid environment
The above graphic is an example of PCs executing
individual scripts and the multithreaded systems executing multiple
simulated users.
Source: Software Productivity Group
Fig. 3
|
|
In one approach to load testing, the load is driven by these test scripts executing
on multiple PCs. (See
Fig. 1.) In the other approach, the load testing
software drives multiple instances of the tests on a single system. (See
Fig.
2.)
The multiple PC approach offers the most realistic approach to system load
testing. Test scripts programmatically drive the application from the PCs in
essentially the same manner as multiple users would drive an application in
a live production environment. While this approach is intuitively sound, the
reality of the matter is that it is impracticable to build a test system that
reflects a full-blown production configuration. Modern day client/server systems
comprise hundreds, and in some cases thousands, of PC clients. Physically amassing
even a small percentage of those PCs would be a monumental task. For testing
such high-end systems, the simulation approach to load testing is more appropriate.
If the application demands more simulated users than can be accommodated on
one system, multiple systems -- each running multiple virtual users -- can be
added. It is also not unusual to see a combination of the approaches with some
PCs executing individual scripts and the multithreaded systems executing multiple
simulated users. (See Fig. 3.)
Initially, test tool vendors offered just one of the testing capabilities.
The single-test-per-PC camp held that performance characteristics could be determined
and then interpolated to ascertain the results of adding more users. The simulated
user camp retorted that interpolation is not reliable when talking about high-end
systems of tremendous complexity. The reality is there is a valid case for both
approaches. In fact, it is now common for vendors to offer both capabilities.
Some companies provide internally built product suites, while others augment
product sets through alliances. The two leaders in the field today are Mercury
Interactive Corp., Sunnyvale, Calif., and the McLean, Va.-based Performix unit
of Pure Atria, Sunnyvale, Calif. Mercury Interactive offers virtual user testing
via its LoadRunner product suite. Performix, acquired by Pure Atria in 1995,
focuses on simulated load testing. Pure Atria will add a GUI-based system from
technology recently acquired with Integrity QA. Those plans have yet to be disclosed.
Newton Centre, Mass.-based Segue Software Inc. and Burlington, Mass.-based SQA Inc. had
initially offered their own PC-based load-testing facilities (in addition to a merger agreement with Rational Software Corp., Santa Clara, Calif.). By the time this article appears in print, Segue Software Inc. will have followed suit with its own simulation facility.
Other vendors offering various load testing facilities include companies such as: the Dallas-based AutoTester unit of Software Recording Corp.; Compuware Corp., Farmington Hills, Mich.; Performance Awareness Corp., Raleigh, N.C.; and Softbridge Inc., Cambridge, Mass.
More than one nugget
Load testing tool suites can offer a veritable gold mine of benefits. The list below identifies only a handful of possibilities. Note that the terms used below may vary from vendor to vendor. The important issue is the range of capabilities these tools offer and the benefits they can bring to the test tool user.
Indeed, determining the focus of a test is one of the challenges facing organizations that opt to use the tools. Every vendor offers classes on how to use their respective toolsets. But the methodology of automated testing -- what to test versus the mechanics of using the tool suite -- is still a relatively new discipline. These tool suites offer so much power and scope that the most appropriate approach is not always clear to an automated testing neophyte.
Multi-user testing This is the classic error detection functionality. Today's client/server systems have an extraordinary number of moving parts. Not only must each part work correctly on its own, they must work together in an extremely complex environment involving application, operating system, middleware, networking and database software. Unfortunately, what may work correctly with only a few users can fall apart under the load of several users. By stressing the system with multi-user testing, many of these insidious errors can be uncovered.
Architectural validation Many organizations are currently in the process of developing computing architectures that will carry them into the next century. It is one thing to talk of strategies and products. It is another thing entirely to put the architecture into action -- to select the products, and then build the first application that can make or break the architecture. One company is well on its way in this process -- Gap Inc., a $4 billion clothing retailer.
Load testing and Gap Inc.
With a chain of over 1700 customer-oriented retail stores, Gap
Inc. has made its mark in the retail industry. Like many organizations
of this size, the company is looking at its computing architecture
and beginning the process of developing an architecture that will
span its computing environment. According to Phil Wilkerson, director
of Technical Architecture, Gap Inc. first had to develop its strategy
and then select the components to implement that strategy. Core
elements of the company's strategy include a three-tier client/server
model, a move to object technology, plus an ability to access corporate
data in an MVS-based mainframe environment. For the mainstay system
components, Gap selected Sun from Sun Microsystems Inc., Mountain
View, Calif., for its server engines, Forté from Forté
Software Inc., Oakland, Calif., for the application development
tool, Tuxedo for the transaction processing monitor and Informix
for the database.
With many of the development components selected, Wilkerson believed
there was one more piece to the puzzle -- automated testing software
that would objectively manage the system level tests and would prove
the architecture could handle the load. According to Wilkerson,
"We needed a mechanism to reduce the fear factor -- to control the
client/server monster."
Mercury Interactive's automated testing tool suite was selected
as that mechanism and put to work on a series of tests. The testing
was a success, according to Wilkerson. The Mercury tool suite allowed
Gap to test whether "the client/server monster" is functioning properly
end-to-end, from the interfaces to network connectivity to the build
process to the business process. Validating the architecture was
not the only hard payback Gap received from these tools, he said.
The retailer said the tools also discovered a memory allocation
and a RAID channel problem. As a result of system readjustments,
the end-to-end response times became phenomenal -- sub-second response
time on a test query into a 40,000-product database. Having this
kind of information enables an organization to scale networks and
hardware to handle production loads accessing huge amounts of information.
|
|
Configuration tuning Not only are there more discrete pieces in today's
systems, each component alone often involves complex configuration options that
can ripple through a system and ultimately affect response time. Manual testing
cannot provide the consistent and repetitive testing environment needed to truly
stress a system and objectively capture the performance metrics. Load testing,
on the other hand, can take configuration optimization from a subjective realm
into a world of objective charts and graphs that show the effect of various
configuration optimizations. Another issue here is whether a performance bottleneck
is the result of a problem with the client/server model (such as two-tier versus
three-tier) or because of a hardware configuration problem. At Gap, for example,
the hardware caused the problem and a modification to the configuration had
a very positive impact on system performance.
System upgrade comparisons Some of the most interesting aspects of
the growing acceptance and maturity of the load testing market are the various
ways companies use these tools. One of the most resourceful uses involves a
system upgrade "shoot-out."
With the flexibility in and competitiveness between today's hardware systems,
choosing from the available options for a system upgrade can be a daunting task.
Load testing can provide an objective measure of these options -- essentially
using the very application that will run on the upgraded hardware. Not only
can a company see what the upgrade will (or will not) do in terms of response
time for the existing number of users, the company can increase the number of
users to determine how much headroom the new hardware offers before another
upgrade is required.
User-acceptance testing Always a relatively sticky issue in terms of
acceptance, load testing software provides a mechanism for demonstrating to
the acceptance team how a system performs under load. Load testing can also
provide feedback to a development team and can be used to measure how the team
is doing against any performance requirements defined by the system specifications.
System software revision verification Some of the more interesting times in a development group can occur when it upgrades an application to a new version of a vendor's packaged software. Whether an operating system, middleware or other system software component, a new software version can cause havoc in an application that had been working perfectly. Load testing software can provide a safe pre-deployment venue for discovering multi-user errors and can compare performance characteristics with those from the previous version. With the time, effort and expense involved in deploying client/server systems, the last thing an organization wants to do is distribute an application that is inferior to an earlier version.
Not a free lunch
Staffing There is tremendous potential within load testing tool suites. There is, however, a price to pay financially, and perhaps more importantly in terms of organizational operations and infrastructure. The purchase price of these tool suites is the tip of the iceberg.
The staffing issue is often one of the hidden land mines in automated testing, and the majority of companies underestimate the quality of staff needed to develop and manage an automated testing effort.
Companies have come to understand that implementing a client/server system is a non-trivial exercise. Testing such systems is no less complex. Therein lies the surprise for many organizations. Test tool suites are software systems in their own right and require all the resources necessary to care for and feed such a system. The skills present in a manual quality insurance (QA) group do not meet the needs of an automation-based group and must be augmented with software development skillsets.
Many companies that have made the leap to automated testing now recognize this fact of life and have turned to a growing number of third-party testing experts for help with their first project. While this strategy requires an additional up-front expenditure, it can be the most effective way to:
- acquire intelligence about the methodology of automated testing;
- eEstablish a formal automated testing program; and
- test an application.
Infrastructure This aspect may be the most difficult area for many companies to tackle. In many organizations, automated testing tools fall under auspices of a QA group. In spite of the emphasis on quality, the QA function is frequently relegated to the bottom of the development food chain. For automated testing tools to be effective, such thinking must change. Testing tools work best when integrated into the development cycle from the start. As development progresses, testing can often identify issues early on, before changes become a major undertaking. To do this, the QA and development groups must work hand-in-hand. With a premium on development groups, this can be a fairly significant barrier.
The Web -- Let the game begin!
Most end-user, client/server systems are currently intra-company; that is, the users are typically employees of that company. As such, any dissatisfaction regarding performance is within the confines of the company, and solutions can be negotiated within the realm of internal politics and corporate memos.
As companies expose their business systems to the public, performance will become a critical factor. And we need not be talking about high-profile business critical applications. Someone just browsing for information will soon become frustrated with a lethargic system. That company, in the brief space of a mouse click, can lose a customer. With serious buy-sell business applications and the Web's ability to even the competitive playing field, the issue is magnified. With such applications, the question is not if you should consider these tools, but when and which ones.