In-Depth
Put Your Apps to the Test
- By Linda L. Briggs
- November 1, 2005
In many years of testing software application performance at a variety of companies,
quality engineer Carl Gerstle says he's seen nearly every application he's ever
tested fail-often at just 1 to 5 percent of its intended load. That means that
when he first tests an application intended for 1,000 users, "somewhere around
10 users, it falls over."
Gerstle, currently a principal quality engineer at a distributed Web and content
services company called Mirror Image Internet, worked on hardware, software
and design teams before becoming a QA engineer. "As smart as the people are
who design the code and the systems," Gerstle says, "the systems are complex
enough that if you don't get to see how they all interact before you field the
code, you're usually in for a horrible embarrassment."
That sort of commentary points to the need for application performance testing
software-along with the expertise to apply it correctly. Rapidly increasing
user loads, wildly complex software and distributed development that includes
worldwide outsourcing all ratchet up the need for more and better application
testing processes and tools.
Simulated software in the real world
Application performance testing software usually describes tools used during
and after product development to simulate user loads and ensure that an application
remains available and accurate under its anticipated use. The term also typically
includes performance optimization-analyzing code to see if there are patterns
causing it to perform poorly. While the speed and accuracy of Internet sites
is a big consideration today, in-house applications also need testing, especially
if they affect the business.
"Anything customer-facing should be tested," according to Gartner analyst Theresa
Lanowitz. That includes the Web sites used by help desks or order centers in
response to customer phone calls. When you call an airline or a catalog company
and hear, "Oh, the computers are slow today," Lanowitz says, "it's because they
haven't done proper performance testing. You might think [those applications]
aren't customer-facing, but in reality they are."
Purchasing a performance testing package, creating tests, running them and
interpreting the results is one way to beef up your software application testing
process. However, Lanowitz warns, a good software performance testing strategy
requires a degree of expertise that many companies lack. "The number of companies
with performance engineers on staff is really very small," she says. (See related
story, "Testing on the outside," below.)
With any kind of testing, of course, the rule is to begin testing as early
as possible, because study after study shows the cost of making a repair increases
exponentially the deeper developers are into the development lifecycle. "The
worse-case scenario," says Forrester analyst Carey Schwaber, "is testing [for
the first time] in production."
Testing on the outside
Even if companies regularly handle some testing functions internally, they may want to outsource larger tests. One advantage of outsourcing to a specialist is the savings in renting the extra bandwidth they’d need to adequately test a large application. According to Mike McCormick with AppLabs, for example, his company “already has infrastructure outside of the firewall, out in the cloud, at different data centers, that is used to drive the tests.” Mc- Cormick says AppLabs has some 550 megabits of bandwidth available across the U.S. for testing customer applications, and more can be added if needed.
There’s another advantage to outsourcing to a company that tests applications for a living: A good testing company has experts on staff who do performance testing for a living and test a range of applications for many companies. That can help companies with some of the considerable challenges of good performance testing, such as understanding and then applying the results.
Done correctly, performance testing isn’t simple, according to Gartner analyst Theresa Lanowitz. “It’s not something where anyone can take one of these tools and get a great bang for the buck.” Although writing the actual scripts typically isn’t difficult, she says, interpreting the resulting reports can be. “You have to figure out if [the results] are about the application or the infrastructure or the architecture. Performance testing is not something for the weak of heart.”
—Linda L. Briggs
Different jobs have different tools
Testing tools are generally priced in terms of the load simulated. Vendors sell
virtual user packs, in which cost is based on the size of the user load. Other
sorts of tests may overlap or fall within performance testing. Load testing,
for example, is generally a subset of performance testing; functional testing
compares actual performance to the functional spec. According to Schwaber, some
vendors offer some degree of integration between functional testing and performance
testing. That allows reuse of functional test scripts created earlier, as use
cases for performance testing. (See related story, "Right tool for the right
job," on below.)
At Mirror Image Internet, Gerstle's company, the QA group is responsible for
testing across a wide range of hardware and software platforms, including Windows
NT and 2000, Sun Solaris and various versions of Linux and Apache. Through huge facilities worldwide, the company provides
distributed Web and content services that include
streaming media and managed caching.
Gerstle uses a load and performance testing
tool—Segue Software’s SilkPerformer 6.5. The
product’s powerful dedicated scripting language
is a big draw for Gerstle, since it allows him to write test scripts natively, or produce them
through recording sessions, or use a combination.
Recording a sequence of keystrokes into a script for later playback is one
way to create test scripts, although it has limitations in flexibility. In Gerstle's
case, he often uses recorded sessions as a foundation "to get the actual flavor
of the protocol...what applications and servers are saying to each other and
to users." Later, he might shape the performance emulation script further by
hand-coding. When external code is needed, Gerstle says, his staff writes it,
then calls the code as a DLL from SilkPerformer.
Although SilkPerformer supports a wide array of protocols and can access all
layers of an enterprise application stack, that isn't its principal draw currently
for Gerstle. "In the space I'm working in now, I really only need to be able
to talk to databases and to talk over HTTP," he says. "To a large extent, I'm
testing either right at the database, [or] I'm communicating over OCI, or I'm
testing over HTTP and HTTPS."
Performance testing large loads
One solution for companies interested in introducing or increasing their application
testing is outsourcing the job. AppLabs Technologies, for example, is a global
IT services company specializing in software testing and development. Mike McCormick,
manager of partner solutions and business development, used RadView's WebLoad
to test applications for customers ranging from online retail businesses to
financial services to travel to news agencies.
AppLabs used RadView's product, for example, to test the MSNBC site for the
2002 Olympics in Salt Lake City, at loads exceeding 100,000 users. AppLabs uses
a variety of testing products from different vendors, selecting the appropriate
tool depending on the application under test. When the testing firm tried WebLoad
initially, it scaled well as virtual users were added.
For AppLabs, another plus with WebLoad is its use of JavaScript as a scripting
language, rather than the sort of proprietary language that SilkPerformer uses.
Use of a well-known language such as JavaScript "makes it easier for us to bring
on new people and get them up to speed," McCormick says.
Since JavaScript supports regular expressions, it gives the AppLabs engineers
more control of complex testing engagements, and the ability to read and write
to files. "That really enhances the capability of the tool," McCormick says.
"You can create a single test script that's flexible enough to represent multiple
users and react appropriately to dynamic data returned by the Web site." For
example, to simulate various users in applications that require logins, the
AppLabs staff can write test scripts that automatically insert unique user names
and passwords.
Right tool for the right job
Selecting the right performance and load testing tool for the job is something that AppLabs does on a regular basis. The global IT services company specializes in software testing and development, with customers such as American Airlines, InstallShield, Hewlett-Packard, Novell and SAP. The company tests applications for customers using a range of different tools from different vendors.
So what does AppLabs look for when selecting from the company’s range of available testing tools? Once the company knows the characteristics of the application they’ll be testing, according to Mike McCormick, manager of partner solutions and business development for the labs, the company first weighs whether a tool is applicable to a particular test. The product he selects to test a .NET application might be different from that used to load test a Citrix implementation.
Second, AppLabs looks at the functionality available in the tool, McCormick says. Does it provide the flexibility needed for the application under test? Third, he considers the stability of the product, including how many users the testing tool can scale up to and then sustain.
In making the final purchase decision on a testing product, the company also weighs ease of use, the availability of the tool—meaning the ease of working with the vendor on licensing—and finally, pricing.
—Linda L. Briggs
Somewhere there's a payoff
Iron Mountain saw the performance testing tool it installed 4 years ago pay
off in cost reductions and more. According to Michael Anthony, manager of corporate
systems' quality assurance for the records and information management company,
"we've seen a direct reduction in the amount of time and effort to test [systems.]"
Iron Mountain uses Mercury Interactive's LoadRunner for testing both Web-based
and character-based systems, including Oracle, Seibel and two proprietary systems.
Mercury's ability to handle the company's proprietary character-based systems
was an important determinant in selecting it, Anthony says. Another plus: The
product offers reports that can be customized with a high-level view for management,
or can allow network staff to drill down to the bits per second level if necessary.
Five years ago, Anthony says, Iron Mountain realized it needed a performance
testing system. The company grew rapidly over the past 10 years, to $1.8 billion
in gross revenue in 2004. Iron Mountain's large databases and diversity of systems,
Anthony says, made automating the testing process a must. "The databases that
we use are huge...There's a critical need for performance testing."
After ramp-up and training costs, LoadRunner has provided a solid return on
investment by reducing the testing labor pool, dramatically increasing how quickly
tests can be performed and increasing the sophistication of the testing process.
"Some load and performance testing just cannot be done in a practical way"
using manual testers, Anthony says. In addition, he's found that automated testing
is faster and more structured, producing results that can be documented and
repeated. If incremental changes are made to a system, for example, or a new
version is installed, the same performance test can be rerun, and results compared.
"It goes much more quickly than someone sitting at a terminal manually operating
the system," he says.
Using a performance testing tool also has forced more structured processes
on the company. Anthony and his staff run standard, repetitive tests regularly
against the Oracle and Seibel systems, for example, some as often as weekly.
"It's forced us to look at the updates we were doing and make sure our automation
scripts were in sync with our updates. It's a way to guarantee that incremental
updates to our production systems land in our regression test pool." (See related
story, "Lifecycle under new management," below.)
Lifecycle under new management
The growing focus on lifecycle management in software development is spilling over into performance testing. According to Forrester analyst Carey Schwaber, lifecycle management simply means determining what changes you make to an application and when you make them. With performance testing, it can mean managing the lifecycle so performance testing takes place at appropriate points in the lifecycle and is followed by appropriate changes back to the application.
For example, Schwaber says, a lifecycle management system that includes performance testing might mean QA engineers or performance testers use performance testing software to run a script, uncovering a performance problem. They send that information back through a defect tracking system to the developer. The developer then makes a change to the application. The change is tracked by the change management system and linked back to the original request from the performance testing software. Finally, the system notifies the performance test group that the problem was resolved and the software should be retested.
One vendor who does this well is IBM, according to Schwaber, with its Performance Optimization Toolkit. The toolkit includes Rational Optimization Developer as the IDE, ClearQuest as the defect tracking system, ClearCase as the version control system or software configuration management tool, and Performance Tester as the performance testing software. That suite comprises a lifecycle management system.
“The reason they’re able to do so well is because they also have very strong development tools,” Schwaber says. “IBM’s differentiator is they’re able to get down to the line of code instead of just the component.”
—Linda L. Briggs
Testing .NET Applications
One challenge with performance testing .NET applications is their complexity.
At Dataquick, a real property information firm, the QA staff uses an integrated
group of products from Empirix called e-TESTSuite for load testing and test
management of the company's large property information database. At the all-Microsoft
firm, applications are built in .NET and tested using an Empirix add-on module
specifically designed for .NET testing. The product "knew exactly what kind
of performance [issues] to look at in .NET apps," according to QA tester Glen
Pascual. "A lot of our developers were pretty impressed with how it tested .NET."
According to Pascual, the automation functions in the Empirix product make
it a good choice for his company. The suite includes e- Load, which Dataquick
uses for performance testing, load testing and stress testing. Using e- Load,
Pascual says, he can add more users or more load, while maintaining a dynamic
view of the testing process. "We could say, let's add 30 users and run it for
12 hours. If that passes, we can add more scripts to that test, and we can benchmark
it."
Because Empirix uses the same scripting language for testing and monitoring,
scripts can be run over and over at different times. If a performance problem
crops up in production because of a software patch, for example, the original
test script can be rerun, and its results compared to earlier reports. The result
"helps development know where our problems are and where we need to focus,"
Pascual says.
Before Empirix, Dataquick used a Windowsbased, stress-test application, Pascual
says. However, the staff found test results vague and hard to interpret. With
the Empirix products, he says, the graphical user interface tells him "exactly
what [I'm] looking at."
Testing gets some respect
The increasing complexity of software, driven in part by service-oriented architecture,
is starting to give performance testing the respect it deserves. At AppLabs,
which specializes in testing software for customers, McCormick says his company
has seen an increased buzz around performance testing during the last year or
so.
"Performance has been one of those things that has been largely ignored" as
firms focused on features, functionality and getting product to market, Gartner's
Lanowitz says. During the height of the Internet craze, she says, "many dotcoms
died because performance was ignored." That's changing as companies realize
the business value of solid application performance. (See related story, "Don't
upset the developers," below.)
"There is a huge and ongoing need to do this kind of testing," Gerstle says.
Awell-behaved system, he says, shouldn't "go up in flames" when pushed to and
beyond capacity. Instead, "we [expect] performance to degrade gracefully as
load comes on. And hopefully, that degradation happens well beyond the expected
maximum that you want the system to perform at."
Don’t upset the developers
Assuming companies choose and implement the right application performance testing product for the job, their QA groups then face another big challenge— learning to work with the various developer groups whose products they’re essentially criticizing.
After all, testing a product’s performance and reliability often means pointing out the problems and sending it back for fixes.
At Mirror Image Internet, principal quality engineer Carl Gerstle fields a team of QA engineers in a one-to-one ratio with the development group—an unusually balanced ratio between the two groups for most companies.
Gerstle maintains harmonious relations with developers, partly by hiring technically savvy QA engineers, and by counseling junior developers on the art of collaboration, “so the developers understand that we’re not throwing rocks at them,” Gerstle says. “You have to get [beyond] the point of thinking ‘Part of my job is to upset developers,’ which it isn’t.”
It helps that Gerstle himself held a variety of technical jobs, including hardware development, software development, systems engineering and user interface testing. “I can look at a complex system from most angles, including the user’s. A good job of performance engineering includes empathy for the end user as well as [understanding] the system on which the code base is running,” he says.
—Linda L. Briggs