In-Depth

Put Your Apps to the Test

Performance Testing SoftwareIn many years of testing software application performance at a variety of companies, quality engineer Carl Gerstle says he's seen nearly every application he's ever tested fail-often at just 1 to 5 percent of its intended load. That means that when he first tests an application intended for 1,000 users, "somewhere around 10 users, it falls over."

Gerstle, currently a principal quality engineer at a distributed Web and content services company called Mirror Image Internet, worked on hardware, software and design teams before becoming a QA engineer. "As smart as the people are who design the code and the systems," Gerstle says, "the systems are complex enough that if you don't get to see how they all interact before you field the code, you're usually in for a horrible embarrassment."

That sort of commentary points to the need for application performance testing software-along with the expertise to apply it correctly. Rapidly increasing user loads, wildly complex software and distributed development that includes worldwide outsourcing all ratchet up the need for more and better application testing processes and tools.

Simulated software in the real world
Application performance testing software usually describes tools used during and after product development to simulate user loads and ensure that an application remains available and accurate under its anticipated use. The term also typically includes performance optimization-analyzing code to see if there are patterns causing it to perform poorly. While the speed and accuracy of Internet sites is a big consideration today, in-house applications also need testing, especially if they affect the business.

"Anything customer-facing should be tested," according to Gartner analyst Theresa Lanowitz. That includes the Web sites used by help desks or order centers in response to customer phone calls. When you call an airline or a catalog company and hear, "Oh, the computers are slow today," Lanowitz says, "it's because they haven't done proper performance testing. You might think [those applications] aren't customer-facing, but in reality they are."

Purchasing a performance testing package, creating tests, running them and interpreting the results is one way to beef up your software application testing process. However, Lanowitz warns, a good software performance testing strategy requires a degree of expertise that many companies lack. "The number of companies with performance engineers on staff is really very small," she says. (See related story, "Testing on the outside," below.)

With any kind of testing, of course, the rule is to begin testing as early as possible, because study after study shows the cost of making a repair increases exponentially the deeper developers are into the development lifecycle. "The worse-case scenario," says Forrester analyst Carey Schwaber, "is testing [for the first time] in production."

Testing on the outside

Even if companies regularly handle some testing functions internally, they may want to outsource larger tests. One advantage of outsourcing to a specialist is the savings in renting the extra bandwidth they’d need to adequately test a large application. According to Mike McCormick with AppLabs, for example, his company “already has infrastructure outside of the firewall, out in the cloud, at different data centers, that is used to drive the tests.” Mc- Cormick says AppLabs has some 550 megabits of bandwidth available across the U.S. for testing customer applications, and more can be added if needed.

There’s another advantage to outsourcing to a company that tests applications for a living: A good testing company has experts on staff who do performance testing for a living and test a range of applications for many companies. That can help companies with some of the considerable challenges of good performance testing, such as understanding and then applying the results.

Done correctly, performance testing isn’t simple, according to Gartner analyst Theresa Lanowitz. “It’s not something where anyone can take one of these tools and get a great bang for the buck.” Although writing the actual scripts typically isn’t difficult, she says, interpreting the resulting reports can be. “You have to figure out if [the results] are about the application or the infrastructure or the architecture. Performance testing is not something for the weak of heart.”

Linda L. Briggs

Different jobs have different tools
Testing tools are generally priced in terms of the load simulated. Vendors sell virtual user packs, in which cost is based on the size of the user load. Other sorts of tests may overlap or fall within performance testing. Load testing, for example, is generally a subset of performance testing; functional testing compares actual performance to the functional spec. According to Schwaber, some vendors offer some degree of integration between functional testing and performance testing. That allows reuse of functional test scripts created earlier, as use cases for performance testing. (See related story, "Right tool for the right job," on below.)

At Mirror Image Internet, Gerstle's company, the QA group is responsible for testing across a wide range of hardware and software platforms, including Windows NT and 2000, Sun Solaris and various versions of Linux and Apache. Through huge facilities worldwide, the company provides distributed Web and content services that include streaming media and managed caching.

Gerstle uses a load and performance testing tool—Segue Software’s SilkPerformer 6.5. The product’s powerful dedicated scripting language is a big draw for Gerstle, since it allows him to write test scripts natively, or produce them through recording sessions, or use a combination.

Recording a sequence of keystrokes into a script for later playback is one way to create test scripts, although it has limitations in flexibility. In Gerstle's case, he often uses recorded sessions as a foundation "to get the actual flavor of the protocol...what applications and servers are saying to each other and to users." Later, he might shape the performance emulation script further by hand-coding. When external code is needed, Gerstle says, his staff writes it, then calls the code as a DLL from SilkPerformer.

Although SilkPerformer supports a wide array of protocols and can access all layers of an enterprise application stack, that isn't its principal draw currently for Gerstle. "In the space I'm working in now, I really only need to be able to talk to databases and to talk over HTTP," he says. "To a large extent, I'm testing either right at the database, [or] I'm communicating over OCI, or I'm testing over HTTP and HTTPS."

Performance testing large loads
One solution for companies interested in introducing or increasing their application testing is outsourcing the job. AppLabs Technologies, for example, is a global IT services company specializing in software testing and development. Mike McCormick, manager of partner solutions and business development, used RadView's WebLoad to test applications for customers ranging from online retail businesses to financial services to travel to news agencies.

AppLabs used RadView's product, for example, to test the MSNBC site for the 2002 Olympics in Salt Lake City, at loads exceeding 100,000 users. AppLabs uses a variety of testing products from different vendors, selecting the appropriate tool depending on the application under test. When the testing firm tried WebLoad initially, it scaled well as virtual users were added.

For AppLabs, another plus with WebLoad is its use of JavaScript as a scripting language, rather than the sort of proprietary language that SilkPerformer uses. Use of a well-known language such as JavaScript "makes it easier for us to bring on new people and get them up to speed," McCormick says.

Since JavaScript supports regular expressions, it gives the AppLabs engineers more control of complex testing engagements, and the ability to read and write to files. "That really enhances the capability of the tool," McCormick says. "You can create a single test script that's flexible enough to represent multiple users and react appropriately to dynamic data returned by the Web site." For example, to simulate various users in applications that require logins, the AppLabs staff can write test scripts that automatically insert unique user names and passwords.

Right tool for the right job

Selecting the right performance and load testing tool for the job is something that AppLabs does on a regular basis. The global IT services company specializes in software testing and development, with customers such as American Airlines, InstallShield, Hewlett-Packard, Novell and SAP. The company tests applications for customers using a range of different tools from different vendors.

So what does AppLabs look for when selecting from the company’s range of available testing tools? Once the company knows the characteristics of the application they’ll be testing, according to Mike McCormick, manager of partner solutions and business development for the labs, the company first weighs whether a tool is applicable to a particular test. The product he selects to test a .NET application might be different from that used to load test a Citrix implementation.

Second, AppLabs looks at the functionality available in the tool, McCormick says. Does it provide the flexibility needed for the application under test? Third, he considers the stability of the product, including how many users the testing tool can scale up to and then sustain.

In making the final purchase decision on a testing product, the company also weighs ease of use, the availability of the tool—meaning the ease of working with the vendor on licensing—and finally, pricing.

Linda L. Briggs

Somewhere there's a payoff
Iron Mountain saw the performance testing tool it installed 4 years ago pay off in cost reductions and more. According to Michael Anthony, manager of corporate systems' quality assurance for the records and information management company, "we've seen a direct reduction in the amount of time and effort to test [systems.]"

Iron Mountain uses Mercury Interactive's LoadRunner for testing both Web-based and character-based systems, including Oracle, Seibel and two proprietary systems.

Mercury's ability to handle the company's proprietary character-based systems was an important determinant in selecting it, Anthony says. Another plus: The product offers reports that can be customized with a high-level view for management, or can allow network staff to drill down to the bits per second level if necessary.

Five years ago, Anthony says, Iron Mountain realized it needed a performance testing system. The company grew rapidly over the past 10 years, to $1.8 billion in gross revenue in 2004. Iron Mountain's large databases and diversity of systems, Anthony says, made automating the testing process a must. "The databases that we use are huge...There's a critical need for performance testing."

After ramp-up and training costs, LoadRunner has provided a solid return on investment by reducing the testing labor pool, dramatically increasing how quickly tests can be performed and increasing the sophistication of the testing process.

"Some load and performance testing just cannot be done in a practical way" using manual testers, Anthony says. In addition, he's found that automated testing is faster and more structured, producing results that can be documented and repeated. If incremental changes are made to a system, for example, or a new version is installed, the same performance test can be rerun, and results compared. "It goes much more quickly than someone sitting at a terminal manually operating the system," he says.

Using a performance testing tool also has forced more structured processes on the company. Anthony and his staff run standard, repetitive tests regularly against the Oracle and Seibel systems, for example, some as often as weekly. "It's forced us to look at the updates we were doing and make sure our automation scripts were in sync with our updates. It's a way to guarantee that incremental updates to our production systems land in our regression test pool." (See related story, "Lifecycle under new management," below.)

Lifecycle under new management

The growing focus on lifecycle management in software development is spilling over into performance testing. According to Forrester analyst Carey Schwaber, lifecycle management simply means determining what changes you make to an application and when you make them. With performance testing, it can mean managing the lifecycle so performance testing takes place at appropriate points in the lifecycle and is followed by appropriate changes back to the application.

For example, Schwaber says, a lifecycle management system that includes performance testing might mean QA engineers or performance testers use performance testing software to run a script, uncovering a performance problem. They send that information back through a defect tracking system to the developer. The developer then makes a change to the application. The change is tracked by the change management system and linked back to the original request from the performance testing software. Finally, the system notifies the performance test group that the problem was resolved and the software should be retested.

One vendor who does this well is IBM, according to Schwaber, with its Performance Optimization Toolkit. The toolkit includes Rational Optimization Developer as the IDE, ClearQuest as the defect tracking system, ClearCase as the version control system or software configuration management tool, and Performance Tester as the performance testing software. That suite comprises a lifecycle management system.

“The reason they’re able to do so well is because they also have very strong development tools,” Schwaber says. “IBM’s differentiator is they’re able to get down to the line of code instead of just the component.”

Linda L. Briggs

Testing .NET Applications
One challenge with performance testing .NET applications is their complexity. At Dataquick, a real property information firm, the QA staff uses an integrated group of products from Empirix called e-TESTSuite for load testing and test management of the company's large property information database. At the all-Microsoft firm, applications are built in .NET and tested using an Empirix add-on module specifically designed for .NET testing. The product "knew exactly what kind of performance [issues] to look at in .NET apps," according to QA tester Glen Pascual. "A lot of our developers were pretty impressed with how it tested .NET."

According to Pascual, the automation functions in the Empirix product make it a good choice for his company. The suite includes e- Load, which Dataquick uses for performance testing, load testing and stress testing. Using e- Load, Pascual says, he can add more users or more load, while maintaining a dynamic view of the testing process. "We could say, let's add 30 users and run it for 12 hours. If that passes, we can add more scripts to that test, and we can benchmark it."

Because Empirix uses the same scripting language for testing and monitoring, scripts can be run over and over at different times. If a performance problem crops up in production because of a software patch, for example, the original test script can be rerun, and its results compared to earlier reports. The result "helps development know where our problems are and where we need to focus," Pascual says.

Before Empirix, Dataquick used a Windowsbased, stress-test application, Pascual says. However, the staff found test results vague and hard to interpret. With the Empirix products, he says, the graphical user interface tells him "exactly what [I'm] looking at."

Testing gets some respect
The increasing complexity of software, driven in part by service-oriented architecture, is starting to give performance testing the respect it deserves. At AppLabs, which specializes in testing software for customers, McCormick says his company has seen an increased buzz around performance testing during the last year or so.

"Performance has been one of those things that has been largely ignored" as firms focused on features, functionality and getting product to market, Gartner's Lanowitz says. During the height of the Internet craze, she says, "many dotcoms died because performance was ignored." That's changing as companies realize the business value of solid application performance. (See related story, "Don't upset the developers," below.)

"There is a huge and ongoing need to do this kind of testing," Gerstle says. Awell-behaved system, he says, shouldn't "go up in flames" when pushed to and beyond capacity. Instead, "we [expect] performance to degrade gracefully as load comes on. And hopefully, that degradation happens well beyond the expected maximum that you want the system to perform at."

Don’t upset the developers

Assuming companies choose and implement the right application performance testing product for the job, their QA groups then face another big challenge— learning to work with the various developer groups whose products they’re essentially criticizing.

After all, testing a product’s performance and reliability often means pointing out the problems and sending it back for fixes.

At Mirror Image Internet, principal quality engineer Carl Gerstle fields a team of QA engineers in a one-to-one ratio with the development group—an unusually balanced ratio between the two groups for most companies.

Gerstle maintains harmonious relations with developers, partly by hiring technically savvy QA engineers, and by counseling junior developers on the art of collaboration, “so the developers understand that we’re not throwing rocks at them,” Gerstle says. “You have to get [beyond] the point of thinking ‘Part of my job is to upset developers,’ which it isn’t.”

It helps that Gerstle himself held a variety of technical jobs, including hardware development, software development, systems engineering and user interface testing. “I can look at a complex system from most angles, including the user’s. A good job of performance engineering includes empathy for the end user as well as [understanding] the system on which the code base is running,” he says.

Linda L. Briggs