Integrating load testing with software development

WELCOME TO THIS installment of the "Architect's Corner." This month we examine how an enterprise Java™ technology development project can mitigate the risk of not achieving its performance and scalability objectives.

Successful projects require mitigating risks early on. Developing an Internet-enabled system exposes project teams to more uncertainties than ever before. With global access to systems, nonfunctional requirements such as security, performance, scalability, and availability suddenly become strategic. These nonfunctional requirements are also grouped under "quality of service" requirements for the burgeoning ASP market. To make the software development process even more interesting to manage, the risk of not providing the right quality of service comes packaged with the Internet time line. Gone are the days when application development schedules were multiyear. In a rush to meet the market, and thus project deadlines, software teams generally end up compromising the testing of their systems. What follows is predictable: considerable rework, schedule slippage, cost overrun, and possibly a lost business opportunity.

Many Internet systems are tested for performance and scalability only after the bulk of the functionality is built. While several commercial products are currently available for load-testing a Web site, the cost and complexity of these products seem to deter organizations from using them in the early phases of a project. In addition, many load testing tools are not intended to test individual application components/tiers. Arguably, a high-powered vendor solution is necessary for load-testing a system prior to taking it live. However, there is a need for a readily usable, lightweight, low-cost solution that can be used easily by system developers to conduct a "sanity check" of their system architecture for its performance and scalability attributes. Such a tool would encourage developers to load-test their systems during the very early iterations.

System requirements can generally be classified as functional and nonfunctional. The functional requirements, which cover the "functionality" of the system, consist of the form, "the user does this and then the system responds with this." The nonfunctional requirements cover systemic qualities such as scalability, reliability, maintainability, manageability, and security. Because nonfunctional requirements span multiple use cases, they could be captured separately under a supplementary requirements section in the use case document. It is important to notice that the nonfunctional requirements drive the system architecture. Architecture for an accounts receivable system that is meant for intranet access by a small organization, for instance, would be quite different from one that could be accessed over the Internet by several thousand simultaneous users.

Performance is the speed at which a system responds to user actions. Often the user of a system is yet another system. Scalability is the relative ability of a system to maintain its performance when under load. Load is measured by the number of simultaneous requests that are dispatched to a system.

The objectives for conducting load tests can be defined as but are not limited to:

  • Comparing varying architectural approaches
  • Performance tuning
  • Capacity planning

When the hardware (including network) resources available are kept constant for a system, its performance will eventually degrade with increased load.

Adapting Development Process to Incorporate Early Load Testing
Iterative software development processes, such as the Rational Unified Process™ (RUP), have been used by software teams for some time. RUP is a use-case-driven, architecture-centric, risk-focused approach to managing a software project. RUP divides the system development life cycle into four distinct phases: inception, elaboration, construction, and transition. Depending on the complexity of the system, each of these phases can be organized as one or more iterations.

The following sections list the principal activities for the inception and elaboration phases when seen in the context of identifying and testing against the performance and scalability requirements.

Inception Phase. The primary focus of the inception phase is to identify the key functional and nonfunctional requirements. Using a use-case-driven approach, the performance and scalability of a system can be described by specifying response times of a system while executing the primary scenarios from selected use cases under varying load conditions. The load conditions used should be defined under nonfunctional requirements. Answering the following questions will help establish the performance and scalability requirements:

  • Who are the different actors?
  • How are individual actors going to be interfacing with the system? Define all the system interfaces (user interfaces as well as interfaces to/from other systems) that need to be supported.
  • How many users (instances of actors) will be simultaneously using each of these system interfaces?
  • How many services will each user request per unit of time—i.e., what is the frequency of interaction? Frequency of interaction is an important consideration in determining peak load levels on a system.
  • What are the principal activities for each actor? Which use cases cover the principal (most frequent, or otherwise deemed critical) interactions that the actors (human users as well as other systems) will have? How many services will each user request per unit of time?
  • What are the specific performance requirements (response time) for the system in general as well as for specific use cases? Are there lower requirements for system performance when under load?

In addition, any dependencies that might affect the performance and scalability requirements must be evaluated. For example:

  • Are there any batch or other resource-intensive processes that when run will slow down the system appreciably? If so, should the performance/scalability requirements accommodate these batch processes?
  • What are the performance and scalability characteristics of the dependent systems? For instance, consider if a back-end system can support only half the transaction volume that is required by the system being built.

It is important to notice the dependencies that the various nonfunctional requirements may have on each other. For instance, adding layers of security generally requires more processing and consumes more system resources. Generous logging helps with manageability; however, it generally reduces the performance. In-memory replication of the state of certain components (e.g., stateful EJBs or servlets) may help with a system's availability, but tends to deteriorate its performance. It is therefore critical to capture all the nonfunctional requirements during inception as well as to understand that meaningful load testing can only be done after the system architecture supports other nonfunctional requirements.

It is important to identify the candidate deployment environment during the inception phase. The deployment architecture should ideally allow for future vertical as well as horizontal scalability. Load tests done later in the elaboration phase can help validate these assumptions and help with capacity planning.

Elaboration Phase. The elaboration phase of a project focuses primarily on building baseline architecture. This baseline architecture is demonstrated by implementing the architecturally significant use cases, as identified in the inception phase. Along with implementing all the functionality specified in the selected use cases, the baseline architecture must also meet all the nonfunctional requirements. Depending on the size of a project, the elaboration phase can be structured as one or more iterations.

Table 1 shows the steps in the individual iterations for an elaboration phase. As shown, all the respective tests should be developed and run in parallel to developing a system's functionality.

Table 1. Individual iteration steps for an elaboration phase.
Requirements Identify and elaborate on the use cases to be implemented. Revisit performance and scalability requirements.
Analysis/Design Apart from architecting and designing the functioning use cases, the project team should also identify a deployment environment for load testing.
Implementation Build end-to-end slices of application functionality.
Test Develop load test suites. Configure load test environment. Deploy load tests. Run load tests. Analyze results. Follow up with white box testing. Until the system is being regularly tested against its functional and nonfunctional requirements, its quality will remain suspect.

Development teams often develop unit-testing code and use it towards subsequent regression testing. Frameworks such as JUnit are available to assist with unit testing. By using a lightweight, easy-to-use load testing framework that can give the initial performance and scalability numbers. This kind of framework can readily be used in conjunction with third-party tools such as code profilers, thread analyzers, and memory debuggers for effective white box testing.

Refactor Revisit the architecture, design, and implementation based on the load-test results as well as the white box tests.

Only when the baseline architecture supports all the architecturally significant use cases and meets all the nonfunctional requirements can the elaboration phase really be complete.

Construction Phase. The construction phase is normally structured as a series of iterations. It is critical to reconfigure and run load tests at the end of each iteration to help ensure that the performance and scalability requirements are still met.

Transition Phase. After deploying the application to the production environment, full-scale load tests should be conducted to help ensure the quality of service requirements are met. Companies at this stage should consider bringing in commercially available Web site load testing tools. However, by conducting load tests early in the life cycle on individual tiers of architecture, the probability that the completed system meets the performance and scalability requirements will be much higher.

Figure 1
Figure 1. A typical J2EE-based system.

A Guide to Load Testing
Let's answer some frequently asked questions on the topic of load testing:

  • Which components/tiers are candidates for load testing? Take a look at Figure 1. Although performance hotspots can exist in any tier, scalability hotspots are more likely to exist in the tiers that are shared across multiple client sessions. Therefore, the application, business logic, persistence, and integration tiers are effective candidates.
  • How should the overall performance/scalability requirements be translated to the individual tiers? Performance is measured by observing the total elapsed time. The total elapsed time to access a service from the presentation tier is the cumulative time spent in all the architectural tiers. There is no set formula for proportionately allocating the elapsed time to the tiers. The allocation can vary significantly depending on specifics of the design of individual tiers. For instance, if an architecture caches data in the Business Logic/Object tier, the elapsed time in the persistence tier for a resource initially requested will be significantly higher than its subsequent requests. Scalability hotspots can be identified by plotting the proportional time spent by various tiers for various load conditions.

To translate the scalability requirements, certain questions need to be considered: Are components in the tier being pooled (such as EJB instances)? Are stateful components being used as opposed to stateless? In general, it is more difficult to horizontally scale stateful components.

Figure 2
Figure 2. Sequence diagram for a Web application that uses servlets and session EJB components.

  • What is a load test suite? A load test suite is a test program that is intended to test the performance of a component (or a group of components) under various load conditions. A load test suite should capture the elapsed time to access various services rendered by the component being tested. Figure 2 shows an example sequence diagram for a typical thin client-based Web application that uses servlets and session EJB components. The objects to the right of the red line are marked as EJB tier. As shown in Figure 3, the sequence of interactions with the EJB tier to access a specific service can be packaged as a test suite. In the example shown, such a load test suite performs the following sequence of actions:

    1. Get an initial context from the Java Naming and Directory Interface (JNDI) lookup.
    2. Locate the EJBcomponent home interface.
    3. Create an instance of the session bean.
    4. Request the session bean to perform a service.

    Load test suites can be similarly created for the persistence tier or the servlets (Web) tier.

  • What is a load test framework? A load test framework provides the building blocks for creating load test suites and executing them under various load conditions. A good load testing framework should scale horizontally by running multiple load simulators in parallel. A minimal load-testing framework should in addition provide a console to start and stop load simulation and a repository for collecting all the benchmark data. In addition to capturing an application's benchmark data, a load test framework can provide hooks to capture system and network resource usage.
  • What are the additional considerations in devising load tests? Specifics of system behavior such as caching, distribution of data, object pooling, network bandwidth as well as traffic, etc., should be carefully considered to produce the most realistic load tests. A load test suite should also adequately instrument the services being called such that the relative distribution of elapsed time across component boundaries can be later analyzed. Because devising a good load test requires intimate knowledge of a system's architecture and design, it is important to allocate senior developers' time for developing and conducting the initial load tests.
  • A production-like environment is not available for load testing (during the elaboration phase). How do I achieve realistic load tests? Even though it is ideal to get the exact production deployment environment for early load testing, it is seldom available. Nevertheless, a project can still benefit from performing load tests on the best production-like environment that it can access. I have observed many system architectures come apart with as little as 100 parallel sessions. It is acceptable to have a load test environment with less horsepower and/or bandwidth than the final production environment, but it is important to do the most realistic load test possible with the resources available—however scaled down it is. Ideally, the project manager should understand the importance of early load testing and influence sponsors to allocate enough resources up front to set up a lab.
  • What is white box testing? How does it tie in with the load tests? Load testing is considered a black box test because of the specific instrumentation that is put in the test suite, and because the test engineer cannot see the internal workings of the system. For example, how many times a specific method was called or how many instances were created of a specific type cannot be determined in this way. A white-box test allows a runtime view into specific system characteristics. Code profilers, memory analyzers, thread analyzers, etc. are examples of white box testing tools.

Figure 3
Figure 3. Packaging the EJB tier interaction sequence as a test suite.

A white box testing tool will allow the tester to hook up to (or launch) a JVM. By using a Java technology-based load test tool, test engineers can monitor various aspects of their application at runtime using one of many commercially available white box testing tools.

Nonfunctional requirements, such as performance and scalability, are often ignored until a system is about to be released for acceptance testing. Even organizations with financial muscle tend to bring in high-power consulting to help them analyze the performance and scalability aspects of the system that they have just about built, and to help with subsequent capacity planning.

I have proposed a methodology for incorporating performance and load testing in the early phases of the development life cycle. Ideally, every tier of the system, before it is released for integration, should be tested for performance and scalability as much as it is regression-tested for functional requirements.

If you would like to know more about these types of architectures or have questions or comments, please send an email to We will do our best to address as many requests as possible.

Copyright 2000 Sun Microsystems Inc. All Rights Reserved. Sun, Sun Microsystems, the Sun logo, Java, Enterprise Java Beans, EJB, and JVM are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. Sun Microsystems, Inc. may have intellectual property rights relating to implementations of the technology described in this article, and no license of any kind is given here. Please visit for licensing information.

The information in this article (the "information") is provided "as is," for discussion purposes only. All express or implied conditions, representations, and warranties, including any implied warranty of merchantability, fitness for a particular purpose, or noninfringement, are disclaimed, except to the extent that such disclaimers are held to be legally invalid. Neither Sun nor the author makes any representations, warranties, or guaranties as to the quality, suitability, truth, accuracy, or completeness of any of the Information. Neither Sun nor the authors shall be liable for any damages suffered as a result of using, modifying, contributing, copying, or distributing the information.


Upcoming Events


Sign up for our newsletter.

I agree to this site's Privacy Policy.