In-Depth

Testing focus boosts XP

The abysmal quality of software has led to Extreme Programming (XP), a discipline for developing software that requires test-driven design, continuous testing and acceptance of constant change. As one application services provider (ASP) delivering complex demand chain management solutions has found, XP has enabled rapid development of quality software. The XP movement, which Kent Beck kicked off with a paper on Smalltalk programming, emphasizes testing that helps to bring new developers up to speed rapidly. It also aids developers when they refactor, or change, code. Every programmer writing code has to refactor existing code to streamline it, and continuous testing gives developers confidence that their refactoring works.

Because developers spend more time on testing than they do on coding, writing the tests is a skill in itself. Tests have to be designed carefully and the results must be scrutinized closely. The test code has to be changed if necessary so the tests do not consume too much time.

X/Unit tests for several languages are freely available in the public domain so programmers can use them without reinventing the wheel. The main XP site is at http://c2.com/cgi/wiki?ExtremeProgramming. For more information about the XP testing framework and available tests, go to http://c2.com/cgi/wiki?TestingFramework.

More tests, more speed
If real-world products, like cars, for example, were built the way software is, there would be riots in the streets. "It's insane the horrible quality of software people put up with," said Erik Meade, senior consultant at Chicago-based consultancy ObjectMentor Inc. The test-driven design approach of XP, however, where one of the rules is to never write a line of code without a failing test, improves code quality. "Testing is an integral part of the application development environment, not an afterthought," noted Gary Baney, chief technology officer at Flashline.com Inc., Cleveland.

Using the XP approach, San Francisco-based Evant Solutions took only six months to develop and roll out a demand chain management solution covering areas of the supply chain, catalogs, production management and planning. The application is delivered over the Web through an ASP model to both e-tailers and brick-and-mortar retailers. The application, which was "very large-scale, complicated and detailed with a very large database behind it," was written with "no client-side script, a very thin client in mind, no applets and no ActiveX," noted Edward Hieatt, one of Evant's programmers.

Evant's development environment is server-side Java, XML and Linux. XP gurus Kent Beck of First Class Software Inc. and Rob Mee, who has since joined Evant, coached the firm's programmers. Work commenced in January 2000 and the application went live that June. Clients included Disney, NASCAR and a division of Hewlett-Packard, said Evant's Hieatt. Development was incremental because of the XP approach, which "encourages you to develop gradually—nothing is ever complete, it's always growing," he added. This allowed developers to get code out quickly. "Every couple of months we come up with new code and our clients love it because they can direct our development and we can give them new features they want," noted Hieatt.

Frequent testing contributed to the rapid pace of development. Developers using the traditional approach would be "very nervous about breaking something" whenever they wrote new code, but "with these tests, as long as they run, you know you haven't broken anything, so you can code faster," explained Hieatt. For example, about three weeks before going live, Evant's developers realized that one part of the system, the Enterprise JavaBean layer, was "extremely slow," said Hieatt. The developers decided to remove the layer and rewrite it. "Looking back, that seems crazy now; but at the time, we were so used to changing massive pieces of our code because of our tests that we had no qualms at all. We removed this huge piece of code and rewrote it in 10 days," he said. "We could never have done that without all our tests." All the developers had to do was reboot the layer, write their code and run tests several times. That saved Evant "a lot of money, because putting a commercial license into production is expensive," said Hieatt.

Evant's developers discovered the problem EJB code because they adhered to the XP principle of continuous refactoring, or redesigning, of code. "Whenever anybody writes new code, it's their duty to refactor the code as they go along," explained Hieatt. This is possible because there is no large, up-front design in XP. Code has to be retested once it has been refactored. One benefit of this constant refactoring and retesting is that the learning curve for new developers is shortened. "They don't have to wade through a stack of documents to figure out how the system works and have all these diagrams drawn for them," commented Hieatt.

Despite the amount of work they had to do, Evant's developers averaged 50-hour workweeks and did not have to keep the crazy hours most large application development projects require. "When you have constant testing, everyone feels more comfortable, and that's the kind of thing that lets people go home earlier because they're not worried or nervous," said Hieatt. "No one worked the weekend before we launched our product."

This is in keeping with Kent Beck's teachings. While developers can work overtime for a week once or twice a year, a second week of overtime in a row "is a clear signal that something else is wrong with the project," said Beck in a post at http://ootips.org/xp.html. This emphasis on keeping normal working hours pays off over the course of months, because the team's productivity will be higher; rested programmers "are more likely to find valuable refactorings, to think of that one more test that breaks the system, to be able to handle the intense inter-personal interaction on the team," reads Beck's post.

The testing process
In X/Unit testing, developers write tests before they write any code. Typically, said Evant's Hieatt, they have "maybe just an idea, but no concrete requirements documented." They then write a "very concrete" suite of tests that correspond closely to the requirements. "The tests almost define the requirement and they're very readable," he said.

Ted Farrell, chief technology officer at WebGain Inc., Santa Clara, Calif., agrees. "I can come into a project I didn't work on before and can use the unit tests to validate that I'm not breaking someone else's code, and get documentation on what the code is," he noted.

Developers then begin to write the code, and immediately run the tests. "Typically, you write the very bare bones of the code so it's barely compiled; then you run the test and it fails," said Evant's Hieatt. There are two kinds of failures. One is an error where the code crashes; the other is a failure where the code does not perform as required. At that point, the developer rewrites the code and keeps testing it. At some point, the code will pass all the tests and "you'll know the code is done, because you know you've satisfied the requirements," he said. If the developer needs to change the requirements the next day, the tests must be run again. "You keep running the tests as long as you run the code," Hieatt explained.

There are two types of tests in XP: Unit tests and functional tests. Unit tests are written by the developer to test each component individually. This ensures reusability because "the tests that define the interface don't worry about the context in which the object is being used," said Hieatt.

WebGain, which provides a suite of products for accelerating the development of e-business applications, has incorporated unit testing into its development process because "we believe it is an essential part of delivering high-quality software in a shorter period of time," the firm's Farrell said. WebGain's approach includes integrating unit tests "at the earliest stages of development," he noted. Flashline's Baney said that, at the unit test level, testing tools like JUnit "should be very, very close to the IDE."

Developers should write unit tests before writing or revising any method that is at all complicated, said Beck in his Web posting. In the long term, the tests "dramatically reduce" the chance that someone will harm the code because they "communicate much of the information that would otherwise be recorded in documentation that would have to be separately updated [or not]," said Beck. Also, writing tests tends to simplify design because it is easier to test a simple design than a complex one. They also reduce over-engineering because developers only implement what they need for tests, reads Beck's post.

Writing unit tests is "as sophisticated and important a skill as coding," Evant's Hieatt said, because developers have to figure out how granular they want to make their tests. Hieatt prefers to write "a couple of methods for the typical case [and] write a few boundaries" because otherwise "you can get dragged down by the testing and you get so worried about every case that you get defensive about your testing." Evant's philosophy is "don't be defensive about your testing or it will bog you down; there's definitely a balance to be struck," Hieatt said.

Hieatt wrote a new test, JSUnit, for Evant because "we became very aware that we had excellent tests—JUnit for the server and SQLUnit for database queries—but there was no client-side JavaScript testing going on," he said. JSUnit is essentially a port of JUnit to JavaScript, he explained. Information about JSUnit tests can be found at www.edwardh.com/jsunit.

Functional testing is ideally written by customers, who are in-house product managers, and it should be more of a black-box test, noted Hieatt. Customers should be "non-technical people who have the business knowledge," he added. At Evant, for example, the customers are product managers who have a strong retail background and domain knowledge of retail systems; functional tests are written in a language they understand. Initially, tests were written in XML. Although XML proved easy for the product managers to learn, Evant found it was not easy enough, so it developed an in-house natural language it calls ESP—Evant Script Programming. That was written in Rebol, a relatively new scripting language and "has been very successful," Hieatt said.

Evant's product managers developed more than 100 functional tests, said Hieatt. They wrote functional tests before developers wrote any code, and ran those tests whenever they wanted throughout the cycle of development, independent of the developers' work. This ensured the applications being developed met the product managers' requirements.

Managing the tests
At Evant, like in most other XP shops, there is a main trunk of code and developers writing code integrate it into that main trunk. Every time anyone integrates any code into the main branch, they run all the tests ever written for the code, not just the tests they wrote. "In Extreme Programming, there's almost more time spent on testing than on the code; we say if you can't finish your integration of code, you can't go home until all tests are run," Hieatt said.

The trick is to ensure that the tests run within a reasonable timeframe. "You don't want the tests to run so slowly that people don't want to run them," said Hieatt. Evant has 1,750 tests and they "take about 15 minutes" to run, he added. The time taken fluctuates "dramatically"—at one time, Evant had 50 tests that took 20 minutes. Kent Beck, who was coaching Evant, said a rule of thumb is that whenever tests take more than 10 minutes, developers should look at the tests, profile them, "maybe think about running them on multiple machines at once or buy a faster machine to run them," noted Hieatt.

While throwing more hardware at the problem and taking a distributed testing approach work, Evant found that the main issue when tests ran slowly was the testing code. "Often you find you are running tests in a silly way, and you just have to streamline it," Hieatt said. Every test in X/Unit has routines for setting it up and tearing it down, and "sometimes you find you can do the setup or teardown more efficiently or find a redundancy you can eliminate, or find you don't have to generate the user interface for every test," Hieatt said. "More often than not, it's the code that's doing something silly." That makes it crucial to monitor the performance of tests and encourage all developers to keep running tests, Hieatt added.

The bottom line: Testing early and often helps produce quality software quickly.