XP lessons learned
- By Natraj Kini, Steve Collins
This article is excerpted from Chapter 31 of ''Extreme Programming
Perspectives'' by Michele Marchesi, Giancarlo Succi, Don Wells and Laurie
Williams. Used with the permission of the authors and Addison-Wesley.
This article is based on our experience while working for a software
outsourcing company that was built around the Extreme Programming (XP)
methodology. One difficulty we had in adopting XP was that the original ''white
book,'' (see Reference 1) though inspiring and persuasive, was somewhat short on
detail. We feel that it is valuable to the XP community if groups doing real
projects provided detailed accounts of their experiences in applying XP.
Finally, some issues took on additional significance for us because our client
had outsourced their project to us.
What facets of XP worked well and were easily accepted
by the customer (including some practices that we expected that the customer
wouldn't like)? What practices proved their worth when requirements or
conditions changed? What practices were difficult to implement? What
modifications did we make to some of the XP practices and why?
don't have definitive answers to these questions that can be extrapolated across
all projects, but our account adds one more data point to the sample.
Description of the context of the
Our company was
formed around XP, which meant that all our developers and our management were
committed to XP from the conception of the company. The developers on this
project had worked together on other XP projects for a few months, and some had
long relationships during their careers, so we had a team that was ready to hit
the ground running.
Our client for this project was also a start-up, which meant that they were
more open to XP than a larger company with an existing methodology might have
been. Many of our customers had backgrounds in development, so they were
familiar with the shortcomings of conventional methodologies. They were also at
a stage in their development where it was possible for them to remain on-site
through most of the life of the project.
The XP-standard environment prescribed by Kent Beck proved its worth in this
project (see Reference 1). We often had four pairs (including client developers)
engaged on the four workstations in the corners of our development room. And
much of the communication during the project happened indirectly as people
overheard and responded to what an adjacent pair was discussing. The walls
covered with whiteboard wallpaper also worked out well; they always held design
scratchings and project notes. Even when we had online data such as bug reports,
we put highlights up on the wall because they acted as a physical reminder of
our priorities. Our wireless network and dockable laptops worked well, too --
they made it much easier for us to switch pairs and get together in another room
for an impromptu design session.
We decided to allocate a specific and separate space for customers outside
the development room. This helped them remain productive with their ''real'' jobs
while still remaining accessible to the development team. It also meant that
they didn't get in the way of the developers. Although we tried to be as open as
possible, we also felt it necessary on occasion to meet in a separate room
without customers present. Likewise, we made space available for the customers
so that they could meet among themselves when they felt the need.
Results from the experience
Billing and Contracts: Based on a
recommendation from ''Uncle Bob'' Martin, we decided to bill by the iteration
instead of by the hour to prevent any squabbling over hours worked. This also
eased the time-tracking burden on the developers and made it easier to swap
developers when needed. More subtly, we felt that this helped show the client
that we were sharing the risk with them and that we were invested in the
We found later that this also acted as a feedback mechanism -- overtime is an
effective cure for developer overreaching. We developers had no one to blame but
ourselves if our optimism resulted in long hours. This can be taken only so far
-- we still believe that we must go back to the customer and reduce scope when
we find that the estimates are way off the mark.
The contract we agreed on with our client didn't attempt to nail down the
deliverables for the seven-week life of the project; that isn't possible when
you're going through iteration planning and changing direction every two weeks.
We agreed on the size of the team and that we would augment the contract every
two weeks with a list of stories for the next iteration.
Iteration planning: We anticipated problems in convincing the customers to
narrow iteration scope to a set of specific stories that made sense for both
business and development. However, this negotiation actually went quite well --
our customers were willing to listen to our reasons for wanting to do some
stories ahead of others (though we tried to keep such dependencies to a minimum)
and for giving high-risk stories a large estimate.
The customers loved the quick turnaround of the planning game. They were
afraid that it would take us much longer to become productive than it did and
were pleasantly surprised that we were able to identify the important stories as
quickly as we did.
Team size: Our team size for this project was four developers and one tester,
whereas we had identified an ideal team size of eight developers and one tester
and usually had at least six developers on previous projects. We found that
there were some advantages and disadvantages to this smaller team size. On
balance, we'd probably still prefer to have six to eight developers in a
* We have struggled with whether everybody who is part of the
development team should be in the planning game -- progress sometimes bogs down
with too many participants, but a lot of knowledge transfer does take place.
This was not a problem here with the smaller team size.
* Communication and
the coordination of schedules were easier.
* It was a little more difficult to recover when a developer had to
leave for personal reasons.
* The overhead of tracking the project was more
visible as a significant component of the total time spent on the project.
We had a smaller spectrum of skills available among the developers.
* With a
smaller number of stories in each iteration, we sometimes found that one of the
assigned tasks was a bottleneck because other tasks depended on it.
Stand-up meetings: How do you avoid discussing design during stand-ups? We
had repeated problems in limiting the length of our daily stand-ups; the issues
raised were often valid and needed further amplification.
One device that worked was to write on a whiteboard all the issues that
people raised for discussion during the stand-up. Then, after the stand-up was
completed, those who needed to be involved in each discussion could split into
smaller groups and follow through on the ''promise to have a conversation.''
Another minor innovation was the use of an egg timer to limit stand-ups. We
set the egg timer to 10 or 15 minutes at the start of the meeting. Then the
person who was talking at any time during the stand-up had to hold the timer in
their hand. This acted as a reminder to the speaker to be brief and a reminder
to others in the stand-up to avoid any private conversations while someone else
had the floor.
We found that the client wanted a much more detailed status than we had
planned to supply -- they were used to the traditional spreadsheet view of
project tasks with a percent complete for each task. We compromised with a daily
status message that summarized the state of the project and the outstanding
issues -- most of the work to compile this daily message was done by one person
at each stand-up meeting.
Pairing with client developers: We knew from day one that we would need to
hand off our code to the customers in-house. For much of the project, their
technical lead was on-site and worked with us. Four other, newly hired
developers paired with us for different stretches of the last three weeks.
The objective of accelerating knowledge transfer by means of pairing worked
very well. It also helped that XP is a cool new methodology that many developers
are eager to experience. But that initial enthusiasm was probably sustained only
because these developers were able to make constructive contributions. One
developer didn't like the idea of pairing at all and quit when he found out that
he would have to pair at least until the code was handed off to the client
Our experience with the technical lead was more complex. He was technically
very capable and made several suggestions and observations that led us down
unexplored paths. However, pairing with him was hampered by the fact that he was
playing multiple roles, trying to get the most value for the client's investment
while still acting as a developer. Therefore, he tried to steer us toward
solving the difficult technical problems that he thought would crop up later,
instead of focusing on the stories and tasks immediately at hand.
Finally, at the end of the sixth week, the team captain (our instantiation of
the ''coach'' role) and another developer both had to leave the team unexpectedly.
We introduced two new developers to the team and were able to deliver all
agreed-on functionality on schedule, which further validated the worth of
pairing and shared code ownership.
During the first iteration, we felt the natural
pressure to please the customer and bit off more than we could chew. We found
that our attention to tracking and to promptly creating the acceptance tests as
well as our discipline in sticking to our XP practices all suffered when we were
under the gun. We continued to practice test-first programming but neglected to
pair up when we thought that the tasks being tackled were simple enough that
they didn't need the spotlight of a continuous code review.
As our customers came to trust us more in later iterations, we felt less
pressure to prove that we were delivering value by stretching ourselves to the
point that our discipline degenerated. We also learned from our failure to meet
the optimistic velocity of the first iteration: We reduced our velocity by about
20% from the first to the fourth and last iteration and felt that the quality of
our code improved as a result.
Bug fixes: By the third iteration, a substantial fraction of our time was
spent on resolving bugs from previous iterations. We found that we had to take
this into account when estimating our velocity. Part of the problem was that our
client was new to the idea that we could throw away work when we realized, after
it had been completed, that it should be done differently. They saw these issues
as bugs -- we saw them as new stories in the making.
The ''green book'' (see Reference 2) suggests that significant bugs should
become stories in future iterations. We probably should have tried harder to
convince our client that this was the right course to follow -- there's a
natural tendency for the client to feel that bug fixes are ''owed'' to them.
One approach to bug fixes that worked quite well was to have one pair at the
start of each new iteration working on cleaning up significant bugs -- those the
customer had decided definitely needed immediate attention. At times we had
significant dependencies on one or two of the tasks in the new iteration.
Especially in that situation, we found that it was an efficient use of our
developer resources to have one pair working on bug fixes while these
foundational tasks were tackled by others.
Overall, we were not satisfied with our handling of bug fixes during this
project -- we wanted to convert them into stories, but our customers always felt
that they were ''owed'' bug fixes as part of the previous iteration, above and
beyond our work on new stories.
Acceptance testing: One thing we realized was the importance of a full-time
tester to keep the developers honest. When we did not have a full-time tester
for the first iteration, we got 90% of every story done, which made the client
very unhappy during acceptance tests -- they perceived that everything was
broken. We also found that our tester provided an impartial source of feedback
to the developers on their progress.
We made one modification to our process specifically to facilitate testing.
Our tester felt burdened by having to ask the developers to interrupt their
paired development tasks whenever she needed a significant chunk of their time.
So we decided to assign the role of test support to a specific developer for
each week of the project.
Because we had multiple customer representatives, we found that a couple of
times one customer helped create the acceptance tests, but a different one went
through them with our tester at the end of the iteration. This necessitated many
delays for explanation during the acceptance testing and some confusion over
whether acceptance tests had been correctly specified. We concluded that in the
future we would strongly push for one customer to help create and approve
Unit testing: Our unit tests proved invaluable in ensuring the quality of our
code -- we found on numerous occasions that refactorings in one area of the code
caused side-effects elsewhere that we caught with our unit tests. Because we
relied so heavily on the unit tests, the frequency and time to run them
increased to the point that they took almost two minutes. Our first response was
to do some refactoring to reduce this time. We then made use of the flexibility
of Apache Ant's XML configuration to sometimes run only a specified subset of
all the unit tests.
During one iteration, we implemented a story that required a multithreaded
producer-consumer engine that was difficult to test using JUnit. We created
pluggable stubs for each module of the engine so we could test any one module
while simulating the functionality of the other modules.
Metrics: As a means of encouraging the creation of unit tests, we wrote a
simple script that traversed our source tree daily and sent e-mail with details
of unit tests written, organized by package and class.
We also used JavaNCSS (distributed under the GNU GPL), which generates
global, class, and function-level metrics for the quality of code. We
automatically generated these metrics daily and wrote the results to the project
wiki to help us determine what parts of the code were ripe (smelly?) for
refactoring and whether test coverage was adequate.
In addition to these automatically generated metrics, our tester manually
created a graph of the acceptances tests, showing acceptance tests written, run,
passed, and failed. This information was available on the development room
whiteboard with the tasks and status of the project. A snapshot of the current
state of the project was thus available on the whiteboard, while a more detailed
view of the project could be found on our project wiki.
The grade card: After each iteration, we graded ourselves on the main XP
practices and a few other aspects of the process that we felt were important
(tester-to-developer communication, clarity of the stories, accuracy of
estimates). The scores showed us the areas where the development team needed to
focus, and provided some useful and immediate feedback into the process. They
served as a check against our sacrificing the long-term benefits of sticking
with the process for the short-term benefits of churning out more code. We found
that our scores improved substantially with each iteration, with the lowest
grade in the final iteration being a B-. We made these grade cards available
publicly on the wiki, although we did not invite the customer into the grading
process. We will consider that step for future projects, at least after a couple
of iterations, when some trust has developed between developers and clients.
Object-oriented databases: We were fortunate to be able to use an
object-oriented database management system (OODBMS), rather than a traditional
relational database management system (RDBMS), which enabled us to treat the
domain model as identical to the persistence model and therefore to be agile
when refactoring the code. It's much more difficult to refactor the model when
the data representation cannot be changed at will.
One aspect of XP that we had to rethink in our
circumstances was the amount of documentation that was necessary. Because the
client developers would be responsible for maintenance and enhancements, we
needed more documentation than for an in-house project. So we put together an
overview of the design along with some automatically generated UML diagrams and
made sure that all our classes had Javadoc comments. We also added some
installation and release documents and a developer FAQ. For most outsourcing
situations, this level of documentation is probably necessary.
The final iteration was completed on Wednesday of the seventh week, with the
contract concluded on Friday. As delivery day approached, we noticed how
different this was compared with past experiences at other companies. The
developers all left before sunset on the day before D-day, and the atmosphere
during the last couple of days was relaxed and cordial, even celebratory. For
our final handoff to the client, we followed Alistair Cockburn's recommendation
to videotape a design discussion. We pushed all our code and documentation over
to their CVS repository, burned a CD containing the code and documentation, and
celebrated with beer and foosball.
All in all, we were pleasantly surprised with the way our customers (both the
developers and the business people) embraced XP during this project. They had
some knowledge of XP when we started and were eager to learn more. The CTO had
previously worked on a similar application and had some definite ideas on how to
handle the complexities of the project, but was still receptive to an
incremental and iterative approach. We found that XP worked well when dealing
with our technically sophisticated customers. Rather than going around in
circles when we disagreed, we could prove (or disprove) our design ideas using
the concrete feedback of our code.
What to do next
In retrospect, though a couple of our customers
did read the white book, we felt that it would have been useful if we had
created a brief ''XP owner's manual,'' perhaps a few pages long. Such a document
would include some items intended to educate the customer, such as the
* Planning game -- story creation, deferring in-depth discussion of each
story until it is selected or needs clarification to be estimated.
* Acceptance tests -- expected customer input; what happens at the end of an
We would use other items as a checklist for discussion, such as the
* Bug fixes -- prioritizing bugs, possibly assigning them as stories.
* Documentation and status -- determining how much documentation and
reporting is essential.
We also found that many aspects of our process worked better as we developed
more trust in our relationship with our customers. With future projects, we
would like to focus on whether projects with repeat customers do indeed bear out
1. Beck, Kent. ''Extreme Programming Explained.'' Reading, Mass.:
2. Beck, Kent and Martin Fowler. ''Planning Extreme
Programming.'' Boston: Addison-Wesley, 2001.
We thank Kevin Blankenship and John Sims for making the
Tensegrent experience possible.
Natraj Kini is a founder of Agile Development, a Denver-based software
engineering firm specializing in the XP methodology. He can be contacted via
e-mail at firstname.lastname@example.org. Steve Collins is
a founder of and senior architect for Agile Development. He can be contacted at