The Agile Architect
Dealing with Technical Debt: An Agile Horror Story
It always seems like the right time to incur technical debt but never the right time to pay it off. So don't incur it or this could happen to you...
- By Mark J. Balbes, Ph.D.
Allow me to brag for a minute. I have a great team. They work hard and accomplish miracles for our customers. We have developed a product base that is extremely adaptable and reusable so that new value can be rapidly added to the product without much effort. Although we are continuously changing and improving our processes and our technical skills, we move smoothly from story to story, knocking them down like dominoes. Everyone on the team gets along. We have a lot of group conversations to share ideas. While we don't always see eye to eye, we trust each other to do the right thing. Yes, life is ideal.
Or rather, it was. Before "it" arrived.
Suddenly, we were at each other's throats. We couldn't pair effectively because we couldn't agree on a direction. We thrashed on stories, not getting traction. We couldn't even figure out how to organize our ideas in order to make cohesive stories. Our team was breaking down, tension was high and the term "war room" was beginning to take on literal meaning.
"What happened?" you ask. Here's the dark tale…
Our software is built on a pluggable architecture based on the Eclipse Rich Client Platform. Third-party vendors can drop new tools into our application and, as long as they follow our interface guidelines, they can use our data, influence how our tools work and provide entirely new functionality. We've worked with vendors very successfully to provide software that combines our expertise and theirs into a seamless software experience for our mutual end users. But it's not always a happy story.
In this particular story, we were working with a vendor that followed a non-agile path. A single software developer was assigned to design and create the software; we were only peripherally involved. As we are all inclined to do when working by ourselves, this developer kept almost everything in his head. Pair programming obviously couldn't happen. He didn't write automated tests or even manual tests and, obviously, test-driven development was out the window. There was neither a continuous integration system nor even an automated build.
If you read my last column ("Project Management Is Not Enough: 4 Crucial Agile Techniques"), you'll notice a familiar pattern. This company didn't use any of the critical software disciplines needed to create high quality agile software. And, of course, it got the expected results: The software was brittle, buggy and hard to modify. In fact, the software was rewritten from scratch multiple times as new requirements were levied against it because it was cheaper to recreate it than to figure out how to modify it.
With the software continuing to deteriorate after multiple years of development (and redevelopment), we were handed the code and asked to make it work. In two months. (No, I'm not exaggerating.)
And that's when our team started to break down. It took us about a week to realize what was going on. We had inherited a huge amount of technical debt.
Technical debt is the work that should have been done on a project but wasn't. Often this is a very deliberate decision. For example, if there is a looming release deadline, it may make sense to defer some activities until after the release.
Technical debt can take many forms, including deferring bug fixes or skimping on automated testing. It can also mean waiting to restructure or refactor the code until after a release. These activities buy you time in the short term but cost you in the long term. A disciplined team (agile or not) will make the decision to incur technical debt but then pay off that debt as quickly as possible. Undisciplined teams may never pay off their technical debt.
When technical debt is not paid off, it accumulates, and just like with financial debt, you accumulate interest. Didn't fix that bug? Now it's out in the wild and your users are calling you about it. Time to put aside the new features you are developing and get that patch out! Deferred a needed refactoring? Now your developers are struggling to add a new feature that should have been easy. And what was that brilliant refactoring idea you had, anyway?
When technical debt becomes too much, when the maintenance costs skyrocket and new features are nearly impossible to create, it's time to scrap the software and write a new application. This happens to every project eventually, but the longer you can delay it, the more return you get on your investment in the software. In the story above, the technical debt built up so fast that the product was dead before it was completed.
And there is a human cost to technical debt, too. It leads to discord in the team. When you are working around problems, when it becomes harder and harder to do your work for fear of breaking the existing code, when every solution you try hits another bug that stops you, life becomes frustrating. It's easy to blame others on the team for the problems you are facing.
And that's what happened to us. We inherited code that had no automated tests, no manual tests, no documentation, dead code, spaghetti code and more undocumented bugs than we could count. (Who are we kidding? They were all undocumented.). Now that's a lot of technical debt. And we almost instantly turned on each other, each of us trying to make progress, pulling in different directions, stepping on each other's code and becoming extremely frustrated.
The road to recovery was not easy. We recognized we had a problem. They say that's the first step.
We took the approach of having one pair decouple the code into three major sections. From there, we used a technique of divide and conquer. Some sections we wrapped with automated tests and refactored. Others we simply rewrote. We continued to fix bugs as we found them. And of course, before any of this, we added the software to our continuous integration server and automated build.
The strategy is to pay off the technical debt as needed. While it would have been satisfying to scrap the existing code and start from scratch, this would have thrown out the good with the bad and left us with no solution until we completed the rewrite. By attacking the problem in pieces, we could ship an improved (albeit not completely polished) version at any time.
As we worked on improving and stabilizing the software, we were, of course, writing automated tests for everything we could. It was not always easy since the code was not written with automated testing in mind. The book "Working Effectively with Legacy Code" by Michael Feathers is an excellent reference for techniques on how to deal with code that was written without tests. That's his definition of legacy code: code with no automated tests. According to that definition, you may be writing new legacy code today.
With the divide and conquer technique, an interesting phenomenon occurred. The areas of the software that were buggy quickly receive a lot of attention and thus, a lot of tests were written for it. As we added new functionality, we wrote new tests. What was left with no tests was only the original code that was stable, not buggy and didn't need to be modified to accommodate new functionality. Over time, that shrank to nothing.
And so, step-by-step, little by little, we got the software under control. The horror receded and life in our war room returned to normal.
If you've been paying attention, you'll realize that I glossed over the details of how my team recognized the source of our confrontations. Reflection and continuous improvement are important aspects of an agile team. Let's talk about that next time.