The Agile Architect

Want To Really Be Agile? Swarm!

The hackneyed phrase, "Good, fast, cheap -- choose two," is wrong in the agile world where collaboration is more powerful than divide and conquer. Our Agile Architect explains how to increase productivity significantly while also increasing code quality with a simple process change.

My team made a breakthrough a couple of weeks ago that was so profound, yet so simple and obvious after the fact that I wanted to share it with everyone. You see, for years, I've been saying that the team doesn't refactor enough. In our retrospective a couple of weeks ago, we managed to peel back some of the layers of complexity and get to the core problem.

We don't refactor enough because we are not organized for it. Let me explain.

We manage our releases by creating sets of stories. These sets of stories create a single feature in the product. We isolate the work performed on each feature by creating a copy of the source code (branches) in our configuration management tool, Git, for each feature. When we finish work on a feature, all of the changes are merged back to the master version of the source code. In that way, work that is done for one feature doesn't affect others. This allows us, for example, to release our software while a feature is still in progress because its changes are not in the master version.

We go even further than this, making a branch of the code for each story. One pair works on the story and when that story is completed, its changes are merged into the feature branch. If the story fails, you abandon the branch with no harm done. If a story doesn't get finished before a release, you can still decide to release the stories that have been completed for the feature.

If you know configuration management, then you probably read the above paragraph and thought, "Yeah! That's awesome." From my experience working with many teams and many companies over the years, dare I say decades, what I described above is the Holy Grail of configuration management and is considered a best practice by many.

So what was the problem? Glad you asked!

Picture this. You are a developer on my team. You and your pair just completed a story, so you step up to the Kanban board to figure out what to work on next.

Because there is no unassigned work in process, the rules of the board tell you that you should start work on a new story. So you look to the highest priority feature for a story. However, a different pair is working on a story in that feature already. You know that they are doings some foundational work for the feature so you can't start on another story until they finish.

So what do you do? You look to the next highest priority feature on the board. It's in the same situation with two stories in development blocking everything else. Sighing, you go to the third highest priority feature. You're in luck! No one has started working on stories for this feature yet. Giddy, you grab the highest priority story card, move it to the development queue to indicate you are working on it, and head back to your development workstation.

Euphoria quickly fades as you start to investigate the code you need to modify. You realize that, despite your best efforts to work in a different part of the code from everyone else, there's a particularly nasty bit that you'd love to refactor to make your story easier to implement. The problem is that it's smack dab in the middle of code being used by the other stories in flight. If you change it, it's going to make merging your changes and theirs extremely difficult, with high risk that something will break and you'll have to rework the implementation. And worse, that merge won't happen for at least a week, when both features are completed and merged back to the master version.

So you do what we humans do. Despite knowing that you should make the code better, you work around it. It's efficient, expedient and there's always another opportunity to refactor the code later, right?

And at this point, you've utterly failed as an agile developer. You've made three critical mistakes:

  1. You started work on a new feature while more important features were not completed. It will take that much longer before any of the features are finished and ready to ship. Congratulations. You've just delayed the release.
  2. You haven't made the code better. You've probably made it worse by not refactoring when you should have. Your implementation for the story is sub-optimal and you've added technical debt to the project. And no, there will never be a right time to refactor that nasty bit of code. Mazel Tov. Not only have you delayed this release, but you've made it that much harder to get every future release out on time.
  3. While you were gnashing your teeth over your problems, your pairing partner died of boredom, was resurrected as a zombie, ate the rest of the team, and once again, your actions have delayed the release.

Okay, maybe you only made two critical mistakes.

Joking aside, this was precisely the scenario my team was struggling with. We could see that our code was getting worse despite the fact that we were supposed to be set up for success.

Swarm to the Rescue!
As we discussed these problems in the retrospective, the first key insight was to ask ourselves how we could constrain the number of features in flight at any given time. If the entire team could work on the same feature, then we could refactor on our story branch and, when the story is completed, everyone else would get the changes. That means we can get code changes propagated to the rest of the story branches in hours or days instead of days or weeks. But first we had to solve the dependency issue.

In order to all be working on the same feature without running into dependency problems, we have to all work on the same story, known colloquially as swarming. That means that the team has to discuss the story, divide it into tasks and have each pair work on a task. Close collaboration is incredibly important since we want to make sure we are all working toward the same goal.

We know from reality that not everyone on the team will be able to work on the same story. So how do we share our code changes quickly? Instead of working on story branches, we will all work on the same feature branch. As one pair develops their story, they share their changes as they go. The benefit is that code changes are shared extremely quickly. The perceived disadvantage is that we can't release the feature until every story is done.

So we tried it and an interesting thing happened. It worked!

We had a very difficult story to implement. Rather than having one pair work on it, the whole team worked together. We discussed our progress constantly. We refactored and immediately shared the changes with everyone else. We made changes to the code that we'd wanted to make for months. Tasks were created and just as quickly abandoned as we recognized them as red herrings. The war room was filled with energy and excitement. In the end, we completed the story in a couple of days rather than a couple of weeks. (I told you it was a hard story!) More importantly, our code was better than when we started and our feature could be delivered that much quicker.

From this experience, our process for development, which we've only changed slightly, has fundamentally altered how we work. The rules now are:

  • Each feature is developed on its own branch.
  • As many developer pairs as possible work on the most important story in process.
  • Each story is developed directly on the feature branch, allowing stories to share code changes.

Final Thoughts
What we've done is nothing new. It's basically a hybrid of the decades-old "stable trunk" and "unstable trunk" methodologies, where we now have a stable master and unstable feature branches. So if we've known about and used these methodologies for years, where did we go wrong?

Our configuration management tools have led us down the wrong path. They make it so easy to develop code in isolation that they imply that we should develop code in isolation. Nothing can be further from the truth. Lean principles tell us that we want to complete stories and features as fast as possible. That means using the entire team on a single story as though they are of one mind. And the only effective way to do that is to swarm!

About the Author

Dr. Mark Balbes is Chief Technology Officer at Docuverus. He received his Ph.D. in Nuclear Physics from Duke University in 1992, then continued his research in nuclear astrophysics at Ohio State University. Dr. Balbes has worked in the industrial sector since 1995 applying his scientific expertise to the disciplines of software development. He has led teams as small as a few software developers to as large as a multi-national Engineering department with development centers in the U.S., Canada, and India. Whether serving as product manager, chief scientist, or chief architect, he provides both technical and thought leadership around Agile development, Agile architecture, and Agile project management principles.