News

What Makes the Most Successful Software Engineering Teams? CircleCI Reports Suggests Leading Performance Indicators

In its recently published analysis of data from millions of workflows on its namesake Continuous Integration and Delivery (CI/CD) platform, CircleCI has identified a set of benchmarks it claims are routinely met by the highest performing engineering teams.

In its "2022 State of Software Delivery Report, the CircleCI researchers found that the most successful team:

• Prioritize being in a state of deploy-readiness, rather than the number of workflows run
• Kept their workflow durations to between five to ten minutes on average
• Recovered from any failed runs by fixing or reverting in under an hour
• Had success rates above 90% for the default branch of their application

"To achieve top-performing status costs time and money but it’s clear that more organizations are realizing that it’s worth the investment," wrote the report's author, Ron Powell, the company's manager of marketing insights and strategy. "Business leaders that outfit their teams with the most performant and powerful tools allow their software teams to be engines of innovation, unlocking new ways for their entire company to operate more effectively and opportunities to get better products to customers sooner."

The report also defined a set of baseline metrics for engineering teams to target to deliver software at scale: Duration (the length of time it takes for a workflow to run), Mean Time to Recovery, (the average time between a workflow’s failure and its next success), Throughput (the average number of workflow runs per day), and Success Rate (the number of successful runs divided by the total number of runs over a period of time).

Duration: "Our industry-leading benchmark for duration is 10 minutes because it’s essential to maximize the amount of information you can get from your pipeline while still moving as quickly as possible," the report's author wrote. "10 minutes is where we feel developers can move fast without losing focus and will benefit from the volume of information generated through their CI pipelines — it’s the optimal time for fast feedback, robust data, and speed."

To reduce the duration, the report recommends:
• Using test splitting to split tests and take advantage of parallelism. Splitting tests by timing data is particularly efficient.
• Using Docker images made specifically for CI. Fast spin-up of lean, deterministic images for your testing environment saves you time.
• Using caching strategies that allow you to reuse existing data from previous builds and workflows.
• Using the optimal size machine to run your workflow. Larger jobs benefit from more compute and run faster on larger instances.

Mean Time to Recovery (MTTR): This metric is the most important on the list, the author says. "The ability of your team to recover as
quickly as possible when an update fails, time and time again, is the ultimate goal of Agile development teams," he wrote.

To lower you MTTR, the report recommends:
• Optimizing duration first.
• Using tooling that supports the rapid identification of failure information through the UI and through messaging, such as Twilio,
Slack, and PagerDuty, which allow the user to be notified as soon as possible when a failure occurs.
• Writing tests that include expert error reporting will help you quickly identify what the problem is when you go to fix it.
• Debugging on the remote machine that fails. "The ability to SSH (Secure Shell Protocol) onto the failed machine of a workflow is massively helpful for an engineer who is still looking for clues as to why an error occurred. Rich, robust, and verbose log output is useful without access to the remote machine," Powell wrote.

Throughput: "Measuring your baseline Throughput and then monitoring for fluctuations will tell you more about the health of your development pipeline than aiming for an arbitrary Throughput number or comparing your stat to others. A particular number of deploys per day is not the goal but continuous validation of your codebase via your pipeline is.

To achieve optimal throughput, the report recommends:
• It's more valuable for organizations to see their own changes and progress week-over-week than it is to compare to industry standards. "Once your development patterns have been decided, your Throughput baseline can be measured and then observed for health and efficiency," Powell wrote.
• Prioritize lean, Agile software development patterns that involve small, incremental changes to projects with a full suite of automated testing that runs on every commit.

Success Rate: The ability to measure the Success Rate of your current workflows will be essential in establishing targets for your team, Powell said. "Remember, failed builds are not a bad thing, especially if you are getting a fast, valuable feedback signal, and your team can resolve issues quickly," he wrote

To achieve an optimal success rate, the report recommends:
• Choosing a Git-flow model, such as short-lived feature branch development or long-lived development branches that allow your team to innovate without polluting the primary branch will keep your product stable and deployable.
• Monitor the Success Rate on these branches along with MTTR. "Low success accompanied by long MTTR is a sign that your testing output is not sufficient for debugging and resolving issues quickly," Powell wrote.

"The bottom line is that the four metrics together provide a constant feedback loop to give you better visibility into your software development pipeline," he concluded. "Remember, the goal isn’t to make updates to your application; the goal is to constantly innovate on your software while preventing the introduction of faulty changes.

The report's conclusions were derived from an analysis of millions of workflows from thousands of organizations over hundreds of thousands of projects. In addition to meeting these four benchmarks, the report concluded that the most successful teams are larger and build extensive testing into their DevOps practice.

About the Author

John K. Waters is the editor in chief of a number of Converge360.com sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS.  He can be reached at [email protected].