As an engineering leader, your role is to build a high-performing team that can scale. But this can be challenging, especially as the team grows. Identifying your bottlenecks is key to building high-performing teams and it’s a good first step to creating a productive environment and building a culture of continuous improvement.
Unfortunately, bottlenecks are not always easy to spot. Your gut may be telling you where the problem is, but even if you're right, how can you get an action plan rolling without data? What if there are other hidden bottlenecks? What if you are wrong?
The best way to identify bottlenecks is by having in-depth, end-to-end visibility of your software delivery pipeline. I'm talking about actionable data and metrics.
By separating the five phases of the delivery pipeline, Athenian highlights any issues that may be reducing the speed and quality of your team's output. And today, I'm going to show you how you can use these metrics to pinpoint the bottlenecks in your engineering process.
We asked 50 engineering leaders: "at what stage of the software engineering pipeline do you find the most bottlenecks?" and more than 50% said "Review." So let's explore an example of how you can dig deeper using data and figure out what's causing issues in your Review Time.
💡 You can use this process for any stage or issue in your software delivery pipeline. You can replace Review Time with MTTR, Merge Time, etc. it's all about asking the right questions and looking for the right metrics to answer them.
Step 1: Investigate Your Suspicions
Business is booming, so you added 20 new people to your engineering team. You expect productivity to rise and more features to be released faster. However, after a month, you notice the opposite has happened.
Your gut tells you that there’s an issue in the review time - you heard a few engineers mention that they were waiting on code reviews in the sprint retrospective. So you decide to check the Review Time.
That’s a great place to start:
Your hunch was right, your review time has increased by 39% last week. But this doesn’t tell you where the bottlenecks are. So let’s look further into this.
Step 2: Branch Out Your Analysis
Check if this issue is linked to a specific repository:
As you can see, one of your repositories “precomputer” has an average review time that is significantly larger than the others.
Then, check if the Review time issue is linked to a specific team:
You notice that the Product Team has been spending more time in Review compared to the others.
Step 3: Dig Deeper & Compare
Now it’s time to identify what has led to this increase in Review Time by looking at the right leading indicators and asking the right questions.
You know that Review Time is linked to a couple of leading indicators. Checking metrics that are linked to your KPIs is crucial. It will lead to a well-founded diagnosis and help determine exactly what to measure when you decide what your next steps are.
💡 From our experience, this is where most engineering leaders fail to leverage engineering metrics. Looking at DORA metrics is a great first step, but where do you start if your deployment frequency is decreasing? You need to dig deeper into the root causes that you have an impact on.
So, what is a leading indicator to review time?
1. “Wait Time for 1st Review” - Are we spending more time waiting to start a review?
There was an increase in “Wait Time for 1st Review” in the last month!
2. Code Reviews / Pull Request - Are we having more reviews per PRs?
This doesn’t seem to be the issue, although there was an increase in one of the PRs, this has stayed mostly below the threshold.
3. Review Comments / Pull Request - Are we having more comments per PR?
Review comments per pull request increased but decreased in the last couple of days.
4. Participants / Pull Request - Are we having more participants per PR?
The nº of Participants has definitely increased, but this is expected since the team has grown and it’s a great way to onboard new hires.
Overall, it looks like Comments per PR, and Participants per PR increased- but that's not a bad indicator. These are symptoms of a growing team. Regardless, you can now try to optimize on these items, and particularly focus your efforts on improving Wait Time for First Review.
Step 4: Take action
Now it's time to align the team by shedding light on this issue and then drawing a plan to optimize review time so this doesn't become a bigger problem later. A high-performing team should be aware of these bottlenecks frequently - this is what will allow your engineering org to scale.
We dove deeper into this approach in The Engineering Leader's Process for Continuous Improvement.