Engineering Productivity

5 Common Issues In The Software Delivery Pipeline

Eiso Kant
Eiso Kant
July 12, 2021
5 min read

Even High-Performance Organisations need to actively monitor their Software Delivery Pipeline to improve. But you don’t need to turn your team processes upside down, build highly scalable infrastructure or have 100% automated test code coverage for this. 

This article covers 5 quick but important wins that will help you improve your Software Delivery Pipeline and become a High-Performance Organisation, with guidance on how to use Athenian to spot them. 

We'll cover the problem, and then the action item with examples from High-Performance Organisations.

  1. Large Pull Request size
  2. Code review waiting time and flexible review policy
  3. Long Pull Request review discussions
  4. Merge blockers
  5. Keep a healthy ratio between new and released features

Large Pull Request size

First, let’s understand the effects of large Pull Requests and why you should avoid them. Even though there are cases where they can’t be completely avoided (e.g. removing deprecated code or updating an embedded library), these should always be exceptions to the rule. Large Pull Requests affect almost every aspect of the Software Delivery Pipeline:

  1. Work In Progress takes more days
  2. Code Reviews are ineffective 
  3. Merge and Release delays are introduced due to complexity

Taking into account the above, Pull Request size can’t just be measured by lines of code. You should structure your work so that:

  1. Pull Requests are submitted by each developer at least once per day
  2. Pull Requests should only include code related to one meaningful feature
  3. Code Reviews are most effective by having changes between 200 and 400 lines of code

In this context, it’s important to understand how to structure work. At SquareSpace work is structured into ‘vertical slices’, which is the smallest unit of work that ships meaningful functionality to the user. This can be a new feature, bug fix or usability improvement. Pull Requests have to contain code related to this. A similar concept is also described in a book Shape Up by the Basecamp team.

In practice, a review of 200 - 400 LOC over 60 to 90 minutes should yield 70 - 90% defect discovery. - Cisco

Work on the smallest unit of work that ships meaningful functionality to the user. - SquareSpace

Work In Progress, Review, Merge and Release time can all Many metrics will be affected by too large Pull Requests. With Athenian you can check how long it takes for each Pull Request to pass in each of the stages of the Software Delivery Pipeline. You can also see how many lines of code were affected, and other consequences of large Pull Requests such as review waiting time.

Athenian's Software Delivery Chart on Pull Request Size.

Code review waiting time and flexible review policy

It’s not unusual to see Pull Requests waiting a few days to be reviewed. After all, a proper code review takes time. But it’s not just the reviewer’s responsibility to quickly jump to review. There are many things the author can do beforehand to make anyone’s life easier:

  1. Do a self code review
  2. Commit small and incremental changes
  3. Notify reviewers about a new code review coming beforehand
  4. Describe the purpose and motivation of the change
  5. Run tests before submitting code review

In High-Performance Organisations, finished work is prioritized before work that’s in progress. And so code reviews, especially smaller ones, shouldn’t be waiting for more than an hour. Following analysis conducted at Athenian, more than 20 percent of Pull Requests impact less than 10 lines of code. This shouldn’t be neglected!

If code reviews are required, they should be performed synchronously: when the developer is ready to commit the code, they should ask somebody else on the team to review the code right then. They should not ask for an asynchronous review. - Google

Skipping code reviews is only advisable for trivial changes that do not change the logic such as commenting, formatting issues, renaming of a local variable, or stylistic fixes. -

With Athenian you can quickly spot Pull Requests blocked by not being reviewed on time. You can also check the average waiting time for your Pull Requests over a longer time period, to see if your team is making progress. 

Athenian's Software Delivery Chart on Wait Time for First Review.

Long Pull Request review discussions

It can happen that a single Pull Request has 50+ comments and a review time of weeks. It’s also true that a longer discussion can be needed to resolve the concerns that reviewers have raised, for example:

  1. With a non-functioning feature
  2. A lack of familiarity with existing code
  3. A lack of rational discussion between authors and reviewers
  4. General confusion

As a result, the team’s output quality decreases when the merge decision is delayed and the number of messages exchanged increases. To resolve long discussions, we should identify them before they happen. Many software development teams have adopted the following practices to avoid these discussions during code review:

  1. Discuss them offline (ie. not through forum)
  2. Focus code reviews around concepts rather than single lines of code
  3. Assign reviewers who wrote related code

Don‘t hesitate to suggest a quick meeting (face-to-face or via VC). - Chromium

To enhance great core review culture and maintain consistency create a code review checklist. - Thoughbot, Hadoop

Athenian helps you identify code review bottlenecks before they impact your Software Delivery Pipeline. You can quickly sort ongoing Pull Requests by time from request or by number of comments, which will identify those blocked or in need of close monitoring.

How Athenian shows you your created Pull Requests.

Merge blockers

You’ve maybe heard the saying “Never deploy on a Friday”. That’s because we want to minimize the risk of also deploying new issues. It’s hard to know if a release is 100% bug-free!

As a consequence, some companies don’t trust everyone to merge code into the production branch. This accountability is assigned only to specific people, who do the final testing before updates are delivered to customers. The downsides of this are that:

  1. The merge process becomes a bottleneck reliant on specific people
  2. The delivered code is usually only manually tested
  3. Large amounts of code become released at the same time
  4. Accountability for problems are not distributed equitably

To overcome these problems, High-Performance Organisations utilize Automated Testing and Continuous Integration tools. This has a few main advantages by:

  1. Distributing accountability amongst those who wrote the code
  2. Giving the ability to release new features on a daily basis
  3. Introducing many tools that monitor the production services
  4. Helping developers be comfortable with the code they write

To successfully introduce Continuous Integration process into your team follow the next steps:

  1. Enable linter and other static analysis tools
  2. Start writing tests for critical parts of the codebase
  3. Add tools for proactive monitoring of production environment
  4. Use a CI service to run automatic tests on every push to the main repository
  5. Have your team integrate changes every day
  6. Fix the build as soon as it breaks
  7. Write tests for every new story that you implement

A team that develops a system is also responsible for operating and supporting that system - Netflix

The whole process of merging accepted Pull Request takes around 10 minutes end-to-end - Intercom

With Athenian you can easily track which repositories take the longest to merge newly created features, as well as how many daily merges there are.

The Merge Time chart from Athenian.

Keep a healthy ration between new and released features

If a team’s output starts to slow the backlog will inevitably fill, leading to developers working simultaneously on multiple tasks. The main reasons for non-productive teams are:

  1. An overcommitment to too many products
  2. Too many separate tasks per engineer
  3. Too large chunks of work per engineer

A great metric to measure if your team members are mostly focused on one task at a time is the Pull Request Ratio Flow. If the ratio between opened and closed Pull Requests is way above 1.00, we can anticipate congestion as developers are multi-tasking.

To keep a healthy ratio between opened and closed Pull Requests:

  1. Teams should commit to fewer product features at a time
  2. Engineering should commit their code daily, shortening the feedback loop
  3. Engineering should monitor and reduce the number of bugs over time

We also need to keep in mind that developers need to multitask between planning, developing, reviewing, and fixing bugs. If we interrupt any of these, it’ll increase the time to finish. As a consequence, work should be structured in smaller pieces to avoid the possibility of interruption and breaking focus. 

We constantly saw that the Work In Progress ration per engineer was over 2. In three months we were able to bring it much closer to 1. - Soundcloud

With continuous delivery, engineers don’t have to wait a week or longer to get feedback about a change they made. - Facebook

With Athenian you can easily monitor your Pull Request Ratio Flow. If it is above 1 we recommend you to closely monitor your Review, Merge and Release times to avoid congestion. 

The Pull Request Ratio Flow chart from Athenian.

Get engineering leadership
content in your inbox!

Sign up for tidbits and bigbits of engineering leadership knowledge.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.