We all know that engineering metrics are essential in guiding our decisions.
As engineering leaders, we're responsible for product development and the engineering team's experience. Athenian provides dozens of relevant data points for engineering leaders, but figuring out where to start can be challenging.
These are the top 9 metrics we recommend looking at, so you can pick 2 or 3 to start monitoring and discussing with your engineering teams.
Once you've picked your metrics, here's how you can use them to set your engineering org. up for success.
But before that, let's dive in!
1. PR Cycle Time
Definition
The time between a code change and it being run in production.
How
Also known as "Lead Time", Athenian calls this metric PR Cycle Time. The data is obtained from GitHub and customers' CI/CD system to understand how work is flowing. This data is then enriched with ticketing information.
Athenian mainly looks at how work flows, not how developers report it in the ticketing system. This gives customers full visibility of the entire development pipeline: from the first commit to code deployed in production without disrupting development teams.
Athenian brings visibility of customer PR Cycle Time and provides deep visibility into the stages the code goes through so teams can identify bottlenecks.
Outcome
Improve delivery speed by identifying bottlenecks.
2. Deployment Frequency
Definition
The number of deployments to production.
How
By having a customer deployment system calling the Athenian API, we can obtain the deployment frequency to production or any relevant environment for the engineering teams (e.g., staging environment).
Outcome
Improve agility by identifying and increasing deployment throughput.
3. Change Failure Rate
Definition
The percentage of failed deployments to production.
How
Customers can notify the Athenian API every time they perform a deployment, allowing Athenian to report the success ratio of deployment activities. Athenian presents the success ratio instead of the failure rate.
Outcome
Increase deployment quality by monitoring the percentage of deployments that are not successful.
4. Mean Time to Recover
Definition
The mean time to resolve an incident in production.
How
Athenian counts the time it takes for customer bugs to be acknowledged by teams (MTTA) and sums it with the time it takes the teams to solve issues (MTTRepair).
These two measurements allow customers to have clear visibility of their SLOs from the moment issues are reported into the ticketing tool until they are solved and deployed to production.
To accurately do this Athenian combines information from the ticketing system, GitHub, and the customer CI/CD system.
Outcome
Increase agility by decreasing response time to resolve issues.
5. CI Velocity
Definition
Average time to run test suit.
How
All GitHub actions are tracked to let customers know the average run time of their tests and other CI activities. You can drill down this information to identify improvement points in the CI system. For example, investigate the build run time to optimize it.
Outcome
Accelerate delivery by identifying CI bottlenecks.
6. CI Quality
Definition
The flakiness and success ratio of test suit.
How
Observe the results of test runs based on GitHub actions to understand the effectiveness of the running tests. Increase customer release confidence by understanding how reliable tests are (flakiness) and how many problems are caught before a release (failing checks).
Outcome
Speed up releases by monitoring quality of code.
7. Bug Resolution Ratio
Definition
The percentage of bugs solved versus bugs identified.
How
Athenian tracks all bugs submitted into the customer issue tracking system over time. This way, teams can understand how the bug backlog has been evolving, allowing them to properly manage the quality of the product.
Outcome
Ensure product quality by keeping bug ratio under control.
8. Code Complexity
Definition
The average size of the changes made to code.
How
Athenian measures the average PR size to help customers identify patterns of complex code changes, which typically increase deployment risk. Code complexity analysis is done by obtaining data from GitHub to understand, per team, what were the biggest changes made on the customer code.
Outcome
Reduce deployment risk by identifying large changes in the code base.
9. Team Investment
Definition
The distribution of time invested by teams.
How
Athenian tracks all work reported in the issue tracking system or through PRs, to help customers understand where their teams are investing more effort.
Customers can understand where teams are investing more effort by defining customizable work categories to increase the visibility of how different activities are moving forward.
In addition, Athenian allows you to categorize the same items across different views to increase visibility. For example, obtain all investment on bugs while identifying the distribution of customer-facing bugs and bugs discovered internally.
Outcome
Improve decisions by having clear visibility of team investment levels.
Ready to see how you're engineering org is doing on these metrics? Let's get started!
Oh and we made a cheat sheet for you!
Save it, share it, print it, stick in on your office wall (or fridge, if you're WFH).