You Have Metrics โ Now What? ๐ง
How to turn metrics into effective processes and improvement loops.
Hey there! Last month we published an article about the recent shifts in measuring developer productivity, based on my chat with Abi Noda, CEO of DX.
The piece had an awesome reception (thank you ๐) and many of you asked via email about what we defined as the next frontier of metrics, that is, their operationalization โ how to turn metrics into effective processes and improvement loops.
So, to talk about about this, today we are bringing in Laura Tacho, CTO at DX and former guest of our own podcast!
Here is the agenda for today:
๐ฏ Use cases for metrics โ who is using metrics? Talking about engineering orgs and systems teams.
โ๏ธ Activities with metrics โ the difference between diagnostic and improvement metrics.
๐บ๏ธ Metric mapping โ how to get from the big picture to actionable improvements for your processes.
๐โโ๏ธ Getting started โ putting everything together in a three-step process.
Letโs dive in!
I am a fan of what Laura is building at DX, and I am grateful to have them as partners on this piece. You can learn more below ๐
Hey, Laura here!
Developer productivity metrics are no stranger to the Refactoring audience. Theyโre on everyoneโs mind, answering questions like โhow do we know if weโre moving as fast as we can?โ or โhow can I measure the impact of tooling investments, like using GenAI for coding?โ
Right now, a lot of conversations about metrics are focused on what to measure. While I donโt believe that problem is perfectly solved, starting with a framework like the DX Core 4 helps you skip through a lot of the brute forcing and guessing as to what metrics really matter.
Now we can bring the conversation up the problem ladder and talk about another thorny problem: once you have metrics, what on earth are you supposed to do with them?
Itโs very common for teams to struggle with interpreting and acting upon developer productivity metrics โ even high-performing teams that effectively use data in other parts of their work. Using data for continuous improvement is a separate skill that needs support at the organizational level.
We frequently talk with organizations who say โweโve just spent the last 6 months setting up DORA metrics, and now weโre trying to figure out what to do with them.โ
When this happens, organizations often fall into the trap of:
โฉ๏ธ Reverting to old habits โ simply adding the metrics to leadership reports without driving real change.
๐๏ธ Overwhelming teams with data โ expecting teams to derive meaning from hundreds of measurements without providing adequate support or clear expectations.
๐ Failing to connect metrics with decision-making โ collecting data that sits unused in dashboards rather than influencing team behavior and strategy.
Instead, high-performing engineering teams approach developer productivity metrics with the following questions:
Who is this data for?
Are we diagnosing or improving?
How will this data be used in decision-making?
By answering these questions, you can move from data collection to real impact, making your developer productivity metrics truly useful.
The key to making metrics useful is to integrate them into decision-making processes at every levelโengineering teams, leadership, and platform teamsโwhile ensuring that the right people are looking at the right data at the right time.
๐ฏ Use cases for metrics โ engineering orgs vs systems teams
There are two primary use cases for developer productivity metrics:
๐ข Engineering Organizations โ these teams use metrics to assess overall efficiency and drive continuous improvement at the organizational level. Leadership uses this data to guide transformation efforts, ensure alignment with business goals, increase quality, and improve engineering velocity. Teams use metrics to make daily and weekly decisions to improve their own performance and velocity.
๐๏ธ Systems teams (Platform Eng, DevEx, DevProd) โ these teams use metrics to understand how engineering teams interact with internal systems and to assess the ROI of DevEx investments. These metrics are existential for these teams: they need to show the impact of their work in order to measure the success of their investment. Measurements are also crucial for setting future priorities.
Understanding this use case will guide your approach to collecting the data (what metrics do I need to fulfill my use case?) as well as your approach to interrogating the data (what questions am I trying to answer, and for what purpose?).
โ๏ธ Activities with metrics โ diagnostics vs improvement
Metrics have certain characteristics that allow them to be most useful in certain contexts. Some metrics help us see trends, while others can drive daily decisions at the team level.
We call this the difference between diagnostic and improvement metrics:
๐ฉบ Diagnostic Metrics โ these are high-level, summary metrics that provide insights into trends over time. They are collected with lower frequency, benefit from industry benchmarks to contextualize performance, and are best used for directional or strategic decision making. Examples: DX Core 4 primary metrics, DORA metrics.
๐ง Improvement Metrics โ these metrics drive behavior change. They are collected with higher frequency, are focused on smaller variables, and are often in teamsโ locus of control.
Use this table for guidance on how to generally distinguish between a diagnostic and improvement metric ๐
So, you may go to the doctor once a year and get a blood panel to look at your cholesterol, glucose, or iron levels. This is a diagnostic metric: meant to show you a high-level overview of your total health, and meant to be an input into other systems (like changing your diet to include more iron-rich foods).
From this diagnostic, more granular improvement metrics can be defined. Some people wear a Continuous Glucose Monitor to keep an eye on their blood glucose after their test indicated that they should work on improving their metabolic health. This real-time data helps them make fine-tuned decisions each day. Then, we expect to see the sum of this effort reflected in the next diagnostic measurement.
For engineering organizations, a diagnostic measurement like PR Throughput can show an overall picture of velocity, as well as contextualizing your performance through the use of industry benchmarks.
Organizations that want to drive velocity then need to identify improvement metrics that support this goal, such as time to first PR review.
For example, they could get a ping in their team Slack to let them know when a new PR is awaiting review, or when a PR has crossed a threshold of time without an approval.
These metrics are more granular and targeted, and allow the team to make in-the-moment decisions to drive improvement.
๐บ๏ธ Metric Mapping
You can get from a bigger-picture diagnostic metric to an actionable improvement metric through a process called metric mapping:
Start with your diagnostic metric, for example, Change Failure Rate.
Think about the boundaries of this metric โ what is the big idea of what the metric is trying to capture? What are the starting and ending points of any processes it measures, and does it include any sub-processes? What areas of your system would need to improve in order to influence this metric? What do developers think about it?
The answers to these questions will give you smaller, more actionable measurements that are easier for teams to reason about, and more likely to be within a teamโs locus of control, or the area where they have autonomy and influence.
Letโs use Change Failure Rate as an example: