You Have Metrics β Now What? π§
How to turn metrics into effective processes and improvement loops.
Hey there! Last month we published an article about the recent shifts in measuring developer productivity, based on my chat with Abi Noda, CEO of DX.
The piece had an awesome reception (thank you π) and many of you asked via email about what we defined as the next frontier of metrics, that is, their operationalization β how to turn metrics into effective processes and improvement loops.
So, to talk about about this, today we are bringing in Laura Tacho, CTO at DX and former guest of our own podcast!
Here is the agenda for today:
π― Use cases for metrics β who is using metrics? Talking about engineering orgs and systems teams.
βοΈ Activities with metrics β the difference between diagnostic and improvement metrics.
πΊοΈ Metric mapping β how to get from the big picture to actionable improvements for your processes.
πββοΈ Getting started β putting everything together in a three-step process.
Letβs dive in!
I am a fan of what Laura is building at DX, and I am grateful to have them as partners on this piece. You can learn more below π
Hey, Laura here!
Developer productivity metrics are no stranger to the Refactoring audience. Theyβre on everyoneβs mind, answering questions like βhow do we know if weβre moving as fast as we can?β or βhow can I measure the impact of tooling investments, like using GenAI for coding?β
Right now, a lot of conversations about metrics are focused on what to measure. While I donβt believe that problem is perfectly solved, starting with a framework like the DX Core 4 helps you skip through a lot of the brute forcing and guessing as to what metrics really matter.
Now we can bring the conversation up the problem ladder and talk about another thorny problem: once you have metrics, what on earth are you supposed to do with them?
Itβs very common for teams to struggle with interpreting and acting upon developer productivity metrics β even high-performing teams that effectively use data in other parts of their work. Using data for continuous improvement is a separate skill that needs support at the organizational level.
We frequently talk with organizations who say βweβve just spent the last 6 months setting up DORA metrics, and now weβre trying to figure out what to do with them.β
When this happens, organizations often fall into the trap of:
β©οΈ Reverting to old habits β simply adding the metrics to leadership reports without driving real change.
ποΈ Overwhelming teams with data β expecting teams to derive meaning from hundreds of measurements without providing adequate support or clear expectations.
π Failing to connect metrics with decision-making β collecting data that sits unused in dashboards rather than influencing team behavior and strategy.
Instead, high-performing engineering teams approach developer productivity metrics with the following questions:
Who is this data for?
Are we diagnosing or improving?
How will this data be used in decision-making?
By answering these questions, you can move from data collection to real impact, making your developer productivity metrics truly useful.
The key to making metrics useful is to integrate them into decision-making processes at every levelβengineering teams, leadership, and platform teamsβwhile ensuring that the right people are looking at the right data at the right time.
π― Use cases for metrics β engineering orgs vs systems teams
There are two primary use cases for developer productivity metrics:
π’ Engineering Organizations β these teams use metrics to assess overall efficiency and drive continuous improvement at the organizational level. Leadership uses this data to guide transformation efforts, ensure alignment with business goals, increase quality, and improve engineering velocity. Teams use metrics to make daily and weekly decisions to improve their own performance and velocity.
ποΈ Systems teams (Platform Eng, DevEx, DevProd) β these teams use metrics to understand how engineering teams interact with internal systems and to assess the ROI of DevEx investments. These metrics are existential for these teams: they need to show the impact of their work in order to measure the success of their investment. Measurements are also crucial for setting future priorities.
Understanding this use case will guide your approach to collecting the data (what metrics do I need to fulfill my use case?) as well as your approach to interrogating the data (what questions am I trying to answer, and for what purpose?).
βοΈ Activities with metrics β diagnostics vs improvement
Metrics have certain characteristics that allow them to be most useful in certain contexts. Some metrics help us see trends, while others can drive daily decisions at the team level.
We call this the difference between diagnostic and improvement metrics:
π©Ί Diagnostic Metrics β these are high-level, summary metrics that provide insights into trends over time. They are collected with lower frequency, benefit from industry benchmarks to contextualize performance, and are best used for directional or strategic decision making. Examples: DX Core 4 primary metrics, DORA metrics.
π§ Improvement Metrics β these metrics drive behavior change. They are collected with higher frequency, are focused on smaller variables, and are often in teamsβ locus of control.
Use this table for guidance on how to generally distinguish between a diagnostic and improvement metric π
So, you may go to the doctor once a year and get a blood panel to look at your cholesterol, glucose, or iron levels. This is a diagnostic metric: meant to show you a high-level overview of your total health, and meant to be an input into other systems (like changing your diet to include more iron-rich foods).
From this diagnostic, more granular improvement metrics can be defined. Some people wear a Continuous Glucose Monitor to keep an eye on their blood glucose after their test indicated that they should work on improving their metabolic health. This real-time data helps them make fine-tuned decisions each day. Then, we expect to see the sum of this effort reflected in the next diagnostic measurement.
For engineering organizations, a diagnostic measurement like PR Throughput can show an overall picture of velocity, as well as contextualizing your performance through the use of industry benchmarks.
Organizations that want to drive velocity then need to identify improvement metrics that support this goal, such as time to first PR review.
For example, they could get a ping in their team Slack to let them know when a new PR is awaiting review, or when a PR has crossed a threshold of time without an approval.
These metrics are more granular and targeted, and allow the team to make in-the-moment decisions to drive improvement.
πΊοΈ Metric Mapping
You can get from a bigger-picture diagnostic metric to an actionable improvement metric through a process called metric mapping:
Start with your diagnostic metric, for example, Change Failure Rate.
Think about the boundaries of this metric β what is the big idea of what the metric is trying to capture? What are the starting and ending points of any processes it measures, and does it include any sub-processes? What areas of your system would need to improve in order to influence this metric? What do developers think about it?
The answers to these questions will give you smaller, more actionable measurements that are easier for teams to reason about, and more likely to be within a teamβs locus of control, or the area where they have autonomy and influence.
Letβs use Change Failure Rate as an example:
What is the big idea the metric is trying to capture? Software quality
What are the starting and ending points of any processes it measures, and does it include any sub-processes? CFR is the result of a few different processes: local testing workflows, CI/CD, QA (if any), and is influenced by batch size, build speed, test flakiness, etc.
What areas of your system would need to improve in order to influence this metric? We know that our CI processes are slow and unreliable. We also work on really big changes most of the time, and we know that bigger changes are riskier to deliver.
What do developers think about the big idea? We can measure satisfaction with software quality to see if weβre heading in the right direction with all of these other interventions.
The hypothesis? If this team reduces batch size, improves CI flakiness, and increases satisfaction with quality practices, then Change Failure Rate will decrease. The improvement metrics give teams a clearer picture of where to focus.
πββοΈ Getting Started
So letβs say you are sold on this β how do you actually get started? Go for a three-step process:
1) π Identify your use case and activities
When approaching data, you need to ask yourself what is your use case and the related activity:
Use case β is this data meant for an engineering organization trying to improve, or is this data for a platform engineering team assessing the impact of their work? This is your use case.
Activity β is this data meant to show me high-level trends, and be useful in a report-card style report? Or is it meant to be zoomed-in and granular, focusing on a specific action or decision? This is your activity, either diagnosing or improving.
If you're leading an engineering org looking to improve efficiency, focus on diagnostic metrics first to identify key problem areas, then use improvement metrics to guide day-to-day actions.
If you're on a platform team, use diagnostic metrics to demonstrate success and adoption, and to identify new opportunities for impact. Then use improvement metrics to iterate on internal tools and processes.
2) π Set expectations for how metrics will be used
βIf you build it, they will comeβ doesnβt apply here. Donβt assume that making metrics available will lead to action. Without clear expectations and a system of accountability, itβs easy for metrics and continuous improvement to take a backseat to delivery pressures.
Pressurize the system β senior leadership should emphasize the importance of metrics in evaluating success and setting priorities. Microsoftβs Edge Thrive initiative, for example, ensures that engineering leaders are accountable for their productivity metrics, which trickles down to teams.
Integrate metrics into organizational workflows β use metrics in all areas of the business, like retrospectives, planning meetings, and all-hands meetings. Leaders should be talking about these metrics at every opportunity. Even if you feel like a broken record, it's important to keep the message consistent and top of mind. For example, if your teams are trying to decide whether to prioritize improvements to code coverage or building a new feature, looking at a quality metric like Change Failure Rate can guide the discussion.
3) π Make change the goal
Metrics are only useful if they lead to action. To ensure metrics drive change:
Tell a story with data β rather than presenting raw numbers, frame metrics in the context of progress toward key business goals.
Use industry benchmarks for context β comparing your organizationβs metrics to industry benchmarks can help make data actionable. You can download the full set of DX Core 4 benchmarks here.
Mix qualitative and quantitative data β looking at quantitative data from systems can tell you what is happening, but only self-reported data from developers can tell you why. For improvement, the βwhyβ is critical.
By structuring your approach using the dimensions of use case (engineering org vs. platform teams) and activity type (diagnostic vs. improvement), you can ensure that data is driving meaningful change rather than becoming an overwhelming reporting exercise.
Next time you find yourself wondering, "Now what?" after collecting developer productivity metrics, ask:
Who is this data for?
Are we diagnosing or improving?
How will this data be used in decision-making?
By answering these questions, you can move from data collection to real impact, making your developer productivity metrics truly useful.
π Bottom line
And thatβs it for today! Here are the main takeaways:
π Metrics alone aren't enough β most organizations struggle with what to do after collecting data. Without proper implementation, metrics often end up unused in dashboards or reports.
π Know your purpose β before implementing metrics, clarify if you're using them for diagnostics (broad trends) or improvement (driving specific behaviors), and whether they're for engineering organizations or platform teams.
βοΈ Diagnostic vs Improvement metrics β diagnostic metrics (like DORA) help identify trends quarterly/monthly, while improvement metrics drive daily/weekly behaviors with specific, actionable insights.
πΊοΈ Metric mapping β transform high-level diagnostic metrics into actionable improvement ones by analyzing boundaries, processes, and developer feedback to identify specific areas within teams' control.
π’ Integration is crucial β leadership must pressurize the system by incorporating metrics into workflows, planning meetings, and retrospectives to create accountability.
π Change is the goal β tell stories with your data, use industry benchmarks for context, and combine quantitative and qualitative feedback to drive meaningful improvement rather than just collecting numbers.
I am a fan of what Laura is building at DX, and I am grateful to have them as partners on this piece. You can learn more about DX below π
See you next week! π
Sincerely
Luca