Hey! Luca here — welcome to a new edition of Refactoring.
To figure out what’s really happening in software engineering, I usually do three things:
I try things myself — recently by working on Tolaria.
I speak with people who know better — on the podcast and in the community.
I go through relevant research — to go big picture and get a grasp of the state of the industry.
Good research is hard to find, but a few weeks ago I was sent a copy of this report by the guys at Faros (thank you for that 🙏), which was very interesting for two reasons:
Quantitatively — it’s one of the largest reports I have ever seen about AI coding: it surveyed 22K+ developers across 4K+ teams.
Qualitatively — it’s very thorough. It doesn’t stop at measuring activity, but tries to understand the downstream impact of AI on all the stages of the SDLC.
It also paints an interesting trajectory when compared with other similar works from the past months, like the State of Software Delivery, which we covered just recently, and DORA’s State of AI Assisted Development from the end of last year.
So today I will report and comment on the main findings, trying to interpret them and connect the dots with the other things I have been observing lately.
Here is the agenda, with my main takeaways:
🏁 Finishing work is hard — starting is now easy, and shipping is hard.
🔍 Reviews are utterly broken — it’s now obvious that we need to do something about them.
🔥 Poor quality is reaching prod (anyway) — production is suffering, big time.
🪴 The way forward — some words of hope to close this on a high note!
Let’s dive in!
I want to thank the guys at Faros for giving me preview access to these results and partnering on this. If you want to download the full report, you can find it below 👇
🏁 Finishing work is hard
Before we get to the first idea, there is an important preamble to be made.
We have said many times that AI is an amplifier, and that the *average* numbers of these reports don’t say a lot because they mix up the good teams that are getting a lot out of AI, with the average/bad ones that are not getting anything, or sometimes even negative results.
Now, Faros literally opens the report by saying that no — this is not the case anymore:
Across every downstream stage, the signal is the same: volume is up, quality is down, and the gap between the two is widening as adoption deepens. High-performing teams are experiencing the same downstream degradation as the median and average ones.
Words of hope! But what does it mean that volume is up and quality is down?
If we look at software dev as a pipeline, this means three things:
More work gets started, but
The amount of work that gets shipped is pretty much the same, and
The quality of such work is noticeably lower
Let’s go through all of this, starting with the most important point to me: starting work is now easier, but finishing it is harder than ever.
In particular, developers do more work indeed, by virtue of working on more things at the same time:
+67% PRs touched / day
+18% separate tasks
+33% tasks completed, as measured by opening a final PR about them.
The problem is that such work doesn’t get to production just as fast. In fact, teams are noticeably slower at the last mile — shipping:
+26% in-progress tasks stalled for 7+ days
+14% work restarts per developer (tasks returning in progress after moving to another stage)
-12% deployments per week
Qualitatively this is the same trend that both the State of Software Delivery and the latest DORA report found, but quantitatively worse. In fact, the numbers about coding activity and those about deployment frequency are respectively better and worse than previous reports, pointing at even more polarization.
One of the most visible results of this trend is lead time skyrocketing to a +480%, fueled by an avg +80% waiting time between each of the pipeline steps.
This is not surprising if we remember that:
Lead Time = Work In Progress / Throughput
So if WIP grows and throughput stays the same, lead time is bound to grow. But why does the throughput stay the same? Where’s the bottleneck?
Enter the elephant in the room: code reviews 👇




