Code Quality in the Age of AI ✅
How to create a process for quality throughout the whole SDLC.
Last month, the 2024 DORA report came out, and as always it was full of interesting insights.
A good part of it explored the impact of AI on engineering teams, and reported a surprising finding:
Despite AI’s potential benefits, our research revealed [that] AI adoption may negatively impact software delivery performance. As AI adoption increased, it was accompanied by an estimated decrease in delivery throughput by 1.5%, and an estimated reduction in delivery stability by 7.2%.
So, in many cases, the gains in pure coding throughput do not translate into general software delivery improvements.
Why is that?
The answer is unclear — I guess it will take years to figure out all the downstream effects of AI in coding. The key tradeoff, though, looks simple: AI allows to write more code, faster, in exchange for it being less under our control.
It is easy to see how bad things can happen from this: faster and sloppier coding leads to bigger batches, which equals more risk, and leads to more regressions and maintenance down the line.
This is not inevitable. I am sure the best teams will be able to constrain the use of AI under their own good practices, and make it a huge net positive — but it will take work.
In particular, I believe it will require a new, intentional approach to code quality.
Quoting Joel Chippindale, our coach-in-residence:
"High quality code is easy to keep changing"
But how do we ensure this when more and more of our code is written by AI?
This is what we are going to explore today. To do this, I am partnering with Cèdric Teyton, CTO of Packmind. Cédric works at the forefront of this transition, building tools that help enforce quality through AI. So, we put together our respective experiences to create the best guide we are capable of.
Here is the agenda for today:
📖 What makes code easy to change — let’s explore the fundamentals of maintainable code, from readability to testing, and why they matter more than ever.
🤖 Coding in the Age of AI — how AI is changing the way we write and maintain code, and why this makes quality even more crucial.
🔄 The Lifecycle of Quality — a practical framework for building systems that consistently produce good code, from early practices to final reviews.
Let’s dive in!
Disclaimer: I am a fan of Packmind and what Cèdric is building, and they are also a long-time partner of Refactoring. However, you will only get my unbiased opinion about all the practices and services mentioned here, Packmind included.
You can try Packmind for free below 👇
📖 What makes code easy to change
Before discussing how AI changes things, we need to establish what makes code truly maintainable. After all, you can't improve what you don't understand.
Many people conflate code quality with abstract notions of "cleanliness" or adherence to specific patterns. But in my opinion there's only one thing that really matters: how easy it is to change the code.
This is not just theoretical. When code is hard to change, velocity slows down and engineers spend more time maintaining existing stuff than writing new features.
To me, code that is easy to change displays three fundamental traits:
1) Easy to understand 📖
The first trait of maintainable code is readability. Code that can be easily understood by anyone on the team, regardless of who wrote it or when it was written, is code that can be safely modified.
Here are the key elements that make code easy to understand:
🏅 Clear responsibilities — each component having a single, well-defined purpose is the foundation of easy understanding. When a file or class tries to do too many things, it becomes harder to grasp and riskier to modify.
📁 Intuitive structure — your codebase should be organized in a way that makes things easy to find. This includes: meaningful folder structure that reflects your domain, consistent file naming conventions, clear separation between different layers (e.g., UI, business logic, data access). I wrote more about naming & structure in this previous piece.
💬 Good comments — some argue for self-documenting code, but I have found that the best teams consistently write good comments. The key is finding the right balance: I am not a fan of inline comments and overly micro stuff, but I always appreciate comments at the top of files/classes describing their primary goal, or a quick explanation of complex business logic that isn’t immediately obvious from the code.
2) Small chance of regressions 🎯
The second trait of maintainable code is confidence — how sure are you that your changes won't break something else?
This confidence primarily comes from good testing. This is not just about high coverage, it's about having the right tests that give you the most value:
Integration tests — in my book, these often provide the best ROI, as they can cover large parts of your codebase while being more resilient to refactoring than unit tests.
E2E critical testing — identify the core business flows and harden them first. A bug in your login page is more critical than one in an admin dashboard.
Test readability — good tests also serve as docs. When a test fails, it should be immediately clear what went wrong and why.
I wrote a lot about 1) good testing, and 2) modern QA, if you want to dig more!
3) Good abstractions 🏗️
The final trait is having abstractions that match your business domain, which to me is the very definition of low technical debt. This is perhaps the hardest to get right, as it requires tech expertise + domain knowledge, for which you need good collaboration between stakeholders.
When abstractions are poor, tech debt sneaks in. The best teams handle this in a variety of ways, which we covered in our full guide.
The relationship between these three traits is often hierarchical: readable code makes it easier to write good tests, and good tests give you the confidence to improve your abstractions.
However, maintaining these qualities becomes more challenging in the age of AI. When code is increasingly being generated by machines, how do we ensure it remains easy to change?
🤖 Coding in the age of AI
It is clear by now that the new coding workflow enabled by AI brings in a sneaky tradeoff: we produce more code, but have less control over it.
But GitHub reported that Copilot helps developers write code ~15% faster, and similar tools showed comparable improvements. How can that not translate into overall better delivery?
My guess is three things: