Code Quality in the Age of AI ✅
How to create a process for quality throughout the whole SDLC.
Last month, the 2024 DORA report came out, and as always it was full of interesting insights.
A good part of it explored the impact of AI on engineering teams, and reported a surprising finding:
Despite AI’s potential benefits, our research revealed [that] AI adoption may negatively impact software delivery performance. As AI adoption increased, it was accompanied by an estimated decrease in delivery throughput by 1.5%, and an estimated reduction in delivery stability by 7.2%.
So, in many cases, the gains in pure coding throughput do not translate into general software delivery improvements.
Why is that?
The answer is unclear — I guess it will take years to figure out all the downstream effects of AI in coding. The key tradeoff, though, looks simple: AI allows to write more code, faster, in exchange for it being less under our control.
It is easy to see how bad things can happen from this: faster and sloppier coding leads to bigger batches, which equals more risk, and leads to more regressions and maintenance down the line.
This is not inevitable. I am sure the best teams will be able to constrain the use of AI under their own good practices, and make it a huge net positive — but it will take work.
In particular, I believe it will require a new, intentional approach to code quality.
Quoting Joel Chippindale, our coach-in-residence:
"High quality code is easy to keep changing"
But how do we ensure this when more and more of our code is written by AI?
This is what we are going to explore today. To do this, I am partnering with Cèdric Teyton, CTO of Packmind. Cédric works at the forefront of this transition, building tools that help enforce quality through AI. So, we put together our respective experiences to create the best guide we are capable of.
Here is the agenda for today:
📖 What makes code easy to change — let’s explore the fundamentals of maintainable code, from readability to testing, and why they matter more than ever.
🤖 Coding in the Age of AI — how AI is changing the way we write and maintain code, and why this makes quality even more crucial.
🔄 The Lifecycle of Quality — a practical framework for building systems that consistently produce good code, from early practices to final reviews.
Let’s dive in!
Disclaimer: I am a fan of Packmind and what Cèdric is building, and they are also a long-time partner of Refactoring. However, you will only get my unbiased opinion about all the practices and services mentioned here, Packmind included.
You can try Packmind for free below 👇
📖 What makes code easy to change
Before discussing how AI changes things, we need to establish what makes code truly maintainable. After all, you can't improve what you don't understand.
Many people conflate code quality with abstract notions of "cleanliness" or adherence to specific patterns. But in my opinion there's only one thing that really matters: how easy it is to change the code.
This is not just theoretical. When code is hard to change, velocity slows down and engineers spend more time maintaining existing stuff than writing new features.
To me, code that is easy to change displays three fundamental traits:
1) Easy to understand 📖
The first trait of maintainable code is readability. Code that can be easily understood by anyone on the team, regardless of who wrote it or when it was written, is code that can be safely modified.
Here are the key elements that make code easy to understand:
🏅 Clear responsibilities — each component having a single, well-defined purpose is the foundation of easy understanding. When a file or class tries to do too many things, it becomes harder to grasp and riskier to modify.
📁 Intuitive structure — your codebase should be organized in a way that makes things easy to find. This includes: meaningful folder structure that reflects your domain, consistent file naming conventions, clear separation between different layers (e.g., UI, business logic, data access). I wrote more about naming & structure in this previous piece.
💬 Good comments — some argue for self-documenting code, but I have found that the best teams consistently write good comments. The key is finding the right balance: I am not a fan of inline comments and overly micro stuff, but I always appreciate comments at the top of files/classes describing their primary goal, or a quick explanation of complex business logic that isn’t immediately obvious from the code.
2) Small chance of regressions 🎯
The second trait of maintainable code is confidence — how sure are you that your changes won't break something else?
This confidence primarily comes from good testing. This is not just about high coverage, it's about having the right tests that give you the most value:
Integration tests — in my book, these often provide the best ROI, as they can cover large parts of your codebase while being more resilient to refactoring than unit tests.
E2E critical testing — identify the core business flows and harden them first. A bug in your login page is more critical than one in an admin dashboard.
Test readability — good tests also serve as docs. When a test fails, it should be immediately clear what went wrong and why.
I wrote a lot about 1) good testing, and 2) modern QA, if you want to dig more!
3) Good abstractions 🏗️
The final trait is having abstractions that match your business domain, which to me is the very definition of low technical debt. This is perhaps the hardest to get right, as it requires tech expertise + domain knowledge, for which you need good collaboration between stakeholders.
When abstractions are poor, tech debt sneaks in. The best teams handle this in a variety of ways, which we covered in our full guide.
The relationship between these three traits is often hierarchical: readable code makes it easier to write good tests, and good tests give you the confidence to improve your abstractions.
However, maintaining these qualities becomes more challenging in the age of AI. When code is increasingly being generated by machines, how do we ensure it remains easy to change?
🤖 Coding in the age of AI
It is clear by now that the new coding workflow enabled by AI brings in a sneaky tradeoff: we produce more code, but have less control over it.
But GitHub reported that Copilot helps developers write code ~15% faster, and similar tools showed comparable improvements. How can that not translate into overall better delivery?
My guess is three things:
Larger changesets — AI makes it easy to write more code at once. Bigger PRs cause slower + sloppier reviews, and bring in more risk when deployed. More risk equals more bugs and more churn.
Non-obvious issues — if you have played for AI-coding for a while, you may have noticed that AI makes weird mistakes, largely different from what humans make. It messes up library versions, sometimes hallucinates entire functions, or rewrites a piece of code deleting a previous functionality in the process. These errors are rare, but they take a lot of time to find out because they are different from our mistakes.
Long-term maintenance — I suspect the real cost of AI code shows up in maintenance. More code naturally requires more maintenance, and the less we understand some code, the harder the maintenance.
So what’s the way forward? It is early to say, but my bet is to invest in your process for quality 👇
🔄 The Lifecycle of Quality
Code quality is not about individual performance or heroic efforts. It's about having systems and processes that consistently produce good outcomes.
I believe a team of average developers working within a well-designed system will outperform a group of exceptional developers working within a suboptimal system.
So let's explore the six steps that make up your quality system, and how they reinforce each other 👇
1) Encoded coding practices 📝
Everything starts with having clear, documented practices that define what good code looks like for your team. These may include: system design principles and patterns, code organization and naming conventions, security and performance guidelines, docs requirements, and more.
Good encoded practices are the foundation of your quality system because they inform everything else: they make coaching and pairing easier; they enable laser-focused static analysis, and provide clear guidelines for code reviews.
They create a shared language that makes all subsequent steps more effective.
2) Coaching & pairing 👥
Knowledge doesn't spread by osmosis. The best teams actively share it and discuss it through good collaboration.
In my experience, the two most effective forms of collaboration in engineering teams are:
👯 Pairing — it’s no secret that I am a big fan of pair programming. I wrote about it plenty of times, and will continue to do so!
✏️ Design discussions — a good design process, possibly supported by good design docs, does wonders for your team’s growth. It helps you get to better solutions, coaches younger co-workers, and intercepts issues before it’s too late. When the design is good, pure coding mistakes can’t be too bad.
3) Static analysis 🤖
One of my predictions about AI is that static analysis is going to be huge.
What started as simple linting for stylistic errors, is gradually evolving to intercept code smells, security vulnerabilities, optimization opportunities, and more.
This is healthy because the more you can catch with automation, the more code reviews get easy. To the point where I believe good testing + good static analysis will remove a lot of the need for blocking reviews, and many teams will be able to switch to a merge first + review later workflow for most changes.
A limitation of most static analysis tools today — which makes me bullish about the work of the guys at Packmind — is that the set of rules and criteria are fixed. The next step, to me, is AI taking your own coding standards and enforcing them in static analysis, like a human reviewer would.
That’s exactly what Packmind is up to, so check it out if you want.
4) Automated testing ✅
We covered this already — a solid test suite makes you faster by creating the confidence to change code without the fear of breaking things.
I am very opinionated about testing and I don’t believe all tests are good. You should be intentional in what kind of tests to invest in, and for what parts of your codebase, to make sure the ROI is positive. I wrote more about this here.
5) Code review 🔍
Finally, code reviews are your last line of defense.
When everything else works, reviews should rarely spot crucial issues. Conversely, if they often do, you should ask yourself what other parts of the process are failing: should you discuss design more? do you have enough encoded practices? Is static analysis powerful enough?
Code reviews are literally the worst moment (before shipping) in which to spot problems, because they are the last. By shifting left and investing in what comes before, you can shrink what goes into reviews considerably, and address many items arguably in better ways.
One of the goals of code reviews should be to continuously reduce their own scope, by making engineers uncover rules that can be enforced by previous parts of the dev process.
So, when done right, reviews are less about catching basic issues (handled by good design + static analysis) or verifying correctness (covered by tests), and more about sharing knowledge, identifying new patterns, and creating new coding practices.
This way, reviews feed back into the rest of the process, creating a virtuous cycle of continuous improvement.
More thoughts (and hot takes!) about reviews in this previous article.
📌 Bottom line
So, in an era where AI is reshaping how we write code, maintaining quality becomes at once more challenging and more crucial. Here are the main takeaways from today:
📖 Focus on changeability — high-quality code is fundamentally about being easy to change. Prioritize readability, testability, and good abstractions in your codebase.
🤖 Understand AI's tradeoffs — while AI can boost coding speed, it can lead to larger changesets, non-obvious issues, and increased maintenance burden. Be aware of these pitfalls.
🔄 Implement a quality lifecycle — create a system that consistently produces good outcomes. This includes encoded practices, coaching, static analysis, automated testing, and effective code reviews.
⬅️ Shift quality left — develop the mindset of catching issues as early as possible in the development process. Invest in practices and tools that prevent problems before they reach the code review stage.
I want to thank again Cèdric and the Packmind team for partnering on this. Their ideas about code quality and static analysis where invaluable for shaping my opinions, and this piece.
I am a big fan of what they are building — in fact, Packmind helps with most of the quality steps above:
It enables teams to capture good/bad coding patterns as they code to define their coding practices (step 1)
After the practice is validated by the team, it does static analysis on the fly in the IDE (step 3).
Finally, it provides a plug-in during code review to capture and capitalize on new coding patterns (step 5).
You can start using Packmind for free below 👇
And that’s it for today! I wish you a great week ☀️
Sincerely
Luca
As a person who has spent a lot of time in the industry working on developer tools during the age of AI (and I also speak to tons of other people in similar situations) one thing possibly overlooked by the DORA report is that many dev tools teams have abandoned delivering basic functionality to their developers in the pursuit of AI developer tools that have panned out partially or not at all.
I’m a strong believer in the power of AI to transform productivity, but if the basic developer toolchain stops functioning or you can’t do what you need to with it, no amount of AI developer tooling is going to improve the quality or delivery effectiveness of your systems.
This is awesome! One of the key aspects I think a lot of people miss is that coding with AI is a skill of its own. In my experience, AI assistants make coding faster but still require meticulous code review and the user to understand technical debt. This will only change as assistants continue to develop, but I don't see the need for meticulous review going away even with better assistants. Companies looking to have their developers code with the help of AI need to make this clear to employees.