How to Plan for Maintenance ๐ ๏ธ
The best strategies including cycling, swim lanes, the boy scout rule, backlogs, and more.
One of the most important duties of any engineering team is to spend their time and effort on the right things.
So, in an ideal world, you would pull all the possible tasks, calculate their cost, their value, and address them in descending order based on their ROI.
In the real world, though, this doesnโt happen.
It doesnโt happen because, in product & engineering, there exist radically different types of work: think of a big feature vs a large refactor, or a small product improvement vs updating a dependency. For many of these, figuring out the precise value or cost beforehand is tricky. Also, they may just bring different types of value (e.g. more revenues vs more productivity), so the playing field is not even.
Maintenance tasks are the ones that suffer the most from this mismatch. They are extremely valuable in the long run, but they can also be utterly technical and hard to grasp by PMs.
So, in the most successful companies I know, maintenance is usually addressed in very specific ways, to protect it and make sure people actually do it.
This article covers the best strategies to perform maintenance in engineering teams.
It includes real-world examples from the teams at Product Hunt, Swarmia, and Codacy, and more ideas from yours truly.
Here is what we will cover:
โ What is maintenance? โ letโs talk of size and urgency.
๐ ย Boy Scout Rule โ a cultural staple for handling everyday tasks.
๐ Cycling โ assigning people to maintenance with rotating processes.
๐โโ๏ธ Swimlanes โ allocating fixed time to maintenance, separated by product dev.
๐ท Dedicated teams โ having a permanent team for it.
๐ย Tracking tasks โ on the need and perils of backlogs.
Letโs dive in!
โ What is maintenance
When talking about maintenance, you may immediately think of various technical tasks like fixing bugs, refactoring code, or updating dependencies.
However, since I am writing this to help with planning and resource allocation, I feel we should enlarge the scope a little bit, and ask ourselves: what does it feel hard to plan and prioritize?
In my experience, there are two elements that play a strong role in this: urgency and size.
1) High urgency
Urgency makes things easy from a planning perspective: you just do them.
Aside from stopping in your tracks when incidents happen, you may also apply hard-and-fast rules for addressing P1 items. Kendrick Curtis, VP of Engineering at Codacy, has a zero-tolerance policy:
โAll outstanding โhighโ bugs go into the next sprint โ PM gets what's leftโ.
Itโs simple, and it works. But in general, P1 stuff is not what people have the most trouble with.
2) Large tasks
Large tasks also kind of make things easier: you sit down, explore the whys, figure out effort, and approximate ROI.
You can afford to be thorough because, of course, for big items it is worth it.
Pitting large engineering work (e.g. migrating a framework) against other feature work may be a headache, but itโs the right thing to do, and there is little doubt about it.
3) Small & non-urgent
Instead, where I have found people struggle the most itโs with the small & non-urgent:
Fixing a nasty bug that few people care about.
Updating a non-trivial dependency because of a minor vulnerability.
Spending 2 days reworking an abstraction that has been leaking for a while, but causes no immediate pain.
This is the true maintenance that often falls in nowhereโs land. Not urgent enough to be addressed immediately, but relevant enough that you canโt ignore it.
How do you deal with this? Here are the most common approaches, with upsides and downsides ๐