How to Manage Context in AI Coding 📑

Exploring an evergreen art, made 10x more relevant by AI

Nov 05, 2025

Article voiceover

0:00

-14:02

Hey, Luca here! This is a ✨ monthly free essay ✨ from Refactoring! To access all our articles, library, and community, subscribe to the full version:

Last month I had the privilege to keynote at the CTO Craft Conference in Berlin, where I tried to paint a broad picture of how engineering teams are using AI today.

To prepare for the keynote, I did a lot of research. I started with our own survey data from the newsletter, but I also took stories from our podcast interviews, plus ad-hoc 1:1s with tech leaders who I knew made good use of AI.

As I eventually said in the keynote, all this data proved a bit inconclusive. For some teams, AI is a runaway success. For others, it’s nice to have at most.

Assuming—as I do believe—that both camps are right in assessing the impact of AI on their teams, the next question is obvious: what separates the early winners from the rest of us?

In recent weeks we have often said that what’s good for humans is good for AI — so AI, for the most part, acts as an amplifier, making good teams even better, and increasing the gap with average ones.

But it would be simplistic to say that’s all there is. There are indeed AI-specific things that teams are doing to work better with it — many of them are tactical and probably transient, but there are a few that feel foundational and here to stay.

The most interesting one, to me, is context engineering.

Context engineering is a relatively new term, initially popularized by Tobi Lutke and Andrew Karpathy, and then picked up by many others.

I really like the term “context engineering” over prompt engineering.
It describes the core skill better: the art of providing all the context for the task to be plausibly solvable by the LLM.
— Tobi Lutke, CEO of Shopify

It started as an attempt to go beyond prompt engineering, which, if you ask me, never felt completely right 👇

Merely crafting prompts does not seem like a real fulltime role, but figuring out how to compress context, chain prompts, recover from errors, and measure improvements is super challenging — is that what people mean when they say “prompt engineer?”

— Amjad Masad, CEO of Replit (in 2023!)

Prompt engineering ideas feel like tactical advice at best—focus on the how, tone, structure, rather than actual outcome—and magical incantations at worst (”You are an expert...”, “Let’s think step by step”).

So how is context engineering different from prompt engineering? First of all, I believe “context” is a better abstraction than “prompt”:

📑 Providing the right context — conveys the idea of grounding the model into accurate and exhaustive information.
🪄 Providing the right prompt — conveys the idea of nailing some specific wording.

You might argue it’s just words, but words matter: over the long run, I believe we want teams to focus on good context, rather than good prompts.

Context engineering, then, should refer to the discipline of designing systems and workflows that reliably ensure AI is provided with such right information.

But what makes this a system, worthy of the engineering word? To me, it’s about two qualities:

👯‍♀️ Multi-player — it should be designed with the team in mind. Shared workflows and practices.
⚙️ Dynamic — it should enable AI to fetch the right content by itself as much as possible, rather than humans passing it every time.

Note that these qualities are orthogonal. In fact, you can work in modes that are:

Dynamic but single-player — a zealous developer who configures 10+ MCPs on their single Claude Code installation.
Multi-player but static — a giant shared CLAUDE.md file that contains a lot of info about the repo.

Good context engineering is about fetching content dynamically, in a way that works for your whole team

So let’s explore this in a more systematic way, with a particular focus on coding workflows. Here is the agenda:

✅ Tasks vs procedures — what do people (and AI) need to know to do… things?
🔭 Sharing the why — the often neglected part.
📖 Task-relevant context — separating between mandatory and best effort context.
🤏 Keep context small — to keep your life simple.

Let’s dive in!

I did a ton of research for this piece, and to do so I got a lot of help from the team at Unblocked.

Unblocked connects your code, docs, and conversations so Cursor, Claude and Copilot finally understand your system like your best engineer.

Context engineering is their bread and butter, and they’ve recently written a deep dive article about why LLM’s need more than prompts and MCP servers.

Learn more

Disclaimer: I only write my unbiased opinion here on all tools and practices covered, Unblocked included!

✅ Tasks vs procedures

What does good context look like?

Whenever I find myself wondering how to make something good for AI, I try to stop and think about how to make it good for humans first. I find it easier and it’s usually ~90% about the same things.

So, what does good context engineering look like… for humans? In normal work, how do you make sure people know exactly what they need to know to… do things?

Let’s unpack what goes into context.

It obviously depends on what you need to do, so let’s say an engineer needs to develop a new feature — what do they need to know? The first things that come to mind are things like product specs, UI design, and system design specs.

If you get deeper, though, and imagine this is not an engineer on your team, but a contractor that you have just summoned for this task (which is closer to what an AI is), there is a lot more they need to know:

Do you enforce writing tests?
How do you instrument features?
What about feature flags?
Any naming conventions?
…

So, a good mental model to think through all of this is to separate between two types of context:

☑️ Task-specific instructions — the what needs to be done
🔄 Procedures / principles — the how it needs to be done

In any team, with or without AI, a good trend is to continuously shrink the what and grow the how.

You don’t want to explicitly say “write tests” in a prompt, or in a Jira ticket — you want it to be a standard practice that everybody knows.

You should trend towards small instructions and large shared context

So if we need a vision for what the best context system looks like, it’s the one where developers (and AI) 1) succeed at doing what they need to do, with 2) as little instructions as possible.

The corollary of that is that it’s generally good to focus on the inputs, rather than the outputs.

If a feature turns out differently, or sub-par in some parts, with respect to what you would have liked, what has been the issue? Has the person (or the AI) just been sloppy, or was something missing from the context? In my experience, the latter is more likely.

So, other than nailing what context looks like, you should nurture the feedback loop. The flywheel that allows the system—and the context—to improve over time. Every time something fails, sure, fix the outputs, but also go back and improve the inputs.

🔭 Sharing the why

There is a third category of context, other than the what and the how, which is the why.

Let’s say you are not simply developing a new feature, but you want to find architecture improvements, refactoring opportunities, or make some product strategy.

To do that, the what is still mostly up in the air, and the how is largely not relevant yet. Instead, you need context on why something would be valuable to have.

Why context can be evergreen (e.g. your principles, values, the business domain), about the future (e.g. your quarter goals), or about the past (e.g. your past goals, ADRs).

All of this is not always important for the task at hand, but in cases where it is, it is invaluable. So, before you even think about AI, you can ask yourself questions like:

How much of this is written down?
Has everyone access to it? Do they know where it is?
Do people think it’s important? What does it happen if they ignore it?

📖 Task-relevant context

In his seminal book High Output Management, Andy Grove coined the term task-relevant maturity to express an employee’s readiness to take on some responsibility. This is separate from the employee’s general seniority: you might be an expert in some domain in general, but still be unfamiliar with a specific task you need to perform within it.

I believe giving AI tools context requires a similar line of thought.

Other than thinking about what’s important per se, and how to keep things neat and tidy, you need to identify what matters for the task at hand. Things are usually not black-or-white here, so, for any category of task, I find it useful to identify what is mandatory to know, vs what is best effort:

🟢 Mandatory context — needs to be fetched right 100% of the time
🟡 Best effort context — is nice to have as additional info, but it’s ok if retrieval sometimes fails or is not 100% accurate.

In my experience, especially with AI systems, getting this right is crucial. If AI fails to account for the mandatory stuff even a small percentage of times, engineers get frustrated and might stop using it for complex work.

The more AI can be trusted, the looser we can keep the feedback loop, and the bigger the problems we can make it take on.

1) 🟢 Mandatory context

My #1 rule for mandatory context is that it should be co-located, as much as possible, with where the task happens.

For example, in coding this means storing a lot of stuff in the repo, alongside the code, like key dev practices, key info about the repo, and the system design / tech specs.

This could be as simple as having three files:

A CLAUDE.md or AGENTS.md file for how AI should write code.
A README file for instructions on building / testing / executing code.
A SYSTEM-DESIGN.md file that captures the main design features of the repo.

I also tend to not rely on conversation threads / memories too much, and I consolidate everything that’s useful — even if provisional — into files.

For example, for meaningful tasks, I find it useful to make agents create a plan file they can use, update, and get back to, to know where they are at. I use a ROADMAP.md file for that: I tell AI to create and use it as part of the procedure in the CLAUDE / AGENTS file, and it gets git-ignored.

This makes it easier to start over when conversations degrade (i.e. context rot), and also to spin multiple agents and parallelize work.

2) 🟡 Best effort context

A lot of the best effort context is not appropriate (or downright impossible) to co-locate: it’s the Slack chats, the ADRs of the relevant part of the code, and other info you might fetch from the various Notion, Linear, Github history, and so on.

Most of these tools have individual MCPs these days, but I feel the best way to fetch everything in a reliable way is to centralize access through a single tool, and—small plug here—I am very happy with Unblocked for this.

It’s hard to explain why individual MCPs fail, because it’s like death from a thousand cuts:

If you have too many, agents get confused on when and how to use them. An agent may skip using some of them randomly, or may stop calling them once it thinks it has an answer (even if wrong).
Many MCPs are too simplistic and not designed for fetching exactly what you need to fetch. So the agent has to do extra work, extra calls, and it may fail at that.
It feels like a waste of reasoning / context window space — the less you can use for that the better.

Centralizing through a tool (Unblocked or others) has also the benefit that you can deliver the experience in multiple places, most crucially in Slack and Teams. In fact, just like all of this long-tail content shouldn’t be crucial to deciding how to write the next line of code, it surely is when you are at the stage of figuring things out. It’s vital to understand old code (and how it came to be), to onboard new folks, and so on.

🤏 Keep context small

My final piece of advice should have probably been the first. The easiest way to make both humans and AI work better with context is to keep it simple and small.

How so? In many ways:

Low coupling — the more code is componentized and isolated, the less stuff needs to be fetched / taken into account / updated.
Convention over configuration — great defaults can go a long way to keeping the cognitive load (and therefore context) small. Less code to be written means less stuff that needs to be remembered.
Keep the tech stack simple — using the same language everywhere, favoring buy over build, using tech that is well understood and doesn’t need a lot of documenting.

📌 Bottom line

And that’s it for today! Here are the main takeaways:

🎯 Context engineering beats prompt engineering — Focus on building multiplayer, dynamic systems that provide the right information reliably, rather than crafting magical wording. Design workflows where AI can fetch what it needs automatically.
📋 Separate what, how, and why — Good context includes task-specific instructions (what), procedures and principles (how), and strategic reasoning (why). Gradually shift from explicit instructions to embedded standards.
🔄 Focus on inputs, not just outputs — When something fails, don’t just fix the result—improve the context that created it. Nurture the feedback loop that makes the system better over time.
🟢 Co-locate mandatory context, centralize access for the rest — Store critical information directly in your repo (AGENTS.md, README, system design docs) for 100% reliable retrieval, while using a single tool to access distributed best-effort context across Slack, Notion, and other platforms.
🤏 Simplicity is your superpower — Keep context small through low coupling, convention over configuration, and a simple tech stack. Less context to manage means better results for both humans and AI.

See you next week!

Sincerely 👋
Luca

I want to thank again the folks at Unblocked for partnering with this! I am a fan of what they are building, you can learn more below 👇