Coding Agents Do Not Need Personal Memory

Why coding agents should learn codebases, not developers

May 15, 2026

Introduction

Every major coding agent now ships with some form of memory. Cursor has Project Rules, Team Rules, and AGENTS.md. Claude Code has CLAUDE.md plus auto memory that quietly accumulates what it thinks is worth remembering across sessions. Codex has AGENTS.md plus a Memories feature that learns your preferences, workflows, and recurring mistakes over time. The file based half of this is good. AGENTS.md, CLAUDE.md, and repo scoped rules are version controlled, reviewable, and live next to the code they describe. But there is now a second layer getting bolted on. Auto extracted memory. Per developer behavioral memory. Systems that watch how you work, track your corrections, and quietly build a profile of you over time.

This is the wrong problem to solve, at least for coding agents. Coding agents do not need personal memory the way customer support bots or personal assistants do. The important thing to remember when building software is usually not the human sitting at the keyboard. It is the architecture, constraints, conventions, and decisions inside the codebase itself.

And software already has a memory system. It is the repo. The bet of this post is that repo scoped context will keep winning over user scoped memory. The products that lean into this model will look obviously correct in hindsight.

Domain Agents vs Coding Agents

Memory makes obvious sense for some kinds of agents. A customer support bot needs to remember that you’ve already tried restarting the router. A personal assistant should know you’re vegetarian, that your daughter’s birthday is in March, that you prefer morning flights. A health coach is useless without a history of what you ate and how you slept. In all of these cases, the agent is helping a specific human, and the relevant context is genuinely about that human. Per user memory is the entire point of the product.

Coding agents are structurally different. The artifact they help create belongs to the team, the repo, and the company, not just the developer. The developer is just a temporary collaborator on a shared, long lived object. That changes what’s worth remembering and how.

Trying to treat a coding agent like a personal assistant creates some problems:

Conflict: Whose personal preferences win when two developers work together?
Leakage: Do a contractor’s habits override repo conventions when they work across multiple projects?
Continuity: When a developer leaves, does their accumulated agent memory vanish or quietly influence the rest of the team?

The framing is wrong because the agent is working on the code, not for the developer. Rules like “use camelCase here” or “validate inputs in a separate function” are facts about the repo, not personal preferences. Storing them as user memory leads to conflict and drift from team standards. In coding, the persistent entity is the codebase.
This does not mean coding agents have no per-session state. They obviously track what they're working on, what they tried, what failed, and what's next. But that state is ephemeral and it belongs in the active conversation, not in a persistent store. Task state is not personal memory. It's just the work.

Memory vs Context

The distinction between memory and context isn’t just semantics. It is a fundamental choice in software architecture that dictates infrastructure cost and system complexity. While both feed information to a model, they sit at two different points on a cost curve.

Memory requires heavy lifting. It updates constantly and needs automated policies for extraction, deduplication, and retrieval. This is the world of vector stores and semantic search where the system must constantly arbitrate between old and new data. Context is far leaner. It updates occasionally, usually through a git commit or a documentation change. Because it is stable and scoped, it can be loaded directly into a prompt without complex retrieval systems. AGENTS.md, CLAUDE.md, and a docs folder for Architectural Decision Records (ADRs) sitting in the same repo as the code are usually enough.

The cost curve has a sharp elbow. When update volatility is low, a markdown file and git history suffice and the cost of implementation is near zero. As update frequency rises, manual management breaks down and you are pushed into a high cost bracket requiring extraction pipelines and database maintenance to keep the agent from drifting.

For coding agents, most valuable information lives well below the volatility threshold. Architectural decisions and coding conventions do not change daily. Treating these stable truths as memory pushes teams toward unnecessary infrastructure when markdown files in the repo are usually sufficient.

The Repo as Memory

The repo is the primary memory for a coding agent. Once you accept that these agents need stable, project scoped context rather than volatile, user scoped memory, the repo becomes the obvious home for that information. This is not a new concept. A healthy engineering organization already uses the repo as a structural memory system, even if it is not framed in those terms.

Consider the artifacts already present in a typical repo:

Project Instructions: AGENTS.md or CLAUDE.md capture framework conventions and naming rules that act as always on instructions for the agent.
Decision History: Architectural Decision Records (ADRs) capture the reasoning behind major technical choices and provide an append only history of how the project evolved.
In flight Thinking: Design docs and RFCs capture upcoming migrations and open questions.

In practice, this looks like:

Each of these is version controlled, human readable, and governed by the same PR process the team already uses for code. The git diff serves as an audit log. git revert is the rollback mechanism. A team moving between agentic tools carries its ADRs and project rules with it effortlessly, never siloed in a secondary system.

The design naturally mitigates hallucinations. A flawed update to AGENTS.md shows up in a PR where a reviewer can catch it. Autogeneration of AGENTS.md is increasingly common, but I still recommend a human review pass before it is merged. The file is read by the agent every session and shapes everything it produces. A bad line in AGENTS.md compounds. The same mistake silently written into a vector store would quietly poison future sessions. Markdown in git makes agent generated context legible. The agent is just another contributor operating inside infrastructure the team already understands.

Conclusion

The industry is building two distinct memory layers into every coding agent. One is repo scoped, file based, and reviewable. The other is user scoped, auto extracted, and opaque. Both ship under the same label of memory, and this conflation causes teams to invest in expensive machinery they do not actually need. I expect the first layer to keep winning while the second quietly atrophies. Not because auto memory is a bad idea in general, but because coding is the wrong domain for it. The thing worth persisting in software is the codebase itself. The artifact already has a robust memory system, version control, a review process, and a portable storage format. Adding a vector store often solves a problem the repo already handles well.

In my work with various teams, we use AGENTS.md or CLAUDE.md files to capture project conventions and track architectural decisions in markdown files. ADRs were previously tracked in external wikis, but the modern approach is to store them in a docs folder within the same repo as the code. This keeps the context physically coupled with the logic. One open question remains. How should an agent get additional technical context like dependency graphs, call relationships, and type flows? Deterministic tools can generate these structures and an LLM can reason over the results. The real question is whether the agent should regenerate this data on demand each session or persist it in an external graph store. I lean toward on demand for now, because regeneration is cheap and the graph is derived from version controlled code. I can imagine specific workloads, such as autonomous code review agents, where persistence pays for itself. At that point, a graph store becomes a defensible third layer.

Put context where the code is and keep the harness thin. Add a new layer only when the one you have demonstrably stops working. When you do scale up, add the smallest thing that solves the actual problem rather than the most fashionable tool. The repository is the memory for a coding agent. Everything else is overhead until proven otherwise.

Bits and Bytes

Discussion about this post

Ready for more?