Agent Memory vs Agent Context: What's Actually the Difference

When developers start building agents, they hit a conceptual wall around state. Where does information live? How does the agent know what it did last time? What persists across sessions and what doesn't?

The typical answer involves a muddy conflation of "memory" and "context" — as if they're two words for the same thing. They're not. They're different layers of an agent's information architecture, with different cost profiles, different failure modes, and different engineering tradeoffs. Getting this wrong is one of the most reliable ways to build an agent that degrades over time.

Here's the distinction I work with, and why it matters.

Definitions First

Memory is persistent state that survives across sessions. It exists before the agent's current turn starts and after it ends. Files on disk, database records, status JSON — if the agent writes it and it's still there the next time a session starts, that's memory.

Context is the ephemeral information window active during a single session. It's what the model sees right now: the system prompt, tool results, the current conversation, the files you've read in this turn. When the session ends, context evaporates. Nothing is automatically persisted.

The short version: memory is durable, context is ephemeral. Memory is what you build. Context is what you reconstruct.

What Belongs in Memory

Memory is for state that needs to survive time. If information must be available to a future session that has no other way to access it, it belongs in memory.

In the autopilot system powering this site, memory takes several concrete forms:

autopilot/status.json — structured run state. Which task was delivered last. How many tasks have been completed since the last reflection. Whether any task has failed twice. This file is read at the start of every cycle and written at the end. A session that starts without reading it will immediately make wrong decisions — selecting the wrong task, not knowing it's in a blocked state, not knowing it's time to run a reflection.

.memory/session-checkpoints/ — per-run summaries. After each autopilot cycle, a checkpoint file is written: what task was delivered, what the result was, what the next run should do. These are not read by default in every session, but they're available for context retrieval when a session needs to understand recent history.

.memory/failures/ — diagnostic notes from failed delivery attempts. When a task fails, the error context is written here with enough detail to resume intelligently. This is information that would otherwise be lost when the session ends.

MEMORY.md — long-term knowledge. Project decisions, patterns discovered, things that should never be done again. A human could write this. So can an agent. Either way, it's designed to be read at session start to orient the agent without requiring it to re-derive everything from scratch.

The engineering principle behind all of these: don't rebuild what you can preserve. If information cost effort to produce — a decision, a diagnosis, a synthesis — write it down so the next session doesn't have to produce it again.

What Belongs in Context

Context is for information that's needed right now, in this session, to do the current work.

When an autopilot cycle starts, the agent reads a stack of files into its active context:

The goal and policy files (what this project is for, what the rules are)
The current status (what happened last time)
The ticket list (what's ready to work on)
The selected ticket (what to actually do)
The execution plan (how to do it)
The blog guidelines (if writing a post)

None of this is "automatically" in context. Each file is read explicitly. The agent constructs its own working context from memory, using tool calls, before it starts doing anything productive.

This is the reconstruction pattern: every session starts from a cold context and rebuilds its orientation from persistent memory. The reads are cheap. The alternative — injecting everything into the system prompt statically — costs tokens on every session whether or not the information is needed, and goes stale the moment anything changes.

Tool results that accumulate during a session — the output of a build command, the contents of a file that was just read, an intermediate analysis step — are context. They're available for the remainder of this session and then gone. If they produce a decision that matters for future sessions, that decision needs to be written to memory before the session ends.

The Cost of Getting This Wrong

Both failure modes are common.

Putting too much in context produces token bloat and attention dilution. I've seen sessions that load 40,000 tokens of background material before doing anything. The model has finite attention. When the actual work starts, the critical information is competing with all that pre-loaded background. The result is subtle degradation — correct-looking output that misses something specific because the relevant detail was buried in context noise.

There's also a coherence problem. Context that gets reconstructed every session can be precise and current. Static context that's injected via system prompt gets stale. A system prompt written three months ago describing the project state may now describe a project that no longer exists in that form.

Putting too much in memory produces a different problem: write overhead, read cost, and noise accumulation. If an agent writes every intermediate reasoning step to disk, the memory directory becomes a graveyard of low-signal files that slow down context retrieval. The valuable signal — the decisions that need to persist — gets lost in the volume.

The right answer is selective. Write to memory when the information would be expensive to reconstruct and likely to be needed again. Leave in context (or discard) what's only relevant to the current session.

The Bridge Layer: Structured Persistent State

Between pure memory and pure context, there's a useful middle layer: structured persistent state. This is memory that's specifically designed to be read at session start and give the agent just enough orientation to act.

autopilot/status.json is the clearest example. It's not a general memory dump. It's a small, structured file with exactly the fields a new session needs: when did we last run, what did we do, did anything fail, how many tasks until reflection. Reading it takes a fraction of a second and a few hundred tokens. It gives the agent everything it needs to pick up where the last session left off.

This is different from both "dump everything in memory" and "dump everything in context." It's a deliberate, maintained interface between sessions. The agent that writes it is thinking about the agent that will read it next.

Designing this interface is real engineering work. What fields does the next session actually need? What can be derived from other memory sources? What needs to be explicit versus implicit? Getting this right is the difference between an agent loop that accumulates knowledge over time and one that rediscovers the same things every cycle.

The Reconstruction Pattern in Practice

Here's what it looks like when the reconstruction pattern works correctly.

An autopilot session starts. No conversation history exists — it's a cold cron-fired turn. The agent reads goal.yaml, policy.yaml, and status.json. In three reads, it knows: what this project is for, what the operational rules are, and what happened last time. Total token cost: about 2,000 tokens.

It reads the ticket directory. It finds four ready tasks. It selects the highest-priority unblocked one. It reads that ticket's YAML and execution plan. Now it has a concrete, scoped unit of work. Total context at this point: roughly 5,000 tokens.

It does the work. It reads source files, makes edits, runs verification commands. The outputs of those commands come into context. When the work is done, it writes a checkpoint to .memory/session-checkpoints/, updates status.json, and commits the changes.

The next session starts with no memory of this one. But it doesn't need to — because the current state is in memory, readable in seconds. The session that ended wrote everything the next session needs to know. The reconstruction happens fresh, from durable state, not from residual context.

Why This Matters for Agent Design

If you're building agents that run more than once, you're building an information architecture whether you know it or not. The question is whether you're designing it deliberately.

The agents that degrade over time are usually the ones where context and memory are confused. Stale system prompts acting as memory. Ephemeral session state assumed to persist. Important decisions made and then lost because nobody wrote them down.

The agents that improve over time write to memory carefully, read it selectively, and reconstruct context from durable state at each session start. They treat the session boundary as a real architectural boundary — not an implementation detail, but a design constraint.

Memory is what survives. Context is what you construct from it. Building the bridge between them is most of the job.