April 2026
Context Is a Budget
Agent operators talk about context like it is free oxygen. It is not. It is closer to cash, battery, and attention packed into one constraint.
Every extra token costs something: money, latency, retrieval noise, weaker focus, and more opportunities for stale information to masquerade as truth.
Position
Context should be spent where it changes the next decision. Everything else is clutter wearing a lanyard.
What Context Actually Buys
Useful context does one of four jobs:
- exposes the current objective
- states the active constraints
- shows the artifact under change
- defines how success will be judged
That is the whole game. If a block of text does not help with one of those four jobs, it is probably there because nobody wanted to decide what mattered.
The Common Failure
Most agent setups die from one of two bad instincts.
Instinct one: dump in everything "just in case."
Instinct two: keep adding summaries of summaries until the system cannot tell current truth from yesterday's interpretation of it.
The result looks thorough and feels intelligent. It is neither.
Wide context often makes agents more obedient to the wrong thing. They follow whatever is easiest to quote, not whatever is most relevant to the live decision.
Why More Context Does Not Mean More Truth
Truth in an agent workflow is not the total volume of available text. It is the subset that still governs the present task.
A stale note, an old roadmap, an outdated README section, and a one-line user correction do not have equal authority. But once they enter the same prompt soup, they start competing as peers.
This is the real problem with context inflation. It flattens authority.
The model sees everything. The operator forgets that not everything should count equally.
Authority Matters More Than Recall
Useful context is hierarchical.
- highest authority - current spec, explicit user correction, failing test, live interface contract
- middle authority - recent decision log, current implementation, current task brief
- low authority - brainstorming notes, old summaries, generic project lore
Most teams have this hierarchy implicitly. Few encode it explicitly. Then they wonder why the agent keeps honoring obsolete notes like a dutiful little historian.
The Four Good Uses of Context
1. State the Objective
What is happening now? Not what the project is "about." Not the company mission. What is the current move?
task: add retry backoff to outbound webhook delivery
done_when:
- retries use exponential backoff
- max attempt count is configurable
- existing webhook tests pass
- failure metrics remain emitted
2. State the Constraints
Good constraints save tokens because they eliminate branches of thought before the model wanders into them.
constraints:
- do not change database schema
- preserve public API response shape
- prefer existing queue abstractions
- no new dependencies
3. Show the Work Surface
The most valuable context is usually the artifact under change: the file, interface, failing test, log line, schema, or diff. Not a sermon about the repository.
4. Define Judgment
Agents behave better when the evaluation surface is present. See Reward Engineering for Coding Agents. If success is visible, less narration is needed.
Bad Context Smells
- long codebase overviews before a narrow task
- duplicate summaries at multiple levels of abstraction
- persona fluff that outranks operational facts
- old decisions with no validity window
- notes that explain what the code already says clearly
- entire browser snapshots shoved into every turn
If context starts looking like archival preservation, the workflow has already gone off the rails.
Memory Is Not a Junk Drawer
Persistent memory gets sold as the answer to drift. Sometimes it is. Often it just preserves mistakes at scale.
Memory only helps when it stores things that should survive task boundaries:
- stable project facts
- decisions that remain binding
- open questions that still block progress
- operator preferences that genuinely affect output
Everything else should decay.
That sounds harsh. It is correct. A memory system without forgetting is just an optimism-powered landfill.
Compression Beats Accumulation
Strong operators compress state after each meaningful step.
Weak operators accumulate every intermediate thought because deleting anything feels risky.
Compression is not summarization theater. It means writing the smallest artifact that preserves the next valid move.
current_state:
objective: ship webhook backoff without API changes
decisions:
- use existing retry queue
- default max attempts stays 5
unresolved:
- metric name for terminal failure event
next_step:
- patch dispatcher and update tests
That is enough. The rest can be reconstructed from code and git history if needed.
Prompt Caching Does Not Solve Epistemology
Cheap cached context is still context. Lower price does not turn fluff into signal.
Caching is useful when the same high-value substrate must stay present across turns. It is not a moral license to keep every paragraph ever written in the loop forever.
Heartbeats and Checkpoints
Long-running agent work benefits from tiny recurring checks. But the heartbeat should ask whether reality changed, not restate the entire universe.
A good heartbeat pattern is boring:
if nothing_changed:
return HEARTBEAT_OK
if state_changed:
return {
changed_fact,
impact_on_task,
required_next_action
}
The point is to surface deltas. If every heartbeat rehydrates the full workflow state, the system is paying rent on its own indecision.
The Better Rule
Pass full context only at real boundaries:
- new task class
- new artifact surface
- spec revision
- handoff between roles
Inside a narrow loop, context should shrink, not grow.
Practical Consequences
- Prefer references to artifacts over pasting entire artifacts.
- Store state in versioned files, not only in conversation history.
- Expire summaries when the underlying truth changes.
- Make authority explicit: spec beats note, test beats plan, operator correction beats both.
- Measure context by decision value, not token count alone.
Context is a spending decision. Spend it on the facts that change the next move. Cut the rest before it starts pretending to be knowledge.
Further Reading
- AGENTS.md Effectiveness - When extra repository context helps and when it just burns money
- Modular Workflow Stack - Why context belongs in a distinct layer
- Reward Engineering for Coding Agents - The evaluation surface matters more than prompt bulk