April 2026
Specs as Shared Reality
Agents do not share understanding in any mystical sense. They do not "align" by wanting the same thing. They align, when they align at all, by reading the same artifacts and being judged against the same boundaries.
The spec is the closest thing an agent workflow has to common reality.
Position
If the truth of the task only exists in conversation history, then the task does not have a durable truth.
Human teams can survive a surprising amount of ambiguity because they carry shared background, tacit knowledge, and the ability to ask annoying follow-up questions.
Agents are worse at all three.
That changes the epistemology of delivery.
Instead of asking whether the model "understood," the better question is: what artifact would make misunderstanding harder?
Conversation Is Not a Stable Substrate
Conversation history feels like shared reality because it is chronologically complete. That is not the same thing as being authoritative.
History contains:
- draft ideas
- obsolete assumptions
- midstream reversals
- soft language that never became a decision
- local clarifications buried under later turns
Asking each new agent to reconstruct truth from that pile is not robust. It is archaeological cosplay.
What a Spec Does
A spec does not need to be grand or bureaucratic. It just needs to collapse uncertainty into a current, inspectable statement.
At minimum, a useful spec names:
- the objective
- the actors
- the expected behavior
- the constraints
- the rejection conditions
feature: webhook retry backoff
objective:
prevent rapid retry storms on downstream failure
behavior:
- failed deliveries retry with exponential backoff
- max attempts default to 5
- terminal failure emits metric and marks delivery dead
constraints:
- no schema changes
- preserve current API payload shape
- reuse existing queue system
reject_if:
- retries become unbounded
- delivery ordering semantics change
- failure metrics disappear
That is enough to ground planning, implementation, QA, and review in the same world.
Specs Beat Memory for the Important Things
Memory is useful for durable preferences and persistent facts. Specs are better for current truth because they are explicit, local, and revisable on purpose.
Memories linger. Specs can be versioned, superseded, diffed, and approved.
A good spec is a legitimacy mechanism.
Shared Reality Requires Versioning
The moment multiple agents or sessions touch the same feature, versioning stops being optional.
If one agent read spec v2 and another implemented against v1, the team no longer shares a reality. It shares a category name.
spec: specs/auth-v2.md
version: 6f1e0d9
status: approved
supersedes: auth-v1.md
That tiny bit of ceremony pays for itself immediately. It prevents the soft chaos where everyone thinks they are talking about the same thing because the filename still sounds familiar.
Tests Are Spec Fragments
Tests do not replace specs. They do, however, encode parts of the same reality.
The best workflows make the relationship explicit:
- spec says what should happen
- tests prove selected claims about what should happen
- implementation attempts to satisfy both
When the spec and tests disagree, that disagreement is gold. It reveals that the workflow does not actually have one truth yet.
What Happens Without a Spec
Planning Drifts Into Invention
The planning step quietly becomes requirements generation.
Implementation Overfits to the Last Message
The coder agent obeys whatever was said most recently, whether or not it outranks the rest of the project.
Review Turns Vague
Without a grounded artifact, review comments become aesthetic: "this feels wrong," "maybe cleaner," "not quite what was intended." Translation: nobody can point to shared truth.
Handoffs Become Storytelling
Instead of passing a stable document, agents summarize their own interpretation of what happened. Now every handoff introduces a chance to mutate reality by accident.
Spec Writing for Agents Is Different
Traditional specs often assume patient human readers who can tolerate narrative, politics, and historical context. Agent-facing specs should be tighter.
Better agent specs are:
- operational instead of inspirational
- bounded instead of visionary
- explicit about rejection conditions
- split into sections that can be referenced directly
- short enough to reload without resentment
The good news is that this is also better for humans.
Specs Need Update Rules
A stale spec is worse than no spec because it advertises confidence it does not deserve.
Every workflow needs to answer:
- who is allowed to change the spec
- which changes need human approval
- how downstream agents are told a revision happened
- whether in-flight work must restart on spec change
If that policy does not exist, "the spec" is just a nice-looking prop.
Specs Are the Human-Agent Treaty
The spec is where human intent becomes something the system can inspect repeatedly without asking the human to restate it every time.
This matters because repeated restatement is a hidden tax. Each retelling introduces drift, omission, and accidental re-prioritization.
A spec stops that bleed. It turns intent into a revisable object.
Practical Rules
- Write a spec before parallel work begins.
- Version it when multiple sessions or agents depend on it.
- Keep it shorter than the conversation it replaces.
- Link tests and rubrics back to it explicitly.
- Treat undocumented midstream changes as scope changes, not casual chat.
Agents do not need a perfect theory of the project.
They need the same binding artifact, the same revision history, and the same criteria for being wrong. That is what shared reality looks like in practice.
Further Reading
- Project AI Philosophy - Why bounded operating rules matter
- Modular Workflow Stack - Specs as a distinct context layer
- Context Is a Budget - Why stable truth should be compressed, not endlessly repeated