Specs as Shared Reality

April 2026

Specs as Shared Reality

Agents do not share understanding in any mystical sense. They do not "align" by wanting the same thing. They align, when they align at all, by reading the same artifacts and being judged against the same boundaries.

The spec is the closest thing an agent workflow has to common reality.

Position

If the truth of the task only exists in conversation history, then the task does not have a durable truth.

Human teams can survive a surprising amount of ambiguity because they carry shared background, tacit knowledge, and the ability to ask annoying follow-up questions.

Agents are worse at all three.

That changes the epistemology of delivery.

Instead of asking whether the model "understood," the better question is: what artifact would make misunderstanding harder?

Conversation Is Not a Stable Substrate

Conversation history feels like shared reality because it is chronologically complete. That is not the same thing as being authoritative.

History contains:

draft ideas
obsolete assumptions
midstream reversals
soft language that never became a decision
local clarifications buried under later turns

Asking each new agent to reconstruct truth from that pile is not robust. It is archaeological cosplay.

What a Spec Does

A spec does not need to be grand or bureaucratic. It just needs to collapse uncertainty into a current, inspectable statement.

At minimum, a useful spec names:

the objective
the actors
the expected behavior
the constraints
the rejection conditions

feature: webhook retry backoff

objective:
  prevent rapid retry storms on downstream failure

behavior:
  - failed deliveries retry with exponential backoff
  - max attempts default to 5
  - terminal failure emits metric and marks delivery dead

constraints:
  - no schema changes
  - preserve current API payload shape
  - reuse existing queue system

reject_if:
  - retries become unbounded
  - delivery ordering semantics change
  - failure metrics disappear

That is enough to ground planning, implementation, QA, and review in the same world.

Specs Beat Memory for the Important Things

Memory is useful for durable preferences and persistent facts. Specs are better for current truth because they are explicit, local, and revisable on purpose.

Memories linger. Specs can be versioned, superseded, diffed, and approved.

A good spec is a legitimacy mechanism.

Shared Reality Requires Versioning

The moment multiple agents or sessions touch the same feature, versioning stops being optional.

If one agent read spec v2 and another implemented against v1, the team no longer shares a reality. It shares a category name.

spec: specs/auth-v2.md
version: 6f1e0d9
status: approved
supersedes: auth-v1.md

That tiny bit of ceremony pays for itself immediately. It prevents the soft chaos where everyone thinks they are talking about the same thing because the filename still sounds familiar.

Tests Are Spec Fragments

Tests do not replace specs. They do, however, encode parts of the same reality.

The best workflows make the relationship explicit:

spec says what should happen
tests prove selected claims about what should happen
implementation attempts to satisfy both

When the spec and tests disagree, that disagreement is gold. It reveals that the workflow does not actually have one truth yet.

What Happens Without a Spec

Planning Drifts Into Invention

The planning step quietly becomes requirements generation.

Implementation Overfits to the Last Message

The coder agent obeys whatever was said most recently, whether or not it outranks the rest of the project.

Review Turns Vague

Without a grounded artifact, review comments become aesthetic: "this feels wrong," "maybe cleaner," "not quite what was intended." Translation: nobody can point to shared truth.

Handoffs Become Storytelling

Instead of passing a stable document, agents summarize their own interpretation of what happened. Now every handoff introduces a chance to mutate reality by accident.

Spec Writing for Agents Is Different

Traditional specs often assume patient human readers who can tolerate narrative, politics, and historical context. Agent-facing specs should be tighter.

Better agent specs are:

operational instead of inspirational
bounded instead of visionary
explicit about rejection conditions
split into sections that can be referenced directly
short enough to reload without resentment

The good news is that this is also better for humans.

Specs Need Update Rules

A stale spec is worse than no spec because it advertises confidence it does not deserve.

Every workflow needs to answer:

who is allowed to change the spec
which changes need human approval
how downstream agents are told a revision happened
whether in-flight work must restart on spec change

If that policy does not exist, "the spec" is just a nice-looking prop.

Specs Are the Human-Agent Treaty

The spec is where human intent becomes something the system can inspect repeatedly without asking the human to restate it every time.

This matters because repeated restatement is a hidden tax. Each retelling introduces drift, omission, and accidental re-prioritization.

A spec stops that bleed. It turns intent into a revisable object.

Practical Rules

Write a spec before parallel work begins.
Version it when multiple sessions or agents depend on it.
Keep it shorter than the conversation it replaces.
Link tests and rubrics back to it explicitly.
Treat undocumented midstream changes as scope changes, not casual chat.

Agents do not need a perfect theory of the project.

They need the same binding artifact, the same revision history, and the same criteria for being wrong. That is what shared reality looks like in practice.

StayFresh