Modular Workflow Stack

April 2026

Modular Workflow Stack

The right tool for the right job at the right time. In sequence, with deliberate hand-offs and human checkpoints between layers.

A modular workflow is a decision surface with execution attached.

The Skills Stack pattern (role-based orchestration via gstack, context stability via GSD, execution via Superpowers) makes this concrete. Each layer has a job. Each layer hands off cleanly. None of them run simultaneously from a single monolithic prompt.

The Three Layers

Every durable workflow decomposes into three concerns:

Layer	Job	Failure Mode When Missing
Orchestration	Decide what to do next, route to the right agent, gate on human input	Agents start work before the goal is clear; you debug outputs instead of inputs
Context	Keep specifications stable across a long chain of steps	Context drift: later steps contradict earlier decisions; agents re-argue closed questions
Execution	Do one narrow task well with minimal context	Bloated prompts, multi-objective confusion, unpredictable output

The layers must be physically separate. An orchestration prompt that also does execution collapses both failure modes into one.

Layer 1: Orchestration

The orchestrator is the conductor, not a performer. Its only job is to route: given the current state and user input, which agent runs next, with what context, and who reviews the result?

Orchestrator Responsibilities

Accept structured user input at each decision point
Choose agent type based on the task class
Decide whether steps run sequentially or in parallel
Insert human-in-the-loop gates before irreversible actions
Decide when to loop (retry, refine, escalate)
Decide when to halt

Role-Based Routing

gstack formalizes what most effective teams do informally: different decisions belong to different roles. The same principle applies to agents.

Decision Type	Route To	Why
Product scope, priority	Product/CEO agent	Avoids over-engineering at the execution layer
Architecture, interfaces	Engineering Manager agent	Separates design from implementation concerns
UX, component structure	Designer agent	Keeps visual decisions out of backend prompts
Implementation	Execution agent	Single-objective, narrow context, fast
Correctness, edge cases	QA agent	Fresh context; no sunk-cost bias from implementation
Security, injection, auth	Security agent	Adversarial lens requires explicit framing
Merge, deploy, release	Release agent	Separate concern; human gate before irreversible push

The orchestrator does not implement any of these roles. It knows which role applies and routes accordingly.

Layer 2: Context Stability

Long chains of agents fail from context drift. The specification decided in step 2 is forgotten by step 8. The GSD pattern addresses this by treating the spec as a first-class artifact, not a conversation thread.

What Context Stability Requires

A written, versioned spec that agents load, not reconstruct from history
Explicit update steps when the spec changes (not implicit drift)
A human gate before the spec changes mid-chain
Agents that confirm spec alignment before proceeding

# Context handoff pattern
# At each agent boundary, pass the spec explicitly:

SPEC: See SPEC.md at commit abc123
TASK: Implement the authentication module as defined in section 3.2
CONSTRAINTS: Do not modify the user model schema
OUTPUT: PR ready for QA agent review

# The spec is not in the prompt. It is referenced by the prompt.

This separates context (stable, versioned) from instructions (per-task, ephemeral). Token cost drops. Drift disappears. Disputes resolve against the written spec, not conversation history.

Layer 3: Execution

Execution agents are narrow by design. They receive a single objective, minimal context, and a verifiable exit condition. Width is the orchestrator's job. Depth is the execution agent's job.

# Good execution prompt
ROLE: QA agent
CONTEXT: See SPEC.md §3.2, auth module PR #47
TASK: Find edge cases not covered by the current test suite
OUTPUT: Numbered list of uncovered cases with reproduction steps
HALT: When list is complete or you have checked all spec assertions

# Bad execution prompt (orchestration collapsed into execution)
You are a full-stack engineer. Review the spec, implement auth,
write tests, check security, prepare the PR, and make sure
it matches the design. Be thorough.

Human-in-the-Loop

Human gates are not an apology for agent unreliability. They are the architecture. The workflow is designed around them.

Where to Insert Gates

Step	Gate Type	Question to the Human
Before spec is finalized	Approval	Does this spec match your intent?
After architecture decision	Approval	Does this design fit constraints we haven't told the agent?
After QA report	Triage	Which findings are blockers vs. accepted risk?
Before any push/deploy	Hard gate	Explicit approval; no default proceed
When agent signals uncertainty	Escalation	Agent surfaces ambiguity; human resolves it

Gates must be explicit in the workflow definition. An implicit assumption that the human will "just notice" when to intervene is an absence of architecture, not a gate.

Designing for Interruption

A workflow that cannot be interrupted mid-chain is fragile. Every long chain should support:

Checkpoint saves: State is written to disk at each gate so the chain can resume
Step-back: Human can reject a step and re-run from the previous checkpoint
Override: Human can inject context or change direction at any gate

Parallelism

Independent tasks should not run sequentially. The constraint is dependency, not caution.

When to Parallelize

# Sequential (correct: B depends on A's output)
A: Finalize spec
B: Implement auth module per spec

# Parallel (correct: no dependency between B and C)
A: Finalize spec
B: Implement auth module per spec     ← launch together
C: Write E2E test scaffold per spec   ← launch together
D: Security review of spec            ← launch together
E: Merge B+C+D results, resolve conflicts

Parallelism Boundaries

Safe to Parallelize	Must Be Sequential
Independent feature branches	Spec finalization → implementation
QA + security review of same PR	Implementation → QA
Multiple execution agents on different modules	Architecture → any implementation
Competing design proposals	Human gate → next phase
Background context refresh	Merge + conflict resolution

Parallelism multiplies throughput only when the merge step is cheap. If parallel outputs require substantial reconciliation, the cost is hidden, not eliminated. Design merge steps explicitly; they are not free.

Loops

Loops are the mechanism for refinement. They require halting conditions, not just goals.

Loop Anatomy

LOOP:
  INPUT:  Current state + failure signal
  TASK:   Fix one thing
  VERIFY: Run oracle (tests, lint, typecheck)
  HALT:   Oracle passes OR loop count exceeds N
  ON HALT EXCEEDED: Escalate to human, do not auto-proceed

Loop Types

Loop Type	Trigger	Halting Condition
Fix-CI loop	Test/lint failure	All checks pass
Review loop	QA or human feedback	All blockers addressed
Refinement loop	Output quality below rubric threshold	Score exceeds threshold or max iterations
Exploration loop	Unknown solution space	N candidates generated; human selects
Context-refresh loop	Spec version mismatch	Agent confirms spec alignment

A loop without a halting condition is a runaway. A loop that halts on "done" is a loop that never halts on time. Halting conditions must be machine-verifiable.

Composing Many Steps

A 30-step workflow is not a 30-prompt workflow. Most prompts are small. The complexity is in the graph, not the nodes.

Step Graph Properties

Acyclic by default: Loops are explicit subgraphs, not accidental cycles
Typed edges: Each edge carries a type (sequential, parallel, gate, loop-back)
Named steps: Steps have IDs. Checkpoints reference IDs. Humans refer to steps by name, not by memory
Explicit merge nodes: Parallel branches always converge at a named merge step

Workflow Definition Pattern

# Minimal workflow definition
workflow: auth-feature
spec: specs/auth-v2.md

steps:
  - id: scope
    agent: product
    input: user_request
    gate: human_approval

  - id: design
    agent: eng-manager
    input: scope.output
    gate: human_approval

  - id: implement
    agent: execution
    parallel:
      - id: impl-backend
        input: design.output
      - id: impl-tests
        input: design.output
      - id: security-review
        input: design.output

  - id: merge
    agent: eng-manager
    input: [impl-backend.output, impl-tests.output, security-review.output]

  - id: qa
    agent: qa
    input: merge.output
    loop:
      on: qa_findings
      until: no_blockers
      max: 3

  - id: release
    agent: release
    input: qa.output
    gate: human_approval  # hard gate; no default proceed

This is not a prompt. It is a schema. The prompts are inside the agent definitions, kept separate from the workflow graph. When a step fails, you debug the step definition, not the entire chain.

Token Economics at Scale

Long workflows amplify token decisions made early. A 600-token context file loaded at every step of a 30-step workflow is 18,000 tokens spent on generic context. Task-specific context passed only to the relevant step costs a fraction of that.

Rules of thumb:

Pass the spec by reference (path + version), not by value (full text), except at the context layer
Execution agents get the minimum context required for their single task
Orchestration agents get workflow state, not file contents
Human gates are the correct place to surface summaries, not inside agent prompts

Anti-Patterns

The Monolith Prompt

A single prompt that asks an agent to plan, implement, review, and ship. All three layers collapsed into one. When it fails (and it will), there is nowhere to debug.

Implicit Sequencing

Running steps in order without documenting why. When a step needs to move or be parallelized, the dependency is unknown. The sequence breaks silently.

Unbounded Loops

Loops without maximum iterations or without escalation on failure. The agent retries indefinitely. The human discovers it hours later.

Framework Stacking Without Layer Separation

Running gstack, GSD, and Skills from a single prompt collapses all three layers into one context. The frameworks are complements because they operate at different layers; running them in parallel from one context eliminates the benefit of any of them.

Parallelism Without Merge Design

Launching parallel agents without planning how their outputs reconcile. Merge conflicts in parallel agent output are harder to resolve than sequential conflicts because neither agent knows about the other's decisions.

Missing Human Gates

Automating past a decision point that requires human judgment. The workflow moves fast and lands in the wrong place. The issue is irreversibility, not speed.

The Compounding Effect

Garry Tan's reported output (10,000 lines of code and 100 pull requests per week over 50 days) is the compounding product of clean layer separation applied consistently, not any single tool or prompt.

Each layer running at its level of abstraction means:

Orchestration is never re-litigating implementation details
Execution is never making architectural decisions
Context is never reconstructed from memory
Humans are never reviewing work that hasn't passed its own layer's gate

The workflow does not scale because it runs faster. It scales because it fails locally. Failures in execution do not corrupt orchestration. Failures in orchestration do not corrupt the spec. Each layer's failure mode is contained to that layer.

Related Workflows

CI Automation: loop patterns, halting conditions, and CI integration
Reward Rubric DSL: machine-verifiable halting conditions for refinement loops
Prompt Patterns: single-objective execution prompt structure
Agent Psychology: how agents reason within a step; why narrow context wins
Enterprise Agent Design: production-grade agent architecture patterns