April 2026
Modular Workflow Stack
The right tool for the right job at the right time. In sequence, with deliberate hand-offs and human checkpoints between layers.
A modular workflow is a decision surface with execution attached.
The Skills Stack pattern (role-based orchestration via gstack, context stability via GSD, execution via Superpowers) makes this concrete. Each layer has a job. Each layer hands off cleanly. None of them run simultaneously from a single monolithic prompt.
The Three Layers
Every durable workflow decomposes into three concerns:
| Layer | Job | Failure Mode When Missing |
|---|---|---|
| Orchestration | Decide what to do next, route to the right agent, gate on human input | Agents start work before the goal is clear; you debug outputs instead of inputs |
| Context | Keep specifications stable across a long chain of steps | Context drift: later steps contradict earlier decisions; agents re-argue closed questions |
| Execution | Do one narrow task well with minimal context | Bloated prompts, multi-objective confusion, unpredictable output |
The layers must be physically separate. An orchestration prompt that also does execution collapses both failure modes into one.
Layer 1: Orchestration
The orchestrator is the conductor, not a performer. Its only job is to route: given the current state and user input, which agent runs next, with what context, and who reviews the result?
Orchestrator Responsibilities
- Accept structured user input at each decision point
- Choose agent type based on the task class
- Decide whether steps run sequentially or in parallel
- Insert human-in-the-loop gates before irreversible actions
- Decide when to loop (retry, refine, escalate)
- Decide when to halt
Role-Based Routing
gstack formalizes what most effective teams do informally: different decisions belong to different roles. The same principle applies to agents.
| Decision Type | Route To | Why |
|---|---|---|
| Product scope, priority | Product/CEO agent | Avoids over-engineering at the execution layer |
| Architecture, interfaces | Engineering Manager agent | Separates design from implementation concerns |
| UX, component structure | Designer agent | Keeps visual decisions out of backend prompts |
| Implementation | Execution agent | Single-objective, narrow context, fast |
| Correctness, edge cases | QA agent | Fresh context; no sunk-cost bias from implementation |
| Security, injection, auth | Security agent | Adversarial lens requires explicit framing |
| Merge, deploy, release | Release agent | Separate concern; human gate before irreversible push |
The orchestrator does not implement any of these roles. It knows which role applies and routes accordingly.
Layer 2: Context Stability
Long chains of agents fail from context drift. The specification decided in step 2 is forgotten by step 8. The GSD pattern addresses this by treating the spec as a first-class artifact, not a conversation thread.
What Context Stability Requires
- A written, versioned spec that agents load, not reconstruct from history
- Explicit update steps when the spec changes (not implicit drift)
- A human gate before the spec changes mid-chain
- Agents that confirm spec alignment before proceeding
# Context handoff pattern
# At each agent boundary, pass the spec explicitly:
SPEC: See SPEC.md at commit abc123
TASK: Implement the authentication module as defined in section 3.2
CONSTRAINTS: Do not modify the user model schema
OUTPUT: PR ready for QA agent review
# The spec is not in the prompt. It is referenced by the prompt.
This separates context (stable, versioned) from instructions (per-task, ephemeral). Token cost drops. Drift disappears. Disputes resolve against the written spec, not conversation history.
Layer 3: Execution
Execution agents are narrow by design. They receive a single objective, minimal context, and a verifiable exit condition. Width is the orchestrator's job. Depth is the execution agent's job.
# Good execution prompt
ROLE: QA agent
CONTEXT: See SPEC.md §3.2, auth module PR #47
TASK: Find edge cases not covered by the current test suite
OUTPUT: Numbered list of uncovered cases with reproduction steps
HALT: When list is complete or you have checked all spec assertions
# Bad execution prompt (orchestration collapsed into execution)
You are a full-stack engineer. Review the spec, implement auth,
write tests, check security, prepare the PR, and make sure
it matches the design. Be thorough.
Human-in-the-Loop
Human gates are not an apology for agent unreliability. They are the architecture. The workflow is designed around them.
Where to Insert Gates
| Step | Gate Type | Question to the Human |
|---|---|---|
| Before spec is finalized | Approval | Does this spec match your intent? |
| After architecture decision | Approval | Does this design fit constraints we haven't told the agent? |
| After QA report | Triage | Which findings are blockers vs. accepted risk? |
| Before any push/deploy | Hard gate | Explicit approval; no default proceed |
| When agent signals uncertainty | Escalation | Agent surfaces ambiguity; human resolves it |
Gates must be explicit in the workflow definition. An implicit assumption that the human will "just notice" when to intervene is an absence of architecture, not a gate.
Designing for Interruption
A workflow that cannot be interrupted mid-chain is fragile. Every long chain should support:
- Checkpoint saves: State is written to disk at each gate so the chain can resume
- Step-back: Human can reject a step and re-run from the previous checkpoint
- Override: Human can inject context or change direction at any gate
Parallelism
Independent tasks should not run sequentially. The constraint is dependency, not caution.
When to Parallelize
# Sequential (correct: B depends on A's output)
A: Finalize spec
B: Implement auth module per spec
# Parallel (correct: no dependency between B and C)
A: Finalize spec
B: Implement auth module per spec ← launch together
C: Write E2E test scaffold per spec ← launch together
D: Security review of spec ← launch together
E: Merge B+C+D results, resolve conflicts
Parallelism Boundaries
| Safe to Parallelize | Must Be Sequential |
|---|---|
| Independent feature branches | Spec finalization → implementation |
| QA + security review of same PR | Implementation → QA |
| Multiple execution agents on different modules | Architecture → any implementation |
| Competing design proposals | Human gate → next phase |
| Background context refresh | Merge + conflict resolution |
Parallelism multiplies throughput only when the merge step is cheap. If parallel outputs require substantial reconciliation, the cost is hidden, not eliminated. Design merge steps explicitly; they are not free.
Loops
Loops are the mechanism for refinement. They require halting conditions, not just goals.
Loop Anatomy
LOOP:
INPUT: Current state + failure signal
TASK: Fix one thing
VERIFY: Run oracle (tests, lint, typecheck)
HALT: Oracle passes OR loop count exceeds N
ON HALT EXCEEDED: Escalate to human, do not auto-proceed
Loop Types
| Loop Type | Trigger | Halting Condition |
|---|---|---|
| Fix-CI loop | Test/lint failure | All checks pass |
| Review loop | QA or human feedback | All blockers addressed |
| Refinement loop | Output quality below rubric threshold | Score exceeds threshold or max iterations |
| Exploration loop | Unknown solution space | N candidates generated; human selects |
| Context-refresh loop | Spec version mismatch | Agent confirms spec alignment |
A loop without a halting condition is a runaway. A loop that halts on "done" is a loop that never halts on time. Halting conditions must be machine-verifiable.
Composing Many Steps
A 30-step workflow is not a 30-prompt workflow. Most prompts are small. The complexity is in the graph, not the nodes.
Step Graph Properties
- Acyclic by default: Loops are explicit subgraphs, not accidental cycles
- Typed edges: Each edge carries a type (sequential, parallel, gate, loop-back)
- Named steps: Steps have IDs. Checkpoints reference IDs. Humans refer to steps by name, not by memory
- Explicit merge nodes: Parallel branches always converge at a named merge step
Workflow Definition Pattern
# Minimal workflow definition
workflow: auth-feature
spec: specs/auth-v2.md
steps:
- id: scope
agent: product
input: user_request
gate: human_approval
- id: design
agent: eng-manager
input: scope.output
gate: human_approval
- id: implement
agent: execution
parallel:
- id: impl-backend
input: design.output
- id: impl-tests
input: design.output
- id: security-review
input: design.output
- id: merge
agent: eng-manager
input: [impl-backend.output, impl-tests.output, security-review.output]
- id: qa
agent: qa
input: merge.output
loop:
on: qa_findings
until: no_blockers
max: 3
- id: release
agent: release
input: qa.output
gate: human_approval # hard gate; no default proceed
This is not a prompt. It is a schema. The prompts are inside the agent definitions, kept separate from the workflow graph. When a step fails, you debug the step definition, not the entire chain.
Token Economics at Scale
Long workflows amplify token decisions made early. A 600-token context file loaded at every step of a 30-step workflow is 18,000 tokens spent on generic context. Task-specific context passed only to the relevant step costs a fraction of that.
Rules of thumb:
- Pass the spec by reference (path + version), not by value (full text), except at the context layer
- Execution agents get the minimum context required for their single task
- Orchestration agents get workflow state, not file contents
- Human gates are the correct place to surface summaries, not inside agent prompts
Anti-Patterns
The Monolith Prompt
A single prompt that asks an agent to plan, implement, review, and ship. All three layers collapsed into one. When it fails (and it will), there is nowhere to debug.
Implicit Sequencing
Running steps in order without documenting why. When a step needs to move or be parallelized, the dependency is unknown. The sequence breaks silently.
Unbounded Loops
Loops without maximum iterations or without escalation on failure. The agent retries indefinitely. The human discovers it hours later.
Framework Stacking Without Layer Separation
Running gstack, GSD, and Skills from a single prompt collapses all three layers into one context. The frameworks are complements because they operate at different layers; running them in parallel from one context eliminates the benefit of any of them.
Parallelism Without Merge Design
Launching parallel agents without planning how their outputs reconcile. Merge conflicts in parallel agent output are harder to resolve than sequential conflicts because neither agent knows about the other's decisions.
Missing Human Gates
Automating past a decision point that requires human judgment. The workflow moves fast and lands in the wrong place. The issue is irreversibility, not speed.
The Compounding Effect
Garry Tan's reported output (10,000 lines of code and 100 pull requests per week over 50 days) is the compounding product of clean layer separation applied consistently, not any single tool or prompt.
Each layer running at its level of abstraction means:
- Orchestration is never re-litigating implementation details
- Execution is never making architectural decisions
- Context is never reconstructed from memory
- Humans are never reviewing work that hasn't passed its own layer's gate
The workflow does not scale because it runs faster. It scales because it fails locally. Failures in execution do not corrupt orchestration. Failures in orchestration do not corrupt the spec. Each layer's failure mode is contained to that layer.
Related Workflows
- CI Automation: loop patterns, halting conditions, and CI integration
- Reward Rubric DSL: machine-verifiable halting conditions for refinement loops
- Prompt Patterns: single-objective execution prompt structure
- Agent Psychology: how agents reason within a step; why narrow context wins
- Enterprise Agent Design: production-grade agent architecture patterns