Skip to main content
d-n
← Back to Agentic Design Patterns
Layer 1: Topology / Control Flow

Planning

Also known as: Plan-and-Execute, Decompose-then-Act

Drafts a multi-step plan, executes it, and revises the tail when a step fails.

by

Where ReAct improvises at every turn, Planning commits upfront: a planner generates a written step list before execution begins, and only the remaining tail is rewritten when a step fails — completed work stays intact.

Claude Code

  • Write the step list to a plan.md file before executing any steps — the plan is a disk artifact a reviewer can inspect.
  • Execute each step as a separate subagent dispatch; pass only the current step brief and prior observations, not the full plan.
  • Assert step success before advancing — a PreToolUse hook or a structured tool result gates the next dispatch.
  • Rewrite the tail of plan.md when a step fails; preserve completed steps so the resume starts from the last successful checkpoint.

Primitives

  • Disk plan.md (inspectable step list)
  • Task tool (per-step executor)
  • PreToolUse hooks (step gate)

Cursor

  • Use Plan mode to generate the step list; review and edit it before execution begins — the plan is the contract.
  • Execute one step at a time in Agent mode; review each diff before accepting and moving to the next step.
  • Write step outputs to named files; reference them via @file in the next step's prompt to carry context forward without relying on chat history.
  • Use cloud agents for long-running plans; the branch-and-PR shape gives you a review gate at each phase boundary.

Primitives

  • Plan mode (step list generation and review)
  • Agent mode (step executor)
  • @file (step output forwarding)
  • Cloud agents (long-running plans)

Decision

Use when ✓Avoid when ✗
+Apply when the task has interdependent sub-goals whose order matters and at least one early step constrains the choices available later (research reports, multi-tool workflows, code-mod sequences).When the goal is single-shot or a fixed pipeline already encodes the right order, the planner step pays no rent and adds an extra LLM call to every run.
+Worth the cost where a wrong step is expensive enough that committing to a forward plan and reviewing it once is cheaper than ten reactive turns.Without a check that can detect a bad step (tool error, unmet precondition, evaluator), the re-plan loop has nothing to fire on and the agent will execute a wrong plan to completion.
+Useful when the executor benefits from a written contract a different process (a reviewer, a checkpoint, a human) can read between steps.When step latency dominates and the goal tolerates greedy local choice, ReAct-style interleaved reasoning is usually faster and indistinguishable in quality.
+A good fit when tools or sub-agents are heterogeneous and the planner needs to pick which to invoke, in what order, and on what input.

In the wild

SourceClaim
cognition.aiCognition’s Devin produces a step-by-step task plan that streams in the side panel and is rewritten as the agent learns the codebase, exposing the plan as the user-facing artifact of the run.
github.comAutoGPT’s classic agent loop instructs the model to think, plan, criticise, and choose the next command on every turn, then writes the resulting plan into the prompt for the following step. It became the canonical popular-press demonstration of the planner-executor split.
github.comBabyAGI maintains an explicit task list, executes the top task, and uses a task-creation step to append new tasks based on the result before reprioritising. The whole minimum viable planning loop ships in roughly a hundred lines of Python.

Reader gotcha

Plan-and-Solve prompting reports gains over chain-of-thought because the explicit plan suppresses missed-step errors, but the same paper documents that calculation errors and step-ordering mistakes survive untouched. The plan looks coherent while a single arithmetic slip propagates through every downstream step. Planning without a per-step check produces a confidently wrong execution trace. source

Implementation sketch

import { generateObject, generateText } from 'ai'
import { openai } from '@ai-sdk/openai'
import { z } from 'zod'

const Plan = z.object({ steps: z.array(z.string()).min(1).max(8) })

declare function execute(step: string): Promise<{ ok: boolean; observation: string }>

async function planAndExecute(goal: string, maxRevisions = 3): Promise<string> {
  let plan = (await generateObject({
    model: openai('gpt-4o'),
    schema: Plan,
    prompt: `Goal: ${goal}\nDecompose into 3-8 ordered steps.`,
  })).object.steps
  const trace: string[] = []
  for (let revisions = 0; plan.length > 0; ) {
    const step = plan.shift()!
    const { ok, observation } = await execute(step)
    trace.push(`${step} -> ${observation}`)
    if (ok) continue
    if (++revisions > maxRevisions) throw new Error('Re-plan budget exhausted')
    plan = (await generateObject({
      model: openai('gpt-4o'),
      schema: Plan,
      prompt: `Goal: ${goal}\nTrace:\n${trace.join('\n')}\nRewrite the remaining steps.`,
    })).object.steps
  }
  return (await generateText({ model: openai('gpt-4o'), prompt: `Goal: ${goal}\nTrace:\n${trace.join('\n')}\nSummarise the result.` })).text
}

export {}
First-party TS SDK
  • LangGraph
  • CrewAI
  • Vercel AI SDK

References

  1. Yao et al.·2023·NeurIPS 2023 · DOI: 10.48550/arXiv.2305.10601

    branching planner that searches over candidate plans

  2. Shen et al.·2023·NeurIPS 2023 · DOI: 10.48550/arXiv.2303.17580

    planner picks specialist models from a registry per step

  3. Wang et al.·2023·ACL 2023 · DOI: 10.48550/arXiv.2305.04091

    collapses the planner-executor split into a single prompt

  4. Anthropic·2024

    orchestrator-workers section frames planning as the dynamic-decomposition workflow

  5. Antonio Gulli·2026·Springer·pp. 89101
  6. LangChain team·2024·accessed
  7. CrewAI team·2024·accessed

Planning splits an agent run into two phases the literature treats as separable: a planner that decomposes a goal into ordered steps, and an executor that takes one step at a time. The planner reasons about the task end-to-end before any tool fires, so the system commits to a structure rather than improvising token by token. The executor walks the step list, calling tools or sub-agents, and the orchestration layer holds onto the plan as a contract the run can be checked against. The split exists because language-model strengths on long-horizon tasks live mostly in the upfront decomposition, not in step-wise reactive choice.

Background · context and trade-offs

The pattern only earns its name when the plan can be revised. After each step the executor compares results against the plan and routes back to the planner whenever a precondition fails, a tool returns the wrong shape, or a later step turns out to be unreachable. The planner edits the tail rather than restarting, preserving work already done and bounding the cost of a mistake. Variants branch differently: tree-of-thoughts expands candidate plans in parallel and prunes by a value heuristic; HuggingGPT picks tools eagerly from a registry; Plan-and-Solve prompting collapses the loop into a single chain-of-thought that emits the plan inline.

Planning sits next to but distinct from chain-of-thought, ReAct, and orchestrator-workers. Chain-of-thought reasons within one response and never names a step list the rest of the system can read; ReAct interleaves reasoning and acting at every turn without an explicit forward plan; orchestrator-workers fans a fixed plan out to specialists. Planning insists on a written, inspectable plan that survives across LLM calls and can be re-edited. The cost is operational: someone has to decide step granularity, when a deviation justifies a re-plan rather than a retry, and how many revisions are allowed before the run aborts. Plans that cannot be checked against ground truth degrade into expensive prose.