Skip to main content
d-n
← Back to Agentic Design Patterns
Layer 1: Topology / Control Flow

Prompt Chaining

Also known as: Pipeline, Sequential Prompting, LLM Pipeline

Decomposes a task into a fixed chain of LLM calls, each consuming the previous output.

by

The simplest agentic topology: a fixed sequence of LLM calls authored by a human at design time, each stage feeding the next through a schema gate that validates the handoff or retries on failure — no planner, no fan-out.

Claude Code

  • Define each pipeline stage as a named step in `CLAUDE.md` so the agent follows the same decomposition every run.
  • Use a hook to validate structured output between stages; exit non-zero to retry the failing stage.
  • Pin the stage order in `settings.json` via an allowed tool sequence — prevents the agent skipping gates under time pressure.
  • Keep each stage in a separate subagent when stages need clean context isolation from one another.

Primitives

  • CLAUDE.md stage definitions
  • PreToolUse hooks (inter-stage gates)
  • Task tool (per-stage subagents)
  • settings.json tool allow-list

Cursor

  • Add a .cursor/rules/*.mdc file listing each pipeline stage and its acceptance criteria.
  • Use Agent mode with Plan mode first — review the generated step list before execution begins.
  • Reference the output schema of each stage with @file so the next stage prompt sees the contract.
  • Set alwaysApply: false on stage rules and trigger them via @rule-name only when that stage is active.

Primitives

  • .cursor/rules/*.mdc (stage rules)
  • Plan mode (upfront step review)
  • @file references
  • Agent mode

Decision

Use when ✓Avoid when ✗
+Use this when the task decomposes cleanly into 2–5 fixed stages whose order is known up front (extract, normalize, draft, review).When the path through the work is data-dependent and changes per request, a fixed chain forces a wrong shape on the task. Route or plan instead.
+Each stage should produce a parseable artifact (JSON object, list, scored shortlist) so a deterministic gate can validate the handoff between calls.Without inter-stage validation, a chain becomes a longer monolith: each stage launders the previous stage's errors into the next, and end-to-end accuracy falls below the single-prompt baseline.
+A good fit when accuracy outranks latency and a single monolithic prompt has started to drop instructions or hallucinate intermediate state.When stages are mutually independent, sequencing them wastes wall-clock time the parallelization pattern would recover.
+Reach for it as the first agentic shape on a new problem. It ships in a day, runs deterministically, and exposes which stage is the real source of error before any planner is introduced.

In the wild

SourceClaim
github.comAnthropic's public agent cookbook ships a runnable prompt-chain workflow that drafts marketing copy, gates the result with a programmatic check, then translates it. The decomposition matches the essay exactly.
docs.langchain.comLangGraph documents prompt chaining as one of the canonical workflow shapes, modelled as a small graph of typed nodes whose edges carry the validated artifact between stages.
openai.comOpenAI's Deep Research product runs a multi-stage pipeline that plans sub-questions, retrieves and reads sources for each, then synthesises a report. The chain is long-running, and its stages are visible to the user as a streaming progress trace.

Reader gotcha

Anthropic's essay names the failure mode bluntly: the trade is latency for accuracy, and the win evaporates when the chain has no programmatic gate between stages. A malformed JSON or an off-topic draft at stage one rides through and the chain returns a confidently wrong final artifact. The gate is the pattern; the chain without it is just a longer prompt. source

Implementation sketch

import { generateText, generateObject } from 'ai'
import { openai } from '@ai-sdk/openai'
import { z } from 'zod'

const Outline = z.object({ topic: z.string(), sections: z.array(z.string()).min(3) })

async function draftReport(brief: string): Promise<string> {
  const { object: outline } = await generateObject({
    model: openai('gpt-4o-mini'),
    schema: Outline,
    prompt: `Produce an outline for: ${brief}`,
  })
  if (outline.sections.length < 3) throw new Error('outline gate: too few sections')
  const { text: draft } = await generateText({
    model: openai('gpt-4o'),
    prompt: `Write the report. Topic: ${outline.topic}. Sections: ${outline.sections.join(', ')}`,
  })
  const { text: polished } = await generateText({
    model: openai('gpt-4o-mini'),
    prompt: `Edit for tone and clarity. Return only the edited text.\n\n${draft}`,
  })
  return polished
}

export {}
First-party TS SDK
  • LangChain
  • LangGraph
  • Vercel AI SDK
  • Mastra

References

  1. Wu et al.·2022·CHI 2022 · DOI: 10.1145/3491102.3517582

    foundational paper — coined "AI chains" and introduced the prompt-chain idiom

  2. Anthropic·2024

    names prompt chaining as the first workflow to reach for, with the latency-for-accuracy trade and the gate discipline

  3. Khattab et al.·2022·arXiv · DOI: 10.48550/arXiv.2212.14024

    shows the chain-of-prompts shape for retrieval-augmented question answering — predecessor to DSPy

  4. Antonio Gulli·2026·Springer·pp. 112
  5. LangChain team·2026·accessed
  6. Anthropic·2024·accessed

Prompt chaining decomposes a task into a fixed sequence of LLM calls in which each step's output becomes the next step's input. The decomposition is editorial: a human picks the seams, names the intermediate artifacts, and writes one prompt per stage so each prompt is small enough to be reliable on its own. Between stages the program holds the artifact in plain memory, and often runs a deterministic gate (a schema validator, a regex, a length check) that decides whether to advance, retry, or fail closed.

Background · context and trade-offs

The pattern is the first thing the Anthropic taxonomy reaches for when a task can be split cleanly. The trade is latency for accuracy: two or three smaller prompts produce more dependable answers than one heroic prompt that has to plan, retrieve, draft, and proofread inside one inference. Structured output earns the chain its reliability: each stage emits JSON or some other parseable shape so the next stage receives an object, not a paragraph, and the seams stay machine-readable. Without the gates, errors at stage one quietly ride through the rest of the chain.

Prompt chaining sits underneath the more elaborate topologies. Routing picks a chain at runtime; parallelization fans out independent stages; orchestrator-workers and ReAct add a planner that the chain itself does not have. The chain is fixed at author time, which is why it ships the soonest and breaks the latest. The cost is editorial debt: when the task changes shape, the seams have to be redrawn by hand, because no part of the chain decides for itself whether the next call is the right one.