Orchestrator-Workers
Also known as: Manager-Workers, Hierarchical Multi-Agent, Coordinator-Specialist
Has a central orchestrator dispatch sub-tasks to fresh workers and merge their outputs.
Claude Code
- Keep the orchestrator session as a planner only — it issues `Task()` calls and reads outputs; it never edits files.
- Dispatch all workers in a single assistant message so they run concurrently; serial dispatch multiplies wall-clock by N.
- Write each worker's output to a deterministic path on disk; the orchestrator reads those files for aggregation.
- Encode the decomposition policy in a SKILL.md loaded before dispatch so the planner prompt is versioned and reviewable.
Primitives
Related patterns
Cursor
- Use Plan mode to draft the worker decomposition before any code changes; review the step list before execution.
- Launch one Cloud Agent per worker task — each runs in an isolated VM on its own branch.
- Set shared invariants in a
.cursor/rules/*.mdcfile withalwaysApply: trueso every worker context loads the same constraints. - Aggregate by opening an Agent chat that references all worker branches via
@branchafter workers complete.
Primitives
Related patterns
Decision
| Use when ✓ | Avoid when ✗ |
|---|---|
| +Use this when the request decomposes into sub-tasks the planner can only enumerate after reading the input (multi-file code edits, deep research over an open question set, document pipelines that branch by content type). | −When the sub-tasks are known in advance and identical, Parallelization is the simpler shape and avoids paying for the orchestrator turn on every request. |
| +Justified where each sub-task benefits from a clean context (its own prompt, tool subset, and scratch space) so workers do not pollute each other or the orchestrator. | −Without observable per-worker outcomes, partial failures vanish into the aggregated response and the orchestrator silently presents broken work as finished. |
| +A good fit when the orchestrator can be a stronger model than the workers and the cost is dominated by worker turns. Anthropic's research system pairs an Opus orchestrator with Sonnet workers for that reason. | −When workers need to coordinate mid-task or react to each other's findings, the tree topology forces every exchange through the orchestrator and an a2a-style peer protocol or debate is a better fit. |
| +Best when the aggregation step has clear merge semantics (concatenation by section, voting, schema-typed merge) so the orchestrator can write an aggregator the workers all target. |
In the wild
| Source | Claim |
|---|---|
| anthropic.com → | Anthropic documents Claude's Research feature as an orchestrator-workers system: a lead Claude agent plans the search, spawns subagents that explore branches in parallel with their own context windows, and a final agent synthesises the report. The wiring matches the hub-and-spoke shape this pattern names. |
| docs.crewai.com → | CrewAI ships a hierarchical Process in which a manager agent (generated by the framework or supplied by the developer) assigns tasks to specialist agents and reviews their outputs before composing the final result. |
| github.com → | MetaGPT encodes Standard Operating Procedures as a hierarchy of specialised role agents (Product Manager, Architect, Engineer, QA) coordinated by an upstream planner that routes artifacts between them, demonstrating the pattern on a software-company workflow. |
Reader gotcha
Anthropic reports their multi-agent research system uses about 15× the tokens of a single Claude chat: the planner pays once, every spawned worker pays the prompt overhead again, and the aggregator pays a third time over the merged context. The pattern only earns the cost when the task is parallelisable enough that wall-clock and quality wins outweigh the multiplier. source
Implementation sketch
import { generateObject, generateText } from 'ai'
import { openai } from '@ai-sdk/openai'
import { z } from 'zod'
const Plan = z.object({
workers: z.array(z.object({ role: z.string(), brief: z.string() })).min(1).max(6),
})
export async function orchestrate(task: string): Promise<string> {
const { object: plan } = await generateObject({
model: openai('gpt-4o'),
schema: Plan,
prompt: `Decompose into 1-6 worker briefs.\nTask: ${task}`,
})
const outputs = await Promise.all(
plan.workers.map(({ role, brief }) =>
generateText({
model: openai('gpt-4o-mini'),
system: `You are the ${role} worker. Return only your slice.`,
prompt: brief,
}).then((r) => `[${role}] ${r.text}`),
),
)
const { text } = await generateText({
model: openai('gpt-4o'),
prompt: `Task: ${task}\nWorker outputs:\n${outputs.join('\n\n')}\nMerge into one coherent response.`,
})
return text
}
export {}
- LangGraph
- CrewAI
- AutoGen
- Vercel AI SDK
References
Orchestrator-Workers structures a multi-agent run as a hub with spokes. A central orchestrator inspects the incoming request, decides at runtime which sub-tasks the work decomposes into, and dispatches each one to a fresh worker agent with its own prompt, tool surface, and context window. When the workers return, the orchestrator aggregates their outputs into the response the caller sees. The decomposition is dynamic: the orchestrator chooses how many workers to spawn and what each one is asked to do based on the request, not by reading off a fixed pipeline. Workers do not talk to each other; they talk only to the orchestrator, and the topology stays a tree.
Background · context and trade-offs
The pattern sits between two simpler shapes that resemble it. Parallelization fans the same prompt across a fixed number of workers and votes; the decomposition is decided at design time, not by an LLM at runtime. Planning splits a single agent into a planner and an executor, but the executor walks one step list in one process; there is no second model call per step in a separate context. Orchestrator-Workers earns its name when both conditions hold: the sub-tasks are not enumerable up front, and each one benefits from running in isolation. AutoGen and MetaGPT formalise the role split; Anthropic frames the workflow as the right answer when "you can't predict the subtasks."
The hard part is the orchestrator, not the workers. A planner that under-decomposes hands a single worker the whole job and adds latency for nothing; one that over-decomposes shatters context the workers needed and pays N times for the same prompt overhead. The aggregation step is where partial worker failures and disagreements surface, and a naive orchestrator that concatenates worker outputs verbatim leaks duplicate sentences and contradicts itself. Production deployments instrument worker count, fan-out latency, and aggregation conflicts, and treat the orchestrator prompt as the critical path. Every regression there multiplies through the workers below.