Skip to main content
d-n
← Back to Agentic Design Patterns
Layer 5: Methodology

12-Factor Agent

Also known as: 12-Factor Agents, 12FA

Catalogues twelve principles a team applies to push an agent from demo to production.

by

Twelve principles applied as a methodology: an existing agent is audited against the factors, and four refactor tracks — own the prompt and context, tools as structured outputs, own the control flow, expose lifecycle APIs — converge on a production-grade agent the team can step through and own end to end.

Claude Code

12-Factor Agent is a methodology, not a single mechanism. Each factor maps to one or more patterns in this catalog; the table below names the most direct correspondence. Apply the factors by reaching for the matching patterns in the order your agent needs them.

Primitives

  • CLAUDE.md (prompt ownership)
  • settings.json (loop configuration)
  • Task tool (subagent boundaries)
  • Disk artifacts (unified state store)

Pattern index

  • context-engineeringFactor 3 (own your context window) — explicit context assembly over framework defaults.
  • checkpointingFactor 5 (unify state) + Factor 12 (stateless reducer) — durable state machine the reducer runs over.
  • tool-use-reactFactor 4 (tools as structured outputs) — structured tool call shape the React loop emits.
  • human-in-the-loopFactor 7 (contact humans via tool calls) — pause-and-resume pattern wired to the human approval gate.
  • guardrailsFactor 6 (own your control flow) — runtime enforcement layer the owning code can audit and update.
  • evaluation-llm-as-judgeFactor 11 (trigger from anywhere) — evaluator running as CI step independent of the agent runtime.

Meta-rule

If a rule has a trigger ("when adding screenshots", "before committing", "during PR review"), it belongs in a skill, not here.

Cursor

  • Write the agent's prompt in a .cursor/rules/*.mdc file with alwaysApply: true — factor 2: prompt lives in the repo.
  • Use Agent mode for the executor loop; write intermediate state to disk files and read them back via @file — factor 5.
  • Use Plan mode to draft the step list before execution — factor 8 (own the loop).
  • Gate each phase on a CI check via cloud agents PR-based flow — factors 9 and 10.

Primitives

  • .cursor/rules/*.mdc (prompt ownership)
  • @file (unified state between steps)
  • Plan mode (explicit step list)
  • Cloud agents (CI-gated phases)

Decision

Use when ✓Avoid when ✗
+Right call when an agent prototype has hit the 70–80 percent quality wall and the next gain requires reaching past framework abstractions into the prompt, the loop, and the state machine.When the agent is a one-off prototype, an internal demo, or a research notebook whose lifetime is shorter than the cost of dismantling a framework abstraction.
+Justified where a team owns the agent in production (they answer the pages, write the evals, and ship the patches) and needs every layer of the runtime to be code they can read and step through.Without a team that owns the runtime end to end, the methodology produces a half-refactored codebase whose framework defaults and hand-rolled pieces fight each other in production.
+A good fit when porting an agent across runtimes (web request, queue worker, cron, webhook, Slack bot) and the same business logic must trigger from anywhere without a rewrite.When the bottleneck is base-model capability rather than engineering: owning your prompts will not rescue an agent whose underlying reasoning cannot solve the task at any prompt.
+Worth the cost when an audit, compliance review, or incident review will demand an account of exactly what the model saw, what it returned, and which deterministic code acted on the result.

In the wild

SourceClaim
github.comHumanLayer publishes the methodology as an open guide with twelve linked factor essays, an AI Engineer World's Fair conference talk, and a public Discord. It is the canonical reference and the only widely-cited treatment of 12-factor for agents.
github.comThe got-agents/agents repository ships deploybot-ts and linear-assistant-ts as worked examples that embody the methodology: small, focused TypeScript agents whose loop, prompts, and state are owned by the application code rather than a framework.
github.comHumanLayer's own kubechain runtime applies the principles to distributed agents on Kubernetes, with explicit launch/pause/resume APIs and a stateless reducer model so workers can be evicted and rescheduled without losing run state.

Reader gotcha

Factor 12's stateless reducer only holds if the reducer is actually pure: a stray Date.now(), an unguarded fetch, or a closure over module-level mutable state inside the step function silently breaks resume, replay, and audit. HumanLayer documents this as the price of admission for owning your control flow: side-effecting work belongs in the executeTool boundary, not in the reducer that decides the next step. source

Implementation sketch

// Demonstrates Factor 12: agent as a stateless reducer over an event log.
// The loop, the prompt, and the context assembly are all owned by application
// code rather than hidden behind a framework Agent class.
import { generateObject } from 'ai'
import { openai } from '@ai-sdk/openai'
import { z } from 'zod'

type Event =
  | { kind: 'user_message'; text: string }
  | { kind: 'tool_call'; name: string; args: unknown }
  | { kind: 'tool_result'; name: string; result: unknown }
  | { kind: 'done'; answer: string }

const NextStep = z.discriminatedUnion('intent', [
  z.object({ intent: z.literal('call_tool'), name: z.string(), args: z.record(z.unknown()) }),
  z.object({ intent: z.literal('done'), answer: z.string() }),
])

declare function renderContext(events: Event[]): string                // Factor 3: own the window
declare function executeTool(name: string, args: unknown): Promise<unknown>

async function step(events: Event[]): Promise<Event> {                  // Factor 12: pure reducer
  const { object } = await generateObject({                             // Factor 4: tools as structured output
    model: openai('gpt-4o'),
    schema: NextStep,
    prompt: renderContext(events),                                      // Factor 2: own the prompt
  })
  if (object.intent === 'done') return { kind: 'done', answer: object.answer }
  const result = await executeTool(object.name, object.args)
  return { kind: 'tool_result', name: object.name, result }
}

async function run(initial: Event[]): Promise<string> {                 // Factor 8: own the loop
  const events = [...initial]
  while (true) {
    const next = await step(events)
    events.push(next)                                                   // Factor 5: unified state
    if (next.kind === 'done') return next.answer
  }
}

export {}

References

  1. Dexter Horthy and HumanLayer contributors·2025·accessed

    canonical and load-bearing reference; the twelve factors and their per-factor essays

  2. Dexter Horthy·2025·accessed

    the factor that sits underneath Context Engineering; explicit prompt assembly over framework defaults

  3. Dexter Horthy·2025·accessed

    reducer-shaped agent that composes with Checkpointing and journal-based durable execution

  4. Adam Wiggins·2011

    the Heroku-era predecessor whose framing 12-factor agents adapts; methodology lineage, not direct technical content

  5. Anthropic·2024

    companion field guide; the 12-factor essays cite this as the catalog of mechanisms the methodology selects among

  6. Dexter Horthy·2025

    conference talk presenting the methodology; treated as a primary source by HumanLayer

  7. Marco Argenti (foreword); Antonio Gulli (book)·2026·Springer·pp. 46

    frames agent engineering as a tenet-driven discipline ("Build with Purpose, Look Around Corners, Inspire Trust") — a parallel methodology argument from the CIO of Goldman Sachs

12-Factor Agent is a methodology, not a mechanism. Dexter Horthy and the HumanLayer community catalogued twelve principles a team applies to push an agent across the gap between a 70-percent demo and software a customer touches: own your prompts, own your context window, treat tools as structured outputs, unify execution and business state, expose launch / pause / resume as APIs, contact humans through tool calls, own your control flow, compact errors back into the window, keep agents small and focused, trigger from anywhere, and make the agent a stateless reducer over an event log. The lineage to Adam Wiggins' 12-Factor App is deliberate: a cloud-portable operational mindset, applied to LLM-powered software.

Background · context and trade-offs

The pattern sits one layer up from the rest of this catalog. Reflexion, Checkpointing, Tool Use ReAct, and the others are mechanisms: concrete loops or protocols a system either runs or does not. 12-Factor is the lens that decides which mechanisms an agent assembles, which abstractions it owns rather than imports, and which framework defaults it dismantles to keep production behaviour legible. The methodology is opinionated about plumbing: the loop is code a developer can step through, the prompt lives in the repo, the context window is assembled by a function a reviewer can read, and the run survives a restart because its state is durable and its reducer is pure.

The principles compose with the rest of the catalog rather than competing. Factor 3 (own your context window) is the operational shape of Context Engineering. Factor 5 (unify state) and factor 12 (stateless reducer) are the contract Checkpointing relies on. Factor 7 (contact humans with tool calls) is one Human-in-the-Loop implementation. Factor 4 (tools as structured outputs) is the wire shape Tool Use ReAct emits. Adoption costs the work of taking abstractions back from a framework; the payoff is that every later debugging session, eval, or incident touches code the team owns end to end.