Reflexion
Also known as: Verbal Reinforcement Learning, Self-Reflection
Writes a critique after each attempt and retrieves prior critiques on the next attempt.
Claude Code
- Write verbal critiques to a named file after each failed attempt; load that file at the top of the next attempt's `CLAUDE.md` context.
- Scope critique storage to the task class — a per-project directory in
.claude/works; key by task type, not by individual run. - Use a separate subagent as the critic to avoid same-model self-approval; pass it the trajectory as a file reference.
- Compact the critique buffer periodically — a long-term store of all failures degrades retrieval quality for the next attempt.
Primitives
Related patterns
Cursor
- Append critique notes to a `.cursor/rules/*.mdc` file after each failed attempt;
alwaysApply: trueloads them on the next session. - Reference the critique file explicitly via
@fileat the start of the retry attempt so the agent reads it first. - Scope critique rules by task class — one
.mdcfile per problem domain keeps lessons targeted and avoids stale notes diluting unrelated tasks. - Limit each critique rule file to under 100 lines; split by task class if the file grows beyond that to preserve retrieval quality.
Primitives
Related patterns
Decision
| Use when ✓ | Avoid when ✗ |
|---|---|
| +This earns its rent when the agent attempts the same task class repeatedly and you have a place to keep critiques across runs (multi-turn assistants, recurring job types, agent benchmarks). | −When the task is single-shot, the lesson has no future attempt to inform and the critique step pays no rent. |
| +Justified where failure is diagnosable from the trajectory: the agent can name the mistake in words an outside reviewer could verify. | −Without external grounding, same-model self-critique tends to approve its own work even when it should not. Substitute a different model, a tool-grounded check, or the CRITIC pattern. |
| +A good fit when fine-tuning is too slow or too expensive but you can afford a second LLM call per failed attempt and a small key-value store for lessons. | −When failures are not visible in the trajectory (stale data the agent could not have known about, hidden environment changes), the critique will hallucinate a cause. |
| +Useful when you want behavioral improvements that survive a deploy: the lessons are inspectable text you can read, edit, or evict by hand. |
In the wild
| Source | Claim |
|---|---|
| cognition.ai → | Cognition documents Devin keeping notes on what worked and what failed across sessions on the same project, then reading those notes back when it picks the work up again. |
| langchain-ai.github.io → | LangGraph ships a runnable Reflection tutorial that wires a generator and reflector around a shared message thread, exactly the loop this pattern describes. |
| arxiv.org → | AgentBench evaluates language agents across eight environments and reports that Reflexion-style verbal feedback measurably improves performance on programming and operating-system tasks where trajectory signal is rich. |
Reader gotcha
A same-model critic trained on the same prompt will systematically approve its own output, producing sycophantic agreement that looks like self-correction but adds no signal. Ground the critique externally (a different model, a code interpreter, a search-grounded checker), as CRITIC argues. source
Implementation sketch
import { generateText } from 'ai'
import { openai } from '@ai-sdk/openai'
type Episode = { task: string; outcome: 'success' | 'failure'; critique: string }
const memory: Episode[] = []
declare function evaluate(output: string): Promise<boolean>
async function attemptWithReflexion(task: string, maxAttempts = 3): Promise<string> {
for (let attempt = 0; attempt < maxAttempts; attempt++) {
const lessons = memory
.filter((m) => m.outcome === 'failure')
.slice(-3)
.map((m) => m.critique)
.join('\n')
const { text } = await generateText({
model: openai('gpt-4o'),
prompt: `Lessons from prior attempts:\n${lessons}\n\nTask: ${task}`,
})
if (await evaluate(text)) return text
const critique = await generateText({
model: openai('gpt-4o'),
prompt: `Task: ${task}\nAttempt: ${text}\nWhy did this fail? Write a one-paragraph lesson for next time.`,
})
memory.push({ task, outcome: 'failure', critique: critique.text })
}
throw new Error('Max attempts exceeded')
}
export {}
- LangChain
- LangGraph
References
Reflexion teaches a language agent to learn from its own trajectories without touching model weights. After each attempt, the agent inspects the trajectory and any environment feedback, then writes a short verbal critique: a paragraph that names what went wrong and what to try next. That critique is appended to an episodic memory buffer keyed by task or task class. On the next attempt, the agent retrieves the most recent and most relevant critiques and conditions its plan on them, treating prior failures as instructions rather than as silent gradient signal.
Background · context and trade-offs
The pattern straddles two layers. As Topology it shapes the control flow: a generator-execute-evaluate loop that, on failure, branches into a critique step before retrying. As State it maintains durable, retrievable memory whose unit is a natural-language lesson, not an embedding of a previous answer. The mechanism only earns its keep when failures are diagnosable from the trajectory itself: the agent must be able to articulate, in words, what an outside observer could also see. Tasks where failure is invisible (a stale tool, a wrong premise the agent never questioned) defeat the loop, because the critique is grounded in nothing.
Reflexion sits next to but distinct from a within-attempt generator-critic loop. Self-Refine iterates on a single output until a critic stops complaining; Reflexion iterates across attempts, so the lesson outlives the run and the next encounter with the same problem class starts informed. The cost is operational, not algorithmic: someone has to decide what counts as the same task, how many critiques to retrieve, when to compact the buffer, and which model writes the critique. The default of having the same model judge its own work is the hazard the pattern is most often deployed without noticing.