Tool Use / ReAct
Also known as: Reasoning + Acting, Function Calling, Tool-Calling Loop
Agent interleaves a thought, a tool call, and the result until it can answer.
Decision
| Use when ✓ | Avoid when ✗ |
|---|---|
| +Apply when answering the question requires data the model does not have at training time (current prices, account state, the contents of a file, the result of a search). | −When the task can be answered from the prompt and a single retrieval, a static RAG pipeline is cheaper and more predictable than letting the model decide whether to search. |
| +Use where each tool result genuinely shifts the next step — code execution, database queries, retrieval against a corpus the model has not memorised. | −When the action space is closed and known in advance, a router or finite state machine pays less per step than re-asking the model what to do at every turn. |
| +Reach for it when the underlying provider already exposes structured tool calling, so the parser, retry semantics, and schema validation are not your code to write. | −Without a hard step budget and a way to detect repeated near-identical tool calls, the loop will burn tokens chasing its own tail on tasks that have no answer. |
| +Prefer it over a hand-rolled chain when the number of steps is data-dependent and a fixed pipeline would either over- or under-fetch. |
In the wild
| Source | Claim |
|---|---|
| docs.anthropic.com → | Anthropic's tool-use API documents the exact loop this pattern names — the model emits a tool_use block, the client runs the tool, returns a tool_result, and the model is re-invoked until it stops requesting tools. |
| docs.claude.com → | Claude Code ships the loop as its core runtime: the CLI exposes filesystem, shell, and search tools, parses the model's tool calls, executes them locally, and feeds results back until the agent emits a final response. |
| cognition.ai → | Cognition's Devin documentation describes the same Thought-Action-Observation cycle layered on a sandboxed shell, browser, and code editor — the agent's plan branches on each tool result rather than running to a fixed script. |
Reader gotcha
Tool catalogues that grow past roughly a dozen entries degrade selection accuracy: the model picks plausible-but-wrong tools and reasoning quality drops. Anthropic documents the same effect in production deployments and recommends scoping the catalogue per task or routing to subagents with smaller toolsets. source
Implementation sketch
import { generateText, tool } from 'ai'
import { openai } from '@ai-sdk/openai'
import { z } from 'zod'
const tools = {
searchDocs: tool({
description: 'Search the docs corpus for a query and return top-k snippets.',
parameters: z.object({ query: z.string(), k: z.number().int().min(1).max(10).default(5) }),
execute: async ({ query, k }) => {
// Real implementation hits a vector store; stubbed here.
return { snippets: Array.from({ length: k }, (_, i) => `hit ${i} for ${query}`) }
},
}),
fetchUrl: tool({
description: 'Fetch a URL and return its text body, truncated to 4 KB.',
parameters: z.object({ url: z.string().url() }),
execute: async ({ url }) => (await fetch(url)).text().then((t) => t.slice(0, 4096)),
}),
}
const { text } = await generateText({
model: openai('gpt-4o'),
tools,
maxSteps: 6, // bounded ReAct loop; runtime parses tool calls and feeds results back
prompt: 'Find the current rate-limit policy in our docs and summarise it in one sentence.',
})
export {}
- Vercel AI SDK
- LangGraph
- OpenAI Agents
- Mastra
References
- Yao et al.·2022·ICLR 2023 · DOI: 10.48550/arXiv.2210.03629
foundational paper; introduces the Thought–Action–Observation interleaving
- Schick et al.·2023·NeurIPS 2023 · DOI: 10.48550/arXiv.2302.04761
self-supervised tool-use training; complementary to inference-time ReAct
- Anthropic·2024
frames the loop as the canonical "augmented LLM" and warns about tool catalog size
- Anthropic·2025·accessed
canonical client-side loop documented end-to-end
- Vercel·2025·accessed
first-party TypeScript implementation used in this pattern's sketch
- Antonio Gulli·2026·Springer·pp. 69–86
Overview · 1-paragraph mechanism
Tool Use / ReAct fuses two capabilities the model has on its own — generating a chain of thought and emitting a structured function call — into a single control loop. At each step the agent writes a private rationale (the Thought), chooses an action from a typed tool catalog (the Action), and is then handed the tool's output (the Observation) before it decides what to do next. The loop continues until the agent emits a terminal answer rather than another action. The rationale is what makes the trajectory legible and what lets the next step condition on more than just the last observation.
Background · context and trade-offs
Mechanically, the runtime owns the loop: it parses the tool call out of the model output, dispatches it to the registered handler, appends the result back into the conversation, and re-invokes the model. A maximum-step budget bounds the recursion. The tool schema is the contract — names, JSON-schema arguments, and a one-line description per tool — and the model only sees actions it could plausibly take. Provider-side function calling pushes the parsing concern down into the model API, so the loop above is now the canonical agent skeleton in the OpenAI, Anthropic, and Vercel AI SDKs.
The pattern earns its keep when the answer requires fresh data, side effects, or computation the model is bad at — search, code execution, database lookup, calendar mutation — and where each tool result genuinely changes what to do next. Two failure modes recur. The catalog grows past the model's working memory and tool selection degrades; or the agent loops between two near-identical tool calls because nothing in the trajectory disconfirms its current hypothesis. Both have the same root: the loop runs without a stopping criterion that is independent of the model's own confidence.