Layer 2 — Quality & Control Gates

Human in the Loop

Also known as: Human Approval, Tool Approval Gates, Pause-and-Resume, Principal-Agent Approval

Agent pauses on a designated step and waits for a human signal before continuing.

A top-down flowchart in which an Agent step branches on whether the next action is sensitive: non-sensitive actions execute and loop back, while sensitive ones snapshot state, pause, and surface the request to a reviewer whose approve, edit, or reject signal either resumes the loop with original or edited arguments, or aborts the run.

Decision

Use when ✓	Avoid when ✗
+Apply when a tool call is irreversible or hard to recover from — money movement, destructive filesystem or database operations, sending external messages — and a wrong call costs more than the latency of waiting.	−When the action volume exceeds the reviewer staffing budget, the queue grows without bound and the pattern collapses into either silent timeouts or rubber-stamping that recovers no safety.
+Use where regulation or policy demands a named human on the commit (clinical orders, legal filings, hiring decisions) so the audit trail attributes the action to a person, not the model.	−Without a UI that exposes the agent's rationale, the proposed arguments, and the consequences of approval, reviewers default to clicking yes and the gate becomes ceremony rather than control.
+Reach for it when the task is rare or low-volume enough that a queue of pending approvals stays small, and a reviewer answers within the trust budget of the upstream caller.	−When the upstream caller cannot tolerate the wall-clock latency of an asynchronous approval round-trip (interactive chat, real-time pipelines), automated Guardrails or a policy-bound autonomous policy is the right shape.
+Prefer it over fully autonomous execution when the agent's confidence on the next step is low, ambiguous, or contested — high-stakes branch points where a fallback prompt is cheaper than a recovery rollback.

In the wild

Source	Claim
code.claude.com →	Claude Code's permission system pauses execution on every tool invocation that matches a non-allowlisted pattern (`Bash(rm:)`, `Write(/etc/)`) and prompts the operator in-terminal to allow once, allow always, or deny — a per-call human approval gate built on the same pause-and-resume primitive the pattern names.
openai.github.io →	OpenAI's Agents SDK ships a `needs_approval` flag on `function_tool` and `Agent.as_tool`; when the flag trips the run pauses, pending items appear in `result.interruptions`, and the orchestrator resumes only after the host application calls `state.approve()` or `state.reject()`.
platform.claude.com →	Anthropic's Computer Use documentation explicitly recommends asking a human to confirm decisions with meaningful real-world consequences and any tasks requiring affirmative consent — accepting cookies, executing financial transactions, agreeing to terms of service — as a primary risk mitigation for the beta agent loop.

Reader gotcha

Reviewers exposed to a long stream of agent proposals drift toward automation bias — they approve faster than they read, the gate becomes a click-through, and the pattern recovers none of the safety it was deployed for. Goddard, Roudsari and Wyatt's systematic review of decision-support studies finds that the rate of automation-driven errors rises with reviewer workload and falls when the system surfaces its rationale alongside the proposal. A HITL UI that hides the agent's reasoning replicates the same failure mode at agent scale: the reviewer rubber-stamps and the audit trail records consent the operator never actually exercised. source

Implementation sketch

import { generateText, tool } from 'ai'
import { openai } from '@ai-sdk/openai'
import { z } from 'zod'

type Pending = { resolve: (decision: 'approve' | 'reject') => void; args: unknown }
const pending = new Map<string, Pending>()

declare function notifyReviewer(id: string, args: unknown): void
export function decide(id: string, decision: 'approve' | 'reject') {
  pending.get(id)?.resolve(decision)
  pending.delete(id)
}

const refund = tool({
  description: 'Issue a customer refund — requires human approval before charging.',
  parameters: z.object({ orderId: z.string(), cents: z.number().int().positive() }),
  execute: async (args) => {
    const id = crypto.randomUUID()
    notifyReviewer(id, args)
    const decision = await new Promise<'approve' | 'reject'>((resolve) => pending.set(id, { resolve, args }))
    if (decision === 'reject') return { status: 'rejected' as const }
    return { status: 'refunded' as const, orderId: args.orderId }
  },
})

export async function runAgent(userInput: string) {
  return generateText({ model: openai('gpt-4o'), prompt: userInput, tools: { refund }, maxSteps: 5 })
}

export {}

First-party TS SDK

LangGraph
OpenAI Agents
Vercel AI SDK

References

PAPERDeep Reinforcement Learning from Human Preferences
Christiano et al.·2017·NeurIPS 2017 · DOI: 10.48550/arXiv.1706.03741
foundational paper for HITL-via-preferences; the training-time variant of the runtime pattern
ESSAYBuilding Effective Agents
Anthropic·2024
frames human approval as a first-class checkpoint in the agent control loop
DOCSLangGraph — Interrupts and Human-in-the-Loop
LangChain team·2025·accessed 2026-05-04
interrupt() primitive, state edits between checkpoints, resume semantics
DOCSOpenAI Agents SDK — Tools and Approval Flow
OpenAI·2025·accessed 2026-05-04
needs_approval, on_invoke, result.interruptions and resume contract
DOCSVercel AI SDK — Human in the Loop cookbook
Vercel·2025·accessed 2026-05-04
tool-confirmation pattern wired into the Next.js streaming UI
DOCSClaude Code — Identity and Access Management
Anthropic·2025·accessed 2026-05-04
permission rules and the per-call approval prompt that gates non-allowlisted tool use
BOOKAgentic Design Patterns, Chapter 16: Human-in-the-Loop
Antonio Gulli·2026·Springer·pp. 262–285

Overview · 1-paragraph mechanism

Human in the Loop (HITL) is the runtime discipline of pausing an agent at a designated step, surfacing the pending decision to a person, and resuming only after that person approves, rejects, or edits what the agent proposed. The interruption is structural rather than incidental: the orchestrator declares which tool calls or branch transitions require a signal, serialises the run state at the boundary, and exposes a queue or callback the human reviewer drives. When the signal arrives the run rehydrates from the same checkpoint and continues with the approved arguments, the rejection reason, or the edited payload spliced in.

Background · context and trade-offs

Three sub-variants share that skeleton. Approve-before-tool-call gates a single side-effecting action (a payment, a destructive shell command, an email send) on a yes-or-no acknowledgement; Claude Code's permission prompts and the OpenAI Agents SDK's `needs_approval` flag both ship this shape. Choose-between-options surfaces N candidates the agent generated and asks the reviewer to pick or rank — Cursor's diff-by-diff accept-and-reject UI on a multi-file edit, or RLHF preference labelling at training time, are both this variant moved to different lifecycle points. Free-form correction lets the reviewer rewrite the proposed arguments, the trajectory, or the next plan step before resume, which LangGraph's interrupt primitive supports through state edits between checkpoints.

HITL is distinct from Guardrails, which screens input and output automatically without a person on the path, and from Evaluator-Optimizer, where the critic is another model. The pattern earns its keep when an irreversible action or a reputational risk wants a person's name on the commit, and when the queue of pending approvals is staffed densely enough that the wait does not eclipse the work. The cost is wall-clock and operational: an interruption that nobody answers strands the run, a queue routed to one over-loaded reviewer becomes the bottleneck, and a UI that hides the agent's rationale produces rubber-stamping that recovers none of the safety the pattern was deployed for.