Skip to main content
d-n
← Back to Agentic Design Patterns
Layer 2: Quality & Control Gates

Human in the Loop

Also known as: Human Approval, Tool Approval Gates, Pause-and-Resume, Principal-Agent Approval

Pauses at designated steps and resumes only after a person approves, edits, or rejects.

by

The orchestrator declares which actions pause for a human signal, snapshots run state at the boundary, and resumes only after the reviewer approves, edits, or rejects.

Claude Code

  • Use a PreToolUse hook to pause execution before side-effecting tool calls — exit non-zero blocks the call until a human allows it.
  • List sensitive tool patterns in `settings.json` permissions.deny; Claude prompts for approval before each match.
  • Use the GitHub merge queue as the deployment-layer HITL gate — require at least one human approval before any PR advances to merge.
  • Expose the agent's rationale alongside the approval request — reviewers who cannot see the reasoning default to rubber-stamping.

Primitives

  • PreToolUse hooks (pause gate)
  • settings.json permissions.deny
  • Claude Code approval prompts
  • Merge queue (deployment HITL)

Cursor

  • Use Agent mode with Cursor's diff-review UI — every file edit surfaces as a reviewable diff; accept or reject each change individually before it lands.
  • For destructive operations (file deletions, config changes), add a .cursor/rules/*.mdc instruction to ask for confirmation before proceeding.
  • Use cloud agents for long tasks — they push to a branch and open a PR for human review before any merge occurs.
  • Use Plan mode as the lightweight HITL step for complex tasks — review and edit the plan before execution begins.

Primitives

  • Diff-by-diff review UI
  • Cloud agents (PR-based HITL)
  • Plan mode (pre-execution approval)
  • .cursor/rules/*.mdc (confirmation rules)

Decision

Use when ✓Avoid when ✗
+Worth the cost when a tool call is irreversible or hard to recover from (money movement, destructive filesystem or database operations, sending external messages) and a wrong call costs more than the latency of waiting.When the action volume exceeds the reviewer staffing budget, the queue grows without bound and the pattern collapses into either silent timeouts or rubber-stamping that recovers no safety.
+Required where regulation or policy demands a named human on the commit (clinical orders, legal filings, hiring decisions) so the audit trail attributes the action to a person, not the model.Without a UI that exposes the agent's rationale, the proposed arguments, and the consequences of approval, reviewers default to clicking yes and the gate becomes ceremony rather than control.
+A good fit when the task is rare or low-volume enough that a queue of pending approvals stays small, and a reviewer answers within the trust budget of the upstream caller.When the upstream caller cannot tolerate the wall-clock latency of an asynchronous approval round-trip (interactive chat, real-time pipelines), automated Guardrails or a policy-bound autonomous policy is the right shape.
+Better than fully autonomous execution when the agent's confidence on the next step is low, ambiguous, or contested. Those are the high-stakes branch points where a fallback prompt is cheaper than a recovery rollback.

In the wild

SourceClaim
code.claude.comClaude Code's permission system pauses execution on every tool invocation that matches a non-allowlisted pattern (`Bash(rm:*)`, `Write(/etc/*)`) and prompts the operator in-terminal to allow once, allow always, or deny. The result is a per-call human approval gate built on the same pause-and-resume primitive the pattern names.
openai.github.ioOpenAI's Agents SDK ships a `needs_approval` flag on `function_tool` and `Agent.as_tool`; when the flag trips the run pauses, pending items appear in `result.interruptions`, and the orchestrator resumes only after the host application calls `state.approve()` or `state.reject()`.
platform.claude.comAnthropic's Computer Use documentation explicitly recommends asking a human to confirm decisions with meaningful real-world consequences and any tasks requiring affirmative consent (accepting cookies, executing financial transactions, agreeing to terms of service) as a primary risk mitigation for the beta agent loop.

Reader gotcha

Reviewers exposed to a long stream of agent proposals drift toward automation bias: they approve faster than they read, the gate becomes a click-through, and the pattern recovers none of the safety it was deployed for. Goddard, Roudsari and Wyatt's systematic review of decision-support studies finds that the rate of automation-driven errors rises with reviewer workload and falls when the system surfaces its rationale alongside the proposal. A HITL UI that hides the agent's reasoning replicates the same failure mode at agent scale: the reviewer rubber-stamps and the audit trail records consent the operator never actually exercised. source

Implementation sketch

import { generateText, tool } from 'ai'
import { openai } from '@ai-sdk/openai'
import { z } from 'zod'

type Pending = { resolve: (decision: 'approve' | 'reject') => void; args: unknown }
const pending = new Map<string, Pending>()

declare function notifyReviewer(id: string, args: unknown): void
export function decide(id: string, decision: 'approve' | 'reject') {
  pending.get(id)?.resolve(decision)
  pending.delete(id)
}

const refund = tool({
  description: 'Issue a customer refund — requires human approval before charging.',
  parameters: z.object({ orderId: z.string(), cents: z.number().int().positive() }),
  execute: async (args) => {
    const id = crypto.randomUUID()
    notifyReviewer(id, args)
    const decision = await new Promise<'approve' | 'reject'>((resolve) => pending.set(id, { resolve, args }))
    if (decision === 'reject') return { status: 'rejected' as const }
    return { status: 'refunded' as const, orderId: args.orderId }
  },
})

export async function runAgent(userInput: string) {
  return generateText({ model: openai('gpt-4o'), prompt: userInput, tools: { refund }, maxSteps: 5 })
}

export {}
First-party TS SDK
  • LangGraph
  • OpenAI Agents
  • Vercel AI SDK

References

  1. Christiano et al.·2017·NeurIPS 2017 · DOI: 10.48550/arXiv.1706.03741

    foundational paper for HITL-via-preferences; the training-time variant of the runtime pattern

  2. Anthropic·2024

    frames human approval as a first-class checkpoint in the agent control loop

  3. LangChain team·2025·accessed

    interrupt() primitive, state edits between checkpoints, resume semantics

  4. OpenAI·2025·accessed

    needs_approval, on_invoke, result.interruptions and resume contract

  5. Vercel·2025·accessed

    tool-confirmation pattern wired into the Next.js streaming UI

  6. Anthropic·2025·accessed

    permission rules and the per-call approval prompt that gates non-allowlisted tool use

  7. Antonio Gulli·2026·Springer·pp. 262285

Human in the Loop (HITL) is the runtime discipline of pausing an agent at a designated step, surfacing the pending decision to a person, and resuming only after that person approves, rejects, or edits what the agent proposed. The interruption is structural rather than incidental: the orchestrator declares which tool calls or branch transitions require a signal, serialises the run state at the boundary, and exposes a queue or callback the human reviewer drives. When the signal arrives the run rehydrates from the same checkpoint and continues with the approved arguments, the rejection reason, or the edited payload spliced in.

Background · context and trade-offs

Three sub-variants share that skeleton. Approve-before-tool-call gates a single side-effecting action (a payment, a destructive shell command, an email send) on a yes-or-no acknowledgement; Claude Code's permission prompts and the OpenAI Agents SDK's `needs_approval` flag both ship this shape. Choose-between-options surfaces N candidates the agent generated and asks the reviewer to pick or rank: Cursor's diff-by-diff accept-and-reject UI on a multi-file edit, or RLHF preference labelling at training time, are both this variant moved to different lifecycle points. Free-form correction lets the reviewer rewrite the proposed arguments, the trajectory, or the next plan step before resume, which LangGraph's interrupt primitive supports through state edits between checkpoints.

HITL is distinct from Guardrails, which screens input and output automatically without a person on the path, and from Evaluator-Optimizer, where the critic is another model. The pattern earns its keep when an irreversible action or a reputational risk wants a person's name on the commit, and when the queue of pending approvals is staffed densely enough that the wait does not eclipse the work. The cost is wall-clock and operational: an interruption that nobody answers strands the run, a queue routed to one over-loaded reviewer becomes the bottleneck, and a UI that hides the agent's rationale produces rubber-stamping that recovers none of the safety the pattern was deployed for.