Most production agents need to escalate to humans sometimes — for hard cases, low confidence, or sensitive operations. The handoff itself is UX-critical: bad handoffs lose context and frustrate users.
When to handoff
Low confidence (model's own uncertainty signal, multi-sample disagreement). Out-of-scope (capability the agent doesn't have). Sensitive (operations the agent shouldn't authorize). Repeated failures (agent retried 3 times unsuccessfully).
What to hand over
Conversation transcript, agent's reasoning, what it tried, what failed. Don't make the human re-read the whole conversation; surface the relevant context and the specific blocker.
Handoff back
Human resolves, decides to either return to agent or stay in human mode. State preserved both directions. Logged for training data ('these are the cases the agent should have handled differently').