LangGraph models agent workflows as directed graphs of nodes (LLM calls, tools, conditionals) with explicit state passed between them. Unlike ad-hoc agent loops, it gives you checkpointing, observability, and human-in-the-loop pauses — properties production agents need.

Advertisement

Graph model

Nodes are functions (LLM call, tool, custom logic). Edges are either fixed transitions or conditional (predicate decides which next node). State is a Pydantic model passed through the graph. Cycles are allowed — that's how loops are expressed.

Checkpointing

After each node, state is persisted (in-memory, Redis, SQLite). On failure or process restart, resume from last checkpoint. Critical for long-running agents that may take 5+ minutes. Production deployments use Postgres or DynamoDB for the checkpoint store.

Advertisement

Human-in-the-loop

A node can be marked interrupt. The graph pauses, surfaces state to a UI for human review, and only resumes when the human approves or edits. Perfect for high-stakes decisions (sending an email, executing a payment).

Sample graph

from langgraph.graph import StateGraph, END

def classify(state): ...
def do_research(state): ...
def summarize(state): ...

g = StateGraph(AgentState)
g.add_node('classify', classify)
g.add_node('research', do_research)
g.add_node('summarize', summarize)
g.add_conditional_edges('classify', lambda s: 'research' if s.needs_data else 'summarize')
g.add_edge('research', 'summarize')
g.add_edge('summarize', END)
agent = g.compile(checkpointer=PostgresSaver())

When NOT to use it

Simple chatbots, one-off tool calls, agents without state — overkill. Use LangGraph when you have: multi-step workflows, need to checkpoint/resume, need human review, multiple agents collaborating. Below that complexity, raw API calls are simpler.

LangGraph = state machine + checkpoint + HITL. Right for multi-step agents; overkill for one-shot LLM calls.