在 LangGraph 中思考 - LangChain中文版文档

当您使用 LangGraph 构建智能体时，您首先会将其分解为称为节点的离散步骤。然后，您将描述每个节点的不同决策和转换。最后，您通过每个节点都可以读写共享的状态将节点连接在一起。在本教程中，我们将引导您完成使用 LangGraph 构建客户支持电子邮件智能体的思考过程。

从您想要自动化的流程开始

想象您需要构建一个处理客户支持电子邮件的 AI 智能体。您的产品团队给了您以下要求：

智能体应该：

- 读取传入的客户电子邮件
- 按紧急程度和主题对它们进行分类
- 搜索相关文档以回答问题
- 起草适当的回复
- 将复杂问题升级给人工代理
- 在需要时安排后续跟进

需要处理的情景示例：

1. 简单产品问题："我如何重置密码？"
2. 错误报告："当我选择 PDF 格式时，导出功能崩溃了"
3. 紧急计费问题："我的订阅被扣了两次款！"
4. 功能请求："你能在手机应用中添加深色模式吗？"
5. 复杂技术问题："我们的 API 集成间歇性地出现 504 错误"

要在 LangGraph 中实现智能体，您通常会遵循相同的五个步骤。

步骤 1：将工作流程映射为离散步骤

首先识别流程中的不同步骤。每个步骤将成为一个节点（执行特定单一功能的函数）。然后，草绘这些步骤如何相互连接。此图中的箭头显示可能的路径，但实际决定走哪条路径发生在每个节点内部。既然我们已经确定了工作流程中的组件，让我们了解每个节点需要做什么：

Read Email：提取和解析电子邮件内容
Classify Intent：使用 LLM 对紧急程度和主题进行分类，然后路由到适当的操作
Doc Search：查询知识库以获取相关信息
Bug Track：在跟踪系统中创建或更新问题
Draft Reply：生成适当的回复
Human Review：升级给人工代理进行审核或处理
Send Reply：发送电子邮件回复

请注意，某些节点决定下一步去哪里（Classify Intent、Draft Reply、Human Review），而其他节点总是转到同一个下一步（Read Email 总是转到 Classify Intent，Doc Search 总是转到 Draft Reply）。

步骤 2：确定每个步骤需要做什么

对于图中的每个节点，确定它代表什么类型的操作以及它正常工作需要什么上下文。

LLM 步骤

当您需要了解、分析、生成文本或做出推理决策时使用

数据步骤

当您需要从外部源检索信息时使用

操作步骤

当您需要执行外部操作时使用

用户输入步骤

当您需要进行人工干预时使用

LLM 步骤

当步骤需要了解、分析、生成文本或做出推理决策时：

分类意图

静态上下文（提示）：分类类别、紧急程度定义、回复格式
动态上下文（来自状态）：电子邮件内容、发件人信息
期望结果：用于确定路由的结构化分类

起草回复

静态上下文（提示）：语气指南、公司政策、回复模板
动态上下文（来自状态）：分类结果、搜索结果、客户历史
期望结果：准备好审核的专业电子邮件回复

数据步骤

当步骤需要从外部源检索信息时：

文档搜索

参数：基于意图和主题构建的查询
重试策略：是，针对临时故障使用指数退避
缓存：可以缓存常见查询以减少 API 调用

客户历史记录查找

参数：来自状态的电子邮件地址或 ID
重试策略：是，但如果不可用则回退到基本信息
缓存：是，带有生存时间以平衡新鲜度和性能

操作步骤

当步骤需要执行外部操作时：

发送回复

何时执行节点：经过批准后（人工或自动）
重试策略：是，针对网络问题使用指数退避
不应缓存：每次发送都是独特的操作

错误跟踪

何时执行节点：意图为 “bug” 时始终执行
重试策略：是，不丢失错误报告至关重要
返回：包含在回复中的票证 ID

用户输入步骤

当步骤需要人工干预时：

人工审核节点

决策上下文：原始电子邮件、草稿回复、紧急程度、分类
预期输入格式：批准布尔值加上可选的编辑回复
触发时机：高紧急程度、复杂问题或质量问题

步骤 3：设计您的状态

状态是您的智能体中所有节点都可访问的共享内存。将其视为智能体在处理过程中记录其所学内容和所做决定的笔记本。

什么应该包含在状态中？

关于每部分数据问自己这些问题：

包含在状态中

它是否需要跨步骤持久化？如果是，则放入状态。

不要存储

您能否从其他数据推导它？如果是，则在需要时计算它，而不是存储在状态中。

对于我们的邮件智能体，我们需要跟踪：

原始电子邮件和发件人信息（以后无法重建）
分类结果（多个后续/下游节点需要）
搜索结果和客户数据（重新获取成本高）
草稿回复（需要在审核期间持久化）
执行元数据（用于调试和恢复）

保持状态原始，按需格式化提示词

一个关键原则：您的状态应存储原始数据，而不是格式化文本。在节点内按需格式化提示词。

这种分离意味着：

不同的节点可以为他们的需求以不同方式格式化相同的数据
您可以更改提示词模板而无需修改状态架构
调试更清晰——您可以确切看到每个节点接收了什么数据
您的智能体可以在不破坏现有状态的情况下演进

让我们定义我们的状态：

from typing import TypedDict, Literal

# Define the structure for email classification
class EmailClassification(TypedDict):
    intent: Literal["question", "bug", "billing", "feature", "complex"]
    urgency: Literal["low", "medium", "high", "critical"]
    topic: str
    summary: str

class EmailAgentState(TypedDict):
    # Raw email data
    email_content: str
    sender_email: str
    email_id: str

    # Classification result
    classification: EmailClassification | None

    # Raw search/API results
    search_results: list[str] | None  # List of raw document chunks
    customer_history: dict | None  # Raw customer data from CRM

    # Generated content
    draft_response: str | None
    messages: list[str] | None

请注意，状态仅包含原始数据——没有提示词模板，没有格式化字符串，没有指令。分类输出作为单个字典存储，直接来自 LLM。

步骤 4：构建您的节点

现在我们实现每个步骤作为一个函数。LangGraph 中的节点只是一个 Python 函数，它接收当前状态并返回对其的更新。

适当处理错误

不同的错误需要不同的处理策略：

错误类型	谁修复它	策略	何时使用
瞬态错误（网络问题、速率限制）	系统（自动）	重试策略	通常重试即可解决的临时故障
LLM 可恢复错误（工具失败、解析问题）	LLM	将错误存储在状态中并循环回来	LLM 可以看到错误并调整其方法
用户可修复错误（缺少信息、说明不清）	人类	使用 `interrupt()` 暂停	需要用户输入才能继续
意外错误	开发者	让它们冒泡	需要调试的未知问题

瞬态错误
LLM 可恢复
用户可修复
意外

添加重试策略以自动重试网络问题和速率限制：

from langgraph.types import RetryPolicy

workflow.add_node(
    "search_documentation",
    search_documentation,
    retry_policy=RetryPolicy(max_attempts=3, initial_interval=1.0)
)

将错误存储在状态中并循环回来，以便 LLM 可以看到哪里出错并再次尝试：

from langgraph.types import Command


def execute_tool(state: State) -> Command[Literal["agent", "execute_tool"]]:
    try:
        result = run_tool(state['tool_call'])
        return Command(update={"tool_result": result}, goto="agent")
    except ToolError as e:
        # Let the LLM see what went wrong and try again
        return Command(
            update={"tool_result": f"Tool error: {str(e)}"},
            goto="agent"
        )

在需要时暂停并从用户收集信息（如账户 ID、订单号或澄清）：

from langgraph.types import Command


def lookup_customer_history(state: State) -> Command[Literal["draft_response"]]:
    if not state.get('customer_id'):
        user_input = interrupt({
            "message": "Customer ID needed",
            "request": "Please provide the customer's account ID to look up their subscription history"
        })
        return Command(
            update={"customer_id": user_input['customer_id']},
            goto="lookup_customer_history"
        )
    # Now proceed with the lookup
    customer_data = fetch_customer_history(state['customer_id'])
    return Command(update={"customer_history": customer_data}, goto="draft_response")

让它们冒泡用于调试。不要捕获您无法处理的内容：

def send_reply(state: EmailAgentState):
    try:
        email_service.send(state["draft_response"])
    except Exception:
        raise  # Surface unexpected errors

实现我们的邮件代理节点

我们将每个节点实现为一个简单的函数。记住：节点接收状态，执行工作，并返回更新。

读取和分类节点

from typing import Literal
from langgraph.graph import StateGraph, START, END
from langgraph.types import interrupt, Command, RetryPolicy
from langchain_openai import ChatOpenAI
from langchain.messages import HumanMessage

llm = ChatOpenAI(model="gpt-5-nano")

def read_email(state: EmailAgentState) -> dict:
    """Extract and parse email content"""
    # In production, this would connect to your email service
    return {
        "messages": [HumanMessage(content=f"Processing email: {state['email_content']}")]
    }

def classify_intent(state: EmailAgentState) -> Command[Literal["search_documentation", "human_review", "draft_response", "bug_tracking"]]:
    """Use LLM to classify email intent and urgency, then route accordingly"""

    # Create structured LLM that returns EmailClassification dict
    structured_llm = llm.with_structured_output(EmailClassification)

    # Format the prompt on-demand, not stored in state
    classification_prompt = f"""
    Analyze this customer email and classify it:

    Email: {state['email_content']}
    From: {state['sender_email']}

    Provide classification including intent, urgency, topic, and summary.
    """

    # Get structured response directly as dict
    classification = structured_llm.invoke(classification_prompt)

    # Determine next node based on classification
    if classification['intent'] == 'billing' or classification['urgency'] == 'critical':
        goto = "human_review"
    elif classification['intent'] in ['question', 'feature']:
        goto = "search_documentation"
    elif classification['intent'] == 'bug':
        goto = "bug_tracking"
    else:
        goto = "draft_response"

    # Store classification as a single dict in state
    return Command(
        update={"classification": classification},
        goto=goto
    )

搜索和跟踪节点

def search_documentation(state: EmailAgentState) -> Command[Literal["draft_response"]]:
    """Search knowledge base for relevant information"""

    # Build search query from classification
    classification = state.get('classification', {})
    query = f"{classification.get('intent', '')} {classification.get('topic', '')}"

    try:
        # Implement your search logic here
        # Store raw search results, not formatted text
        search_results = [
            "Reset password via Settings > Security > Change Password",
            "Password must be at least 12 characters",
            "Include uppercase, lowercase, numbers, and symbols"
        ]
    except SearchAPIError as e:
        # For recoverable search errors, store error and continue
        search_results = [f"Search temporarily unavailable: {str(e)}"]

    return Command(
        update={"search_results": search_results},  # Store raw results or error
        goto="draft_response"
    )

def bug_tracking(state: EmailAgentState) -> Command[Literal["draft_response"]]:
    """Create or update bug tracking ticket"""

    # Create ticket in your bug tracking system
    ticket_id = "BUG-12345"  # Would be created via API

    return Command(
        update={
            "search_results": [f"Bug ticket {ticket_id} created"],
            "current_step": "bug_tracked"
        },
        goto="draft_response"
    )

响应节点

def draft_response(state: EmailAgentState) -> Command[Literal["human_review", "send_reply"]]:
    """Generate response using context and route based on quality"""

    classification = state.get('classification', {})

    # Format context from raw state data on-demand
    context_sections = []

    if state.get('search_results'):
        # Format search results for the prompt
        formatted_docs = "\n".join([f"- {doc}" for doc in state['search_results']])
        context_sections.append(f"Relevant documentation:\n{formatted_docs}")

    if state.get('customer_history'):
        # Format customer data for the prompt
        context_sections.append(f"Customer tier: {state['customer_history'].get('tier', 'standard')}")

    # Build the prompt with formatted context
    draft_prompt = f"""
    Draft a response to this customer email:
    {state['email_content']}

    Email intent: {classification.get('intent', 'unknown')}
    Urgency level: {classification.get('urgency', 'medium')}

    {chr(10).join(context_sections)}

    Guidelines:
    - Be professional and helpful
    - Address their specific concern
    - Use the provided documentation when relevant
    """

    response = llm.invoke(draft_prompt)

    # Determine if human review needed based on urgency and intent
    needs_review = (
        classification.get('urgency') in ['high', 'critical'] or
        classification.get('intent') == 'complex'
    )

    # Route to appropriate next node
    goto = "human_review" if needs_review else "send_reply"

    return Command(
        update={"draft_response": response.content},  # Store only the raw response
        goto=goto
    )

def human_review(state: EmailAgentState) -> Command[Literal["send_reply", END]]:
    """Pause for human review using interrupt and route based on decision"""

    classification = state.get('classification', {})

    # interrupt() must come first - any code before it will re-run on resume
    human_decision = interrupt({
        "email_id": state.get('email_id',''),
        "original_email": state.get('email_content',''),
        "draft_response": state.get('draft_response',''),
        "urgency": classification.get('urgency'),
        "intent": classification.get('intent'),
        "action": "Please review and approve/edit this response"
    })

    # Now process the human's decision
    if human_decision.get("approved"):
        return Command(
            update={"draft_response": human_decision.get("edited_response", state.get('draft_response',''))},
            goto="send_reply"
        )
    else:
        # Rejection means human will handle directly
        return Command(update={}, goto=END)

def send_reply(state: EmailAgentState) -> dict:
    """Send the email response"""
    # Integrate with email service
    print(f"Sending reply: {state['draft_response'][:100]}...")
    return {}

步骤 5：连接它们

现在我们将节点连接成一个工作的图。由于我们的节点处理自己的路由决策，我们只需要几个基本边。要启用人机回环与 interrupt()，我们需要使用检查点器编译以在运行之间保存状态：

图编译代码

from langgraph.checkpoint.memory import MemorySaver
from langgraph.types import RetryPolicy

# Create the graph
workflow = StateGraph(EmailAgentState)

# Add nodes with appropriate error handling
workflow.add_node("read_email", read_email)
workflow.add_node("classify_intent", classify_intent)

# Add retry policy for nodes that might have transient failures
workflow.add_node(
    "search_documentation",
    search_documentation,
    retry_policy=RetryPolicy(max_attempts=3)
)
workflow.add_node("bug_tracking", bug_tracking)
workflow.add_node("draft_response", draft_response)
workflow.add_node("human_review", human_review)
workflow.add_node("send_reply", send_reply)

# Add only the essential edges
workflow.add_edge(START, "read_email")
workflow.add_edge("read_email", "classify_intent")
workflow.add_edge("send_reply", END)

# Compile with checkpointer for persistence, in case run graph with Local_Server --> Please compile without checkpointer
memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

图结构是最小的，因为路由发生在节点内部通过 Command 对象。每个节点使用类型提示声明它可以去哪里，例如 Command[Literal["node1", "node2"]]，使流程明确且可追踪。

试用您的代理

让我们用一个需要人工审核的紧急计费问题来运行我们的代理：

测试代理

# Test with an urgent billing issue
initial_state = {
    "email_content": "I was charged twice for my subscription! This is urgent!",
    "sender_email": "customer@example.com",
    "email_id": "email_123",
    "messages": []
}

# Run with a thread_id for persistence
config = {"configurable": {"thread_id": "customer_123"}}
result = app.invoke(initial_state, config)
# The graph will pause at human_review
print(f"human review interrupt:{result['__interrupt__']}")

# When ready, provide human input to resume
from langgraph.types import Command

human_response = Command(
    resume={
        "approved": True,
        "edited_response": "We sincerely apologize for the double charge. I've initiated an immediate refund..."
    }
)

# Resume execution
final_result = app.invoke(human_response, config)
print(f"Email sent successfully!")

当遇到 interrupt() 时，图会暂停，将所有内容保存到检查点器，并等待。它可以在几天后恢复，从它停止的地方继续。thread_id 确保此对话的所有状态一起保留。

总结与后续步骤

关键见解

构建这个邮件智能体向我们展示了 LangGraph 的思考方式：

分解为离散步骤

每个节点做好一件事。这种分解使得能够流式传输进度更新、可持久化执行（可以暂停和恢复），以及清晰的调试，因为您可以在步骤之间检查状态。

状态是共享内存

存储原始数据，而不是格式化文本。这使得不同的节点可以用不同的方式使用相同的信息。

节点是函数

它们接收状态，执行工作，并返回更新。当它们需要做出路由决策时，它们指定状态更新和下一个目的地。

错误是流程的一部分

瞬态故障获得重试，LLM 可恢复错误带上下文循环回来，用户可修复的问题暂停等待输入，意外错误冒泡用于调试。

用户输入是一等公民

interrupt() 函数无限期暂停执行，保存所有状态，并在您提供输入时精确地从它停止的地方恢复。当与其他操作结合在一个节点中时，它必须放在前面。

图结构自然浮现

您定义基本连接，您的节点处理自己的路由逻辑。这使控制流明确且可追踪——您可以通过查看当前节点始终理解您的智能体下一步将做什么。

高级考虑

节点粒度权衡

本节探讨节点粒度设计的权衡。大多数应用程序可以跳过此部分并使用上面显示的图案。

您可能想知道：为什么不将 Read Email 和 Classify Intent 合并为一个节点？或者为什么将 Doc Search 与 Draft Reply 分开？答案涉及弹性和可观测性之间的权衡。弹性考虑： LangGraph 的持久化执行在节点边界创建检查点。当工作流在中断或故障后恢复时，它从执行停止的节点的开头开始。较小的节点意味着更频繁的检查点，这意味着如果出现问题，需要重复的工作更少。如果您将多个操作组合到一个大节点中，末尾附近的故障意味着从头开始重新执行该节点中的所有操作。为什么我们为邮件智能体选择这种分解：

外部服务的隔离： Doc Search 和 Bug Track 是分开的节点，因为它们调用外部 API。如果搜索服务慢或失败，我们希望将其与 LLM 调用隔离。我们可以向这些特定节点添加重试策略而不影响其他节点。
中间可见性： 拥有 Classify Intent 作为单独的节点允许我们在采取行动之前检查 LLM 的决定。这对于调试和监控很有价值——您可以确切地看到智能体何时以及为何路由到人工审核。
不同的故障模式： LLM 调用、数据库查找和电子邮件发送有不同的重试策略。单独的节点允许您独立配置这些。
可重用性和测试： 较小的节点更容易在隔离中测试并在其他工作流中重用。

另一种有效的方法：您可以将 Read Email 和 Classify Intent 合并为一个节点。您将失去在分类之前检查原始电子邮件的能力，并且如果该节点中发生任何故障，将重复这两个操作。对于大多数应用程序，单独节点的可观测性和调试益处值得这种权衡。应用程序级关注：步骤 2 中的缓存讨论（是否缓存搜索结果）是一个应用程序级决策，而不是 LangGraph 框架功能。您在节点函数中根据特定要求实现缓存——LangGraph 不规定这一点。性能考虑：更多节点并不意味着更慢的执行。LangGraph 默认在后台写入检查点（异步持久化模式），因此您的图继续运行而无需等待检查点完成。这意味着您可以获得频繁的检查点，同时最小化性能影响。如果需要可以调整此行为——使用 "exit" 模式仅在完成时检查点，或使用 "sync" 模式阻塞执行直到每个检查点写入。

从这里开始

这是关于使用 LangGraph 构建智能体的介绍。您可以扩展此基础：

人机回环图案

学习如何在执行前添加工具批准、批量批准和其他图案

子图

为复杂的多步操作创建子图

流式传输

添加流式传输以向用户显示实时进度

可观测性

使用 LangSmith 添加可观测性以进行调试和监控

工具集成

集成更多工具用于 Web 搜索、数据库查询和 API 调用

重试逻辑

实施具有指数退避的重试逻辑以处理失败的操作

Edit this page on GitHub or file an issue.

Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

​从您想要自动化的流程开始

​步骤 1：将工作流程映射为离散步骤

​步骤 2：确定每个步骤需要做什么

LLM 步骤

数据步骤

操作步骤

用户输入步骤

​LLM 步骤

​数据步骤

​操作步骤

​用户输入步骤

​步骤 3：设计您的状态

​什么应该包含在状态中？

包含在状态中

不要存储

​保持状态原始，按需格式化提示词

​步骤 4：构建您的节点

​适当处理错误

​实现我们的邮件代理节点

​步骤 5：连接它们

​试用您的代理

​总结与后续步骤

​关键见解

分解为离散步骤

状态是共享内存

节点是函数

错误是流程的一部分

用户输入是一等公民

图结构自然浮现

​高级考虑

​从这里开始

人机回环图案

子图

流式传输

可观测性

工具集成

重试逻辑

从您想要自动化的流程开始

步骤 1：将工作流程映射为离散步骤

步骤 2：确定每个步骤需要做什么

LLM 步骤

数据步骤

操作步骤

用户输入步骤

步骤 3：设计您的状态

什么应该包含在状态中？

保持状态原始，按需格式化提示词

步骤 4：构建您的节点

适当处理错误

实现我们的邮件代理节点

步骤 5：连接它们

试用您的代理

总结与后续步骤

关键见解

高级考虑

从这里开始