Agent 工具调用模式与 Function Calling

原创灵阙教研团队

S 精选进阶最佳实践 | 约 8 分钟阅读更新于 2026-02-28

AI 导读

Agent 工具调用模式与 Function Calling 工具 Schema 设计、并行工具调用、结构化输出、错误恢复与工具选择策略引言工具调用（Function Calling / Tool Use）是 Agent 从"只能说"到"能做事"的关键跳跃。LLM 本身只能生成文本，但通过工具调用，它可以查询数据库、调用 API、操作文件系统——将思考转化为行动。...

Agent 工具调用模式与 Function Calling

工具 Schema 设计、并行工具调用、结构化输出、错误恢复与工具选择策略

引言

工具调用（Function Calling / Tool Use）是 Agent 从"只能说"到"能做事"的关键跳跃。LLM 本身只能生成文本，但通过工具调用，它可以查询数据库、调用 API、操作文件系统——将思考转化为行动。

但工具调用远不只是"把函数名和参数传给模型"这么简单。工具 Schema 如何设计才能让模型准确理解？多个工具如何并行调用？工具失败了怎么优雅恢复？这些工程问题决定了 Agent 的实际可用性。

工具 Schema 设计

Schema 设计原则

原则	说明	好的示例	差的示例
命名清晰	动词+名词，一看就懂	`search_products`	`do_thing`
描述充分	说明何时用、限制条件	"Search by name, max 50"	"Search"
参数精确	类型+约束+默认值+枚举	`limit: int, 1-100, default 10`	`limit: any`
粒度适中	一个工具做一件事	`get_user` + `update_user`	`manage_user`
幂等优先	读操作无副作用	GET 请求	无状态清理

TypeScript 工具定义

// src/tools/definitions.ts
import { z } from "zod";

// Good: Clear schema with descriptions, constraints, and enums
const searchProductsTool = {
  name: "search_products",
  description: `Search the product catalog by query string.
Returns up to 'limit' products matching the search criteria.
Use this when the user asks about available products, prices, or product details.
Do NOT use this for order-related queries (use search_orders instead).`,
  parameters: z.object({
    query: z.string()
      .min(1)
      .max(200)
      .describe("Search query: product name, category, or keywords"),
    category: z.enum(["electronics", "clothing", "food", "books", "all"])
      .default("all")
      .describe("Filter by product category"),
    price_min: z.number()
      .min(0)
      .optional()
      .describe("Minimum price in USD"),
    price_max: z.number()
      .max(100000)
      .optional()
      .describe("Maximum price in USD"),
    sort_by: z.enum(["relevance", "price_asc", "price_desc", "rating"])
      .default("relevance")
      .describe("Sort order for results"),
    limit: z.number()
      .int()
      .min(1)
      .max(50)
      .default(10)
      .describe("Maximum number of results to return"),
  }),
};

// Tool for creating orders (with confirmation requirement)
const createOrderTool = {
  name: "create_order",
  description: `Create a new order for the customer.
IMPORTANT: Always confirm the order details with the user before calling this tool.
This action is NOT reversible. Returns order ID on success.`,
  parameters: z.object({
    product_id: z.string()
      .describe("Product ID from search results"),
    quantity: z.number()
      .int()
      .min(1)
      .max(100)
      .describe("Number of items to order"),
    shipping_address_id: z.string()
      .describe("ID of saved shipping address"),
    payment_method_id: z.string()
      .describe("ID of saved payment method"),
  }),
};

Python 工具定义

# src/tools/product_tools.py
from langchain_core.tools import tool
from pydantic import BaseModel, Field
from typing import Optional, Literal

class SearchProductsInput(BaseModel):
    query: str = Field(
        ...,
        description="Search query: product name, category, or keywords",
        min_length=1,
        max_length=200,
    )
    category: Literal["electronics", "clothing", "food", "books", "all"] = Field(
        default="all",
        description="Filter by product category",
    )
    price_min: Optional[float] = Field(
        default=None,
        description="Minimum price in USD",
        ge=0,
    )
    price_max: Optional[float] = Field(
        default=None,
        description="Maximum price in USD",
        le=100000,
    )
    limit: int = Field(
        default=10,
        description="Maximum number of results to return",
        ge=1,
        le=50,
    )

@tool(args_schema=SearchProductsInput)
async def search_products(
    query: str,
    category: str = "all",
    price_min: Optional[float] = None,
    price_max: Optional[float] = None,
    limit: int = 10,
) -> str:
    """Search the product catalog by query string.
    Returns product names, prices, ratings, and IDs.
    Use this when the user asks about available products.
    Do NOT use this for order-related queries."""

    products = await product_service.search(
        query=query,
        category=None if category == "all" else category,
        price_range=(price_min, price_max),
        limit=limit,
    )

    if not products:
        return "No products found matching your criteria."

    results = []
    for p in products:
        results.append(
            f"- {p.name} (ID: {p.id}): ${p.price:.2f}, Rating: {p.rating}/5"
        )

    return f"Found {len(products)} products:\n" + "\n".join(results)

并行工具调用

模型原生并行

# OpenAI and Anthropic support parallel tool calls natively
from openai import AsyncOpenAI

client = AsyncOpenAI()

response = await client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "What's the weather in Tokyo and London, and calculate 42*17?"},
    ],
    tools=[weather_tool_schema, calculator_tool_schema],
    parallel_tool_calls=True,  # Enabled by default
)

# Model returns multiple tool calls in a single response
# response.choices[0].message.tool_calls = [
#   ToolCall(id="call_1", function=Function(name="get_weather", arguments='{"city":"Tokyo"}')),
#   ToolCall(id="call_2", function=Function(name="get_weather", arguments='{"city":"London"}')),
#   ToolCall(id="call_3", function=Function(name="calculate", arguments='{"expression":"42*17"}')),
# ]

并行执行引擎

# src/tools/parallel_executor.py
import asyncio
from typing import Any

class ParallelToolExecutor:
    """Execute multiple tool calls concurrently."""

    def __init__(self, tools: dict[str, callable], max_concurrent: int = 5):
        self.tools = tools
        self.semaphore = asyncio.Semaphore(max_concurrent)

    async def execute_all(
        self,
        tool_calls: list[dict],
        timeout: float = 30.0,
    ) -> list[dict]:
        """Execute all tool calls in parallel with timeout."""

        async def execute_one(call: dict) -> dict:
            async with self.semaphore:
                tool_name = call["function"]["name"]
                tool_fn = self.tools.get(tool_name)

                if not tool_fn:
                    return {
                        "tool_call_id": call["id"],
                        "role": "tool",
                        "content": f"Error: Unknown tool '{tool_name}'",
                    }

                try:
                    args = json.loads(call["function"]["arguments"])
                    result = await asyncio.wait_for(
                        tool_fn(**args),
                        timeout=timeout,
                    )
                    return {
                        "tool_call_id": call["id"],
                        "role": "tool",
                        "content": str(result),
                    }
                except asyncio.TimeoutError:
                    return {
                        "tool_call_id": call["id"],
                        "role": "tool",
                        "content": f"Error: Tool '{tool_name}' timed out after {timeout}s",
                    }
                except Exception as e:
                    return {
                        "tool_call_id": call["id"],
                        "role": "tool",
                        "content": f"Error executing '{tool_name}': {str(e)}",
                    }

        # Execute all calls concurrently
        results = await asyncio.gather(
            *[execute_one(call) for call in tool_calls],
            return_exceptions=False,
        )

        return results

结构化输出

强制 JSON Schema 输出

# Using OpenAI Structured Outputs
from pydantic import BaseModel
from openai import AsyncOpenAI

class ProductRecommendation(BaseModel):
    products: list[dict]
    reasoning: str
    confidence: float

client = AsyncOpenAI()

response = await client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a product recommendation engine."},
        {"role": "user", "content": "Recommend laptops under $1000 for coding"},
    ],
    response_format=ProductRecommendation,
)

recommendation = response.choices[0].message.parsed
# Type-safe access: recommendation.products, recommendation.reasoning

Anthropic 工具输出

# Using Anthropic tool_choice for structured output
import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    tools=[{
        "name": "structured_response",
        "description": "Output a structured analysis",
        "input_schema": {
            "type": "object",
            "properties": {
                "sentiment": {"type": "string", "enum": ["positive", "negative", "neutral"]},
                "key_topics": {"type": "array", "items": {"type": "string"}},
                "summary": {"type": "string"},
                "confidence": {"type": "number", "minimum": 0, "maximum": 1},
            },
            "required": ["sentiment", "key_topics", "summary", "confidence"],
        },
    }],
    tool_choice={"type": "tool", "name": "structured_response"},
    messages=[{"role": "user", "content": "Analyze this review: ..."}],
)

# Response will always use the structured_response tool
result = response.content[0].input  # Parsed JSON

错误恢复策略

三级错误处理

# src/tools/error_recovery.py

class ToolErrorHandler:
    """Three-tier error recovery for tool calls."""

    async def handle_tool_error(
        self,
        tool_name: str,
        error: Exception,
        original_args: dict,
        messages: list,
        retry_count: int = 0,
    ) -> dict:
        """
        Tier 1: Auto-retry with same args (transient errors)
        Tier 2: Ask LLM to fix args (parameter errors)
        Tier 3: Report to user (unrecoverable)
        """

        error_type = classify_error(error)

        # Tier 1: Transient errors -> retry
        if error_type == "transient" and retry_count < 2:
            await asyncio.sleep(2 ** retry_count)
            return await self.retry_tool(tool_name, original_args)

        # Tier 2: Parameter errors -> ask LLM to fix
        if error_type == "parameter" and retry_count < 1:
            fixed_args = await self.ask_llm_to_fix(
                tool_name, original_args, str(error), messages
            )
            return await self.retry_tool(tool_name, fixed_args)

        # Tier 3: Unrecoverable -> inform the model
        return {
            "role": "tool",
            "content": f"Tool '{tool_name}' failed: {error}. "
                      f"Please try a different approach or inform the user.",
        }

    async def ask_llm_to_fix(
        self,
        tool_name: str,
        args: dict,
        error: str,
        messages: list,
    ) -> dict:
        """Ask LLM to correct the tool arguments."""
        fix_prompt = f"""The tool call failed. Fix the arguments.

Tool: {tool_name}
Arguments: {json.dumps(args)}
Error: {error}

Return corrected arguments as JSON."""

        response = await llm.ainvoke(fix_prompt)
        return json.loads(response.content)

工具选择策略

动态工具集

# src/tools/tool_selector.py

class DynamicToolSelector:
    """Select relevant tools based on conversation context."""

    def __init__(self, all_tools: list, max_tools: int = 10):
        self.all_tools = all_tools
        self.max_tools = max_tools
        self.tool_embeddings = {}

    async def select_tools(self, query: str, context: str = "") -> list:
        """Select the most relevant tools for the current query."""

        # Always include core tools
        core_tools = [t for t in self.all_tools if t.metadata.get("always_available")]

        # Semantic matching for optional tools
        optional_tools = [t for t in self.all_tools if not t.metadata.get("always_available")]

        if not optional_tools:
            return core_tools

        query_embedding = await embed(query + " " + context)

        scored = []
        for tool in optional_tools:
            tool_embedding = await self.get_tool_embedding(tool)
            similarity = cosine_similarity(query_embedding, tool_embedding)
            scored.append((similarity, tool))

        scored.sort(key=lambda x: x[0], reverse=True)
        selected = [t for _, t in scored[:self.max_tools - len(core_tools)]]

        return core_tools + selected

    async def get_tool_embedding(self, tool) -> list[float]:
        key = tool.name
        if key not in self.tool_embeddings:
            text = f"{tool.name}: {tool.description}"
            self.tool_embeddings[key] = await embed(text)
        return self.tool_embeddings[key]

设计清单

检查项	要求	优先级
工具命名	动词+名词，清晰无歧义	必需
参数描述	每个参数有 description	必需
类型约束	枚举/范围/默认值	必需
错误处理	工具失败返回有意义的错误信息	必需
幂等设计	读操作无副作用	推荐
并行调用	独立工具支持并行执行	推荐
超时保护	每个工具有执行超时	必需
工具数量	单次调用不超过 20 个工具定义	推荐

总结

Schema 质量决定调用准确率：清晰的命名、充分的描述、精确的类型约束，是工具调用成功的基础。
并行调用提升吞吐：独立的工具调用应该并行执行，用 semaphore 控制并发度。
结构化输出消除解析风险：用 Schema 强制输出格式，比让模型"自由发挥"再解析可靠得多。
错误恢复要分级：临时错误自动重试，参数错误让 LLM 修正，不可恢复的错误优雅报告。
工具数量要克制：给模型太多工具会降低选择准确率，用动态工具选择控制在 10-15 个以内。

Maurice | [email protected]

深度加工（NotebookLM 生成）

基于本文内容生成的 PPT 大纲、博客摘要、短视频脚本与 Deep Dive 播客，用于多场景复用

PPT 大纲（5-8 张幻灯片）点击展开

Agent 工具调用模式与 Function Calling — ppt

幻灯片 1：Agent 工具调用：从“只能说”到“能做事”

核心价值：工具调用（Function Calling）是让 LLM 从仅仅生成文本跨越到查询数据库、操作文件等实际行动的关键跳跃 [1]。
工程挑战：单靠传递函数名和参数是不够的，Schema 设计、并行调用和错误恢复等工程层面的优化，直接决定了 Agent 的实际可用性 [1]。
关键要素：构建高可用 Agent 依赖于高质量的 Schema 设计、结构化输出、容错与重试机制，以及动态工具选择策略 [1, 2]。

幻灯片 2：工具 Schema 设计原则

命名清晰与描述充分：工具命名需采用“动词+名词”（如 search_products），并详细说明使用时机与限制条件 [1]。
参数精确无歧义：需在 Schema 中明确参数的类型、约束条件（如范围、枚举值）及默认值 [1]。
粒度适中与幂等优先：确保一个工具只做一件事，且读操作等应优先设计为无副作用（即幂等） [1]。
决定性作用：Schema 设计的质量（命名清晰、约束精确）直接决定了模型选择工具和生成参数的准确率 [2]。

幻灯片 3：并行工具调用引擎

模型原生支持：当前主流大模型（如 OpenAI 和 Anthropic）已原生支持在单次请求响应中返回多个并行工具调用 [3]。
并发执行提升吞吐：针对模型输出的多个独立工具调用，应构建并行执行引擎（如使用 Python asyncio.gather）以提升系统吞吐量 [2, 4]。
并发控制机制：在并行执行引擎中，推荐使用信号量（Semaphore）来控制最大的并发执行数量 [2, 4]。
超时保护：每个工具的并行执行都必须设置超时时间（timeout），防止单个工具异常阻塞全局进程 [2, 4]。

幻灯片 4：结构化输出与解析

强制 JSON Schema 输出：可以利用大模型的结构化输出能力（如 OpenAI Structured Outputs），将生成结果强制约束为预定义的模型结构 [5]。
利用 Tool Choice 限制：通过指定特定的 tool_choice（如 Anthropic 的用法），可以强制模型按设定好的结构化分析格式进行输出 [5, 6]。
消除解析风险：相比让模型“自由发挥”生成文本再依靠正则表达式解析，通过 Schema 强制输出格式能极大提高系统的可靠性 [2]。

幻灯片 5：三级错误恢复策略

第一级（临时错误自动重试）：针对网络波动等临时性错误（Transient errors），采取自动退避重试机制，重新传入相同参数 [2, 6, 7]。
第二级（参数错误由 LLM 修正）：若出现参数校验失败等情况，将错误信息连同原参数反馈给大语言模型，让其自行修正参数（Ask LLM to fix args） [2, 7]。
第三级（不可恢复错误上报）：遇到无法自动解决的问题时，优雅地将带有意义的错误信息返回给模型，以便它尝试其他途径或直接告知用户 [2, 7]。

幻灯片 6：动态工具选择策略

按需加载机制：为了防止向模型注入过多工具导致选择准确率下降，需根据当前的对话上下文动态筛选相关工具 [2, 8]。
核心与可选工具分离：系统应始终提供基础的“核心工具”，而对“可选工具”则通过匹配逻辑按需提供 [2]。
向量相似度匹配：计算用户 Query 加上下文的向量表示，与候选工具描述（Embedding）的余弦相似度，从而选出最相关的工具 [2]。
数量克制：单次调用提供给模型的工具定义总量，建议严格控制在 10-15 个以内 [2]。

幻灯片 7：最佳实践与设计总结

严格的检查清单：所有工具调用必须具备无歧义的命名、包含 description 的参数描述、精确的类型约束以及执行超时保护 [2]。
健壮的错误处理：当工具执行失败时，系统不能直接崩溃，必须返回有意义的错误提示供后续逻辑处理 [2]。
系统性优化：通过高质量 Schema（保准确）、并行调用（提吞吐）、结构化输出（降解析风险）和分级错误恢复（保稳定）共同构建高可用的 Agent 业务引擎 [2]。

博客摘要 + 核心看点点击展开

Agent 工具调用模式与 Function Calling — summary

SEO 友好博客摘要

本文深入探讨了 AI Agent 从“只能说”到“能做事”的核心技术——工具调用（Function Calling）的进阶工程实践[1]。文章全面解析了高质量 Tool Schema 设计原则、利用并行执行引擎提升系统吞吐量、强制结构化输出以消除解析风险等关键技术[1-4]。此外，还详细介绍了实用的三级错误恢复策略与动态工具选择机制[5, 6]。这是一篇不可错过的实战指南，助你构建更高稳定、高性能的 AI Agent！[4]

核心看点