ChatOpenAI 集成 - LangChain中文版文档

您可以在 OpenAI 平台文档中找到有关 OpenAI 最新模型、成本、上下文窗口和支持输入类型的信息。

API 参考如需所有功能和配置选项的详细文档，请前往 ChatOpenAI API 参考。

API 范围ChatOpenAI 仅针对官方 OpenAI API 规范。第三方提供商的非标准响应字段（例如 reasoning_content、reasoning、reasoning_details）不会被提取或保留。如果您使用的是扩展了 Chat Completions 或 Responses 格式的提供商，例如 OpenRouter、LiteLLM、vLLM 或 DeepSeek，请使用特定于提供商的包。详见 Chat Completions API 兼容性。

概述

集成详情

类	包	可序列化	JS/TS 支持	下载量	最新版本
`ChatOpenAI`	`langchain-openai`	beta	✅ (npm)

模型功能

工具调用	结构化输出	图像输入	音频输入	视频输入	令牌级流式传输	原生异步	令牌用量	日志概率
✅	✅	✅	✅	❌	✅	✅	✅	✅

设置

要访问 OpenAI 模型，您需要安装 langchain-openai 集成包并获取 OpenAI 平台 API 密钥。

安装

pip install -U langchain-openai

uv add langchain-openai

凭据

前往 OpenAI 平台注册并生成 API 密钥。完成后，在您的环境中设置 OPENAI_API_KEY 环境变量：

import getpass
import os

if not os.environ.get("OPENAI_API_KEY"):
    os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")

如果您希望自动追踪模型调用，还可以设置您的 LangSmith API 密钥：

os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
os.environ["LANGSMITH_TRACING"] = "true"

实例化

现在我们可以实例化我们的模型对象并生成响应：

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-5-nano",
    # stream_usage=True,
    # temperature=None,
    # max_tokens=None,
    # timeout=None,
    # reasoning_effort="low",
    # max_retries=2,
    # api_key="...",  # If you prefer to pass api key in directly
    # base_url="...",
    # organization="...",
    # other params...
)

有关完整可用模型参数列表，请参阅 ChatOpenAI API 参考。

令牌参数弃用OpenAI 于 2024 年 9 月弃用了 max_tokens，改用 max_completion_tokens。虽然 max_tokens 仍为向后兼容而受支持，但它会在内部自动转换为 max_completion_tokens。

调用

messages = [
    (
        "system",
        "You are a helpful assistant that translates English to French. Translate the user sentence.",
    ),
    ("human", "I love programming."),
]
ai_msg = llm.invoke(messages)
ai_msg

AIMessage(content="J'adore la programmation.", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 5, 'prompt_tokens': 31, 'total_tokens': 36}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_3aa7262c27', 'finish_reason': 'stop', 'logprobs': None}, id='run-63219b22-03e3-4561-8cc4-78b7c7c3a3ca-0', usage_metadata={'input_tokens': 31, 'output_tokens': 5, 'total_tokens': 36})

print(ai_msg.text)

J'adore la programmation.

流式传输用法元数据

OpenAI 的 Chat Completions API 默认不流式传输令牌用量统计（参见 OpenAI API 参考中的流选项）。要在使用 ChatOpenAI 或 AzureChatOpenAI 进行流式传输时恢复令牌计数，请将 stream_usage=True 设置为初始化参数或在调用时设置：

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4.1-mini", stream_usage=True)

与 Azure OpenAI 配合使用

Azure OpenAI v1 API 支持自 langchain-openai>=1.0.1 起，ChatOpenAI 可直接用于 Azure OpenAI 端点，使用新的 v1 API。这提供了一种统一的方式来使用托管在 OpenAI 或 Azure 上的 OpenAI 模型。对于传统的 Azure 特定实现，请继续使用 AzureChatOpenAI。

使用 Azure OpenAI v1 API 和 API 密钥

要将 ChatOpenAI 与 Azure OpenAI 配合使用，请将 base_url 设置为您的 Azure 端点，并在末尾附加 /openai/v1/：

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-5-mini",  # Your Azure deployment name
    base_url="https://{your-resource-name}.openai.azure.com/openai/v1/",
    api_key="your-azure-api-key"
)

response = llm.invoke("Hello, how are you?")
print(response.content)

使用 Microsoft Entra ID 与 Azure OpenAI 配合使用

v1 API 添加了原生的 Microsoft Entra ID（前身为 Azure AD）身份验证支持，具有自动令牌刷新功能。将令牌提供者可调用对象传递给 api_key 参数：

from azure.identity import DefaultAzureCredential, get_bearer_token_provider
from langchain_openai import ChatOpenAI

# Create a token provider that handles automatic refresh
token_provider = get_bearer_token_provider(
    DefaultAzureCredential(),
    "https://cognitiveservices.azure.com/.default"
)

llm = ChatOpenAI(
    model="gpt-5-mini",  # Your Azure deployment name
    base_url="https://{your-resource-name}.openai.azure.com/openai/v1/",
    api_key=token_provider  # Callable that handles token refresh
)

# Use the model as normal
messages = [
    ("system", "You are a helpful assistant."),
    ("human", "Translate 'I love programming' to French.")
]
response = llm.invoke(messages)
print(response.content)

令牌提供者是一个可调用对象，它会自动检索和刷新身份验证令牌，无需手动管理令牌过期。

安装要求要使用 Microsoft Entra ID 身份验证，请安装 Azure Identity 库：

pip install azure-identity

当使用异步函数时，您也可以将令牌提供者可调用对象传递给 api_key 参数。您必须从 azure.identity.aio 导入 DefaultAzureCredential：

from azure.identity.aio import DefaultAzureCredential
from langchain_openai import ChatOpenAI

credential = DefaultAzureCredential()

llm_async = ChatOpenAI(
    model="gpt-5-nano",
    api_key=credential
)

# Use async methods when using async callable
response = await llm_async.ainvoke("Hello!")

当对 API 密钥使用异步可调用对象时，您必须使用异步方法（ainvoke、astream 等）。同步方法将引发错误。

工具调用

OpenAI 有一个工具调用（此处我们互换使用“工具调用”和“函数调用”）API，允许您描述工具及其参数，并使模型返回一个包含要调用的工具和该工具输入的 JSON 对象。工具调用对于构建使用工具的链和代理，以及更广泛地从模型获取结构化输出非常有用。

绑定工具

使用 ChatOpenAI.bind_tools，我们可以轻松地将 Pydantic 类、dict 模式、LangChain 工具甚至函数作为工具传递给模型。底层这些会被转换为 OpenAI 工具模式，如下所示：

{
    "name": "...",
    "description": "...",
    "parameters": {...}  # JSONSchema
}

…并在每次模型调用时传递。

from pydantic import BaseModel, Field

class GetWeather(BaseModel):
    """Get the current weather in a given location"""

    location: str = Field(description="The city and state, e.g. San Francisco, CA")

llm_with_tools = llm.bind_tools([GetWeather])

ai_msg = llm_with_tools.invoke(
    "what is the weather like in San Francisco",
)
ai_msg

AIMessage(content='', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 68, 'total_tokens': 85}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_3aa7262c27', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-1617c9b2-dda5-4120-996b-0333ed5992e2-0', tool_calls=[{'name': 'GetWeather', 'args': {'location': 'San Francisco, CA'}, 'id': 'call_o9udf3EVOWiV4Iupktpbpofk', 'type': 'tool_call'}], usage_metadata={'input_tokens': 68, 'output_tokens': 17, 'total_tokens': 85})

严格模式

需要 langchain-openai>=0.1.21

自 2024 年 8 月 6 日起，OpenAI 支持在调用工具时使用 strict 参数，这将强制模型遵守工具参数模式。查看更多。

如果 strict=True，工具定义也将被验证，并且只接受 JSON 模式的子集。关键的是，模式不能有可选参数（那些带有默认值的参数）。阅读完整文档了解支持的模式类型。

llm_with_tools = llm.bind_tools([GetWeather], strict=True)
ai_msg = llm_with_tools.invoke(
    "what is the weather like in San Francisco",
)
ai_msg

AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_jUqhd8wzAIzInTJl72Rla8ht', 'function': {'arguments': '{"location":"San Francisco, CA"}', 'name': 'GetWeather'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 68, 'total_tokens': 85}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_3aa7262c27', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-5e3356a9-132d-4623-8e73-dd5a898cf4a6-0', tool_calls=[{'name': 'GetWeather', 'args': {'location': 'San Francisco, CA'}, 'id': 'call_jUqhd8wzAIzInTJl72Rla8ht', 'type': 'tool_call'}], usage_metadata={'input_tokens': 68, 'output_tokens': 17, 'total_tokens': 85})

工具调用

注意 AIMessage 有一个 tool_calls 属性。它以标准化的 ToolCall 格式包含内容，这是与模型提供商无关的。

ai_msg.tool_calls

[{'name': 'GetWeather',
  'args': {'location': 'San Francisco, CA'},
  'id': 'call_jUqhd8wzAIzInTJl72Rla8ht',
  'type': 'tool_call'}]

有关绑定工具和工具调用输出的更多信息，请前往工具调用文档。

自定义工具

需要 langchain-openai>=0.3.29

自定义工具支持具有任意字符串输入的工具。当您预期字符串参数很长或很复杂时，它们特别有用。

from langchain_openai import ChatOpenAI, custom_tool
from langchain.agents import create_agent


@custom_tool
def execute_code(code: str) -> str:
    """Execute python code."""
    return "27"


llm = ChatOpenAI(model="gpt-5", use_responses_api=True)

agent = create_agent(llm, [execute_code])

input_message = {"role": "user", "content": "Use the tool to calculate 3^3."}
for step in agent.stream(
    {"messages": [input_message]},
    stream_mode="values",
):
    step["messages"][-1].pretty_print()

================================ Human Message =================================

Use the tool to calculate 3^3.
================================== Ai Message ==================================

[{'id': 'rs_68b7336cb72081a080da70bf5e980e4e0d6082d28f91357a', 'summary': [], 'type': 'reasoning'}, {'call_id': 'call_qyKsJ4XlGRudbIJDrXVA2nQa', 'input': 'print(3**3)', 'name': 'execute_code', 'type': 'custom_tool_call', 'id': 'ctc_68b7336f718481a0b39584cd35fbaa5d0d6082d28f91357a', 'status': 'completed'}]
Tool Calls:
  execute_code (call_qyKsJ4XlGRudbIJDrXVA2nQa)
 Call ID: call_qyKsJ4XlGRudbIJDrXVA2nQa
  Args:
    __arg1: print(3**3)
================================= Tool Message =================================
Name: execute_code

[{'type': 'custom_tool_call_output', 'output': '27'}]
================================== Ai Message ==================================

[{'type': 'text', 'text': '27', 'annotations': [], 'id': 'msg_68b73371e9e081a0927f54f88f2cd7a20d6082d28f91357a'}]

上下文无关文法

OpenAI 支持在 lark 或 regex 格式中为自定义工具输入指定上下文无关文法。详细信息请参阅 OpenAI 文档。format 参数可以像下面这样传递给 @custom_tool：

from langchain_openai import ChatOpenAI, custom_tool
from langchain.agents import create_agent


grammar = """
start: expr
expr: term (SP ADD SP term)* -> add
| term
term: factor (SP MUL SP factor)* -> mul
| factor
factor: INT
SP: " "
ADD: "+"
MUL: "*"
%import common.INT
"""

format_ = {"type": "grammar", "syntax": "lark", "definition": grammar}


@custom_tool(format=format_)
def do_math(input_string: str) -> str:
    """Do a mathematical operation."""
    return "27"


llm = ChatOpenAI(model="gpt-5", use_responses_api=True)

agent = create_agent(llm, [do_math])

input_message = {"role": "user", "content": "Use the tool to calculate 3^3."}
for step in agent.stream(
    {"messages": [input_message]},
    stream_mode="values",
):
    step["messages"][-1].pretty_print()

================================ Human Message =================================

Use the tool to calculate 3^3.
================================== Ai Message ==================================

[{'id': 'rs_68b733f066a48194a41001c0cc1081760811f11b6f4bae47', 'summary': [], 'type': 'reasoning'}, {'call_id': 'call_7hTYtlTj9NgWyw8AQGqETtV9', 'input': '3 * 3 * 3', 'name': 'do_math', 'type': 'custom_tool_call', 'id': 'ctc_68b733f3a0a08194968b8338d33ad89f0811f11b6f4bae47', 'status': 'completed'}]
Tool Calls:
  do_math (call_7hTYtlTj9NgWyw8AQGqETtV9)
 Call ID: call_7hTYtlTj9NgWyw8AQGqETtV9
  Args:
    __arg1: 3 * 3 * 3
================================= Tool Message =================================
Name: do_math

[{'type': 'custom_tool_call_output', 'output': '27'}]
================================== Ai Message ==================================

[{'type': 'text', 'text': '27', 'annotations': [], 'id': 'msg_68b733f4bb008194937130796372bd0f0811f11b6f4bae47'}]

结构化输出

OpenAI 支持原生的结构化输出功能，保证响应遵循给定的模式。您可以在单个模型调用中访问此功能，或者通过指定 LangChain 代理的响应格式。示例如下。

单个模型调用

使用 with_structured_output 方法生成结构化的模型响应。指定 method="json_schema" 以启用 OpenAI 的原生结构化输出功能；否则该方法默认使用函数调用。

from langchain_openai import ChatOpenAI
from pydantic import BaseModel, Field

llm = ChatOpenAI(model="gpt-4.1")

class Movie(BaseModel):
    """A movie with details."""
    title: str = Field(description="The title of the movie")
    year: int = Field(description="The year the movie was released")
    director: str = Field(description="The director of the movie")
    rating: float = Field(description="The movie's rating out of 10")

structured_llm = llm.with_structured_output(Movie, method="json_schema")
response = structured_llm.invoke("Provide details about the movie Inception")
response

Movie(title='Inception', year=2010, director='Christopher Nolan', rating=8.8)

代理响应格式

指定 response_format 并使用 ProviderStrategy 在生成最终响应时启用 OpenAI 的结构化输出功能。

from langchain.agents import create_agent
from langchain.agents.structured_output import ProviderStrategy
from pydantic import BaseModel

class Weather(BaseModel):
    temperature: float
    condition: str

def weather_tool(location: str) -> str:
    """Get the weather at a location."""
    return "Sunny and 75 degrees F."

agent = create_agent(
    model="openai:gpt-4.1",
    tools=[weather_tool],
    response_format=ProviderStrategy(Weather),
)

result = agent.invoke({
    "messages": [{"role": "user", "content": "What's the weather in SF?"}]
})

result["structured_response"]

Weather(temperature=75.0, condition='Sunny')

带工具调用的结构化输出

OpenAI 的结构化输出功能可以与工具调用同时使用。模型将生成工具调用或遵循所需模式的响应。见下方示例：

from langchain_openai import ChatOpenAI
from pydantic import BaseModel


def get_weather(location: str) -> None:
    """Get weather at a location."""
    return "It's sunny."


class OutputSchema(BaseModel):
    """Schema for response."""

    answer: str
    justification: str


llm = ChatOpenAI(model="gpt-4.1")

structured_llm = llm.bind_tools(
    [get_weather],
    response_format=OutputSchema,
    strict=True,
)

# Response contains tool calls:
tool_call_response = structured_llm.invoke("What is the weather in SF?")

# structured_response.additional_kwargs["parsed"] contains parsed output
structured_response = structured_llm.invoke(
    "What weighs more, a pound of feathers or a pound of gold?"
)

Responses API

需要 langchain-openai>=0.3.9

OpenAI 支持一个 Responses API，旨在构建代理应用程序。它包括一套内置工具，包括网页和文件搜索。它还支持对话状态的管理，允许您继续对话线程而无需显式传递之前的消息，以及推理过程的输出。如果使用其中任一功能，ChatOpenAI 将路由到 Responses API。也可以在实例化 ChatOpenAI 时指定 use_responses_api=True。

网页搜索

要触发网页搜索，像传递其他工具一样向模型传递 {"type": "web_search_preview"}。

您也可以将内置工具作为调用参数传递：

llm.invoke("...", tools=[{"type": "web_search_preview"}])

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4.1-mini")

tool = {"type": "web_search_preview"}
llm_with_tools = llm.bind_tools([tool])

response = llm_with_tools.invoke("What was a positive news story from today?")

请注意，响应包含结构化的内容块，其中包括响应的文本和 OpenAI 引用来源。输出消息还将包含任何工具调用的信息：

response.content_blocks

[{'type': 'server_tool_call',
  'name': 'web_search',
  'args': {'query': 'positive news stories today', 'type': 'search'},
  'id': 'ws_68cd6f8d72e4819591dab080f4b0c340080067ad5ea8144a'},
 {'type': 'server_tool_result',
  'tool_call_id': 'ws_68cd6f8d72e4819591dab080f4b0c340080067ad5ea8144a',
  'status': 'success'},
 {'type': 'text',
  'text': 'Here are some positive news stories from today...',
  'annotations': [{'end_index': 410,
    'start_index': 337,
    'title': 'Positive News | Real Stories. Real Positive Impact',
    'type': 'citation',
    'url': 'https://www.positivenews.press/?utm_source=openai'},
   {'end_index': 969,
    'start_index': 798,
    'title': "From Green Innovation to Community Triumphs: Uplifting US Stories Lighting Up September 2025 | That's Great News",
    'type': 'citation',
    'url': 'https://info.thatsgreatnews.com/from-green-innovation-to-community-triumphs-uplifting-us-stories-lighting-up-september-2025/?utm_source=openai'},
  'id': 'msg_68cd6f8e8d448195a807b89f483a1277080067ad5ea8144a'}]

您可以使用 response.text 仅恢复响应的文本内容作为字符串。例如，流式传输响应文本：

for token in llm_with_tools.stream("..."):
    print(token.text, end="|")

有关更多详细信息，请参阅流式传输指南。

图像生成

需要 langchain-openai>=0.3.19

要触发图像生成，像传递其他工具一样向模型传递 {"type": "image_generation"}。

您也可以将内置工具作为调用参数传递：

llm.invoke("...", tools=[{"type": "image_generation"}])

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4.1-mini")

tool = {"type": "image_generation", "quality": "low"}

llm_with_tools = llm.bind_tools([tool])

ai_message = llm_with_tools.invoke(
    "Draw a picture of a cute fuzzy cat with an umbrella"
)

import base64

from IPython.display import Image

image = next(
    item for item in ai_message.content_blocks if item["type"] == "image"
)
Image(base64.b64decode(image["base64"]), width=200)

文件搜索

要触发文件搜索，像传递其他工具一样向模型传递文件搜索工具。您需要填充 OpenAI 管理的向量存储，并在工具定义中包含向量存储 ID。有关更多详细信息，请参阅 OpenAI 文档。

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-4.1-mini",
    include=["file_search_call.results"],  # optionally include search results
)

openai_vector_store_ids = [
    "vs_...",  # your IDs here
]

tool = {
    "type": "file_search",
    "vector_store_ids": openai_vector_store_ids,
}
llm_with_tools = llm.bind_tools([tool])

response = llm_with_tools.invoke("What is deep research by OpenAI?")
print(response.text)

Deep Research by OpenAI is...

与网页搜索一样，响应将包含带有引用的内容块：

[block["type"] for block in response.content_blocks]

['server_tool_call', 'server_tool_result', 'text']

text_block = next(block for block in response.content_blocks if block["type"] == "text")

text_block["annotations"][:2]

[{'type': 'citation',
  'title': 'deep_research_blog.pdf',
  'extras': {'file_id': 'file-3UzgX7jcC8Dt9ZAFzywg5k', 'index': 2712}},
 {'type': 'citation',
  'title': 'deep_research_blog.pdf',
  'extras': {'file_id': 'file-3UzgX7jcC8Dt9ZAFzywg5k', 'index': 2712}}]

它还将包含来自内置工具调用的信息：

response.content_blocks[0]

{'type': 'server_tool_call',
 'name': 'file_search',
 'id': 'fs_68cd704c191c81959281b3b2ec6b139908f8f7fb31b1123c',
 'args': {'queries': ['deep research by OpenAI']}}

工具搜索

需要 langchain-openai>=1.1.11

OpenAI 支持工具搜索功能，允许模型根据需要搜索并将工具加载到其上下文中。OpenAI 将在活动上下文的末尾注入检索到的工具定义，以保留其缓存。要启用工具搜索，请使用 @tool(extras={"defer_loading": True}) 标记工具，并将 OpenAI 的搜索工具添加到可用工具中。示例如下。

服务器端工具搜索

OpenAI 可以在可用工具中搜索，并在同一响应中返回加载的工具（如果有适当的工具调用）：

@tool(extras={"defer_loading": True})
def get_weather(location: str) -> str:
    """Get the current weather for a location."""
    return f"The weather in {location} is sunny and 72°F"

@tool(extras={"defer_loading": True})
def get_recipe(query: str) -> None:
    """Get a recipe for chicken soup."""

model = ChatOpenAI(model="gpt-5.4", use_responses_api=True)

agent = create_agent(
    model=model,
    tools=[
        get_weather,
        get_recipe,
        {"type": "tool_search"}
    ],
)
input_message = {"role": "user", "content": "What's the weather in San Francisco?"}
result = agent.invoke({"messages": [input_message]})

for message in result["messages"]:
    message.pretty_print()

================================ Human Message =================================

What's the weather in San Francisco?
================================== Ai Message ==================================

[
  {
    "id": "tsc_0667642bae2ae6c70069ad6cb31f0c819c838b18b0e1cf1279",
    "arguments": {
      "paths": [
        "get_weather"
      ]
    },
    "execution": "server",
    "status": "completed",
    "type": "tool_search_call"
  },
  {
    "id": "tso_0667642bae2ae6c70069ad6cb339dc819c9bbc05cb432f347e",
    "execution": "server",
    "status": "completed",
    "tools": [
      {
        "name": "get_weather",
        "parameters": {
          "properties": {
            "location": {
              "type": "string"
            }
          },
          "required": [
            "location"
          ],
          "type": "object",
          "additionalProperties": false
        },
        "strict": true,
        "type": "function",
        "defer_loading": true,
        "description": "Get the current weather for a location."
      }
    ],
    "type": "tool_search_output"
  },
  {
    "arguments": "{\"location\":\"San Francisco\"}",
    "call_id": "call_nwy9NDI24fTe8qESIRqZGtYm",
    "name": "get_weather",
    "type": "function_call",
    "id": "fc_0667642bae2ae6c70069ad6cb37adc819cbc55cde85e111e32",
    "namespace": "get_weather",
    "status": "completed"
  }
]
Tool Calls:
  get_weather (call_nwy9NDI24fTe8qESIRqZGtYm)
 Call ID: call_nwy9NDI24fTe8qESIRqZGtYm
  Args:
    location: San Francisco
================================= Tool Message =================================
Name: get_weather

The weather in San Francisco is sunny and 72°F
================================== Ai Message ==================================

[
  {
    "type": "text",
    "text": "It\u2019s currently sunny and 72\u00b0F in San Francisco.",
    "annotations": [],
    "id": "msg_0667642bae2ae6c70069ad6cb4829c819c8e26bc7ccc68dcd7"
  }
]

客户端执行工具搜索

为了完全控制底层工具搜索过程，您可以在搜索工具定义中指定 "execution": "client"。如果模型选择搜索工具，它将在其响应中包含一个 tool_search_call 块。然后您可以提供一个包含工具定义的 tool_search_output 块。以下示例展示了如何使用自定义中间件编排此操作。该示例实现了定义搜索逻辑的可调用对象。中间件随后包括：

after_model 钩子以检查 tool_search_call 块并调用我们的可调用对象
wrap_tool_call 钩子用于运行时工具注册

@tool
def get_weather(location: str) -> str:
    """Get the current weather for a location."""
    return f"The weather in {location} is sunny and 72°F"

# Implement a callable that returns a tool definition
def search_tools(goal: str) -> list[dict]:
    """Search for available tools to help answer the question."""
    # Arbitrary logic here
    return [
        {
            "type": "function",
            "defer_loading": True,
            **convert_to_openai_tool(get_weather)["function"],
        }
    ]

tool_search_schema = convert_to_openai_tool(search_tools, strict=True)
tool_search_config: dict = {
    "type": "tool_search",
    "execution": "client",
    "description": tool_search_schema["function"]["description"],
    "parameters": tool_search_schema["function"]["parameters"],
}

# Implement middleware to invoke the callable and register the tool.
class ClientToolSearchMiddleware(AgentMiddleware):

    @hook_config(can_jump_to=["model"])
    def after_model(self, state: AgentState, runtime: Any) -> dict[str, Any] | None:
        last_message = state["messages"][-1]
        if not isinstance(last_message, AIMessage):
            return None
        for block in last_message.content:
            if isinstance(block, dict) and block.get("type") == "tool_search_call":
                call_id = block.get("call_id")
                args = block.get("arguments", {})
                goal = args.get("goal", "") if isinstance(args, dict) else ""
                loaded_tools = search_tools(goal)
                tool_search_output = {
                    "type": "tool_search_output",
                    "execution": "client",
                    "call_id": call_id,
                    "status": "completed",
                    "tools": loaded_tools,
                }
                return {
                    "messages": [HumanMessage(content=[tool_search_output])],
                    "jump_to": "model",
                }
        return None

    def wrap_tool_call(
        self,
        request: ToolCallRequest,
        handler: Any,
    ) -> Any:
        if request.tool_call["name"] == "get_weather":
            return handler(request.override(tool=get_weather))
        return handler(request)

llm = ChatOpenAI(model="gpt-5.4", use_responses_api=True)

agent = create_agent(
    model=llm,
    tools=[tool_search_config],
    middleware=[ClientToolSearchMiddleware()],
)

result = agent.invoke(
    {"messages": [HumanMessage("What's the weather in San Francisco?")]}
)

for message in result["messages"]:
    message.pretty_print()

================================ Human Message =================================

What's the weather in San Francisco?
================================== Ai Message ==================================

[
  {
    "id": "tsc_0311ca847e392d540069acdd40394c8196a99345e2992eb657",
    "arguments": {
      "goal": "Find available tool(s) or weather capability to get current weather for San Francisco."
    },
    "call_id": "call_EcvKsh3r9IamaBW4Zz9r7RiK",
    "execution": "client",
    "status": "completed",
    "type": "tool_search_call"
  }
]
================================ Human Message =================================

[
  {
    "type": "tool_search_output",
    "execution": "client",
    "call_id": "call_EcvKsh3r9IamaBW4Zz9r7RiK",
    "status": "completed",
    "tools": [
      {
        "type": "function",
        "defer_loading": true,
        "name": "get_weather",
        "description": "Get the current weather for a location.",
        "parameters": {
          "properties": {
            "location": {
              "type": "string"
            }
          },
          "required": [
            "location"
          ],
          "type": "object"
        }
      }
    ]
  }
]
================================== Ai Message ==================================

[
  {
    "arguments": "{\"location\":\"San Francisco\"}",
    "call_id": "call_wH09dZpqDoVtpeu7uBdvY91l",
    "name": "get_weather",
    "type": "function_call",
    "id": "fc_0311ca847e392d540069acdd41502881968b29d96840633746",
    "namespace": "get_weather",
    "status": "completed"
  }
]
Tool Calls:
  get_weather (call_wH09dZpqDoVtpeu7uBdvY91l)
 Call ID: call_wH09dZpqDoVtpeu7uBdvY91l
  Args:
    location: San Francisco
================================= Tool Message =================================
Name: get_weather

The weather in San Francisco is sunny and 72°F
================================== Ai Message ==================================

[
  {
    "type": "text",
    "text": "San Francisco is sunny and 72\u00b0F.",
    "annotations": [],
    "id": "msg_0311ca847e392d540069acdd420b648196a603306f5546fabd"
  }
]

计算机使用

ChatOpenAI 支持 "computer-use-preview" 模型，这是一个专为内置计算机使用工具设计的专用模型。要启用，像传递其他工具一样传递计算机使用工具。目前，计算机使用的工具输出存在于消息 content 字段中。要回复计算机使用工具调用，请构造一个 ToolMessage，在其 additional_kwargs 中包含 {"type": "computer_call_output"}。消息的内容将是一张屏幕截图。下面，我们演示一个简单的示例。首先，加载两张屏幕截图：

import base64


def load_png_as_base64(file_path):
    with open(file_path, "rb") as image_file:
        encoded_string = base64.b64encode(image_file.read())
        return encoded_string.decode("utf-8")


screenshot_1_base64 = load_png_as_base64(
    "/path/to/screenshot_1.png"
)  # perhaps a screenshot of an application
screenshot_2_base64 = load_png_as_base64(
    "/path/to/screenshot_2.png"
)  # perhaps a screenshot of the Desktop

from langchain_openai import ChatOpenAI

# Initialize model
llm = ChatOpenAI(model="computer-use-preview", truncation="auto")

# Bind computer-use tool
tool = {
    "type": "computer_use_preview",
    "display_width": 1024,
    "display_height": 768,
    "environment": "browser",
}
llm_with_tools = llm.bind_tools([tool])

# Construct input message
input_message = {
    "role": "user",
    "content": [
        {
            "type": "text",
            "text": (
                "Click the red X to close and reveal my Desktop. "
                "Proceed, no confirmation needed."
            ),
        },
        {
            "type": "input_image",
            "image_url": f"data:image/png;base64,{screenshot_1_base64}",
        },
    ],
}

# Invoke model
response = llm_with_tools.invoke(
    [input_message],
    reasoning={
        "generate_summary": "concise",
    },
)

响应将在其 content 中包含对计算机使用工具的调用：

response.content

[{'id': 'rs_685da051742c81a1bb35ce46a9f3f53406b50b8696b0f590',
  'summary': [{'text': "Clicking red 'X' to show desktop",
    'type': 'summary_text'}],
  'type': 'reasoning'},
 {'id': 'cu_685da054302481a1b2cc43b56e0b381706b50b8696b0f590',
  'action': {'button': 'left', 'type': 'click', 'x': 14, 'y': 38},
  'call_id': 'call_zmQerFBh4PbBE8mQoQHkfkwy',
  'pending_safety_checks': [],
  'status': 'completed',
  'type': 'computer_call'}]

接下来，我们使用这些属性构造一个 ToolMessage：

它具有与计算机调用的 call_id 匹配的 tool_call_id。
它在 additional_kwargs 中具有 {"type": "computer_call_output"}。
其内容是 image_url 或 input_image 输出块（有关格式说明，请参阅 OpenAI 文档）。

from langchain.messages import ToolMessage

tool_call_id = next(
    item["call_id"] for item in response.content if item["type"] == "computer_call"
)

tool_message = ToolMessage(
    content=[
        {
            "type": "input_image",
            "image_url": f"data:image/png;base64,{screenshot_2_base64}",
        }
    ],
    # content=f"data:image/png;base64,{screenshot_2_base64}",  # <-- also acceptable
    tool_call_id=tool_call_id,
    additional_kwargs={"type": "computer_call_output"},
)

我们现在可以使用消息历史再次调用模型：

messages = [
    input_message,
    response,
    tool_message,
]

response_2 = llm_with_tools.invoke(
    messages,
    reasoning={
        "generate_summary": "concise",
    },
)

response_2.text

'VS Code has been closed, and the desktop is now visible.'

除了传递整个序列外，我们还可以使用 previous_response_id：

previous_response_id = response.response_metadata["id"]

response_2 = llm_with_tools.invoke(
    [tool_message],
    previous_response_id=previous_response_id,
    reasoning={
        "generate_summary": "concise",
    },
)

response_2.text

'The VS Code window is closed, and the desktop is now visible. Let me know if you need any further assistance.'

代码解释器

OpenAI 实现了代码解释器工具，以支持沙盒代码生成和执行。

Example use

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-4.1-mini",
    include=["code_interpreter_call.outputs"],  # optionally include outputs
)

llm_with_tools = llm.bind_tools(
    [
        {
            "type": "code_interpreter",
            # Create a new container
            "container": {"type": "auto"},
        }
    ]
)
response = llm_with_tools.invoke(
    "Write and run code to answer the question: what is 3^3?"
)

请注意，上述命令创建了一个新容器。我们也可以指定现有容器 ID：

code_interpreter_calls = [
    item for item in response.content if item["type"] == "code_interpreter_call"
]
assert len(code_interpreter_calls) == 1
container_id = code_interpreter_calls[0]["extras"]["container_id"]

llm_with_tools = llm.bind_tools(
    [
        {
            "type": "code_interpreter",
            # Use an existing container
            "container": container_id,
        }
    ]
)

远程 MCP

OpenAI 实现了远程 MCP 工具，允许模型生成的调用 MCP 服务器。

Example use

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4.1-mini")

llm_with_tools = llm.bind_tools(
    [
        {
            "type": "mcp",
            "server_label": "deepwiki",
            "server_url": "https://mcp.deepwiki.com/mcp",
            "require_approval": "never",
        }
    ]
)
response = llm_with_tools.invoke(
    "What transport protocols does the 2025-03-26 version of the MCP "
    "spec (modelcontextprotocol/modelcontextprotocol) support?"
)

MCP 批准

OpenAI 有时会在与远程 MCP 服务器共享数据之前请求批准。在上述命令中，我们指示模型从不要求批准。我们也可以配置模型始终请求批准，或始终为特定工具请求批准：

llm_with_tools = llm.bind_tools(
    [
        {
            "type": "mcp",
            "server_label": "deepwiki",
            "server_url": "https://mcp.deepwiki.com/mcp",
            "require_approval": {
                "always": {
                    "tool_names": ["read_wiki_structure"]
                }
            }
        }
    ]
)
response = llm_with_tools.invoke(
    "What transport protocols does the 2025-03-26 version of the MCP "
    "spec (modelcontextprotocol/modelcontextprotocol) support?"
)

响应可能包含类型为 "mcp_approval_request" 的块。要提交批准请求的批准，将其结构化到输入消息的内容块中：

approval_message = {
    "role": "user",
    "content": [
        {
            "type": "mcp_approval_response",
            "approve": True,
            "approval_request_id": block["id"],
        }
        for block in response.content
        if block["type"] == "mcp_approval_request"
    ]
}

next_response = llm_with_tools.invoke(
    [approval_message],
    # continue existing thread
    previous_response_id=response.response_metadata["id"]
)

管理对话状态

Responses API 支持对话状态的管理。

手动管理状态

您可以手动管理状态或使用 LangGraph，与其他聊天模型一样：

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4.1-mini", use_responses_api=True)

first_query = "Hi, I'm Bob."
messages = [{"role": "user", "content": first_query}]

response = llm.invoke(messages)
print(response.text)

Hi Bob! Nice to meet you. How can I assist you today?

second_query = "What is my name?"

messages.extend(
    [
        response,
        {"role": "user", "content": second_query},
    ]
)
second_response = llm.invoke(messages)
print(second_response.text)

You mentioned that your name is Bob. How can I assist you further, Bob?

您可以使用 LangGraph 在各种后端（包括内存和 Postgres）中为您管理对话线程。请参阅本教程开始使用。

传递 `previous_response_id`

使用 Responses API 时，LangChain 消息将在其元数据中包含 "id" 字段。将此 ID 传递给后续调用将继续对话。请注意，这在计费方面是等效的手动传递消息。

second_response = llm.invoke(
    "What is my name?",
    previous_response_id=response.id,
)
print(second_response.text)

Your name is Bob. How can I help you today, Bob?

ChatOpenAI 也可以使用消息序列中的最后一个响应自动指定 previous_response_id：

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-4.1-mini",
    use_previous_response_id=True,
)

如果我们设置 use_previous_response_id=True，输入消息直到最近的响应将从请求负载中删除，并且 previous_response_id 将使用最近响应的 ID 设置。也就是说，

llm.invoke(
    [
        HumanMessage("Hello"),
        AIMessage("Hi there!", id="resp_123"),
        HumanMessage("How are you?"),
    ]
)

…等同于：

llm.invoke([HumanMessage("How are you?")], previous_response_id="resp_123")

上下文管理

Responses API 支持自动服务器端上下文压缩。当对话大小达到令牌阈值时，这会减少对话大小，从而支持长时间运行的交互：

from langchain_openai import ChatOpenAI

model = ChatOpenAI(
    model="gpt-5.2",
    context_management=[
        {"type": "compaction", "compact_threshold": 100_000}
    ],
)

启用后，AIMessage 响应可能在内容中包含类型为 "compaction" 的块。这些应保留在对话历史记录中，并可以按通常方式附加到消息序列。最近的 compaction 项之前的消息可以保留，也可以丢弃以提高延迟。

推理输出

一些 OpenAI 模型将生成单独的文本内容来说明其推理过程。有关详细信息，请参阅 OpenAI 的推理文档。 OpenAI 可以返回模型推理的摘要（尽管它不暴露原始推理令牌）。要配置 ChatOpenAI 返回此摘要，请指定 reasoning 参数。如果设置了此参数，ChatOpenAI 将自动路由到 Responses API。

from langchain_openai import ChatOpenAI

reasoning = {
    "effort": "medium",  # 'low', 'medium', or 'high'
    "summary": "auto",  # 'detailed', 'auto', or None
}

llm = ChatOpenAI(model="gpt-5-nano", reasoning=reasoning)
response = llm.invoke("What is 3^3?")

# Output
response.text

'3³ = 3 × 3 × 3 = 27.'

# Reasoning
for block in response.content_blocks:
    if block["type"] == "reasoning":
        print(block["reasoning"])

**Calculating the power of three**

The user is asking about 3 raised to the power of 3. That's a pretty simple calculation! I know that 3^3 equals 27, so I can say, "3 to the power of 3 equals 27." I might also include a quick explanation that it's 3 multiplied by itself three times: 3 × 3 × 3 = 27. So, the answer is definitely 27.

故障排除：推理模型的空响应如果您从 gpt-5-nano 等推理模型获得空响应，这可能是由于限制性令牌限制。模型使用令牌进行内部推理，可能没有剩余令牌用于最终输出。确保 max_tokens 设置为 None 或增加令牌限制，以便有足够的令牌用于推理和输出生成。

微调

您可以通过传递相应的 modelName 参数来调用微调后的 OpenAI 模型。这通常采用 ft:{OPENAI_MODEL_NAME}:{ORG_NAME}::{MODEL_ID} 的形式。例如：

fine_tuned_model = ChatOpenAI(
    temperature=0, model_name="ft:gpt-3.5-turbo-0613:langchain::7qTVM5AR"
)

fine_tuned_model.invoke(messages)

AIMessage(content="J'adore la programmation.", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 8, 'prompt_tokens': 31, 'total_tokens': 39}, 'model_name': 'ft:gpt-3.5-turbo-0613:langchain::7qTVM5AR', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-0f39b30e-c56e-4f3b-af99-5c948c984146-0', usage_metadata={'input_tokens': 31, 'output_tokens': 8, 'total_tokens': 39})

多模态输入（图像、PDF、音频）

OpenAI 有支持多模态输入的模型。您可以将这些模型传递图像、PDF 或音频。有关如何在 LangChain 中执行此操作的更多信息，请前往多模态输入文档。您可以在 OpenAI 文档中查看支持不同模态的模型列表。对于所有模态，LangChain 都支持其跨提供商标准以及 OpenAI 的原生内容块格式。要将多模态数据传递给 ChatOpenAI，请创建包含数据的内容块并将其合并到消息中，例如如下：

message = {
    "role": "user",
    "content": [
        {
            "type": "text",
            # Update prompt as desired
            "text": "Describe the (image / PDF / audio...)",
        },
        content_block,
    ],
}

以下是内容块的示例。

图像

请参阅多模态消息操作指南中的示例。

URLs

# LangChain format
content_block = {
    "type": "image",
    "url": url_string,
}

# OpenAI Chat Completions format
content_block = {
    "type": "image_url",
    "image_url": {"url": url_string},
}

In-line base64 data

# LangChain format
content_block = {
    "type": "image",
    "base64": base64_string,
    "mime_type": "image/jpeg",
}

# OpenAI Chat Completions format
content_block = {
    "type": "image_url",
    "image_url": {
        "url": f"data:image/jpeg;base64,{base64_string}",
    },
}

PDF

注意：OpenAI 要求为 PDF 输入指定文件名。使用 LangChain 格式时，请包含 filename 键。阅读有关 OpenAI 多模态消息的文件名的更多信息。请参阅 PDF 文档操作指南中的示例。

In-line base64 data

# LangChain format
content_block = {
    "type": "file",
    "base64": base64_string,
    "mime_type": "application/pdf",
    "filename": "my-file.pdf",
}

# OpenAI Chat Completions format
content_block = {
    "type": "file",
    "file": {
        "filename": "my-file.pdf",
        "file_data": f"data:application/pdf;base64,{base64_string}",
    }
}

音频

请参阅支持的模型，例如 "gpt-4o-audio-preview"。请参阅音频操作指南中的示例。

In-line base64 data

# LangChain format
content_block = {
    "type": "audio",
    "mime_type": "audio/wav",  # or appropriate mime-type
    "base64": base64_string,
}

# OpenAI Chat Completions format
content_block = {
    "type": "input_audio",
    "input_audio": {"data": base64_string, "format": "wav"},
}

预测输出

需要 langchain-openai>=0.2.6

一些 OpenAI 模型（如它们的 gpt-4o 和 gpt-4o-mini 系列）支持预测输出，允许您提前传递 LLM 预期输出的一部分以减少延迟。这对于编辑文本或代码等情况很有用，其中只有模型输出的一小部分会更改。这是一个示例：

code = """
/// <summary>
/// Represents a user with a first name, last name, and username.
/// </summary>
public class User
{
    /// <summary>
    /// Gets or sets the user's first name.
    /// </summary>
    public string FirstName { get; set; }

    /// <summary>
    /// Gets or sets the user's last name.
    /// </summary>
    public string LastName { get; set; }

    /// <summary>
    /// Gets or sets the user's username.
    /// </summary>
    public string Username { get; set; }
}
"""

llm = ChatOpenAI(model="gpt-4.1")
query = (
    "Replace the Username property with an Email property. "
    "Respond only with code, and with no markdown formatting."
)
response = llm.invoke(
    [{"role": "user", "content": query}, {"role": "user", "content": code}],
    prediction={"type": "content", "content": code},
)
print(response.content)
print(response.response_metadata)

/// <summary>
/// Represents a user with a first name, last name, and email.
/// </summary>
public class User
{
    /// <summary>
    /// Gets or sets the user's first name.
    /// </summary>
    public string FirstName { get; set; }

    /// <summary>
    /// Gets or sets the user's last name.
    /// </summary>
    public string LastName { get; set; }

    /// <summary>
    /// Gets or sets the user's email.
    /// </summary>
    public string Email { get; set; }
}
{'token_usage': {'completion_tokens': 226, 'prompt_tokens': 166, 'total_tokens': 392, 'completion_tokens_details': {'accepted_prediction_tokens': 49, 'audio_tokens': None, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 107}, 'prompt_tokens_details': {'audio_tokens': None, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_45cf54deae', 'finish_reason': 'stop', 'logprobs': None}

预测作为额外令牌计费，可能会增加您的使用量和成本，以换取这种降低的延迟。

音频生成（预览）

需要 langchain-openai>=0.2.3

OpenAI 有一个新的音频生成功能，允许您使用 gpt-4o-audio-preview 模型的音频输入和输出。

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-4o-audio-preview",
    temperature=0,
    model_kwargs={
        "modalities": ["text", "audio"],
        "audio": {"voice": "alloy", "format": "wav"},
    },
)

output_message = llm.invoke(
    [
        ("human", "Are you made by OpenAI? Just answer yes or no"),
    ]
)

output_message.additional_kwargs['audio'] 将包含一个字典，如下所示

{
    'data': '<audio data b64-encoded',
    'expires_at': 1729268602,
    'id': 'audio_67127d6a44348190af62c1530ef0955a',
    'transcript': 'Yes.'
}

…格式将是您在 model_kwargs['audio']['format'] 中传递的格式。我们也可以在此消息中传递带有音频数据的消息作为消息历史的一部分，在 openai expires_at 到期之前。

输出音频存储在 AIMessage.additional_kwargs 的 audio 键下，但输入内容块在 HumanMessage.content 列表中类型为 input_audio 类型和键。有关更多信息，请参阅 OpenAI 的音频文档。

history = [
    ("human", "Are you made by OpenAI? Just answer yes or no"),
    output_message,
    ("human", "And what is your name? Just give your name."),
]
second_output_message = llm.invoke(history)

提示词缓存

OpenAI 的提示词缓存功能自动缓存超过 1024 个令牌的提示词，以降低成本并提高响应速度。此功能对所有近期模型（gpt-4o 及更新版本）启用。

手动缓存

您可以使用 prompt_cache_key 参数来影响 OpenAI 的缓存并优化缓存命中率：

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4.1")

# Use a cache key for repeated prompts
messages = [
    {"role": "system", "content": "You are a helpful assistant that translates English to French."},
    {"role": "user", "content": "I love programming."},
]

response = llm.invoke(
    messages,
    prompt_cache_key="translation-assistant-v1"
)

# Check cache usage
cache_read_tokens = response.usage_metadata.input_token_details.cache_read
print(f"Cached tokens used: {cache_read_tokens}")

缓存命中需要提示词前缀完全匹配

缓存键策略

您可以根据应用程序的需求使用不同的缓存键策略：

# Static cache keys for consistent prompt templates
customer_response = llm.invoke(
    messages,
    prompt_cache_key="customer-support-v1"
)

support_response = llm.invoke(
    messages,
    prompt_cache_key="internal-support-v1"
)

# Dynamic cache keys based on context
user_type = "premium"
cache_key = f"assistant-{user_type}-v1"
response = llm.invoke(messages, prompt_cache_key=cache_key)

模型级缓存

您还可以使用 model_kwargs 在模型级别设置默认缓存键：

llm = ChatOpenAI(
    model="gpt-4.1-mini",
    model_kwargs={"prompt_cache_key": "default-cache-v1"}
)

# Uses default cache key
response1 = llm.invoke(messages)

# Override with specific cache key
response2 = llm.invoke(messages, prompt_cache_key="override-cache-v1")

灵活处理

OpenAI 提供多种服务层级。“flex”层级提供更便宜的价格，但代价是响应可能需要更长时间，资源可能并不总是可用。这种方法最适合非关键任务，包括模型测试、数据增强或可以异步运行的作业。要使用它，使用 service_tier="flex" 初始化模型：

llm = ChatOpenAI(model="o4-mini", service_tier="flex")

请注意，这是一个仅限部分模型可用的 beta 功能。有关更多详细信息，请参阅 OpenAI 文档。

API 参考

如需所有功能和配置选项的详细文档，请前往 ChatOpenAI API 参考。

Edit this page on GitHub or file an issue.

Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

​概述

​集成详情

​模型功能

​设置

​安装

​凭据

​实例化

​调用

​流式传输用法元数据

​与 Azure OpenAI 配合使用

​工具调用

​绑定工具

​严格模式

​工具调用

​自定义工具

​结构化输出

​带工具调用的结构化输出

​Responses API

​网页搜索

​图像生成

​文件搜索

​工具搜索

​计算机使用

​代码解释器

​远程 MCP

​管理对话状态

​手动管理状态

​传递 previous_response_id

​上下文管理

​推理输出

​微调

​多模态输入（图像、PDF、音频）

​预测输出

​音频生成（预览）

​提示词缓存

​手动缓存

​缓存键策略

​模型级缓存

​灵活处理

​API 参考

概述

集成详情

模型功能

设置

安装

凭据

实例化

调用

流式传输用法元数据

与 Azure OpenAI 配合使用

工具调用

绑定工具

严格模式

工具调用

自定义工具

结构化输出

带工具调用的结构化输出

Responses API

网页搜索

图像生成

文件搜索

工具搜索

计算机使用

代码解释器

远程 MCP

管理对话状态

手动管理状态

传递 `previous_response_id`

上下文管理

推理输出

微调

多模态输入（图像、PDF、音频）

预测输出

音频生成（预览）

提示词缓存

手动缓存

缓存键策略

模型级缓存

灵活处理

API 参考