AzureMLChatOnlineEndpoint 集成 - LangChain中文版文档

Azure Machine Learning 是一个用于构建、训练和部署机器学习模型的平台。用户可以在模型目录中探索要部署的模型类型，该目录提供了来自不同提供商的基础模型和通用模型。通常，您需要部署模型才能使用其预测（推理）。在 Azure Machine Learning 中，在线端点用于通过实时服务部署这些模型。它们基于 端点 和 部署 的概念，允许您将生产工作负载的接口与提供服务的实现解耦。

本笔记本介绍了如何使用托管在 Azure Machine Learning 端点 上的聊天模型。

from langchain_community.chat_models.azureml_endpoint import AzureMLChatOnlineEndpoint

设置

您必须在 Azure ML 上部署模型或部署到 Azure AI Foundry（原 Azure AI Studio）并获取以下参数：

endpoint_url：端点提供的 REST 端点 URL。
endpoint_api_type：将模型部署到 专用端点（托管托管基础设施）时，使用 endpoint_type='dedicated'。使用 按需付费 服务（模型即服务）部署模型时，使用 endpoint_type='serverless'。
endpoint_api_key：端点提供的 API 密钥。

内容格式化器

content_formatter 参数是一个处理程序类，用于转换 AzureML 端点的请求和响应以匹配所需的架构。由于模型目录中有多种模型，每种模型处理数据的方式可能不同，因此提供了 ContentFormatterBase 类，允许用户根据需要转换数据。提供了以下内容格式化器：

CustomOpenAIChatContentFormatter：为遵循 OpenAI API 规范的请求和响应的模型（如 LLaMa2-chat）格式化请求和响应数据。

注意：langchain.chat_models.azureml_endpoint.LlamaChatContentFormatter 正在被弃用，并由 langchain.chat_models.azureml_endpoint.CustomOpenAIChatContentFormatter 替代。 您可以从类 langchain_community.llms.azureml_endpoint.ContentFormatterBase 派生，为您的模型实现特定的自定义内容格式化器。

示例

以下部分包含如何使用此类的示例：

示例：使用实时端点进行聊天补全

from langchain_community.chat_models.azureml_endpoint import (
    AzureMLEndpointApiType,
    CustomOpenAIChatContentFormatter,
)
from langchain.messages import HumanMessage

chat = AzureMLChatOnlineEndpoint(
    endpoint_url="https://<your-endpoint>.<your_region>.inference.ml.azure.com/score",
    endpoint_api_type=AzureMLEndpointApiType.dedicated,
    endpoint_api_key="my-api-key",
    content_formatter=CustomOpenAIChatContentFormatter(),
)
response = chat.invoke(
    [HumanMessage(content="科拉茨猜想最终会被解决吗？")]
)
response

AIMessage(content='  科拉茨猜想是数学中最著名的未解问题之一，多年来一直是许多研究和探讨的主题。虽然无法确定地预测该猜想是否会被解决，但有几个原因使其被视为一个具有挑战性和重要性的问题：\n\n1. 简单却难以捉摸：科拉茨猜想是一个看似简单但证明或证伪却异常困难的陈述。尽管其表述简单，但该猜想已经难倒了一些最杰出的数学家，并且仍然是该领域最著名的开放问题之一。\n\n2. 广泛的影响：科拉茨猜想对数学的许多领域具有深远的影响，包括数论、代数和分析。解决该猜想可能对这些领域产生重大影响，并可能带来新的见解和发现。\n\n3. 计算证据：虽然该猜想尚未被证明，但大量的计算证据支持其有效性。事实上，对于任何起始值直到 2^64（一个数字），尚未发现该猜想的反例。', additional_kwargs={}, example=False)

示例：使用按需付费部署（模型即服务）进行聊天补全

chat = AzureMLChatOnlineEndpoint(
    endpoint_url="https://<your-endpoint>.<your_region>.inference.ml.azure.com/v1/chat/completions",
    endpoint_api_type=AzureMLEndpointApiType.serverless,
    endpoint_api_key="my-api-key",
    content_formatter=CustomOpenAIChatContentFormatter,
)
response = chat.invoke(
    [HumanMessage(content="科拉茨猜想最终会被解决吗？")]
)
response

如果需要向模型传递额外参数，请使用 model_kwargs 参数：

chat = AzureMLChatOnlineEndpoint(
    endpoint_url="https://<your-endpoint>.<your_region>.inference.ml.azure.com/v1/chat/completions",
    endpoint_api_type=AzureMLEndpointApiType.serverless,
    endpoint_api_key="my-api-key",
    content_formatter=CustomOpenAIChatContentFormatter,
    model_kwargs={"temperature": 0.8},
)

参数也可以在调用时传递：

response = chat.invoke(
    [HumanMessage(content="科拉茨猜想最终会被解决吗？")],
    max_tokens=512,
)
response

Edit this page on GitHub or file an issue.

Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

​设置

​内容格式化器

​示例

​示例：使用实时端点进行聊天补全

​示例：使用按需付费部署（模型即服务）进行聊天补全

设置

内容格式化器

示例

示例：使用实时端点进行聊天补全

示例：使用按需付费部署（模型即服务）进行聊天补全