通过 Gemini Developer API 或 Vertex AI 访问 Google 的生成式 AI 模型,包括 Gemini 系列。Gemini Developer API 提供快速设置和 API 密钥,适合个人开发者。Vertex AI 提供企业级功能并与 Google Cloud Platform 集成。有关最新模型、模型 ID、其功能、上下文窗口等的信息,请前往 Google AI 文档。
Vertex AI 整合与兼容性自 langchain-google-genai 4.0.0 起,此包使用整合后的 google-genai SDK,而非旧的 google-ai-generativelanguage SDK。此次迁移带来了通过 Gemini Developer API 和 Vertex AI 中的 Gemini API 对 Gemini 模型的支持,取代了 langchain-google-vertexai 中的某些类,例如 ChatVertexAI。阅读 完整公告和迁移指南。
messages = [ ( "system", "You are a helpful assistant that translates English to French. Translate the user sentence.", ), ("human", "I love programming."),]ai_msg = model.invoke(messages)ai_msg
from langchain_google_genai import ChatGoogleGenerativeAI# Set at instantiation (applies to all calls)model = ChatGoogleGenerativeAI( model="gemini-2.5-flash-image", image_config={"aspect_ratio": "16:9"},)# Or override per callresponse = model.invoke( "Generate a photorealistic image of a cuddly cat wearing a hat.", image_config={"aspect_ratio": "1:1"},)
默认情况下,图像生成模型可能会同时返回文本和图像(例如 “Ok! Here’s an image of a…”)。您可以通过设置 response_modalities 参数请求模型仅返回图像:
from langchain_google_genai import ChatGoogleGenerativeAI, Modalitymodel = ChatGoogleGenerativeAI( model="gemini-2.5-flash-image", response_modalities=[Modality.IMAGE],)# All invocations will return only imagesresponse = model.invoke("Generate a photorealistic image of a cuddly cat wearing a hat.")
Vertex AI 限制音频生成模型目前在 Vertex AI 上处于有限预览阶段,可能需要白名单访问。如果在 vertexai=True 时使用 TTS 模型遇到 INVALID_ARGUMENT 错误,您的 GCP 项目可能需要加入白名单。更多详细信息,请参阅此 Google AI 论坛讨论。
from langchain_google_genai import ChatGoogleGenerativeAImodel = ChatGoogleGenerativeAI(model="gemini-2.5-flash-preview-tts")response = model.invoke("Please say The quick brown fox jumps over the lazy dog")# Base64 encoded binary data of the audiowav_data = response.additional_kwargs.get("audio")with open("output.wav", "wb") as f: f.write(wav_data)
from langchain.tools import toolfrom langchain.messages import HumanMessagefrom langchain_google_genai import ChatGoogleGenerativeAI# Define the tool@tool(description="Get the current weather in a given location")def get_weather(location: str) -> str: return "It's sunny."# Initialize and bind (potentially multiple) tools to the modelmodel_with_tools = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview").bind_tools([get_weather])# Step 1: Model generates tool callsmessages = [HumanMessage("What's the weather in Boston?")]ai_msg = model_with_tools.invoke(messages)messages.append(ai_msg)# Check the tool calls in the responseprint(ai_msg.tool_calls)# Step 2: Execute tools and collect resultsfor tool_call in ai_msg.tool_calls: # Execute the tool with the generated arguments tool_result = get_weather.invoke(tool_call) messages.append(tool_result)# Step 3: Pass results back to model for final responsefinal_response = model_with_tools.invoke(messages)final_response
from langchain_google_genai import ChatGoogleGenerativeAImodel = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")result = model.invoke("Explain the concept of prompt engineering in one sentence.")print(result.content)print("\nUsage Metadata:")print(result.usage_metadata)
Prompt engineering is the art and science of crafting effective text prompts to elicit desired and accurate responses from large language models.Usage Metadata:{'input_tokens': 10, 'output_tokens': 24, 'total_tokens': 34, 'input_token_details': {'cache_read': 0}}
from langchain_google_genai import ChatGoogleGenerativeAI# Gemini 3+: use thinking_levelllm = ChatGoogleGenerativeAI( model="gemini-3.1-pro-preview", thinking_level="low",)response = llm.invoke("How many O's are in Google?")
from langchain_google_genai import ChatGoogleGenerativeAIllm = ChatGoogleGenerativeAI( model="gemini-3.1-pro-preview", include_thoughts=True,)response = llm.invoke("How many O's are in Google? How did you verify your answer?")reasoning_tokens = response.usage_metadata["output_token_details"]["reasoning"]print("Response:", response.content)print("Reasoning tokens used:", reasoning_tokens)
from langchain_google_genai import ChatGoogleGenerativeAImodel = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")model_with_search = model.bind_tools([{"google_search": {}}])response = model_with_search.invoke("When is the next total solar eclipse in US?")response.content_blocks
[{'type': 'text', 'text': 'The next total solar eclipse visible in the contiguous United States will occur on...', 'annotations': [{'type': 'citation', 'id': 'abc123', 'url': '<url for source 1>', 'title': '<source 1 title>', 'start_index': 0, 'end_index': 99, 'cited_text': 'The next total solar eclipse...', 'extras': {'google_ai_metadata': {'web_search_queries': ['next total solar eclipse in US'], 'grounding_chunk_index': 0, 'confidence_scores': []}}}, ...
某些模型支持使用 Google 地图进行 grounding。地图 grounding 将 Gemini 的生成能力与 Google 地图当前的、事实性的位置数据连接起来。这使得能够提供准确、地理位置特定响应的定位感知应用程序。详见 Gemini 文档。
from langchain_google_genai import ChatGoogleGenerativeAImodel = ChatGoogleGenerativeAI(model="gemini-2.5-pro")model_with_maps = model.bind_tools([{"google_maps": {}}])response = model_with_maps.invoke( "What are some good Italian restaurants near the Eiffel Tower in Paris?")
from langchain_google_genai import ChatGoogleGenerativeAImodel = ChatGoogleGenerativeAI(model="gemini-2.5-pro")# Provide location context (latitude and longitude)model_with_maps = model.bind_tools( [{"google_maps": {}}], tool_config={ "retrieval_config": { # Eiffel Tower "lat_lng": { "latitude": 48.858844, "longitude": 2.294351, } } },)response = model_with_maps.invoke( "What Italian restaurants are within a 5 minute walk from here?")
上下文缓存允许您存储和重用内容(例如 PDF、图像)以加快处理速度。cached_content 参数接受通过 Google Generative AI API 创建的缓存名称。
单文件示例
此示例缓存单个文件并对其进行查询。
import timefrom google import genaifrom google.genai import typesfrom langchain.messages import HumanMessagefrom langchain_google_genai import ChatGoogleGenerativeAIclient = genai.Client()# Upload filefile = client.files.upload(file="path/to/your/file")while file.state.name == "PROCESSING": time.sleep(2) file = client.files.get(name=file.name)# Create cachemodel = "gemini-3.1-pro-preview"cache = client.caches.create( model=model, config=types.CreateCachedContentConfig( display_name="Cached Content", system_instruction=( "You are an expert content analyzer, and your job is to answer " "the user's query based on the file you have access to." ), contents=[file], ttl="300s", ),)# Query with LangChainllm = ChatGoogleGenerativeAI( model=model, cached_content=cache.name,)message = HumanMessage(content="Summarize the main points of the content.")llm.invoke([message])
多文件示例
此示例使用 Part 缓存两个文件并一起查询它们。
import timefrom google import genaifrom google.genai.types import CreateCachedContentConfig, Content, Partfrom langchain.messages import HumanMessagefrom langchain_google_genai import ChatGoogleGenerativeAIclient = genai.Client()# Upload filesfile_1 = client.files.upload(file="./file1")while file_1.state.name == "PROCESSING": time.sleep(2) file_1 = client.files.get(name=file_1.name)file_2 = client.files.upload(file="./file2")while file_2.state.name == "PROCESSING": time.sleep(2) file_2 = client.files.get(name=file_2.name)# Create cache with multiple filescontents = [ Content( role="user", parts=[ Part.from_uri(file_uri=file_1.uri, mime_type=file_1.mime_type), Part.from_uri(file_uri=file_2.uri, mime_type=file_2.mime_type), ], )]model = "gemini-3.1-pro-preview"cache = client.caches.create( model=model, config=CreateCachedContentConfig( display_name="Cached Contents", system_instruction=( "You are an expert content analyzer, and your job is to answer " "the user's query based on the files you have access to." ), contents=contents, ttl="300s", ),)# Query with LangChainllm = ChatGoogleGenerativeAI( model=model, cached_content=cache.name,)message = HumanMessage( content="Provide a summary of the key information across both files.")llm.invoke([message])