--- title: Node.js HTTP Server description: Learn how to use the AI SDK in a Node.js HTTP server tags: ['api servers', 'streaming'] --- # Node.js HTTP Server You can use the AI SDK in a Node.js HTTP server to generate text and stream it to the client. ## Examples The examples start a simple HTTP server that listens on port 8080. You can e.g. test it using `curl`: ```bash curl -X POST http://localhost:8080 ``` The examples use the OpenAI `gpt-4o` model. Ensure that the OpenAI API key is set in the `OPENAI_API_KEY` environment variable. **Full example**: [github.com/vercel/ai/examples/node-http-server](https://github.com/vercel/ai/tree/main/examples/node-http-server) ### Data Stream You can use the `pipeDataStreamToResponse` method to pipe the stream data to the server response. ```ts filename='index.ts' import { openai } from '@ai-sdk/openai'; import { streamText } from 'ai'; import { createServer } from 'http'; createServer(async (req, res) => { const result = streamText({ model: openai('gpt-4o'), prompt: 'Invent a new holiday and describe its traditions.', }); result.pipeDataStreamToResponse(res); }).listen(8080); ``` ### Sending Custom Data `pipeDataStreamToResponse` can be used to send custom data to the client. ```ts filename='index.ts' highlight="6-9,16" import { openai } from '@ai-sdk/openai'; import { pipeDataStreamToResponse, streamText } from 'ai'; import { createServer } from 'http'; createServer(async (req, res) => { // immediately start streaming the response pipeDataStreamToResponse(res, { execute: async dataStreamWriter => { dataStreamWriter.writeData('initialized call'); const result = streamText({ model: openai('gpt-4o'), prompt: 'Invent a new holiday and describe its traditions.', }); result.mergeIntoDataStream(dataStreamWriter); }, onError: error => { // Error messages are masked by default for security reasons. // If you want to expose the error message to the client, you can do so here: return error instanceof Error ? error.message : String(error); }, }); }).listen(8080); ``` ### Text Stream You can send a text stream to the client using `pipeTextStreamToResponse`. ```ts filename='index.ts' import { openai } from '@ai-sdk/openai'; import { streamText } from 'ai'; import { createServer } from 'http'; createServer(async (req, res) => { const result = streamText({ model: openai('gpt-4o'), prompt: 'Invent a new holiday and describe its traditions.', }); result.pipeTextStreamToResponse(res); }).listen(8080); ``` --- title: Express description: Learn how to use the AI SDK in an Express server tags: ['api servers', 'streaming'] --- # Express You can use the AI SDK in an [Express](https://expressjs.com/) server to generate and stream text and objects to the client. ## Examples The examples start a simple HTTP server that listens on port 8080. You can e.g. test it using `curl`: ```bash curl -X POST http://localhost:8080 ``` The examples use the OpenAI `gpt-4o` model. Ensure that the OpenAI API key is set in the `OPENAI_API_KEY` environment variable. **Full example**: [github.com/vercel/ai/examples/express](https://github.com/vercel/ai/tree/main/examples/express) ### Data Stream You can use the `pipeDataStreamToResponse` method to pipe the stream data to the server response. ```ts filename='index.ts' import { openai } from '@ai-sdk/openai'; import { streamText } from 'ai'; import express, { Request, Response } from 'express'; const app = express(); app.post('/', async (req: Request, res: Response) => { const result = streamText({ model: openai('gpt-4o'), prompt: 'Invent a new holiday and describe its traditions.', }); result.pipeDataStreamToResponse(res); }); app.listen(8080, () => { console.log(`Example app listening on port ${8080}`); }); ``` ### Sending Custom Data `pipeDataStreamToResponse` can be used to send custom data to the client. ```ts filename='index.ts' highlight="8-11,18" import { openai } from '@ai-sdk/openai'; import { pipeDataStreamToResponse, streamText } from 'ai'; import express, { Request, Response } from 'express'; const app = express(); app.post('/stream-data', async (req: Request, res: Response) => { // immediately start streaming the response pipeDataStreamToResponse(res, { execute: async dataStreamWriter => { dataStreamWriter.writeData('initialized call'); const result = streamText({ model: openai('gpt-4o'), prompt: 'Invent a new holiday and describe its traditions.', }); result.mergeIntoDataStream(dataStreamWriter); }, onError: error => { // Error messages are masked by default for security reasons. // If you want to expose the error message to the client, you can do so here: return error instanceof Error ? error.message : String(error); }, }); }); app.listen(8080, () => { console.log(`Example app listening on port ${8080}`); }); ``` ### Text Stream You can send a text stream to the client using `pipeTextStreamToResponse`. ```ts filename='index.ts' highlight="13" import { openai } from '@ai-sdk/openai'; import { streamText } from 'ai'; import express, { Request, Response } from 'express'; const app = express(); app.post('/', async (req: Request, res: Response) => { const result = streamText({ model: openai('gpt-4o'), prompt: 'Invent a new holiday and describe its traditions.', }); result.pipeTextStreamToResponse(res); }); app.listen(8080, () => { console.log(`Example app listening on port ${8080}`); }); ``` --- title: Hono description: Example of using the AI SDK in a Hono server. tags: ['api servers', 'streaming'] --- # Hono You can use the AI SDK in a [Hono](https://hono.dev/) server to generate and stream text and objects to the client. ## Examples The examples start a simple HTTP server that listens on port 8080. You can e.g. test it using `curl`: ```bash curl -X POST http://localhost:8080 ``` The examples use the OpenAI `gpt-4o` model. Ensure that the OpenAI API key is set in the `OPENAI_API_KEY` environment variable. **Full example**: [github.com/vercel/ai/examples/hono](https://github.com/vercel/ai/tree/main/examples/hono) ### Data Stream You can use the `toDataStream` method to get a data stream from the result and then pipe it to the response. ```ts filename='index.ts' import { openai } from '@ai-sdk/openai'; import { serve } from '@hono/node-server'; import { streamText } from 'ai'; import { Hono } from 'hono'; import { stream } from 'hono/streaming'; const app = new Hono(); app.post('/', async c => { const result = streamText({ model: openai('gpt-4o'), prompt: 'Invent a new holiday and describe its traditions.', }); // Mark the response as a v1 data stream: c.header('X-Vercel-AI-Data-Stream', 'v1'); c.header('Content-Type', 'text/plain; charset=utf-8'); return stream(c, stream => stream.pipe(result.toDataStream())); }); serve({ fetch: app.fetch, port: 8080 }); ``` ### Sending Custom Data `createDataStream` can be used to send custom data to the client. ```ts filename='index.ts' highlight="10-13,20" import { openai } from '@ai-sdk/openai'; import { serve } from '@hono/node-server'; import { createDataStream, streamText } from 'ai'; import { Hono } from 'hono'; import { stream } from 'hono/streaming'; const app = new Hono(); app.post('/stream-data', async c => { // immediately start streaming the response const dataStream = createDataStream({ execute: async dataStreamWriter => { dataStreamWriter.writeData('initialized call'); const result = streamText({ model: openai('gpt-4o'), prompt: 'Invent a new holiday and describe its traditions.', }); result.mergeIntoDataStream(dataStreamWriter); }, onError: error => { // Error messages are masked by default for security reasons. // If you want to expose the error message to the client, you can do so here: return error instanceof Error ? error.message : String(error); }, }); // Mark the response as a v1 data stream: c.header('X-Vercel-AI-Data-Stream', 'v1'); c.header('Content-Type', 'text/plain; charset=utf-8'); return stream(c, stream => stream.pipe(dataStream.pipeThrough(new TextEncoderStream())), ); }); serve({ fetch: app.fetch, port: 8080 }); ``` ### Text Stream You can use the `textStream` property to get a text stream from the result and then pipe it to the response. ```ts filename='index.ts' highlight="17" import { openai } from '@ai-sdk/openai'; import { serve } from '@hono/node-server'; import { streamText } from 'ai'; import { Hono } from 'hono'; import { stream } from 'hono/streaming'; const app = new Hono(); app.post('/', async c => { const result = streamText({ model: openai('gpt-4o'), prompt: 'Invent a new holiday and describe its traditions.', }); c.header('Content-Type', 'text/plain; charset=utf-8'); return stream(c, stream => stream.pipe(result.textStream)); }); serve({ fetch: app.fetch, port: 8080 }); ``` --- title: Fastify description: Learn how to use the AI SDK in a Fastify server tags: ['api servers', 'streaming'] --- # Fastify You can use the AI SDK in a [Fastify](https://fastify.dev/) server to generate and stream text and objects to the client. ## Examples The examples start a simple HTTP server that listens on port 8080. You can e.g. test it using `curl`: ```bash curl -X POST http://localhost:8080 ``` The examples use the OpenAI `gpt-4o` model. Ensure that the OpenAI API key is set in the `OPENAI_API_KEY` environment variable. **Full example**: [github.com/vercel/ai/examples/fastify](https://github.com/vercel/ai/tree/main/examples/fastify) ### Data Stream You can use the `toDataStream` method to get a data stream from the result and then pipe it to the response. ```ts filename='index.ts' import { openai } from '@ai-sdk/openai'; import { streamText } from 'ai'; import Fastify from 'fastify'; const fastify = Fastify({ logger: true }); fastify.post('/', async function (request, reply) { const result = streamText({ model: openai('gpt-4o'), prompt: 'Invent a new holiday and describe its traditions.', }); // Mark the response as a v1 data stream: reply.header('X-Vercel-AI-Data-Stream', 'v1'); reply.header('Content-Type', 'text/plain; charset=utf-8'); return reply.send(result.toDataStream({ data })); }); fastify.listen({ port: 8080 }); ``` ### Sending Custom Data `createDataStream` can be used to send custom data to the client. ```ts filename='index.ts' highlight="8-11,18" import { openai } from '@ai-sdk/openai'; import { createDataStream, streamText } from 'ai'; import Fastify from 'fastify'; const fastify = Fastify({ logger: true }); fastify.post('/stream-data', async function (request, reply) { // immediately start streaming the response const dataStream = createDataStream({ execute: async dataStreamWriter => { dataStreamWriter.writeData('initialized call'); const result = streamText({ model: openai('gpt-4o'), prompt: 'Invent a new holiday and describe its traditions.', }); result.mergeIntoDataStream(dataStreamWriter); }, onError: error => { // Error messages are masked by default for security reasons. // If you want to expose the error message to the client, you can do so here: return error instanceof Error ? error.message : String(error); }, }); // Mark the response as a v1 data stream: reply.header('X-Vercel-AI-Data-Stream', 'v1'); reply.header('Content-Type', 'text/plain; charset=utf-8'); return reply.send(dataStream); }); fastify.listen({ port: 8080 }); ``` ### Text Stream You can use the `textStream` property to get a text stream from the result and then pipe it to the response. ```ts filename='index.ts' highlight="15" import { openai } from '@ai-sdk/openai'; import { streamText } from 'ai'; import Fastify from 'fastify'; const fastify = Fastify({ logger: true }); fastify.post('/', async function (request, reply) { const result = streamText({ model: openai('gpt-4o'), prompt: 'Invent a new holiday and describe its traditions.', }); reply.header('Content-Type', 'text/plain; charset=utf-8'); return reply.send(result.textStream); }); fastify.listen({ port: 8080 }); ``` --- title: Nest.js description: Learn how to use the AI SDK in a Nest.js server tags: ['api servers', 'streaming'] --- # Nest.js You can use the AI SDK in a [Nest.js](https://nestjs.com/) server to generate and stream text and objects to the client. ## Examples The examples show how to implement a Nest.js controller that uses the AI SDK to stream text and objects to the client. **Full example**: [github.com/vercel/ai/examples/nest](https://github.com/vercel/ai/tree/main/examples/nest) ### Data Stream You can use the `pipeDataStreamToResponse` method to get a data stream from the result and then pipe it to the response. ```ts filename='app.controller.ts' import { Controller, Post, Res } from '@nestjs/common'; import { openai } from '@ai-sdk/openai'; import { streamText } from 'ai'; import { Response } from 'express'; @Controller() export class AppController { @Post() async example(@Res() res: Response) { const result = streamText({ model: openai('gpt-4o'), prompt: 'Invent a new holiday and describe its traditions.', }); result.pipeDataStreamToResponse(res); } } ``` ### Sending Custom Data `pipeDataStreamToResponse` can be used to send custom data to the client. ```ts filename='app.controller.ts' highlight="10-12,19" import { Controller, Post, Res } from '@nestjs/common'; import { openai } from '@ai-sdk/openai'; import { pipeDataStreamToResponse, streamText } from 'ai'; import { Response } from 'express'; @Controller() export class AppController { @Post('/stream-data') async streamData(@Res() res: Response) { pipeDataStreamToResponse(res, { execute: async dataStreamWriter => { dataStreamWriter.writeData('initialized call'); const result = streamText({ model: openai('gpt-4o'), prompt: 'Invent a new holiday and describe its traditions.', }); result.mergeIntoDataStream(dataStreamWriter); }, onError: error => { // Error messages are masked by default for security reasons. // If you want to expose the error message to the client, you can do so here: return error instanceof Error ? error.message : String(error); }, }); } } ``` ### Text Stream You can use the `pipeTextStreamToResponse` method to get a text stream from the result and then pipe it to the response. ```ts filename='app.controller.ts' highlight="15" import { Controller, Post, Res } from '@nestjs/common'; import { openai } from '@ai-sdk/openai'; import { streamText } from 'ai'; import { Response } from 'express'; @Controller() export class AppController { @Post() async example(@Res() res: Response) { const result = streamText({ model: openai('gpt-4o'), prompt: 'Invent a new holiday and describe its traditions.', }); result.pipeTextStreamToResponse(res); } } ``` --- title: Overview description: An overview of AI SDK Core. --- # AI SDK Core Large Language Models (LLMs) are advanced programs that can understand, create, and engage with human language on a large scale. They are trained on vast amounts of written material to recognize patterns in language and predict what might come next in a given piece of text. AI SDK Core **simplifies working with LLMs by offering a standardized way of integrating them into your app** - so you can focus on building great AI applications for your users, not waste time on technical details. For example, here’s how you can generate text with various models using the AI SDK: ## AI SDK Core Functions AI SDK Core has various functions designed for [text generation](./generating-text), [structured data generation](./generating-structured-data), and [tool usage](./tools-and-tool-calling). These functions take a standardized approach to setting up [prompts](./prompts) and [settings](./settings), making it easier to work with different models. - [`generateText`](/docs/ai-sdk-core/generating-text): Generates text and [tool calls](./tools-and-tool-calling). This function is ideal for non-interactive use cases such as automation tasks where you need to write text (e.g. drafting email or summarizing web pages) and for agents that use tools. - [`streamText`](/docs/ai-sdk-core/generating-text): Stream text and tool calls. You can use the `streamText` function for interactive use cases such as [chat bots](/docs/ai-sdk-ui/chatbot) and [content streaming](/docs/ai-sdk-ui/completion). - [`generateObject`](/docs/ai-sdk-core/generating-structured-data): Generates a typed, structured object that matches a [Zod](https://zod.dev/) schema. You can use this function to force the language model to return structured data, e.g. for information extraction, synthetic data generation, or classification tasks. - [`streamObject`](/docs/ai-sdk-core/generating-structured-data): Stream a structured object that matches a Zod schema. You can use this function to [stream generated UIs](/docs/ai-sdk-ui/object-generation). ## API Reference Please check out the [AI SDK Core API Reference](/docs/reference/ai-sdk-core) for more details on each function. --- title: Generating Text description: Learn how to generate text with the AI SDK. --- # Generating and Streaming Text Large language models (LLMs) can generate text in response to a prompt, which can contain instructions and information to process. For example, you can ask a model to come up with a recipe, draft an email, or summarize a document. The AI SDK Core provides two functions to generate text and stream it from LLMs: - [`generateText`](#generatetext): Generates text for a given prompt and model. - [`streamText`](#streamtext): Streams text from a given prompt and model. Advanced LLM features such as [tool calling](./tools-and-tool-calling) and [structured data generation](./generating-structured-data) are built on top of text generation. ## `generateText` You can generate text using the [`generateText`](/docs/reference/ai-sdk-core/generate-text) function. This function is ideal for non-interactive use cases where you need to write text (e.g. drafting email or summarizing web pages) and for agents that use tools. ```tsx import { generateText } from 'ai'; const { text } = await generateText({ model: yourModel, prompt: 'Write a vegetarian lasagna recipe for 4 people.', }); ``` You can use more [advanced prompts](./prompts) to generate text with more complex instructions and content: ```tsx import { generateText } from 'ai'; const { text } = await generateText({ model: yourModel, system: 'You are a professional writer. ' + 'You write simple, clear, and concise content.', prompt: `Summarize the following article in 3-5 sentences: ${article}`, }); ``` The result object of `generateText` contains several promises that resolve when all required data is available: - `result.text`: The generated text. - `result.reasoning`: The reasoning text of the model (only available for some models). - `result.sources`: Sources that have been used as input to generate the response (only available for some models). - `result.finishReason`: The reason the model finished generating text. - `result.usage`: The usage of the model during text generation. ### Accessing response headers & body Sometimes you need access to the full response from the model provider, e.g. to access some provider-specific headers or body content. You can access the raw response headers and body using the `response` property: ```ts import { generateText } from 'ai'; const result = await generateText({ // ... }); console.log(JSON.stringify(result.response.headers, null, 2)); console.log(JSON.stringify(result.response.body, null, 2)); ``` ## `streamText` Depending on your model and prompt, it can take a large language model (LLM) up to a minute to finish generating its response. This delay can be unacceptable for interactive use cases such as chatbots or real-time applications, where users expect immediate responses. AI SDK Core provides the [`streamText`](/docs/reference/ai-sdk-core/stream-text) function which simplifies streaming text from LLMs: ```ts import { streamText } from 'ai'; const result = streamText({ model: yourModel, prompt: 'Invent a new holiday and describe its traditions.', }); // example: use textStream as an async iterable for await (const textPart of result.textStream) { console.log(textPart); } ``` `result.textStream` is both a `ReadableStream` and an `AsyncIterable`. `streamText` immediately starts streaming and suppresses errors to prevent server crashes. Use the `onError` callback to log errors. You can use `streamText` on its own or in combination with [AI SDK UI](/examples/next-pages/basics/streaming-text-generation) and [AI SDK RSC](/examples/next-app/basics/streaming-text-generation). The result object contains several helper functions to make the integration into [AI SDK UI](/docs/ai-sdk-ui) easier: - `result.toDataStreamResponse()`: Creates a data stream HTTP response (with tool calls etc.) that can be used in a Next.js App Router API route. - `result.pipeDataStreamToResponse()`: Writes data stream delta output to a Node.js response-like object. - `result.toTextStreamResponse()`: Creates a simple text stream HTTP response. - `result.pipeTextStreamToResponse()`: Writes text delta output to a Node.js response-like object. `streamText` is using backpressure and only generates tokens as they are requested. You need to consume the stream in order for it to finish. It also provides several promises that resolve when the stream is finished: - `result.text`: The generated text. - `result.reasoning`: The reasoning text of the model (only available for some models). - `result.sources`: Sources that have been used as input to generate the response (only available for some models). - `result.finishReason`: The reason the model finished generating text. - `result.usage`: The usage of the model during text generation. ### `onError` callback `streamText` immediately starts streaming to enable sending data without waiting for the model. Errors become part of the stream and are not thrown to prevent e.g. servers from crashing. To log errors, you can provide an `onError` callback that is triggered when an error occurs. ```tsx highlight="6-8" import { streamText } from 'ai'; const result = streamText({ model: yourModel, prompt: 'Invent a new holiday and describe its traditions.', onError({ error }) { console.error(error); // your error logging logic here }, }); ``` ### `onChunk` callback When using `streamText`, you can provide an `onChunk` callback that is triggered for each chunk of the stream. It receives the following chunk types: - `text-delta` - `reasoning` - `source` - `tool-call` - `tool-result` - `tool-call-streaming-start` (when `toolCallStreaming` is enabled) - `tool-call-delta` (when `toolCallStreaming` is enabled) ```tsx highlight="6-11" import { streamText } from 'ai'; const result = streamText({ model: yourModel, prompt: 'Invent a new holiday and describe its traditions.', onChunk({ chunk }) { // implement your own logic here, e.g.: if (chunk.type === 'text-delta') { console.log(chunk.text); } }, }); ``` ### `onFinish` callback When using `streamText`, you can provide an `onFinish` callback that is triggered when the stream is finished ( [API Reference](/docs/reference/ai-sdk-core/stream-text#on-finish) ). It contains the text, usage information, finish reason, messages, and more: ```tsx highlight="6-8" import { streamText } from 'ai'; const result = streamText({ model: yourModel, prompt: 'Invent a new holiday and describe its traditions.', onFinish({ text, finishReason, usage, response }) { // your own logic, e.g. for saving the chat history or recording usage const messages = response.messages; // messages that were generated }, }); ``` ### `fullStream` property You can read a stream with all events using the `fullStream` property. This can be useful if you want to implement your own UI or handle the stream in a different way. Here is an example of how to use the `fullStream` property: ```tsx import { streamText } from 'ai'; import { z } from 'zod'; const result = streamText({ model: yourModel, tools: { cityAttractions: { parameters: z.object({ city: z.string() }), execute: async ({ city }) => ({ attractions: ['attraction1', 'attraction2', 'attraction3'], }), }, }, prompt: 'What are some San Francisco tourist attractions?', }); for await (const part of result.fullStream) { switch (part.type) { case 'text-delta': { // handle text delta here break; } case 'reasoning': { // handle reasoning here break; } case 'source': { // handle source here break; } case 'tool-call': { switch (part.toolName) { case 'cityAttractions': { // handle tool call here break; } } break; } case 'tool-result': { switch (part.toolName) { case 'cityAttractions': { // handle tool result here break; } } break; } case 'finish': { // handle finish here break; } case 'error': { // handle error here break; } } } ``` ### Stream transformation You can use the `experimental_transform` option to transform the stream. This is useful for e.g. filtering, changing, or smoothing the text stream. The transformations are applied before the callbacks are invoked and the promises are resolved. If you e.g. have a transformation that changes all text to uppercase, the `onFinish` callback will receive the transformed text. #### Smoothing streams The AI SDK Core provides a [`smoothStream` function](/docs/reference/ai-sdk-core/smooth-stream) that can be used to smooth out text streaming. ```tsx highlight="6" import { smoothStream, streamText } from 'ai'; const result = streamText({ model, prompt, experimental_transform: smoothStream(), }); ``` #### Custom transformations You can also implement your own custom transformations. The transformation function receives the tools that are available to the model, and returns a function that is used to transform the stream. Tools can either be generic or limited to the tools that you are using. Here is an example of how to implement a custom transformation that converts all text to uppercase: ```ts const upperCaseTransform = () => (options: { tools: TOOLS; stopStream: () => void }) => new TransformStream, TextStreamPart>({ transform(chunk, controller) { controller.enqueue( // for text-delta chunks, convert the text to uppercase: chunk.type === 'text-delta' ? { ...chunk, textDelta: chunk.textDelta.toUpperCase() } : chunk, ); }, }); ``` You can also stop the stream using the `stopStream` function. This is e.g. useful if you want to stop the stream when model guardrails are violated, e.g. by generating inappropriate content. When you invoke `stopStream`, it is important to simulate the `step-finish` and `finish` events to guarantee that a well-formed stream is returned and all callbacks are invoked. ```ts const stopWordTransform = () => ({ stopStream }: { stopStream: () => void }) => new TransformStream, TextStreamPart>({ // note: this is a simplified transformation for testing; // in a real-world version more there would need to be // stream buffering and scanning to correctly emit prior text // and to detect all STOP occurrences. transform(chunk, controller) { if (chunk.type !== 'text-delta') { controller.enqueue(chunk); return; } if (chunk.textDelta.includes('STOP')) { // stop the stream stopStream(); // simulate the step-finish event controller.enqueue({ type: 'step-finish', finishReason: 'stop', logprobs: undefined, usage: { completionTokens: NaN, promptTokens: NaN, totalTokens: NaN, }, request: {}, response: { id: 'response-id', modelId: 'mock-model-id', timestamp: new Date(0), }, warnings: [], isContinued: false, }); // simulate the finish event controller.enqueue({ type: 'finish', finishReason: 'stop', logprobs: undefined, usage: { completionTokens: NaN, promptTokens: NaN, totalTokens: NaN, }, response: { id: 'response-id', modelId: 'mock-model-id', timestamp: new Date(0), }, }); return; } controller.enqueue(chunk); }, }); ``` #### Multiple transformations You can also provide multiple transformations. They are applied in the order they are provided. ```tsx highlight="4" const result = streamText({ model, prompt, experimental_transform: [firstTransform, secondTransform], }); ``` ## Sources Some providers such as [Perplexity](/providers/ai-sdk-providers/perplexity#sources) and [Google Generative AI](/providers/ai-sdk-providers/google-generative-ai#sources) include sources in the response. Currently sources are limited to web pages that ground the response. You can access them using the `sources` property of the result. Each `url` source contains the following properties: - `id`: The ID of the source. - `url`: The URL of the source. - `title`: The optional title of the source. - `providerMetadata`: Provider metadata for the source. When you use `generateText`, you can access the sources using the `sources` property: ```ts const result = await generateText({ model: google('gemini-2.0-flash-exp', { useSearchGrounding: true }), prompt: 'List the top 5 San Francisco news from the past week.', }); for (const source of result.sources) { if (source.sourceType === 'url') { console.log('ID:', source.id); console.log('Title:', source.title); console.log('URL:', source.url); console.log('Provider metadata:', source.providerMetadata); console.log(); } } ``` When you use `streamText`, you can access the sources using the `fullStream` property: ```tsx const result = streamText({ model: google('gemini-2.0-flash-exp', { useSearchGrounding: true }), prompt: 'List the top 5 San Francisco news from the past week.', }); for await (const part of result.fullStream) { if (part.type === 'source' && part.source.sourceType === 'url') { console.log('ID:', part.source.id); console.log('Title:', part.source.title); console.log('URL:', part.source.url); console.log('Provider metadata:', part.source.providerMetadata); console.log(); } } ``` The sources are also available in the `result.sources` promise. ## Generating Long Text Most language models have an output limit that is much shorter than their context window. This means that you cannot generate long text in one go, but it is possible to add responses back to the input and continue generating to create longer text. `generateText` and `streamText` support such continuations for long text generation using the experimental `continueSteps` setting: ```tsx highlight="5-6,9-10" import { openai } from '@ai-sdk/openai'; import { generateText } from 'ai'; const { text, // combined text usage, // combined usage of all steps } = await generateText({ model: openai('gpt-4o'), // 4096 output tokens maxSteps: 5, // enable multi-step calls experimental_continueSteps: true, prompt: 'Write a book about Roman history, ' + 'from the founding of the city of Rome ' + 'to the fall of the Western Roman Empire. ' + 'Each chapter MUST HAVE at least 1000 words.', }); ``` When `experimental_continueSteps` is enabled, only full words are streamed in `streamText`, and both `generateText` and `streamText` might drop the trailing tokens of some calls to prevent whitespace issues. Some models might not always stop correctly on their own and keep generating until `maxSteps` is reached. You can hint the model to stop by e.g. using a system message such as "Stop when sufficient information was provided." ## Examples You can see `generateText` and `streamText` in action using various frameworks in the following examples: ### `generateText` ### `streamText` --- title: Generating Structured Data description: Learn how to generate structured data with the AI SDK. --- # Generating Structured Data While text generation can be useful, your use case will likely call for generating structured data. For example, you might want to extract information from text, classify data, or generate synthetic data. Many language models are capable of generating structured data, often defined as using "JSON modes" or "tools". However, you need to manually provide schemas and then validate the generated data as LLMs can produce incorrect or incomplete structured data. The AI SDK standardises structured object generation across model providers with the [`generateObject`](/docs/reference/ai-sdk-core/generate-object) and [`streamObject`](/docs/reference/ai-sdk-core/stream-object) functions. You can use both functions with different output strategies, e.g. `array`, `object`, or `no-schema`, and with different generation modes, e.g. `auto`, `tool`, or `json`. You can use [Zod schemas](/docs/reference/ai-sdk-core/zod-schema), [Valibot](/docs/reference/ai-sdk-core/valibot-schema), or [JSON schemas](/docs/reference/ai-sdk-core/json-schema) to specify the shape of the data that you want, and the AI model will generate data that conforms to that structure. You can pass Zod objects directly to the AI SDK functions or use the `zodSchema` helper function. ## Generate Object The `generateObject` generates structured data from a prompt. The schema is also used to validate the generated data, ensuring type safety and correctness. ```ts import { generateObject } from 'ai'; import { z } from 'zod'; const { object } = await generateObject({ model: yourModel, schema: z.object({ recipe: z.object({ name: z.string(), ingredients: z.array(z.object({ name: z.string(), amount: z.string() })), steps: z.array(z.string()), }), }), prompt: 'Generate a lasagna recipe.', }); ``` See `generateObject` in action with [these examples](#more-examples) ### Accessing response headers & body Sometimes you need access to the full response from the model provider, e.g. to access some provider-specific headers or body content. You can access the raw response headers and body using the `response` property: ```ts import { generateText } from 'ai'; const result = await generateText({ // ... }); console.log(JSON.stringify(result.response.headers, null, 2)); console.log(JSON.stringify(result.response.body, null, 2)); ``` ## Stream Object Given the added complexity of returning structured data, model response time can be unacceptable for your interactive use case. With the [`streamObject`](/docs/reference/ai-sdk-core/stream-object) function, you can stream the model's response as it is generated. ```ts import { streamObject } from 'ai'; const { partialObjectStream } = streamObject({ // ... }); // use partialObjectStream as an async iterable for await (const partialObject of partialObjectStream) { console.log(partialObject); } ``` You can use `streamObject` to stream generated UIs in combination with React Server Components (see [Generative UI](../ai-sdk-rsc))) or the [`useObject`](/docs/reference/ai-sdk-ui/use-object) hook. See `streamObject` in action with [these examples](#more-examples) ### `onError` callback `streamObject` immediately starts streaming. Errors become part of the stream and are not thrown to prevent e.g. servers from crashing. To log errors, you can provide an `onError` callback that is triggered when an error occurs. ```tsx highlight="5-7" import { streamObject } from 'ai'; const result = streamObject({ // ... onError({ error }) { console.error(error); // your error logging logic here }, }); ``` ## Output Strategy You can use both functions with different output strategies, e.g. `array`, `object`, or `no-schema`. ### Object The default output strategy is `object`, which returns the generated data as an object. You don't need to specify the output strategy if you want to use the default. ### Array If you want to generate an array of objects, you can set the output strategy to `array`. When you use the `array` output strategy, the schema specifies the shape of an array element. With `streamObject`, you can also stream the generated array elements using `elementStream`. ```ts highlight="7,18" import { openai } from '@ai-sdk/openai'; import { streamObject } from 'ai'; import { z } from 'zod'; const { elementStream } = streamObject({ model: openai('gpt-4-turbo'), output: 'array', schema: z.object({ name: z.string(), class: z .string() .describe('Character class, e.g. warrior, mage, or thief.'), description: z.string(), }), prompt: 'Generate 3 hero descriptions for a fantasy role playing game.', }); for await (const hero of elementStream) { console.log(hero); } ``` ### Enum If you want to generate a specific enum value, e.g. for classification tasks, you can set the output strategy to `enum` and provide a list of possible values in the `enum` parameter. Enum output is only available with `generateObject`. ```ts highlight="5-6" import { generateObject } from 'ai'; const { object } = await generateObject({ model: yourModel, output: 'enum', enum: ['action', 'comedy', 'drama', 'horror', 'sci-fi'], prompt: 'Classify the genre of this movie plot: ' + '"A group of astronauts travel through a wormhole in search of a ' + 'new habitable planet for humanity."', }); ``` ### No Schema In some cases, you might not want to use a schema, for example when the data is a dynamic user request. You can use the `output` setting to set the output format to `no-schema` in those cases and omit the schema parameter. ```ts highlight="6" import { openai } from '@ai-sdk/openai'; import { generateObject } from 'ai'; const { object } = await generateObject({ model: openai('gpt-4-turbo'), output: 'no-schema', prompt: 'Generate a lasagna recipe.', }); ``` ## Generation Mode While some models (like OpenAI) natively support object generation, others require alternative methods, like modified [tool calling](/docs/ai-sdk-core/tools-and-tool-calling). The `generateObject` function allows you to specify the method it will use to return structured data. - `auto`: The provider will choose the best mode for the model. This recommended mode is used by default. - `tool`: A tool with the JSON schema as parameters is provided and the provider is instructed to use it. - `json`: The response format is set to JSON when supported by the provider, e.g. via json modes or grammar-guided generation. If grammar-guided generation is not supported, the JSON schema and instructions to generate JSON that conforms to the schema are injected into the system prompt. Please note that not every provider supports all generation modes. Some providers do not support object generation at all. ## Schema Name and Description You can optionally specify a name and description for the schema. These are used by some providers for additional LLM guidance, e.g. via tool or schema name. ```ts highlight="6-7" import { generateObject } from 'ai'; import { z } from 'zod'; const { object } = await generateObject({ model: yourModel, schemaName: 'Recipe', schemaDescription: 'A recipe for a dish.', schema: z.object({ name: z.string(), ingredients: z.array(z.object({ name: z.string(), amount: z.string() })), steps: z.array(z.string()), }), prompt: 'Generate a lasagna recipe.', }); ``` ## Error Handling When `generateObject` cannot generate a valid object, it throws a [`AI_NoObjectGeneratedError`](/docs/reference/ai-sdk-errors/ai-no-object-generated-error). This error occurs when the AI provider fails to generate a parsable object that conforms to the schema. It can arise due to the following reasons: - The model failed to generate a response. - The model generated a response that could not be parsed. - The model generated a response that could not be validated against the schema. The error preserves the following information to help you log the issue: - `text`: The text that was generated by the model. This can be the raw text or the tool call text, depending on the object generation mode. - `response`: Metadata about the language model response, including response id, timestamp, and model. - `usage`: Request token usage. - `cause`: The cause of the error (e.g. a JSON parsing error). You can use this for more detailed error handling. ```ts import { generateObject, NoObjectGeneratedError } from 'ai'; try { await generateObject({ model, schema, prompt }); } catch (error) { if (NoObjectGeneratedError.isInstance(error)) { console.log('NoObjectGeneratedError'); console.log('Cause:', error.cause); console.log('Text:', error.text); console.log('Response:', error.response); console.log('Usage:', error.usage); } } ``` ## Repairing Invalid or Malformed JSON The `repairText` function is experimental and may change in the future. Sometimes the model will generate invalid or malformed JSON. You can use the `repairText` function to attempt to repair the JSON. It receives the error, either a `JSONParseError` or a `TypeValidationError`, and the text that was generated by the model. You can then attempt to repair the text and return the repaired text. ```ts highlight="7-10" import { generateObject } from 'ai'; const { object } = await generateObject({ model, schema, prompt, experimental_repairText: async ({ text, error }) => { // example: add a closing brace to the text return text + '}'; }, }); ``` ## Structured outputs with `generateText` and `streamText` You can generate structured data with `generateText` and `streamText` by using the `experimental_output` setting. Some models, e.g. those by OpenAI, support structured outputs and tool calling at the same time. This is only possible with `generateText` and `streamText`. Structured output generation with `generateText` and `streamText` is experimental and may change in the future. ### `generateText` ```ts highlight="2,4-18" // experimental_output is a structured object that matches the schema: const { experimental_output } = await generateText({ // ... experimental_output: Output.object({ schema: z.object({ name: z.string(), age: z.number().nullable().describe('Age of the person.'), contact: z.object({ type: z.literal('email'), value: z.string(), }), occupation: z.object({ type: z.literal('employed'), company: z.string(), position: z.string(), }), }), }), prompt: 'Generate an example person for testing.', }); ``` ### `streamText` ```ts highlight="2,4-18" // experimental_partialOutputStream contains generated partial objects: const { experimental_partialOutputStream } = await streamText({ // ... experimental_output: Output.object({ schema: z.object({ name: z.string(), age: z.number().nullable().describe('Age of the person.'), contact: z.object({ type: z.literal('email'), value: z.string(), }), occupation: z.object({ type: z.literal('employed'), company: z.string(), position: z.string(), }), }), }), prompt: 'Generate an example person for testing.', }); ``` ## More Examples You can see `generateObject` and `streamObject` in action using various frameworks in the following examples: ### `generateObject` ### `streamObject` --- title: Tool Calling description: Learn about tool calling and multi-step calls (using maxSteps) with AI SDK Core. --- # Tool Calling As covered under Foundations, [tools](/docs/foundations/tools) are objects that can be called by the model to perform a specific task. AI SDK Core tools contain three elements: - **`description`**: An optional description of the tool that can influence when the tool is picked. - **`parameters`**: A [Zod schema](/docs/foundations/tools#schemas) or a [JSON schema](/docs/reference/ai-sdk-core/json-schema) that defines the parameters. The schema is consumed by the LLM, and also used to validate the LLM tool calls. - **`execute`**: An optional async function that is called with the arguments from the tool call. It produces a value of type `RESULT` (generic type). It is optional because you might want to forward tool calls to the client or to a queue instead of executing them in the same process. You can use the [`tool`](/docs/reference/ai-sdk-core/tool) helper function to infer the types of the `execute` parameters. The `tools` parameter of `generateText` and `streamText` is an object that has the tool names as keys and the tools as values: ```ts highlight="6-17" import { z } from 'zod'; import { generateText, tool } from 'ai'; const result = await generateText({ model: yourModel, tools: { weather: tool({ description: 'Get the weather in a location', parameters: z.object({ location: z.string().describe('The location to get the weather for'), }), execute: async ({ location }) => ({ location, temperature: 72 + Math.floor(Math.random() * 21) - 10, }), }), }, prompt: 'What is the weather in San Francisco?', }); ``` When a model uses a tool, it is called a "tool call" and the output of the tool is called a "tool result". Tool calling is not restricted to only text generation. You can also use it to render user interfaces (Generative UI). ## Multi-Step Calls (using maxSteps) With the `maxSteps` setting, you can enable multi-step calls in `generateText` and `streamText`. When `maxSteps` is set to a number greater than 1 and the model generates a tool call, the AI SDK will trigger a new generation passing in the tool result until there are no further tool calls or the maximum number of tool steps is reached. To decide what value to set for `maxSteps`, consider the most complex task the call might handle and the number of sequential steps required for completion, rather than just the number of available tools. By default, when you use `generateText` or `streamText`, it triggers a single generation (`maxSteps: 1`). This works well for many use cases where you can rely on the model's training data to generate a response. However, when you provide tools, the model now has the choice to either generate a normal text response, or generate a tool call. If the model generates a tool call, it's generation is complete and that step is finished. You may want the model to generate text after the tool has been executed, either to summarize the tool results in the context of the users query. In many cases, you may also want the model to use multiple tools in a single response. This is where multi-step calls come in. You can think of multi-step calls in a similar way to a conversation with a human. When you ask a question, if the person does not have the requisite knowledge in their common knowledge (a model's training data), the person may need to look up information (use a tool) before they can provide you with an answer. In the same way, the model may need to call a tool to get the information it needs to answer your question where each generation (tool call or text generation) is a step. ### Example In the following example, there are two steps: 1. **Step 1** 1. The prompt `'What is the weather in San Francisco?'` is sent to the model. 1. The model generates a tool call. 1. The tool call is executed. 1. **Step 2** 1. The tool result is sent to the model. 1. The model generates a response considering the tool result. ```ts highlight="18" import { z } from 'zod'; import { generateText, tool } from 'ai'; const { text, steps } = await generateText({ model: yourModel, tools: { weather: tool({ description: 'Get the weather in a location', parameters: z.object({ location: z.string().describe('The location to get the weather for'), }), execute: async ({ location }) => ({ location, temperature: 72 + Math.floor(Math.random() * 21) - 10, }), }), }, maxSteps: 5, // allow up to 5 steps prompt: 'What is the weather in San Francisco?', }); ``` You can use `streamText` in a similar way. ### Steps To access intermediate tool calls and results, you can use the `steps` property in the result object or the `streamText` `onFinish` callback. It contains all the text, tool calls, tool results, and more from each step. #### Example: Extract tool results from all steps ```ts highlight="3,9-10" import { generateText } from 'ai'; const { steps } = await generateText({ model: openai('gpt-4-turbo'), maxSteps: 10, // ... }); // extract all tool calls from the steps: const allToolCalls = steps.flatMap(step => step.toolCalls); ``` ### `onStepFinish` callback When using `generateText` or `streamText`, you can provide an `onStepFinish` callback that is triggered when a step is finished, i.e. all text deltas, tool calls, and tool results for the step are available. When you have multiple steps, the callback is triggered for each step. ```tsx highlight="5-7" import { generateText } from 'ai'; const result = await generateText({ // ... onStepFinish({ text, toolCalls, toolResults, finishReason, usage }) { // your own logic, e.g. for saving the chat history or recording usage }, }); ``` ## Response Messages Adding the generated assistant and tool messages to your conversation history is a common task, especially if you are using multi-step tool calls. Both `generateText` and `streamText` have a `response.messages` property that you can use to add the assistant and tool messages to your conversation history. It is also available in the `onFinish` callback of `streamText`. The `response.messages` property contains an array of `CoreMessage` objects that you can add to your conversation history: ```ts import { generateText } from 'ai'; const messages: CoreMessage[] = [ // ... ]; const { response } = await generateText({ // ... messages, }); // add the response messages to your conversation history: messages.push(...response.messages); // streamText: ...((await response).messages) ``` ## Tool Choice You can use the `toolChoice` setting to influence when a tool is selected. It supports the following settings: - `auto` (default): the model can choose whether and which tools to call. - `required`: the model must call a tool. It can choose which tool to call. - `none`: the model must not call tools - `{ type: 'tool', toolName: string (typed) }`: the model must call the specified tool ```ts highlight="18" import { z } from 'zod'; import { generateText, tool } from 'ai'; const result = await generateText({ model: yourModel, tools: { weather: tool({ description: 'Get the weather in a location', parameters: z.object({ location: z.string().describe('The location to get the weather for'), }), execute: async ({ location }) => ({ location, temperature: 72 + Math.floor(Math.random() * 21) - 10, }), }), }, toolChoice: 'required', // force the model to call a tool prompt: 'What is the weather in San Francisco?', }); ``` ## Tool Execution Options When tools are called, they receive additional options as a second parameter. ### Tool Call ID The ID of the tool call is forwarded to the tool execution. You can use it e.g. when sending tool-call related information with stream data. ```ts highlight="14-20" import { StreamData, streamText, tool } from 'ai'; export async function POST(req: Request) { const { messages } = await req.json(); const data = new StreamData(); const result = streamText({ // ... messages, tools: { myTool: tool({ // ... execute: async (args, { toolCallId }) => { // return e.g. custom status for tool call data.appendMessageAnnotation({ type: 'tool-status', toolCallId, status: 'in-progress', }); // ... }, }), }, onFinish() { data.close(); }, }); return result.toDataStreamResponse({ data }); } ``` ### Messages The messages that were sent to the language model to initiate the response that contained the tool call are forwarded to the tool execution. You can access them in the second parameter of the `execute` function. In multi-step calls, the messages contain the text, tool calls, and tool results from all previous steps. ```ts highlight="8-9" import { generateText, tool } from 'ai'; const result = await generateText({ // ... tools: { myTool: tool({ // ... execute: async (args, { messages }) => { // use the message history in e.g. calls to other language models return something; }, }), }, }); ``` ### Abort Signals The abort signals from `generateText` and `streamText` are forwarded to the tool execution. You can access them in the second parameter of the `execute` function and e.g. abort long-running computations or forward them to fetch calls inside tools. ```ts highlight="6,11,14" import { z } from 'zod'; import { generateText, tool } from 'ai'; const result = await generateText({ model: yourModel, abortSignal: myAbortSignal, // signal that will be forwarded to tools tools: { weather: tool({ description: 'Get the weather in a location', parameters: z.object({ location: z.string() }), execute: async ({ location }, { abortSignal }) => { return fetch( `https://api.weatherapi.com/v1/current.json?q=${location}`, { signal: abortSignal }, // forward the abort signal to fetch ); }, }), }, prompt: 'What is the weather in San Francisco?', }); ``` ## Types Modularizing your code often requires defining types to ensure type safety and reusability. To enable this, the AI SDK provides several helper types for tools, tool calls, and tool results. You can use them to strongly type your variables, function parameters, and return types in parts of the code that are not directly related to `streamText` or `generateText`. Each tool call is typed with `ToolCall`, depending on the tool that has been invoked. Similarly, the tool results are typed with `ToolResult`. The tools in `streamText` and `generateText` are defined as a `ToolSet`. The type inference helpers `ToolCallUnion` and `ToolResultUnion` can be used to extract the tool call and tool result types from the tools. ```ts highlight="18-19,23-24" import { openai } from '@ai-sdk/openai'; import { ToolCallUnion, ToolResultUnion, generateText, tool } from 'ai'; import { z } from 'zod'; const myToolSet = { firstTool: tool({ description: 'Greets the user', parameters: z.object({ name: z.string() }), execute: async ({ name }) => `Hello, ${name}!`, }), secondTool: tool({ description: 'Tells the user their age', parameters: z.object({ age: z.number() }), execute: async ({ age }) => `You are ${age} years old!`, }), }; type MyToolCall = ToolCallUnion; type MyToolResult = ToolResultUnion; async function generateSomething(prompt: string): Promise<{ text: string; toolCalls: Array; // typed tool calls toolResults: Array; // typed tool results }> { return generateText({ model: openai('gpt-4o'), tools: myToolSet, prompt, }); } ``` ## Handling Errors The AI SDK has three tool-call related errors: - [`NoSuchToolError`](/docs/reference/ai-sdk-errors/ai-no-such-tool-error): the model tries to call a tool that is not defined in the tools object - [`InvalidToolArgumentsError`](/docs/reference/ai-sdk-errors/ai-invalid-tool-arguments-error): the model calls a tool with arguments that do not match the tool's parameters - [`ToolExecutionError`](/docs/reference/ai-sdk-errors/ai-tool-execution-error): an error that occurred during tool execution - [`ToolCallRepairError`](/docs/reference/ai-sdk-errors/ai-tool-call-repair-error): an error that occurred during tool call repair ### `generateText` `generateText` throws errors and can be handled using a `try`/`catch` block: ```ts try { const result = await generateText({ //... }); } catch (error) { if (NoSuchToolError.isInstance(error)) { // handle the no such tool error } else if (InvalidToolArgumentsError.isInstance(error)) { // handle the invalid tool arguments error } else if (ToolExecutionError.isInstance(error)) { // handle the tool execution error } else { // handle other errors } } ``` ### `streamText` `streamText` sends the errors as part of the full stream. The error parts contain the error object. When using `toDataStreamResponse`, you can pass an `getErrorMessage` function to extract the error message from the error part and forward it as part of the data stream response: ```ts const result = streamText({ // ... }); return result.toDataStreamResponse({ getErrorMessage: error => { if (NoSuchToolError.isInstance(error)) { return 'The model tried to call a unknown tool.'; } else if (InvalidToolArgumentsError.isInstance(error)) { return 'The model called a tool with invalid arguments.'; } else if (ToolExecutionError.isInstance(error)) { return 'An error occurred during tool execution.'; } else { return 'An unknown error occurred.'; } }, }); ``` ## Tool Call Repair The tool call repair feature is experimental and may change in the future. Language models sometimes fail to generate valid tool calls, especially when the parameters are complex or the model is smaller. You can use the `experimental_repairToolCall` function to attempt to repair the tool call with a custom function. You can use different strategies to repair the tool call: - Use a model with structured outputs to generate the arguments. - Send the messages, system prompt, and tool schema to a stronger model to generate the arguments. - Provide more specific repair instructions based on which tool was called. ### Example: Use a model with structured outputs for repair ```ts import { openai } from '@ai-sdk/openai'; import { generateObject, generateText, NoSuchToolError, tool } from 'ai'; const result = await generateText({ model, tools, prompt, experimental_repairToolCall: async ({ toolCall, tools, parameterSchema, error, }) => { if (NoSuchToolError.isInstance(error)) { return null; // do not attempt to fix invalid tool names } const tool = tools[toolCall.toolName as keyof typeof tools]; const { object: repairedArgs } = await generateObject({ model: openai('gpt-4o', { structuredOutputs: true }), schema: tool.parameters, prompt: [ `The model tried to call the tool "${toolCall.toolName}"` + ` with the following arguments:`, JSON.stringify(toolCall.args), `The tool accepts the following schema:`, JSON.stringify(parameterSchema(toolCall)), 'Please fix the arguments.', ].join('\n'), }); return { ...toolCall, args: JSON.stringify(repairedArgs) }; }, }); ``` ### Example: Use the re-ask strategy for repair ```ts import { openai } from '@ai-sdk/openai'; import { generateObject, generateText, NoSuchToolError, tool } from 'ai'; const result = await generateText({ model, tools, prompt, experimental_repairToolCall: async ({ toolCall, tools, error, messages, system, }) => { const result = await generateText({ model, system, messages: [ ...messages, { role: 'assistant', content: [ { type: 'tool-call', toolCallId: toolCall.toolCallId, toolName: toolCall.toolName, args: toolCall.args, }, ], }, { role: 'tool' as const, content: [ { type: 'tool-result', toolCallId: toolCall.toolCallId, toolName: toolCall.toolName, result: error.message, }, ], }, ], tools, }); const newToolCall = result.toolCalls.find( newToolCall => newToolCall.toolName === toolCall.toolName, ); return newToolCall != null ? { toolCallType: 'function' as const, toolCallId: toolCall.toolCallId, toolName: toolCall.toolName, args: JSON.stringify(newToolCall.args), } : null; }, }); ``` ## Active Tools The `activeTools` property is experimental and may change in the future. Language models can only handle a limited number of tools at a time, depending on the model. To allow for static typing using a large number of tools and limiting the available tools to the model at the same time, the AI SDK provides the `experimental_activeTools` property. It is an array of tool names that are currently active. By default, the value is `undefined` and all tools are active. ```ts highlight="7" import { openai } from '@ai-sdk/openai'; import { generateText } from 'ai'; const { text } = await generateText({ model: openai('gpt-4o'), tools: myToolSet, experimental_activeTools: ['firstTool'], }); ``` ## Multi-modal Tool Results Multi-modal tool results are experimental and only supported by Anthropic. In order to send multi-modal tool results, e.g. screenshots, back to the model, they need to be converted into a specific format. AI SDK Core tools have an optional `experimental_toToolResultContent` function that converts the tool result into a content part. Here is an example for converting a screenshot into a content part: ```ts highlight="22-27" const result = await generateText({ model: anthropic('claude-3-5-sonnet-20241022'), tools: { computer: anthropic.tools.computer_20241022({ // ... async execute({ action, coordinate, text }) { switch (action) { case 'screenshot': { return { type: 'image', data: fs .readFileSync('./data/screenshot-editor.png') .toString('base64'), }; } default: { return `executed ${action}`; } } }, // map to tool result content for LLM consumption: experimental_toToolResultContent(result) { return typeof result === 'string' ? [{ type: 'text', text: result }] : [{ type: 'image', data: result.data, mimeType: 'image/png' }]; }, }), }, // ... }); ``` ## Extracting Tools Once you start having many tools, you might want to extract them into separate files. The `tool` helper function is crucial for this, because it ensures correct type inference. Here is an example of an extracted tool: ```ts filename="tools/weather-tool.ts" highlight="1,4-5" import { tool } from 'ai'; import { z } from 'zod'; // the `tool` helper function ensures correct type inference: export const weatherTool = tool({ description: 'Get the weather in a location', parameters: z.object({ location: z.string().describe('The location to get the weather for'), }), execute: async ({ location }) => ({ location, temperature: 72 + Math.floor(Math.random() * 21) - 10, }), }); ``` ## MCP Tools The MCP tools feature is experimental and may change in the future. The AI SDK supports connecting to [Model Context Protocol (MCP)](https://modelcontextprotocol.io/) servers to access their tools. This enables your AI applications to discover and use tools across various services through a standardized interface. ### Initializing an MCP Client Create an MCP client using either: - `stdio`: Uses standard input and output streams for communication, ideal for local tool servers running on the same machine (like CLI tools or local services) - `SSE` (Server-Sent Events): Uses HTTP-based real-time communication, better suited for remote servers that need to send data over the network ```typescript // Example of using MCP with stdio const mcpClient = await createMCPClient({ transport: { type: 'stdio', command: 'node', args: ['src/stdio/dist/server.js'], }, }); // Example of using MCP with SSE const mcpClient = await createMCPClient({ transport: { type: 'sse', url: 'https://my-server.com/sse', }, }); ``` After initialization, always close the MCP client when you're done to prevent resource leaks. Use try/finally or cleanup functions in your framework: ```typescript try { const mcpClient = await createMCPClient({...}); const tools = await mcpClient.tools(); // use tools... } finally { await client?.close(); } ``` ### Using MCP Tools The client's `tools` method acts as an adapter between MCP tools and AI SDK tools. It supports two approaches for working with tool schemas: #### Schema Discovery The simplest approach where all tools offered by the server are listed, and input parameter types are inferred based the schemas provided by the server: ```typescript const tools = await mcpClient.tools(); ``` **Pros:** - Simpler to implement - Automatically stays in sync with server changes **Cons:** - No TypeScript type safety during development - No IDE autocompletion for tool parameters - Errors only surface at runtime - Loads all tools from the server #### Schema Definition You can also define the tools and their input schemas explicitly in your client code: ```typescript import { z } from 'zod'; const tools = await mcpClient.tools({ schemas: { 'get-data': { parameters: z.object({ query: z.string().describe('The data query'), format: z.enum(['json', 'text']).optional(), }), }, }, }); ``` **Pros:** - Control over which tools are loaded - Full TypeScript type safety - Better IDE support with autocompletion - Catch parameter mismatches during development **Cons:** - Need to manually keep schemas in sync with server - More code to maintain When you define `schemas`, the client will only pull the explicitly defined tools, even if the server offers additional tools. This can be beneficial for: - Keeping your application focused on the tools it needs - Reducing unnecessary tool loading - Making your tool dependencies explicit ## Examples You can see tools in action using various frameworks in the following examples: --- title: Prompt Engineering description: Learn how to develop prompts with AI SDK Core. --- # Prompt Engineering ## Tips ### Prompts for Tools When you create prompts that include tools, getting good results can be tricky as the number and complexity of your tools increases. Here are a few tips to help you get the best results: 1. Use a model that is strong at tool calling, such as `gpt-4` or `gpt-4-turbo`. Weaker models will often struggle to call tools effectively and flawlessly. 1. Keep the number of tools low, e.g. to 5 or less. 1. Keep the complexity of the tool parameters low. Complex Zod schemas with many nested and optional elements, unions, etc. can be challenging for the model to work with. 1. Use semantically meaningful names for your tools, parameters, parameter properties, etc. The more information you pass to the model, the better it can understand what you want. 1. Add `.describe("...")` to your Zod schema properties to give the model hints about what a particular property is for. 1. When the output of a tool might be unclear to the model and there are dependencies between tools, use the `description` field of a tool to provide information about the output of the tool execution. 1. You can include example input/outputs of tool calls in your prompt to help the model understand how to use the tools. Keep in mind that the tools work with JSON objects, so the examples should use JSON. In general, the goal should be to give the model all information it needs in a clear way. ### Tool & Structured Data Schemas The mapping from Zod schemas to LLM inputs (typically JSON schema) is not always straightforward, since the mapping is not one-to-one. #### Zod Dates Zod expects JavaScript Date objects, but models return dates as strings. You can specify and validate the date format using `z.string().datetime()` or `z.string().date()`, and then use a Zod transformer to convert the string to a Date object. ```ts highlight="7-10" const result = await generateObject({ model: openai('gpt-4-turbo'), schema: z.object({ events: z.array( z.object({ event: z.string(), date: z .string() .date() .transform(value => new Date(value)), }), ), }), prompt: 'List 5 important events from the the year 2000.', }); ``` ## Debugging ### Inspecting Warnings Not all providers support all AI SDK features. Providers either throw exceptions or return warnings when they do not support a feature. To check if your prompt, tools, and settings are handled correctly by the provider, you can check the call warnings: ```ts const result = await generateText({ model: openai('gpt-4o'), prompt: 'Hello, world!', }); console.log(result.warnings); ``` ### HTTP Request Bodies You can inspect the raw HTTP request bodies for models that expose them, e.g. [OpenAI](/providers/ai-sdk-providers/openai). This allows you to inspect the exact payload that is sent to the model provider in the provider-specific way. Request bodies are available via the `request.body` property of the response: ```ts highlight="6" const result = await generateText({ model: openai('gpt-4o'), prompt: 'Hello, world!', }); console.log(result.request.body); ``` --- title: Settings description: Learn how to configure the AI SDK. --- # Settings Large language models (LLMs) typically provide settings to augment their output. All AI SDK functions support the following common settings in addition to the model, the [prompt](./prompts), and additional provider-specific settings: ```ts highlight="3-5" const result = await generateText({ model: yourModel, maxTokens: 512, temperature: 0.3, maxRetries: 5, prompt: 'Invent a new holiday and describe its traditions.', }); ``` Some providers do not support all common settings. If you use a setting with a provider that does not support it, a warning will be generated. You can check the `warnings` property in the result object to see if any warnings were generated. ### `maxTokens` Maximum number of tokens to generate. ### `temperature` Temperature setting. The value is passed through to the provider. The range depends on the provider and model. For most providers, `0` means almost deterministic results, and higher values mean more randomness. It is recommended to set either `temperature` or `topP`, but not both. ### `topP` Nucleus sampling. The value is passed through to the provider. The range depends on the provider and model. For most providers, nucleus sampling is a number between 0 and 1. E.g. 0.1 would mean that only tokens with the top 10% probability mass are considered. It is recommended to set either `temperature` or `topP`, but not both. ### `topK` Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use `temperature`. ### `presencePenalty` The presence penalty affects the likelihood of the model to repeat information that is already in the prompt. The value is passed through to the provider. The range depends on the provider and model. For most providers, `0` means no penalty. ### `frequencyPenalty` The frequency penalty affects the likelihood of the model to repeatedly use the same words or phrases. The value is passed through to the provider. The range depends on the provider and model. For most providers, `0` means no penalty. ### `stopSequences` The stop sequences to use for stopping the text generation. If set, the model will stop generating text when one of the stop sequences is generated. Providers may have limits on the number of stop sequences. ### `seed` It is the seed (integer) to use for random sampling. If set and supported by the model, calls will generate deterministic results. ### `maxRetries` Maximum number of retries. Set to 0 to disable retries. Default: `2`. ### `abortSignal` An optional abort signal that can be used to cancel the call. The abort signal can e.g. be forwarded from a user interface to cancel the call, or to define a timeout. #### Example: Timeout ```ts const result = await generateText({ model: openai('gpt-4o'), prompt: 'Invent a new holiday and describe its traditions.', abortSignal: AbortSignal.timeout(5000), // 5 seconds }); ``` ### `headers` Additional HTTP headers to be sent with the request. Only applicable for HTTP-based providers. You can use the request headers to provide additional information to the provider, depending on what the provider supports. For example, some observability providers support headers such as `Prompt-Id`. ```ts import { generateText } from 'ai'; import { openai } from '@ai-sdk/openai'; const result = await generateText({ model: openai('gpt-4o'), prompt: 'Invent a new holiday and describe its traditions.', headers: { 'Prompt-Id': 'my-prompt-id', }, }); ``` The `headers` setting is for request-specific headers. You can also set `headers` in the provider configuration. These headers will be sent with every request made by the provider. --- title: Embeddings description: Learn how to embed values with the AI SDK. --- # Embeddings Embeddings are a way to represent words, phrases, or images as vectors in a high-dimensional space. In this space, similar words are close to each other, and the distance between words can be used to measure their similarity. ## Embedding a Single Value The AI SDK provides the [`embed`](/docs/reference/ai-sdk-core/embed) function to embed single values, which is useful for tasks such as finding similar words or phrases or clustering text. You can use it with embeddings models, e.g. `openai.embedding('text-embedding-3-large')` or `mistral.embedding('mistral-embed')`. ```tsx import { embed } from 'ai'; import { openai } from '@ai-sdk/openai'; // 'embedding' is a single embedding object (number[]) const { embedding } = await embed({ model: openai.embedding('text-embedding-3-small'), value: 'sunny day at the beach', }); ``` ## Embedding Many Values When loading data, e.g. when preparing a data store for retrieval-augmented generation (RAG), it is often useful to embed many values at once (batch embedding). The AI SDK provides the [`embedMany`](/docs/reference/ai-sdk-core/embed-many) function for this purpose. Similar to `embed`, you can use it with embeddings models, e.g. `openai.embedding('text-embedding-3-large')` or `mistral.embedding('mistral-embed')`. ```tsx import { openai } from '@ai-sdk/openai'; import { embedMany } from 'ai'; // 'embeddings' is an array of embedding objects (number[][]). // It is sorted in the same order as the input values. const { embeddings } = await embedMany({ model: openai.embedding('text-embedding-3-small'), values: [ 'sunny day at the beach', 'rainy afternoon in the city', 'snowy night in the mountains', ], }); ``` ## Embedding Similarity After embedding values, you can calculate the similarity between them using the [`cosineSimilarity`](/docs/reference/ai-sdk-core/cosine-similarity) function. This is useful to e.g. find similar words or phrases in a dataset. You can also rank and filter related items based on their similarity. ```ts highlight={"2,10"} import { openai } from '@ai-sdk/openai'; import { cosineSimilarity, embedMany } from 'ai'; const { embeddings } = await embedMany({ model: openai.embedding('text-embedding-3-small'), values: ['sunny day at the beach', 'rainy afternoon in the city'], }); console.log( `cosine similarity: ${cosineSimilarity(embeddings[0], embeddings[1])}`, ); ``` ## Token Usage Many providers charge based on the number of tokens used to generate embeddings. Both `embed` and `embedMany` provide token usage information in the `usage` property of the result object: ```ts highlight={"4,9"} import { openai } from '@ai-sdk/openai'; import { embed } from 'ai'; const { embedding, usage } = await embed({ model: openai.embedding('text-embedding-3-small'), value: 'sunny day at the beach', }); console.log(usage); // { tokens: 10 } ``` ## Settings ### Retries Both `embed` and `embedMany` accept an optional `maxRetries` parameter of type `number` that you can use to set the maximum number of retries for the embedding process. It defaults to `2` retries (3 attempts in total). You can set it to `0` to disable retries. ```ts highlight={"7"} import { openai } from '@ai-sdk/openai'; import { embed } from 'ai'; const { embedding } = await embed({ model: openai.embedding('text-embedding-3-small'), value: 'sunny day at the beach', maxRetries: 0, // Disable retries }); ``` ### Abort Signals and Timeouts Both `embed` and `embedMany` accept an optional `abortSignal` parameter of type [`AbortSignal`](https://developer.mozilla.org/en-US/docs/Web/API/AbortSignal) that you can use to abort the embedding process or set a timeout. ```ts highlight={"7"} import { openai } from '@ai-sdk/openai'; import { embed } from 'ai'; const { embedding } = await embed({ model: openai.embedding('text-embedding-3-small'), value: 'sunny day at the beach', abortSignal: AbortSignal.timeout(1000), // Abort after 1 second }); ``` ### Custom Headers Both `embed` and `embedMany` accept an optional `headers` parameter of type `Record` that you can use to add custom headers to the embedding request. ```ts highlight={"7"} import { openai } from '@ai-sdk/openai'; import { embed } from 'ai'; const { embedding } = await embed({ model: openai.embedding('text-embedding-3-small'), value: 'sunny day at the beach', headers: { 'X-Custom-Header': 'custom-value' }, }); ``` ## Embedding Providers & Models Several providers offer embedding models: | Provider | Model | Embedding Dimensions | | ----------------------------------------------------------------------------------------- | ------------------------------- | -------------------- | | [OpenAI](/providers/ai-sdk-providers/openai#embedding-models) | `text-embedding-3-large` | 3072 | | [OpenAI](/providers/ai-sdk-providers/openai#embedding-models) | `text-embedding-3-small` | 1536 | | [OpenAI](/providers/ai-sdk-providers/openai#embedding-models) | `text-embedding-ada-002` | 1536 | | [Google Generative AI](/providers/ai-sdk-providers/google-generative-ai#embedding-models) | `text-embedding-004` | 768 | | [Mistral](/providers/ai-sdk-providers/mistral#embedding-models) | `mistral-embed` | 1024 | | [Cohere](/providers/ai-sdk-providers/cohere#embedding-models) | `embed-english-v3.0` | 1024 | | [Cohere](/providers/ai-sdk-providers/cohere#embedding-models) | `embed-multilingual-v3.0` | 1024 | | [Cohere](/providers/ai-sdk-providers/cohere#embedding-models) | `embed-english-light-v3.0` | 384 | | [Cohere](/providers/ai-sdk-providers/cohere#embedding-models) | `embed-multilingual-light-v3.0` | 384 | | [Cohere](/providers/ai-sdk-providers/cohere#embedding-models) | `embed-english-v2.0` | 4096 | | [Cohere](/providers/ai-sdk-providers/cohere#embedding-models) | `embed-english-light-v2.0` | 1024 | | [Cohere](/providers/ai-sdk-providers/cohere#embedding-models) | `embed-multilingual-v2.0` | 768 | | [Amazon Bedrock](/providers/ai-sdk-providers/amazon-bedrock#embedding-models) | `amazon.titan-embed-text-v1` | 1024 | | [Amazon Bedrock](/providers/ai-sdk-providers/amazon-bedrock#embedding-models) | `amazon.titan-embed-text-v2:0` | 1024 | --- title: Image Generation description: Learn how to generate images with the AI SDK. --- # Image Generation Image generation is an experimental feature. The AI SDK provides the [`generateImage`](/docs/reference/ai-sdk-core/generate-image) function to generate images based on a given prompt using an image model. ```tsx import { experimental_generateImage as generateImage } from 'ai'; import { openai } from '@ai-sdk/openai'; const { image } = await generateImage({ model: openai.image('dall-e-3'), prompt: 'Santa Claus driving a Cadillac', }); ``` You can access the image data using the `base64` or `uint8Array` properties: ```tsx const base64 = image.base64; // base64 image data const uint8Array = image.uint8Array; // Uint8Array image data ``` ### Size and Aspect Ratio Depending on the model, you can either specify the size or the aspect ratio. ##### Size The size is specified as a string in the format `{width}x{height}`. Models only support a few sizes, and the supported sizes are different for each model and provider. ```tsx highlight={"7"} import { experimental_generateImage as generateImage } from 'ai'; import { openai } from '@ai-sdk/openai'; const { image } = await generateImage({ model: openai.image('dall-e-3'), prompt: 'Santa Claus driving a Cadillac', size: '1024x1024', }); ``` ##### Aspect Ratio The aspect ratio is specified as a string in the format `{width}:{height}`. Models only support a few aspect ratios, and the supported aspect ratios are different for each model and provider. ```tsx highlight={"7"} import { experimental_generateImage as generateImage } from 'ai'; import { vertex } from '@ai-sdk/google-vertex'; const { image } = await generateImage({ model: vertex.image('imagen-3.0-generate-001'), prompt: 'Santa Claus driving a Cadillac', aspectRatio: '16:9', }); ``` ### Generating Multiple Images `generateImage` also supports generating multiple images at once: ```tsx highlight={"7"} import { experimental_generateImage as generateImage } from 'ai'; import { openai } from '@ai-sdk/openai'; const { images } = await generateImage({ model: openai.image('dall-e-2'), prompt: 'Santa Claus driving a Cadillac', n: 4, // number of images to generate }); ``` `generateImage` will automatically call the model as often as needed (in parallel) to generate the requested number of images. Each image model has an internal limit on how many images it can generate in a single API call. The AI SDK manages this automatically by batching requests appropriately when you request multiple images using the `n` parameter. By default, the SDK uses provider-documented limits (for example, DALL-E 3 can only generate 1 image per call, while DALL-E 2 supports up to 10). If needed, you can override this behavior using the `maxImagesPerCall` setting when configuring your model. This is particularly useful when working with new or custom models where the default batch size might not be optimal: ```tsx const model = openai.image('dall-e-2', { maxImagesPerCall: 5, // Override the default batch size }); const { images } = await generateImage({ model, prompt: 'Santa Claus driving a Cadillac', n: 10, // Will make 2 calls of 5 images each }); ``` ### Providing a Seed You can provide a seed to the `generateImage` function to control the output of the image generation process. If supported by the model, the same seed will always produce the same image. ```tsx highlight={"7"} import { experimental_generateImage as generateImage } from 'ai'; import { openai } from '@ai-sdk/openai'; const { image } = await generateImage({ model: openai.image('dall-e-3'), prompt: 'Santa Claus driving a Cadillac', seed: 1234567890, }); ``` ### Provider-specific Settings Image models often have provider- or even model-specific settings. You can pass such settings to the `generateImage` function using the `providerOptions` parameter. The options for the provider (`openai` in the example below) become request body properties. ```tsx highlight={"9"} import { experimental_generateImage as generateImage } from 'ai'; import { openai } from '@ai-sdk/openai'; const { image } = await generateImage({ model: openai.image('dall-e-3'), prompt: 'Santa Claus driving a Cadillac', size: '1024x1024', providerOptions: { openai: { style: 'vivid', quality: 'hd' }, }, }); ``` ### Abort Signals and Timeouts `generateImage` accepts an optional `abortSignal` parameter of type [`AbortSignal`](https://developer.mozilla.org/en-US/docs/Web/API/AbortSignal) that you can use to abort the image generation process or set a timeout. ```ts highlight={"7"} import { openai } from '@ai-sdk/openai'; import { experimental_generateImage as generateImage } from 'ai'; const { image } = await generateImage({ model: openai.image('dall-e-3'), prompt: 'Santa Claus driving a Cadillac', abortSignal: AbortSignal.timeout(1000), // Abort after 1 second }); ``` ### Custom Headers `generateImage` accepts an optional `headers` parameter of type `Record` that you can use to add custom headers to the image generation request. ```ts highlight={"7"} import { openai } from '@ai-sdk/openai'; import { experimental_generateImage as generateImage } from 'ai'; const { image } = await generateImage({ model: openai.image('dall-e-3'), value: 'sunny day at the beach', headers: { 'X-Custom-Header': 'custom-value' }, }); ``` ### Warnings If the model returns warnings, e.g. for unsupported parameters, they will be available in the `warnings` property of the response. ```tsx const { image, warnings } = await generateImage({ model: openai.image('dall-e-3'), prompt: 'Santa Claus driving a Cadillac', }); ``` ### Error Handling When `generateImage` cannot generate a valid image, it throws a [`AI_NoImageGeneratedError`](/docs/reference/ai-sdk-errors/ai-no-image-generated-error). This error occurs when the AI provider fails to generate an image. It can arise due to the following reasons: - The model failed to generate a response - The model generated a response that could not be parsed The error preserves the following information to help you log the issue: - `responses`: Metadata about the image model responses, including timestamp, model, and headers. - `cause`: The cause of the error. You can use this for more detailed error handling ```ts import { generateImage, NoImageGeneratedError } from 'ai'; try { await generateImage({ model, prompt }); } catch (error) { if (NoImageGeneratedError.isInstance(error)) { console.log('NoImageGeneratedError'); console.log('Cause:', error.cause); console.log('Responses:', error.responses); } } ``` ## Image Models | Provider | Model | Support sizes (`width x height`) or aspect ratios (`width : height`) | | ------------------------------------------------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | [Amazon Bedrock](/providers/ai-sdk-providers/amazon-bedrock#image-models) | `amazon.nova-canvas-v1:0` | 320-4096 (multiples of 16), 1:4 to 4:1, max 4.2M pixels | | [Replicate](/providers/ai-sdk-providers/replicate) | `black-forest-labs/flux-schnell` | 1:1, 2:3, 3:2, 4:5, 5:4, 16:9, 9:16, 9:21, 21:9 | | [Replicate](/providers/ai-sdk-providers/replicate) | `recraft-ai/recraft-v3` | 1024x1024, 1365x1024, 1024x1365, 1536x1024, 1024x1536, 1820x1024, 1024x1820, 1024x2048, 2048x1024, 1434x1024, 1024x1434, 1024x1280, 1280x1024, 1024x1707, 1707x1024 | | [Google Vertex](/providers/ai-sdk-providers/google-vertex#image-models) | `imagen-3.0-generate-001` | 1:1, 3:4, 4:3, 9:16, 16:9 | | [Google Vertex](/providers/ai-sdk-providers/google-vertex#image-models) | `imagen-3.0-fast-generate-001` | 1:1, 3:4, 4:3, 9:16, 16:9 | | [OpenAI](/providers/ai-sdk-providers/openai#image-models) | `dall-e-3` | 1024x1024, 1792x1024, 1024x1792 | | [OpenAI](/providers/ai-sdk-providers/openai#image-models) | `dall-e-2` | 256x256, 512x512, 1024x1024 | | [Fireworks](/providers/ai-sdk-providers/fireworks#image-models) | `accounts/fireworks/models/flux-1-dev-fp8` | 1:1, 2:3, 3:2, 4:5, 5:4, 16:9, 9:16, 9:21, 21:9 | | [Fireworks](/providers/ai-sdk-providers/fireworks#image-models) | `accounts/fireworks/models/flux-1-schnell-fp8` | 1:1, 2:3, 3:2, 4:5, 5:4, 16:9, 9:16, 9:21, 21:9 | | [Fireworks](/providers/ai-sdk-providers/fireworks#image-models) | `accounts/fireworks/models/playground-v2-5-1024px-aesthetic` | 640x1536, 768x1344, 832x1216, 896x1152, 1024x1024, 1152x896, 1216x832, 1344x768, 1536x640 | | [Fireworks](/providers/ai-sdk-providers/fireworks#image-models) | `accounts/fireworks/models/japanese-stable-diffusion-xl` | 640x1536, 768x1344, 832x1216, 896x1152, 1024x1024, 1152x896, 1216x832, 1344x768, 1536x640 | | [Fireworks](/providers/ai-sdk-providers/fireworks#image-models) | `accounts/fireworks/models/playground-v2-1024px-aesthetic` | 640x1536, 768x1344, 832x1216, 896x1152, 1024x1024, 1152x896, 1216x832, 1344x768, 1536x640 | | [Fireworks](/providers/ai-sdk-providers/fireworks#image-models) | `accounts/fireworks/models/SSD-1B` | 640x1536, 768x1344, 832x1216, 896x1152, 1024x1024, 1152x896, 1216x832, 1344x768, 1536x640 | | [Fireworks](/providers/ai-sdk-providers/fireworks#image-models) | `accounts/fireworks/models/stable-diffusion-xl-1024-v1-0` | 640x1536, 768x1344, 832x1216, 896x1152, 1024x1024, 1152x896, 1216x832, 1344x768, 1536x640 | | [Luma](/providers/ai-sdk-providers/luma#image-models) | `photon-1` | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 | | [Luma](/providers/ai-sdk-providers/luma#image-models) | `photon-flash-1` | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 | | [Fal](/providers/ai-sdk-providers/fal#image-models) | `fal-ai/flux/dev` | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 | | [Fal](/providers/ai-sdk-providers/fal#image-models) | `fal-ai/fast-sdxl` | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 | | [Fal](/providers/ai-sdk-providers/fal#image-models) | `fal-ai/flux-pro/v1.1-ultra` | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 | | [Fal](/providers/ai-sdk-providers/fal#image-models) | `fal-ai/ideogram/v2` | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 | | [Fal](/providers/ai-sdk-providers/fal#image-models) | `fal-ai/recraft-v3` | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 | | [Fal](/providers/ai-sdk-providers/fal#image-models) | `fal-ai/stable-diffusion-3.5-large` | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 | | [Fal](/providers/ai-sdk-providers/fal#image-models) | `fal-ai/hyper-sdxl` | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 | | [Together.ai](/providers/ai-sdk-providers/togetherai#image-models) | `stabilityai/stable-diffusion-xl-base-1.0` | 512x512, 768x768, 1024x1024 | | [Together.ai](/providers/ai-sdk-providers/togetherai#image-models) | `black-forest-labs/FLUX.1-dev` | 512x512, 768x768, 1024x1024 | | [Together.ai](/providers/ai-sdk-providers/togetherai#image-models) | `black-forest-labs/FLUX.1-dev-lora` | 512x512, 768x768, 1024x1024 | | [Together.ai](/providers/ai-sdk-providers/togetherai#image-models) | `black-forest-labs/FLUX.1-schnell` | 512x512, 768x768, 1024x1024 | | [Together.ai](/providers/ai-sdk-providers/togetherai#image-models) | `black-forest-labs/FLUX.1-canny` | 512x512, 768x768, 1024x1024 | | [Together.ai](/providers/ai-sdk-providers/togetherai#image-models) | `black-forest-labs/FLUX.1-depth` | 512x512, 768x768, 1024x1024 | | [Together.ai](/providers/ai-sdk-providers/togetherai#image-models) | `black-forest-labs/FLUX.1-redux` | 512x512, 768x768, 1024x1024 | | [Together.ai](/providers/ai-sdk-providers/togetherai#image-models) | `black-forest-labs/FLUX.1.1-pro` | 512x512, 768x768, 1024x1024 | | [Together.ai](/providers/ai-sdk-providers/togetherai#image-models) | `black-forest-labs/FLUX.1-pro` | 512x512, 768x768, 1024x1024 | | [Together.ai](/providers/ai-sdk-providers/togetherai#image-models) | `black-forest-labs/FLUX.1-schnell-Free` | 512x512, 768x768, 1024x1024 | | [DeepInfra](/providers/ai-sdk-providers/deepinfra#image-models) | `stabilityai/sd3.5` | 1:1, 16:9, 1:9, 3:2, 2:3, 4:5, 5:4, 9:16, 9:21 | | [DeepInfra](/providers/ai-sdk-providers/deepinfra#image-models) | `black-forest-labs/FLUX-1.1-pro` | 256-1440 (multiples of 32) | | [DeepInfra](/providers/ai-sdk-providers/deepinfra#image-models) | `black-forest-labs/FLUX-1-schnell` | 256-1440 (multiples of 32) | | [DeepInfra](/providers/ai-sdk-providers/deepinfra#image-models) | `black-forest-labs/FLUX-1-dev` | 256-1440 (multiples of 32) | | [DeepInfra](/providers/ai-sdk-providers/deepinfra#image-models) | `black-forest-labs/FLUX-pro` | 256-1440 (multiples of 32) | | [DeepInfra](/providers/ai-sdk-providers/deepinfra#image-models) | `stabilityai/sd3.5-medium` | 1:1, 16:9, 1:9, 3:2, 2:3, 4:5, 5:4, 9:16, 9:21 | | [DeepInfra](/providers/ai-sdk-providers/deepinfra#image-models) | `stabilityai/sdxl-turbo` | 1:1, 16:9, 1:9, 3:2, 2:3, 4:5, 5:4, 9:16, 9:21 | Above are a small subset of the image models supported by the AI SDK providers. For more, see the respective provider documentation. --- title: Provider Management description: Learn how to work with multiple providers --- # Provider Management When you work with multiple providers and models, it is often desirable to manage them in a central place and access the models through simple string ids. The AI SDK offers [custom providers](/docs/reference/ai-sdk-core/custom-provider) and a [provider registry](/docs/reference/ai-sdk-core/provider-registry) for this purpose. With custom providers, you can **pre-configure model settings**, **provide model name aliases**, and **limit the available models** . The provider registry lets you mix **multiple providers** and access them through simple string ids. ## Custom Providers You can create a [custom provider](/docs/reference/ai-sdk-core/custom-provider) using `customProvider`. ### Example: custom model settings You might want to override the default model settings for a provider or provide model name aliases with pre-configured settings. ```ts import { openai as originalOpenAI } from '@ai-sdk/openai'; import { customProvider } from 'ai'; // custom provider with different model settings: export const openai = customProvider({ languageModels: { // replacement model with custom settings: 'gpt-4o': originalOpenAI('gpt-4o', { structuredOutputs: true }), // alias model with custom settings: 'gpt-4o-mini-structured': originalOpenAI('gpt-4o-mini', { structuredOutputs: true, }), }, fallbackProvider: originalOpenAI, }); ``` ### Example: model name alias You can also provide model name aliases, so you can update the model version in one place in the future: ```ts import { anthropic as originalAnthropic } from '@ai-sdk/anthropic'; import { customProvider } from 'ai'; // custom provider with alias names: export const anthropic = customProvider({ languageModels: { opus: originalAnthropic('claude-3-opus-20240229'), sonnet: originalAnthropic('claude-3-5-sonnet-20240620'), haiku: originalAnthropic('claude-3-haiku-20240307'), }, fallbackProvider: originalAnthropic, }); ``` ### Example: limit available models You can limit the available models in the system, even if you have multiple providers. ```ts import { anthropic } from '@ai-sdk/anthropic'; import { openai } from '@ai-sdk/openai'; import { customProvider } from 'ai'; export const myProvider = customProvider({ languageModels: { 'text-medium': anthropic('claude-3-5-sonnet-20240620'), 'text-small': openai('gpt-4o-mini'), 'structure-medium': openai('gpt-4o', { structuredOutputs: true }), 'structure-fast': openai('gpt-4o-mini', { structuredOutputs: true }), }, embeddingModels: { emdedding: openai.textEmbeddingModel('text-embedding-3-small'), }, // no fallback provider }); ``` ## Provider Registry The provider registry is an experimental feature. You can create a [provider registry](/docs/reference/ai-sdk-core/provider-registry) with multiple providers and models using `experimental_createProviderRegistry`. ### Example: Setup ```ts filename={"registry.ts"} import { anthropic } from '@ai-sdk/anthropic'; import { createOpenAI } from '@ai-sdk/openai'; import { experimental_createProviderRegistry as createProviderRegistry } from 'ai'; export const registry = createProviderRegistry({ // register provider with prefix and default setup: anthropic, // register provider with prefix and custom setup: openai: createOpenAI({ apiKey: process.env.OPENAI_API_KEY, }), }); ``` ### Example: Use language models You can access language models by using the `languageModel` method on the registry. The provider id will become the prefix of the model id: `providerId:modelId`. ```ts highlight={"5"} import { generateText } from 'ai'; import { registry } from './registry'; const { text } = await generateText({ model: registry.languageModel('openai:gpt-4-turbo'), prompt: 'Invent a new holiday and describe its traditions.', }); ``` ### Example: Use text embedding models You can access text embedding models by using the `textEmbeddingModel` method on the registry. The provider id will become the prefix of the model id: `providerId:modelId`. ```ts highlight={"5"} import { embed } from 'ai'; import { registry } from './registry'; const { embedding } = await embed({ model: registry.textEmbeddingModel('openai:text-embedding-3-small'), value: 'sunny day at the beach', }); ``` ### Example: Use image models You can access image models by using the `imageModel` method on the registry. The provider id will become the prefix of the model id: `providerId:modelId`. ```ts highlight={"5"} import { generateImage } from 'ai'; import { registry } from './registry'; const { image } = await generateImage({ model: registry.imageModel('openai:dall-e-3'), prompt: 'A beautiful sunset over a calm ocean', }); ``` --- title: Language Model Middleware description: Learn how to use middleware to enhance the behavior of language models --- # Language Model Middleware Language model middleware is a way to enhance the behavior of language models by intercepting and modifying the calls to the language model. It can be used to add features like guardrails, RAG, caching, and logging in a language model agnostic way. Such middleware can be developed and distributed independently from the language models that they are applied to. ## Using Language Model Middleware You can use language model middleware with the `wrapLanguageModel` function. It takes a language model and a language model middleware and returns a new language model that incorporates the middleware. ```ts import { wrapLanguageModel } from 'ai'; const wrappedLanguageModel = wrapLanguageModel({ model: yourModel, middleware: yourLanguageModelMiddleware, }); ``` The wrapped language model can be used just like any other language model, e.g. in `streamText`: ```ts highlight="2" const result = streamText({ model: wrappedLanguageModel, prompt: 'What cities are in the United States?', }); ``` ## Multiple middlewares You can provide multiple middlewares to the `wrapLanguageModel` function. The middlewares will be applied in the order they are provided. ```ts const wrappedLanguageModel = wrapLanguageModel({ model: yourModel, middleware: [firstMiddleware, secondMiddleware], }); // applied as: firstMiddleware(secondMiddleware(yourModel)) ``` ## Built-in Middleware The AI SDK comes with several built-in middlewares that you can use to configure language models: - `extractReasoningMiddleware`: Extracts reasoning information from the generated text and exposes it as a `reasoning` property on the result. - `simulateStreamingMiddleware`: Simulates streaming behavior with responses from non-streaming language models. - `defaultSettingsMiddleware`: Applies default settings to a language model. ### Extract Reasoning Some providers and models expose reasoning information in the generated text using special tags, e.g. <think> and </think>. The `extractReasoningMiddleware` function can be used to extract this reasoning information and expose it as a `reasoning` property on the result. ```ts import { wrapLanguageModel, extractReasoningMiddleware } from 'ai'; const model = wrapLanguageModel({ model: yourModel, middleware: extractReasoningMiddleware({ tagName: 'think' }), }); ``` You can then use that enhanced model in functions like `generateText` and `streamText`. The `extractReasoningMiddleware` function also includes a `startWithReasoning` option. When set to `true`, the reasoning tag will be prepended to the generated text. This is useful for models that do not include the reasoning tag at the beginning of the response. For more details, see the [DeepSeek R1 guide](https://sdk.vercel.ai/docs/guides/r1#deepseek-r1-middleware). ### Simulate Streaming The `simulateStreamingMiddleware` function can be used to simulate streaming behavior with responses from non-streaming language models. This is useful when you want to maintain a consistent streaming interface even when using models that only provide complete responses. ```ts import { wrapLanguageModel, simulateStreamingMiddleware } from 'ai'; const model = wrapLanguageModel({ model: yourModel, middleware: simulateStreamingMiddleware(), }); ``` ### Default Settings The `defaultSettingsMiddleware` function can be used to apply default settings to a language model. ```ts import { wrapLanguageModel, defaultSettingsMiddleware } from 'ai'; const model = wrapLanguageModel({ model: yourModel, middleware: defaultSettingsMiddleware({ settings: { temperature: 0.5, maxTokens: 800, // note: use providerMetadata instead of providerOptions here: providerMetadata: { openai: { store: false } }, }, }), }); ``` ## Implementing Language Model Middleware Implementing language model middleware is advanced functionality and requires a solid understanding of the [language model specification](https://github.com/vercel/ai/blob/main/packages/provider/src/language-model/v1/language-model-v1.ts). You can implement any of the following three function to modify the behavior of the language model: 1. `transformParams`: Transforms the parameters before they are passed to the language model, for both `doGenerate` and `doStream`. 2. `wrapGenerate`: Wraps the `doGenerate` method of the [language model](https://github.com/vercel/ai/blob/main/packages/provider/src/language-model/v1/language-model-v1.ts). You can modify the parameters, call the language model, and modify the result. 3. `wrapStream`: Wraps the `doStream` method of the [language model](https://github.com/vercel/ai/blob/main/packages/provider/src/language-model/v1/language-model-v1.ts). You can modify the parameters, call the language model, and modify the result. Here are some examples of how to implement language model middleware: ## Examples These examples are not meant to be used in production. They are just to show how you can use middleware to enhance the behavior of language models. ### Logging This example shows how to log the parameters and generated text of a language model call. ```ts import type { LanguageModelV1Middleware, LanguageModelV1StreamPart } from 'ai'; export const yourLogMiddleware: LanguageModelV1Middleware = { wrapGenerate: async ({ doGenerate, params }) => { console.log('doGenerate called'); console.log(`params: ${JSON.stringify(params, null, 2)}`); const result = await doGenerate(); console.log('doGenerate finished'); console.log(`generated text: ${result.text}`); return result; }, wrapStream: async ({ doStream, params }) => { console.log('doStream called'); console.log(`params: ${JSON.stringify(params, null, 2)}`); const { stream, ...rest } = await doStream(); let generatedText = ''; const transformStream = new TransformStream< LanguageModelV1StreamPart, LanguageModelV1StreamPart >({ transform(chunk, controller) { if (chunk.type === 'text-delta') { generatedText += chunk.textDelta; } controller.enqueue(chunk); }, flush() { console.log('doStream finished'); console.log(`generated text: ${generatedText}`); }, }); return { stream: stream.pipeThrough(transformStream), ...rest, }; }, }; ``` ### Caching This example shows how to build a simple cache for the generated text of a language model call. ```ts import type { LanguageModelV1Middleware } from 'ai'; const cache = new Map(); export const yourCacheMiddleware: LanguageModelV1Middleware = { wrapGenerate: async ({ doGenerate, params }) => { const cacheKey = JSON.stringify(params); if (cache.has(cacheKey)) { return cache.get(cacheKey); } const result = await doGenerate(); cache.set(cacheKey, result); return result; }, // here you would implement the caching logic for streaming }; ``` ### Retrieval Augmented Generation (RAG) This example shows how to use RAG as middleware. Helper functions like `getLastUserMessageText` and `findSources` are not part of the AI SDK. They are just used in this example to illustrate the concept of RAG. ```ts import type { LanguageModelV1Middleware } from 'ai'; export const yourRagMiddleware: LanguageModelV1Middleware = { transformParams: async ({ params }) => { const lastUserMessageText = getLastUserMessageText({ prompt: params.prompt, }); if (lastUserMessageText == null) { return params; // do not use RAG (send unmodified parameters) } const instruction = 'Use the following information to answer the question:\n' + findSources({ text: lastUserMessageText }) .map(chunk => JSON.stringify(chunk)) .join('\n'); return addToLastUserMessage({ params, text: instruction }); }, }; ``` ### Guardrails Guard rails are a way to ensure that the generated text of a language model call is safe and appropriate. This example shows how to use guardrails as middleware. ```ts import type { LanguageModelV1Middleware } from 'ai'; export const yourGuardrailMiddleware: LanguageModelV1Middleware = { wrapGenerate: async ({ doGenerate }) => { const { text, ...rest } = await doGenerate(); // filtering approach, e.g. for PII or other sensitive information: const cleanedText = text?.replace(/badword/g, ''); return { text: cleanedText, ...rest }; }, // here you would implement the guardrail logic for streaming // Note: streaming guardrails are difficult to implement, because // you do not know the full content of the stream until it's finished. }; ``` --- title: Error Handling description: Learn how to handle errors in the AI SDK Core --- # Error Handling ## Handling regular errors Regular errors are thrown and can be handled using the `try/catch` block. ```ts highlight="3,8-10" import { generateText } from 'ai'; try { const { text } = await generateText({ model: yourModel, prompt: 'Write a vegetarian lasagna recipe for 4 people.', }); } catch (error) { // handle error } ``` See [Error Types](/docs/reference/ai-sdk-errors) for more information on the different types of errors that may be thrown. ## Handling streaming errors (simple streams) When errors occur during streams that do not support error chunks, the error is thrown as a regular error. You can handle these errors using the `try/catch` block. ```ts highlight="3,12-14" import { generateText } from 'ai'; try { const { textStream } = streamText({ model: yourModel, prompt: 'Write a vegetarian lasagna recipe for 4 people.', }); for await (const textPart of textStream) { process.stdout.write(textPart); } } catch (error) { // handle error } ``` ## Handling streaming errors (streaming with `error` support) Full streams support error parts. You can handle those parts similar to other parts. It is recommended to also add a try-catch block for errors that happen outside of the streaming. ```ts highlight="13-17" import { generateText } from 'ai'; try { const { fullStream } = streamText({ model: yourModel, prompt: 'Write a vegetarian lasagna recipe for 4 people.', }); for await (const part of fullStream) { switch (part.type) { // ... handle other part types case 'error': { const error = part.error; // handle error break; } } } } catch (error) { // handle error } ``` --- title: Testing description: Learn how to use AI SDK Core mock providers for testing. --- # Testing Testing language models can be challenging, because they are non-deterministic and calling them is slow and expensive. To enable you to unit test your code that uses the AI SDK, the AI SDK Core includes mock providers and test helpers. You can import the following helpers from `ai/test`: - `MockEmbeddingModelV1`: A mock embedding model using the [embedding model v1 specification](https://github.com/vercel/ai/blob/main/packages/provider/src/embedding-model/v1/embedding-model-v1.ts). - `MockLanguageModelV1`: A mock language model using the [language model v1 specification](https://github.com/vercel/ai/blob/main/packages/provider/src/language-model/v1/language-model-v1.ts). - `mockId`: Provides an incrementing integer ID. - `mockValues`: Iterates over an array of values with each call. Returns the last value when the array is exhausted. - [`simulateReadableStream`](/docs/reference/ai-sdk-core/simulate-readable-stream): Simulates a readable stream with delays. With mock providers and test helpers, you can control the output of the AI SDK and test your code in a repeatable and deterministic way without actually calling a language model provider. ## Examples You can use the test helpers with the AI Core functions in your unit tests: ### generateText ```ts import { generateText } from 'ai'; import { MockLanguageModelV1 } from 'ai/test'; const result = await generateText({ model: new MockLanguageModelV1({ doGenerate: async () => ({ rawCall: { rawPrompt: null, rawSettings: {} }, finishReason: 'stop', usage: { promptTokens: 10, completionTokens: 20 }, text: `Hello, world!`, }), }), prompt: 'Hello, test!', }); ``` ### streamText ```ts import { streamText, simulateReadableStream } from 'ai'; import { MockLanguageModelV1 } from 'ai/test'; const result = streamText({ model: new MockLanguageModelV1({ doStream: async () => ({ stream: simulateReadableStream({ chunks: [ { type: 'text-delta', textDelta: 'Hello' }, { type: 'text-delta', textDelta: ', ' }, { type: 'text-delta', textDelta: `world!` }, { type: 'finish', finishReason: 'stop', logprobs: undefined, usage: { completionTokens: 10, promptTokens: 3 }, }, ], }), rawCall: { rawPrompt: null, rawSettings: {} }, }), }), prompt: 'Hello, test!', }); ``` ### generateObject ```ts import { generateObject } from 'ai'; import { MockLanguageModelV1 } from 'ai/test'; import { z } from 'zod'; const result = await generateObject({ model: new MockLanguageModelV1({ defaultObjectGenerationMode: 'json', doGenerate: async () => ({ rawCall: { rawPrompt: null, rawSettings: {} }, finishReason: 'stop', usage: { promptTokens: 10, completionTokens: 20 }, text: `{"content":"Hello, world!"}`, }), }), schema: z.object({ content: z.string() }), prompt: 'Hello, test!', }); ``` ### streamObject ```ts import { streamObject, simulateReadableStream } from 'ai'; import { MockLanguageModelV1 } from 'ai/test'; import { z } from 'zod'; const result = streamObject({ model: new MockLanguageModelV1({ defaultObjectGenerationMode: 'json', doStream: async () => ({ stream: simulateReadableStream({ chunks: [ { type: 'text-delta', textDelta: '{ ' }, { type: 'text-delta', textDelta: '"content": ' }, { type: 'text-delta', textDelta: `"Hello, ` }, { type: 'text-delta', textDelta: `world` }, { type: 'text-delta', textDelta: `!"` }, { type: 'text-delta', textDelta: ' }' }, { type: 'finish', finishReason: 'stop', logprobs: undefined, usage: { completionTokens: 10, promptTokens: 3 }, }, ], }), rawCall: { rawPrompt: null, rawSettings: {} }, }), }), schema: z.object({ content: z.string() }), prompt: 'Hello, test!', }); ``` ### Simulate Data Stream Protocol Responses You can also simulate [Data Stream Protocol](/docs/ai-sdk-ui/stream-protocol#data-stream-protocol) responses for testing, debugging, or demonstration purposes. Here is a Next example: ```ts filename="route.ts" import { simulateReadableStream } from 'ai'; export async function POST(req: Request) { return new Response( simulateReadableStream({ initialDelayInMs: 1000, // Delay before the first chunk chunkDelayInMs: 300, // Delay between chunks chunks: [ `0:"This"\n`, `0:" is an"\n`, `0:"example."\n`, `e:{"finishReason":"stop","usage":{"promptTokens":20,"completionTokens":50},"isContinued":false}\n`, `d:{"finishReason":"stop","usage":{"promptTokens":20,"completionTokens":50}}\n`, ], }).pipeThrough(new TextEncoderStream()), { status: 200, headers: { 'X-Vercel-AI-Data-Stream': 'v1', 'Content-Type': 'text/plain; charset=utf-8', }, }, ); } ``` --- title: Telemetry description: Using OpenTelemetry with AI SDK Core --- # Telemetry AI SDK Telemetry is experimental and may change in the future. The AI SDK uses [OpenTelemetry](https://opentelemetry.io/) to collect telemetry data. OpenTelemetry is an open-source observability framework designed to provide standardized instrumentation for collecting telemetry data. Check out the [AI SDK Observability Integrations](/providers/observability) to see providers that offer monitoring and tracing for AI SDK applications. ## Enabling telemetry For Next.js applications, please follow the [Next.js OpenTelemetry guide](https://nextjs.org/docs/app/building-your-application/optimizing/open-telemetry) to enable telemetry first. You can then use the `experimental_telemetry` option to enable telemetry on specific function calls while the feature is experimental: ```ts highlight="4" const result = await generateText({ model: openai('gpt-4-turbo'), prompt: 'Write a short story about a cat.', experimental_telemetry: { isEnabled: true }, }); ``` When telemetry is enabled, you can also control if you want to record the input values and the output values for the function. By default, both are enabled. You can disable them by setting the `recordInputs` and `recordOutputs` options to `false`. Disabling the recording of inputs and outputs can be useful for privacy, data transfer, and performance reasons. You might for example want to disable recording inputs if they contain sensitive information. ## Telemetry Metadata You can provide a `functionId` to identify the function that the telemetry data is for, and `metadata` to include additional information in the telemetry data. ```ts highlight="6-10" const result = await generateText({ model: openai('gpt-4-turbo'), prompt: 'Write a short story about a cat.', experimental_telemetry: { isEnabled: true, functionId: 'my-awesome-function', metadata: { something: 'custom', someOtherThing: 'other-value', }, }, }); ``` ## Custom Tracer You may provide a `tracer` which must return an OpenTelemetry `Tracer`. This is useful in situations where you want your traces to use a `TracerProvider` other than the one provided by the `@opentelemetry/api` singleton. ```ts highlight="7" const tracerProvider = new NodeTracerProvider(); const result = await generateText({ model: openai('gpt-4-turbo'), prompt: 'Write a short story about a cat.', experimental_telemetry: { isEnabled: true, tracer: tracerProvider.getTracer('ai'), }, }); ``` ## Collected Data ### generateText function `generateText` records 3 types of spans: - `ai.generateText` (span): the full length of the generateText call. It contains 1 or more `ai.generateText.doGenerate` spans. It contains the [basic LLM span information](#basic-llm-span-information) and the following attributes: - `operation.name`: `ai.generateText` and the functionId that was set through `telemetry.functionId` - `ai.operationId`: `"ai.generateText"` - `ai.prompt`: the prompt that was used when calling `generateText` - `ai.response.text`: the text that was generated - `ai.response.toolCalls`: the tool calls that were made as part of the generation (stringified JSON) - `ai.response.finishReason`: the reason why the generation finished - `ai.settings.maxSteps`: the maximum number of steps that were set - `ai.generateText.doGenerate` (span): a provider doGenerate call. It can contain `ai.toolCall` spans. It contains the [call LLM span information](#call-llm-span-information) and the following attributes: - `operation.name`: `ai.generateText.doGenerate` and the functionId that was set through `telemetry.functionId` - `ai.operationId`: `"ai.generateText.doGenerate"` - `ai.prompt.format`: the format of the prompt - `ai.prompt.messages`: the messages that were passed into the provider - `ai.prompt.tools`: array of stringified tool definitions. The tools can be of type `function` or `provider-defined`. Function tools have a `name`, `description` (optional), and `parameters` (JSON schema). Provider-defined tools have a `name`, `id`, and `args` (Record). - `ai.prompt.toolChoice`: the stringified tool choice setting (JSON). It has a `type` property (`auto`, `none`, `required`, `tool`), and if the type is `tool`, a `toolName` property with the specific tool. - `ai.response.text`: the text that was generated - `ai.response.toolCalls`: the tool calls that were made as part of the generation (stringified JSON) - `ai.response.finishReason`: the reason why the generation finished - `ai.toolCall` (span): a tool call that is made as part of the generateText call. See [Tool call spans](#tool-call-spans) for more details. ### streamText function `streamText` records 3 types of spans and 2 types of events: - `ai.streamText` (span): the full length of the streamText call. It contains a `ai.streamText.doStream` span. It contains the [basic LLM span information](#basic-llm-span-information) and the following attributes: - `operation.name`: `ai.streamText` and the functionId that was set through `telemetry.functionId` - `ai.operationId`: `"ai.streamText"` - `ai.prompt`: the prompt that was used when calling `streamText` - `ai.response.text`: the text that was generated - `ai.response.toolCalls`: the tool calls that were made as part of the generation (stringified JSON) - `ai.response.finishReason`: the reason why the generation finished - `ai.settings.maxSteps`: the maximum number of steps that were set - `ai.streamText.doStream` (span): a provider doStream call. This span contains an `ai.stream.firstChunk` event and `ai.toolCall` spans. It contains the [call LLM span information](#call-llm-span-information) and the following attributes: - `operation.name`: `ai.streamText.doStream` and the functionId that was set through `telemetry.functionId` - `ai.operationId`: `"ai.streamText.doStream"` - `ai.prompt.format`: the format of the prompt - `ai.prompt.messages`: the messages that were passed into the provider - `ai.prompt.tools`: array of stringified tool definitions. The tools can be of type `function` or `provider-defined`. Function tools have a `name`, `description` (optional), and `parameters` (JSON schema). Provider-defined tools have a `name`, `id`, and `args` (Record). - `ai.prompt.toolChoice`: the stringified tool choice setting (JSON). It has a `type` property (`auto`, `none`, `required`, `tool`), and if the type is `tool`, a `toolName` property with the specific tool. - `ai.response.text`: the text that was generated - `ai.response.toolCalls`: the tool calls that were made as part of the generation (stringified JSON) - `ai.response.msToFirstChunk`: the time it took to receive the first chunk in milliseconds - `ai.response.msToFinish`: the time it took to receive the finish part of the LLM stream in milliseconds - `ai.response.avgCompletionTokensPerSecond`: the average number of completion tokens per second - `ai.response.finishReason`: the reason why the generation finished - `ai.toolCall` (span): a tool call that is made as part of the generateText call. See [Tool call spans](#tool-call-spans) for more details. - `ai.stream.firstChunk` (event): an event that is emitted when the first chunk of the stream is received. - `ai.response.msToFirstChunk`: the time it took to receive the first chunk - `ai.stream.finish` (event): an event that is emitted when the finish part of the LLM stream is received. It also records a `ai.stream.firstChunk` event when the first chunk of the stream is received. ### generateObject function `generateObject` records 2 types of spans: - `ai.generateObject` (span): the full length of the generateObject call. It contains 1 or more `ai.generateObject.doGenerate` spans. It contains the [basic LLM span information](#basic-llm-span-information) and the following attributes: - `operation.name`: `ai.generateObject` and the functionId that was set through `telemetry.functionId` - `ai.operationId`: `"ai.generateObject"` - `ai.prompt`: the prompt that was used when calling `generateObject` - `ai.schema`: Stringified JSON schema version of the schema that was passed into the `generateObject` function - `ai.schema.name`: the name of the schema that was passed into the `generateObject` function - `ai.schema.description`: the description of the schema that was passed into the `generateObject` function - `ai.response.object`: the object that was generated (stringified JSON) - `ai.settings.mode`: the object generation mode, e.g. `json` - `ai.settings.output`: the output type that was used, e.g. `object` or `no-schema` - `ai.generateObject.doGenerate` (span): a provider doGenerate call. It contains the [call LLM span information](#call-llm-span-information) and the following attributes: - `operation.name`: `ai.generateObject.doGenerate` and the functionId that was set through `telemetry.functionId` - `ai.operationId`: `"ai.generateObject.doGenerate"` - `ai.prompt.format`: the format of the prompt - `ai.prompt.messages`: the messages that were passed into the provider - `ai.response.object`: the object that was generated (stringified JSON) - `ai.settings.mode`: the object generation mode - `ai.response.finishReason`: the reason why the generation finished ### streamObject function `streamObject` records 2 types of spans and 1 type of event: - `ai.streamObject` (span): the full length of the streamObject call. It contains 1 or more `ai.streamObject.doStream` spans. It contains the [basic LLM span information](#basic-llm-span-information) and the following attributes: - `operation.name`: `ai.streamObject` and the functionId that was set through `telemetry.functionId` - `ai.operationId`: `"ai.streamObject"` - `ai.prompt`: the prompt that was used when calling `streamObject` - `ai.schema`: Stringified JSON schema version of the schema that was passed into the `streamObject` function - `ai.schema.name`: the name of the schema that was passed into the `streamObject` function - `ai.schema.description`: the description of the schema that was passed into the `streamObject` function - `ai.response.object`: the object that was generated (stringified JSON) - `ai.settings.mode`: the object generation mode, e.g. `json` - `ai.settings.output`: the output type that was used, e.g. `object` or `no-schema` - `ai.streamObject.doStream` (span): a provider doStream call. This span contains an `ai.stream.firstChunk` event. It contains the [call LLM span information](#call-llm-span-information) and the following attributes: - `operation.name`: `ai.streamObject.doStream` and the functionId that was set through `telemetry.functionId` - `ai.operationId`: `"ai.streamObject.doStream"` - `ai.prompt.format`: the format of the prompt - `ai.prompt.messages`: the messages that were passed into the provider - `ai.settings.mode`: the object generation mode - `ai.response.object`: the object that was generated (stringified JSON) - `ai.response.msToFirstChunk`: the time it took to receive the first chunk - `ai.response.finishReason`: the reason why the generation finished - `ai.stream.firstChunk` (event): an event that is emitted when the first chunk of the stream is received. - `ai.response.msToFirstChunk`: the time it took to receive the first chunk ### embed function `embed` records 2 types of spans: - `ai.embed` (span): the full length of the embed call. It contains 1 `ai.embed.doEmbed` spans. It contains the [basic embedding span information](#basic-embedding-span-information) and the following attributes: - `operation.name`: `ai.embed` and the functionId that was set through `telemetry.functionId` - `ai.operationId`: `"ai.embed"` - `ai.value`: the value that was passed into the `embed` function - `ai.embedding`: a JSON-stringified embedding - `ai.embed.doEmbed` (span): a provider doEmbed call. It contains the [basic embedding span information](#basic-embedding-span-information) and the following attributes: - `operation.name`: `ai.embed.doEmbed` and the functionId that was set through `telemetry.functionId` - `ai.operationId`: `"ai.embed.doEmbed"` - `ai.values`: the values that were passed into the provider (array) - `ai.embeddings`: an array of JSON-stringified embeddings ### embedMany function `embedMany` records 2 types of spans: - `ai.embedMany` (span): the full length of the embedMany call. It contains 1 or more `ai.embedMany.doEmbed` spans. It contains the [basic embedding span information](#basic-embedding-span-information) and the following attributes: - `operation.name`: `ai.embedMany` and the functionId that was set through `telemetry.functionId` - `ai.operationId`: `"ai.embedMany"` - `ai.values`: the values that were passed into the `embedMany` function - `ai.embeddings`: an array of JSON-stringified embedding - `ai.embedMany.doEmbed` (span): a provider doEmbed call. It contains the [basic embedding span information](#basic-embedding-span-information) and the following attributes: - `operation.name`: `ai.embedMany.doEmbed` and the functionId that was set through `telemetry.functionId` - `ai.operationId`: `"ai.embedMany.doEmbed"` - `ai.values`: the values that were sent to the provider - `ai.embeddings`: an array of JSON-stringified embeddings for each value ## Span Details ### Basic LLM span information Many spans that use LLMs (`ai.generateText`, `ai.generateText.doGenerate`, `ai.streamText`, `ai.streamText.doStream`, `ai.generateObject`, `ai.generateObject.doGenerate`, `ai.streamObject`, `ai.streamObject.doStream`) contain the following attributes: - `resource.name`: the functionId that was set through `telemetry.functionId` - `ai.model.id`: the id of the model - `ai.model.provider`: the provider of the model - `ai.request.headers.*`: the request headers that were passed in through `headers` - `ai.settings.maxRetries`: the maximum number of retries that were set - `ai.telemetry.functionId`: the functionId that was set through `telemetry.functionId` - `ai.telemetry.metadata.*`: the metadata that was passed in through `telemetry.metadata` - `ai.usage.completionTokens`: the number of completion tokens that were used - `ai.usage.promptTokens`: the number of prompt tokens that were used ### Call LLM span information Spans that correspond to individual LLM calls (`ai.generateText.doGenerate`, `ai.streamText.doStream`, `ai.generateObject.doGenerate`, `ai.streamObject.doStream`) contain [basic LLM span information](#basic-llm-span-information) and the following attributes: - `ai.response.model`: the model that was used to generate the response. This can be different from the model that was requested if the provider supports aliases. - `ai.response.id`: the id of the response. Uses the ID from the provider when available. - `ai.response.timestamp`: the timestamp of the response. Uses the timestamp from the provider when available. - [Semantic Conventions for GenAI operations](https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans/) - `gen_ai.system`: the provider that was used - `gen_ai.request.model`: the model that was requested - `gen_ai.request.temperature`: the temperature that was set - `gen_ai.request.max_tokens`: the maximum number of tokens that were set - `gen_ai.request.frequency_penalty`: the frequency penalty that was set - `gen_ai.request.presence_penalty`: the presence penalty that was set - `gen_ai.request.top_k`: the topK parameter value that was set - `gen_ai.request.top_p`: the topP parameter value that was set - `gen_ai.request.stop_sequences`: the stop sequences - `gen_ai.response.finish_reasons`: the finish reasons that were returned by the provider - `gen_ai.response.model`: the model that was used to generate the response. This can be different from the model that was requested if the provider supports aliases. - `gen_ai.response.id`: the id of the response. Uses the ID from the provider when available. - `gen_ai.usage.input_tokens`: the number of prompt tokens that were used - `gen_ai.usage.output_tokens`: the number of completion tokens that were used ### Basic embedding span information Many spans that use embedding models (`ai.embed`, `ai.embed.doEmbed`, `ai.embedMany`, `ai.embedMany.doEmbed`) contain the following attributes: - `ai.model.id`: the id of the model - `ai.model.provider`: the provider of the model - `ai.request.headers.*`: the request headers that were passed in through `headers` - `ai.settings.maxRetries`: the maximum number of retries that were set - `ai.telemetry.functionId`: the functionId that was set through `telemetry.functionId` - `ai.telemetry.metadata.*`: the metadata that was passed in through `telemetry.metadata` - `ai.usage.tokens`: the number of tokens that were used - `resource.name`: the functionId that was set through `telemetry.functionId` ### Tool call spans Tool call spans (`ai.toolCall`) contain the following attributes: - `operation.name`: `"ai.toolCall"` - `ai.operationId`: `"ai.toolCall"` - `ai.toolCall.name`: the name of the tool - `ai.toolCall.id`: the id of the tool call - `ai.toolCall.args`: the parameters of the tool call - `ai.toolCall.result`: the result of the tool call. Only available if the tool call is successful and the result is serializable. --- title: Overview description: An overview of AI SDK UI. --- # AI SDK UI AI SDK UI is designed to help you build interactive chat, completion, and assistant applications with ease. It is a **framework-agnostic toolkit**, streamlining the integration of advanced AI functionalities into your applications. AI SDK UI provides robust abstractions that simplify the complex tasks of managing chat streams and UI updates on the frontend, enabling you to develop dynamic AI-driven interfaces more efficiently. With four main hooks — **`useChat`**, **`useCompletion`**, **`useObject`**, and **`useAssistant`** — you can incorporate real-time chat capabilities, text completions, streamed JSON, and interactive assistant features into your app. - **[`useChat`](/docs/ai-sdk-ui/chatbot)** offers real-time streaming of chat messages, abstracting state management for inputs, messages, loading, and errors, allowing for seamless integration into any UI design. - **[`useCompletion`](/docs/ai-sdk-ui/completion)** enables you to handle text completions in your applications, managing the prompt input and automatically updating the UI as new completions are streamed. - **[`useObject`](/docs/ai-sdk-ui/object-generation)** is a hook that allows you to consume streamed JSON objects, providing a simple way to handle and display structured data in your application. - **[`useAssistant`](/docs/ai-sdk-ui/openai-assistants)** is designed to facilitate interaction with OpenAI-compatible assistant APIs, managing UI state and updating it automatically as responses are streamed. These hooks are designed to reduce the complexity and time required to implement AI interactions, letting you focus on creating exceptional user experiences. ## UI Framework Support AI SDK UI supports the following frameworks: [React](https://react.dev/), [Svelte](https://svelte.dev/), [Vue.js](https://vuejs.org/), and [SolidJS](https://www.solidjs.com/). Here is a comparison of the supported functions across these frameworks: | Function | React | Svelte | Vue.js | SolidJS | | --------------------------------------------------------- | ------------------- | ------------------- | ------------------- | ------------------- | | [useChat](/docs/reference/ai-sdk-ui/use-chat) | | | | | | [useCompletion](/docs/reference/ai-sdk-ui/use-completion) | | | | | | [useObject](/docs/reference/ai-sdk-ui/use-object) | | | | | | [useAssistant](/docs/reference/ai-sdk-ui/use-assistant) | | | | | [Contributions](https://github.com/vercel/ai/blob/main/CONTRIBUTING.md) are welcome to implement missing features for non-React frameworks. ## API Reference Please check out the [AI SDK UI API Reference](/docs/reference/ai-sdk-ui) for more details on each function. --- title: Chatbot description: Learn how to use the useChat hook. --- # Chatbot The `useChat` hook makes it effortless to create a conversational user interface for your chatbot application. It enables the streaming of chat messages from your AI provider, manages the chat state, and updates the UI automatically as new messages arrive. To summarize, the `useChat` hook provides the following features: - **Message Streaming**: All the messages from the AI provider are streamed to the chat UI in real-time. - **Managed States**: The hook manages the states for input, messages, status, error and more for you. - **Seamless Integration**: Easily integrate your chat AI into any design or layout with minimal effort. In this guide, you will learn how to use the `useChat` hook to create a chatbot application with real-time message streaming. Check out our [chatbot with tools guide](/docs/ai-sdk-ui/chatbot-with-tool-calling) to learn how to use tools in your chatbot. Let's start with the following example first. ## Example ```tsx filename='app/page.tsx' 'use client'; import { useChat } from '@ai-sdk/react'; export default function Page() { const { messages, input, handleInputChange, handleSubmit } = useChat({}); return ( <> {messages.map(message => (
{message.role === 'user' ? 'User: ' : 'AI: '} {message.content}
))}
); } ``` ```ts filename='app/api/chat/route.ts' import { openai } from '@ai-sdk/openai'; import { streamText } from 'ai'; // Allow streaming responses up to 30 seconds export const maxDuration = 30; export async function POST(req: Request) { const { messages } = await req.json(); const result = streamText({ model: openai('gpt-4-turbo'), system: 'You are a helpful assistant.', messages, }); return result.toDataStreamResponse(); } ``` The UI messages have a new `parts` property that contains the message parts. We recommend rendering the messages using the `parts` property instead of the `content` property. The parts property supports different message types, including text, tool invocation, and tool result, and allows for more flexible and complex chat UIs. In the `Page` component, the `useChat` hook will request to your AI provider endpoint whenever the user submits a message. The messages are then streamed back in real-time and displayed in the chat UI. This enables a seamless chat experience where the user can see the AI response as soon as it is available, without having to wait for the entire response to be received. ## Customized UI `useChat` also provides ways to manage the chat message and input states via code, show status, and update messages without being triggered by user interactions. ### Status The `useChat` hook returns a `status`. It has the following possible values: - `submitted`: The message has been sent to the API and we're awaiting the start of the response stream. - `streaming`: The response is actively streaming in from the API, receiving chunks of data. - `ready`: The full response has been received and processed; a new user message can be submitted. - `error`: An error occurred during the API request, preventing successful completion. You can use `status` for e.g. the following purposes: - To show a loading spinner while the chatbot is processing the user's message. - To show a "Stop" button to abort the current message. - To disable the submit button. ```tsx filename='app/page.tsx' highlight="6,20-27,34" 'use client'; import { useChat } from '@ai-sdk/react'; export default function Page() { const { messages, input, handleInputChange, handleSubmit, status, stop } = useChat({}); return ( <> {messages.map(message => (
{message.role === 'user' ? 'User: ' : 'AI: '} {message.content}
))} {(status === 'submitted' || status === 'streaming') && (
{status === 'submitted' && }
)}
); } ``` ### Error State Similarly, the `error` state reflects the error object thrown during the fetch request. It can be used to display an error message, disable the submit button, or show a retry button: We recommend showing a generic error message to the user, such as "Something went wrong." This is a good practice to avoid leaking information from the server. ```tsx file="app/page.tsx" highlight="6,18-25,31" 'use client'; import { useChat } from '@ai-sdk/react'; export default function Chat() { const { messages, input, handleInputChange, handleSubmit, error, reload } = useChat({}); return (
{messages.map(m => (
{m.role}: {m.content}
))} {error && ( <>
An error occurred.
)}
); } ``` Please also see the [error handling](/docs/ai-sdk-ui/error-handling) guide for more information. ### Modify messages Sometimes, you may want to directly modify some existing messages. For example, a delete button can be added to each message to allow users to remove them from the chat history. The `setMessages` function can help you achieve these tasks: ```tsx const { messages, setMessages, ... } = useChat() const handleDelete = (id) => { setMessages(messages.filter(message => message.id !== id)) } return <> {messages.map(message => (
{message.role === 'user' ? 'User: ' : 'AI: '} {message.content}
))} ... ``` You can think of `messages` and `setMessages` as a pair of `state` and `setState` in React. ### Controlled input In the initial example, we have `handleSubmit` and `handleInputChange` callbacks that manage the input changes and form submissions. These are handy for common use cases, but you can also use uncontrolled APIs for more advanced scenarios such as form validation or customized components. The following example demonstrates how to use more granular APIs like `setInput` and `append` with your custom input and submit button components: ```tsx const { input, setInput, append } = useChat() return <> setInput(value)} /> { // Send a new message to the AI provider append({ role: 'user', content: input, }) }}/> ... ``` ### Cancellation and regeneration It's also a common use case to abort the response message while it's still streaming back from the AI provider. You can do this by calling the `stop` function returned by the `useChat` hook. ```tsx const { stop, status, ... } = useChat() return <> ... ``` When the user clicks the "Stop" button, the fetch request will be aborted. This avoids consuming unnecessary resources and improves the UX of your chatbot application. Similarly, you can also request the AI provider to reprocess the last message by calling the `reload` function returned by the `useChat` hook: ```tsx const { reload, status, ... } = useChat() return <> ... ``` When the user clicks the "Regenerate" button, the AI provider will regenerate the last message and replace the current one correspondingly. ### Throttling UI Updates This feature is currently only available for React. By default, the `useChat` hook will trigger a render every time a new chunk is received. You can throttle the UI updates with the `experimental_throttle` option. ```tsx filename="page.tsx" highlight="2-3" const { messages, ... } = useChat({ // Throttle the messages and data updates to 50ms: experimental_throttle: 50 }) ``` ## Event Callbacks `useChat` provides optional event callbacks that you can use to handle different stages of the chatbot lifecycle: - `onFinish`: Called when the assistant message is completed - `onError`: Called when an error occurs during the fetch request. - `onResponse`: Called when the response from the API is received. These callbacks can be used to trigger additional actions, such as logging, analytics, or custom UI updates. ```tsx import { Message } from '@ai-sdk/react'; const { /* ... */ } = useChat({ onFinish: (message, { usage, finishReason }) => { console.log('Finished streaming message:', message); console.log('Token usage:', usage); console.log('Finish reason:', finishReason); }, onError: error => { console.error('An error occurred:', error); }, onResponse: response => { console.log('Received HTTP response from server:', response); }, }); ``` It's worth noting that you can abort the processing by throwing an error in the `onResponse` callback. This will trigger the `onError` callback and stop the message from being appended to the chat UI. This can be useful for handling unexpected responses from the AI provider. ## Request Configuration ### Custom headers, body, and credentials By default, the `useChat` hook sends a HTTP POST request to the `/api/chat` endpoint with the message list as the request body. You can customize the request by passing additional options to the `useChat` hook: ```tsx const { messages, input, handleInputChange, handleSubmit } = useChat({ api: '/api/custom-chat', headers: { Authorization: 'your_token', }, body: { user_id: '123', }, credentials: 'same-origin', }); ``` In this example, the `useChat` hook sends a POST request to the `/api/custom-chat` endpoint with the specified headers, additional body fields, and credentials for that fetch request. On your server side, you can handle the request with these additional information. ### Setting custom body fields per request You can configure custom `body` fields on a per-request basis using the `body` option of the `handleSubmit` function. This is useful if you want to pass in additional information to your backend that is not part of the message list. ```tsx filename="app/page.tsx" highlight="18-20" 'use client'; import { useChat } from '@ai-sdk/react'; export default function Chat() { const { messages, input, handleInputChange, handleSubmit } = useChat(); return (
{messages.map(m => (
{m.role}: {m.content}
))}
{ handleSubmit(event, { body: { customKey: 'customValue', }, }); }} >
); } ``` You can retrieve these custom fields on your server side by destructuring the request body: ```ts filename="app/api/chat/route.ts" highlight="3" export async function POST(req: Request) { // Extract addition information ("customKey") from the body of the request: const { messages, customKey } = await req.json(); //... } ``` ## Controlling the response stream With `streamText`, you can control how error messages and usage information are sent back to the client. ### Error Messages By default, the error message is masked for security reasons. The default error message is "An error occurred." You can forward error messages or send your own error message by providing a `getErrorMessage` function: ```ts filename="app/api/chat/route.ts" highlight="13-27" import { openai } from '@ai-sdk/openai'; import { streamText } from 'ai'; export async function POST(req: Request) { const { messages } = await req.json(); const result = streamText({ model: openai('gpt-4o'), messages, }); return result.toDataStreamResponse({ getErrorMessage: error => { if (error == null) { return 'unknown error'; } if (typeof error === 'string') { return error; } if (error instanceof Error) { return error.message; } return JSON.stringify(error); }, }); } ``` ### Usage Information By default, the usage information is sent back to the client. You can disable it by setting the `sendUsage` option to `false`: ```ts filename="app/api/chat/route.ts" highlight="13" import { openai } from '@ai-sdk/openai'; import { streamText } from 'ai'; export async function POST(req: Request) { const { messages } = await req.json(); const result = streamText({ model: openai('gpt-4o'), messages, }); return result.toDataStreamResponse({ sendUsage: false, }); } ``` ### Text Streams `useChat` can handle plain text streams by setting the `streamProtocol` option to `text`: ```tsx filename="app/page.tsx" highlight="7" 'use client'; import { useChat } from '@ai-sdk/react'; export default function Chat() { const { messages } = useChat({ streamProtocol: 'text', }); return <>...; } ``` This configuration also works with other backend servers that stream plain text. Check out the [stream protocol guide](/docs/ai-sdk-ui/stream-protocol) for more information. When using `streamProtocol: 'text'`, tool calls, usage information and finish reasons are not available. ## Empty Submissions You can configure the `useChat` hook to allow empty submissions by setting the `allowEmptySubmit` option to `true`. ```tsx filename="app/page.tsx" highlight="18" 'use client'; import { useChat } from '@ai-sdk/react'; export default function Chat() { const { messages, input, handleInputChange, handleSubmit } = useChat(); return (
{messages.map(m => (
{m.role}: {m.content}
))}
{ handleSubmit(event, { allowEmptySubmit: true, }); }} >
); } ``` ## Reasoning Some models such as as DeepSeek `deepseek-reasoner` and Anthropic `claude-3-7-sonnet-20250219` support reasoning tokens. These tokens are typically sent before the message content. You can forward them to the client with the `sendReasoning` option: ```ts filename="app/api/chat/route.ts" highlight="13" import { deepseek } from '@ai-sdk/deepseek'; import { streamText } from 'ai'; export async function POST(req: Request) { const { messages } = await req.json(); const result = streamText({ model: deepseek('deepseek-reasoner'), messages, }); return result.toDataStreamResponse({ sendReasoning: true, }); } ``` On the client side, you can access the reasoning parts of the message object. They have a `details` property that contains the reasoning and redacted reasoning parts. You can also use `reasoning` to access just the reasoning as a string. ```tsx filename="app/page.tsx" messages.map(message => (
{message.role === 'user' ? 'User: ' : 'AI: '} {message.parts.map((part, index) => { // text parts: if (part.type === 'text') { return
{part.text}
; } // reasoning parts: if (part.type === 'reasoning') { return (
            {part.details.map(detail =>
              detail.type === 'text' ? detail.text : '',
            )}
          
); } })}
)); ``` ## Sources Some providers such as [Perplexity](/providers/ai-sdk-providers/perplexity#sources) and [Google Generative AI](/providers/ai-sdk-providers/google-generative-ai#sources) include sources in the response. Currently sources are limited to web pages that ground the response. You can forward them to the client with the `sendSources` option: ```ts filename="app/api/chat/route.ts" highlight="13" import { perplexity } from '@ai-sdk/perplexity'; import { streamText } from 'ai'; export async function POST(req: Request) { const { messages } = await req.json(); const result = streamText({ model: perplexity('sonar-pro'), messages, }); return result.toDataStreamResponse({ sendSources: true, }); } ``` On the client side, you can access source parts of the message object. Here is an example that renders the sources as links at the bottom of the message: ```tsx filename="app/page.tsx" messages.map(message => (
{message.role === 'user' ? 'User: ' : 'AI: '} {message.parts .filter(part => part.type !== 'source') .map((part, index) => { if (part.type === 'text') { return
{part.text}
; } })} {message.parts .filter(part => part.type === 'source') .map(part => ( [ {part.source.title ?? new URL(part.source.url).hostname} ] ))}
)); ``` ## Attachments (Experimental) The `useChat` hook supports sending attachments along with a message as well as rendering them on the client. This can be useful for building applications that involve sending images, files, or other media content to the AI provider. There are two ways to send attachments with a message, either by providing a `FileList` object or a list of URLs to the `handleSubmit` function: ### FileList By using `FileList`, you can send multiple files as attachments along with a message using the file input element. The `useChat` hook will automatically convert them into data URLs and send them to the AI provider. Currently, only `image/*` and `text/*` content types get automatically converted into [multi-modal content parts](https://sdk.vercel.ai/docs/foundations/prompts#multi-modal-messages). You will need to handle other content types manually. ```tsx filename="app/page.tsx" 'use client'; import { useChat } from '@ai-sdk/react'; import { useRef, useState } from 'react'; export default function Page() { const { messages, input, handleSubmit, handleInputChange, status } = useChat(); const [files, setFiles] = useState(undefined); const fileInputRef = useRef(null); return (
{messages.map(message => (
{`${message.role}: `}
{message.content}
{message.experimental_attachments ?.filter(attachment => attachment.contentType.startsWith('image/'), ) .map((attachment, index) => ( {attachment.name} ))}
))}
{ handleSubmit(event, { experimental_attachments: files, }); setFiles(undefined); if (fileInputRef.current) { fileInputRef.current.value = ''; } }} > { if (event.target.files) { setFiles(event.target.files); } }} multiple ref={fileInputRef} />
); } ``` ### URLs You can also send URLs as attachments along with a message. This can be useful for sending links to external resources or media content. > **Note:** The URL can also be a data URL, which is a base64-encoded string that represents the content of a file. Currently, only `image/*` content types get automatically converted into [multi-modal content parts](https://sdk.vercel.ai/docs/foundations/prompts#multi-modal-messages). You will need to handle other content types manually. ```tsx filename="app/page.tsx" 'use client'; import { useChat } from '@ai-sdk/react'; import { useState } from 'react'; import { Attachment } from '@ai-sdk/ui-utils'; export default function Page() { const { messages, input, handleSubmit, handleInputChange, status } = useChat(); const [attachments] = useState([ { name: 'earth.png', contentType: 'image/png', url: 'https://example.com/earth.png', }, { name: 'moon.png', contentType: 'image/png', url: '...', }, ]); return (
{messages.map(message => (
{`${message.role}: `}
{message.content}
{message.experimental_attachments ?.filter(attachment => attachment.contentType?.startsWith('image/'), ) .map((attachment, index) => ( {attachment.name} ))}
))}
{ handleSubmit(event, { experimental_attachments: attachments, }); }} >
); } ``` --- title: Chatbot Message Persistence description: Learn how to store and load chat messages in a chatbot. --- # Chatbot Message Persistence Being able to store and load chat messages is crucial for most AI chatbots. In this guide, we'll show how to implement message persistence with `useChat` and `streamText`. This guide does not cover authorization, error handling, or other real-world considerations. It is intended to be a simple example of how to implement message persistence. ## Starting a new chat When the user navigates to the chat page without providing a chat ID, we need to create a new chat and redirect to the chat page with the new chat ID. ```tsx filename="app/chat/page.tsx" import { redirect } from 'next/navigation'; import { createChat } from '@tools/chat-store'; export default async function Page() { const id = await createChat(); // create a new chat redirect(`/chat/${id}`); // redirect to chat page, see below } ``` Our example chat store implementation uses files to store the chat messages. In a real-world application, you would use a database or a cloud storage service, and get the chat ID from the database. That being said, the function interfaces are designed to be easily replaced with other implementations. ```tsx filename="tools/chat-store.ts" import { generateId } from 'ai'; import { existsSync, mkdirSync } from 'fs'; import { writeFile } from 'fs/promises'; import path from 'path'; export async function createChat(): Promise { const id = generateId(); // generate a unique chat ID await writeFile(getChatFile(id), '[]'); // create an empty chat file return id; } function getChatFile(id: string): string { const chatDir = path.join(process.cwd(), '.chats'); if (!existsSync(chatDir)) mkdirSync(chatDir, { recursive: true }); return path.join(chatDir, `${id}.json`); } ``` ## Loading an existing chat When the user navigates to the chat page with a chat ID, we need to load the chat messages and display them. ```tsx filename="app/chat/[id]/page.tsx" import { loadChat } from '@tools/chat-store'; import Chat from '@ui/chat'; export default async function Page(props: { params: Promise<{ id: string }> }) { const { id } = await props.params; // get the chat ID from the URL const messages = await loadChat(id); // load the chat messages return ; // display the chat } ``` The `loadChat` function in our file-based chat store is implemented as follows: ```tsx filename="tools/chat-store.ts" import { Message } from 'ai'; import { readFile } from 'fs/promises'; export async function loadChat(id: string): Promise { return JSON.parse(await readFile(getChatFile(id), 'utf8')); } // ... rest of the file ``` The display component is a simple chat component that uses the `useChat` hook to send and receive messages: ```tsx filename="ui/chat.tsx" highlight="10-12" 'use client'; import { Message, useChat } from '@ai-sdk/react'; export default function Chat({ id, initialMessages, }: { id?: string | undefined; initialMessages?: Message[] } = {}) { const { input, handleInputChange, handleSubmit, messages } = useChat({ id, // use the provided chat ID initialMessages, // initial messages if provided sendExtraMessageFields: true, // send id and createdAt for each message }); // simplified rendering code, extend as needed: return (
{messages.map(m => (
{m.role === 'user' ? 'User: ' : 'AI: '} {m.content}
))}
); } ``` ## Storing messages `useChat` sends the chat id and the messages to the backend. We have enabled the `sendExtraMessageFields` option to send the id and createdAt fields, meaning that we store messages in the `useChat` message format. The `useChat` message format is different from the `CoreMessage` format. The `useChat` message format is designed for frontend display, and contains additional fields such as `id` and `createdAt`. We recommend storing the messages in the `useChat` message format. Storing messages is done in the `onFinish` callback of the `streamText` function. `onFinish` receives the messages from the AI response as a `CoreMessage[]`, and we use the [`appendResponseMessages`](/docs/reference/ai-sdk-ui/append-response-messages) helper to append the AI response messages to the chat messages. ```tsx filename="app/api/chat/route.ts" highlight="6,11-19" import { openai } from '@ai-sdk/openai'; import { appendResponseMessages, streamText } from 'ai'; import { saveChat } from '@tools/chat-store'; export async function POST(req: Request) { const { messages, id } = await req.json(); const result = streamText({ model: openai('gpt-4o-mini'), messages, async onFinish({ response }) { await saveChat({ id, messages: appendResponseMessages({ messages, responseMessages: response.messages, }), }); }, }); return result.toDataStreamResponse(); } ``` The actual storage of the messages is done in the `saveChat` function, which in our file-based chat store is implemented as follows: ```tsx filename="tools/chat-store.ts" import { Message } from 'ai'; import { writeFile } from 'fs/promises'; export async function saveChat({ id, messages, }: { id: string; messages: Message[]; }): Promise { const content = JSON.stringify(messages, null, 2); await writeFile(getChatFile(id), content); } // ... rest of the file ``` ## Message IDs In addition to a chat ID, each message has an ID. You can use this message ID to e.g. manipulate individual messages. The IDs for user messages are generated by the `useChat` hook on the client, and the IDs for AI response messages are generated by `streamText`. You can control the ID format by providing ID generators (see [`createIdGenerator()`](/docs/reference/ai-sdk-core/create-id-generator): ```tsx filename="ui/chat.tsx" highlight="8-12" import { createIdGenerator } from 'ai'; import { useChat } from '@ai-sdk/react'; const { // ... } = useChat({ // ... // id format for client-side messages: generateId: createIdGenerator({ prefix: 'msgc', size: 16, }), }); ``` ```tsx filename="app/api/chat/route.ts" highlight="7-11" import { createIdGenerator, streamText } from 'ai'; export async function POST(req: Request) { // ... const result = streamText({ // ... // id format for server-side messages: experimental_generateMessageId: createIdGenerator({ prefix: 'msgs', size: 16, }), }); // ... } ``` ## Sending only the last message Once you have implemented message persistence, you might want to send only the last message to the server. This reduces the amount of data sent to the server on each request and can improve performance. To achieve this, you can provide an `experimental_prepareRequestBody` function to the `useChat` hook (React only). This function receives the messages and the chat ID, and returns the request body to be sent to the server. ```tsx filename="ui/chat.tsx" highlight="7-10" import { useChat } from '@ai-sdk/react'; const { // ... } = useChat({ // ... // only send the last message to the server: experimental_prepareRequestBody({ messages, id }) { return { message: messages[messages.length - 1], id }; }, }); ``` On the server, you can then load the previous messages and append the new message to the previous messages: ```tsx filename="app/api/chat/route.ts" highlight="2-9" import { appendClientMessage } from 'ai'; export async function POST(req: Request) { // get the last message from the client: const { message, id } = await req.json(); // load the previous messages from the server: const previousMessages = await loadChat(id); // append the new message to the previous messages: const messages = appendClientMessage({ messages: previousMessages, message, }); const result = streamText({ // ... messages, }); // ... } ``` ## Handling client disconnects By default, the AI SDK `streamText` function uses backpressure to the language model provider to prevent the consumption of tokens that are not yet requested. However, this means that when the client disconnects, e.g. by closing the browser tab or because of a network issue, the stream from the LLM will be aborted and the conversation may end up in a broken state. Assuming that you have a [storage solution](#storing-messages) in place, you can use the `consumeStream` method to consume the stream on the backend, and then save the result as usual. `consumeStream` effectively removes the backpressure, meaning that the result is stored even when the client has already disconnected. ```tsx filename="app/api/chat/route.ts" highlight="21-23" import { appendResponseMessages, streamText } from 'ai'; import { saveChat } from '@tools/chat-store'; export async function POST(req: Request) { const { messages, id } = await req.json(); const result = streamText({ model, messages, async onFinish({ response }) { await saveChat({ id, messages: appendResponseMessages({ messages, responseMessages: response.messages, }), }); }, }); // consume the stream to ensure it runs to completion & triggers onFinish // even when the client response is aborted: result.consumeStream(); // no await return result.toDataStreamResponse(); } ``` When the client reloads the page after a disconnect, the chat will be restored from the storage solution. In production applications, you would also track the state of the request (in progress, complete) in your stored messages and use it on the client to cover the case where the client reloads the page after a disconnection, but the streaming is not yet complete. --- title: Chatbot Tool Usage description: Learn how to use tools with the useChat hook. --- # Chatbot Tool Usage With [`useChat`](/docs/reference/ai-sdk-ui/use-chat) and [`streamText`](/docs/reference/ai-sdk-core/stream-text), you can use tools in your chatbot application. The AI SDK supports three types of tools in this context: 1. Automatically executed server-side tools 2. Automatically executed client-side tools 3. Tools that require user interaction, such as confirmation dialogs The flow is as follows: 1. The user enters a message in the chat UI. 1. The message is sent to the API route. 1. In your server side route, the language model generates tool calls during the `streamText` call. 1. All tool calls are forwarded to the client. 1. Server-side tools are executed using their `execute` method and their results are forwarded to the client. 1. Client-side tools that should be automatically executed are handled with the `onToolCall` callback. You can return the tool result from the callback. 1. Client-side tool that require user interactions can be displayed in the UI. The tool calls and results are available as tool invocation parts in the `parts` property of the last assistant message. 1. When the user interaction is done, `addToolResult` can be used to add the tool result to the chat. 1. When there are tool calls in the last assistant message and all tool results are available, the client sends the updated messages back to the server. This triggers another iteration of this flow. The tool call and tool executions are integrated into the assistant message as tool invocation parts. A tool invocation is at first a tool call, and then it becomes a tool result when the tool is executed. The tool result contains all information about the tool call as well as the result of the tool execution. In order to automatically send another request to the server when all tool calls are server-side, you need to set [`maxSteps`](/docs/reference/ai-sdk-ui/use-chat#max-steps) to a value greater than 1 in the `useChat` options. It is disabled by default for backward compatibility. ## Example In this example, we'll use three tools: - `getWeatherInformation`: An automatically executed server-side tool that returns the weather in a given city. - `askForConfirmation`: A user-interaction client-side tool that asks the user for confirmation. - `getLocation`: An automatically executed client-side tool that returns a random city. ### API route ```tsx filename='app/api/chat/route.ts' import { openai } from '@ai-sdk/openai'; import { streamText } from 'ai'; import { z } from 'zod'; // Allow streaming responses up to 30 seconds export const maxDuration = 30; export async function POST(req: Request) { const { messages } = await req.json(); const result = streamText({ model: openai('gpt-4o'), messages, tools: { // server-side tool with execute function: getWeatherInformation: { description: 'show the weather in a given city to the user', parameters: z.object({ city: z.string() }), execute: async ({}: { city: string }) => { const weatherOptions = ['sunny', 'cloudy', 'rainy', 'snowy', 'windy']; return weatherOptions[ Math.floor(Math.random() * weatherOptions.length) ]; }, }, // client-side tool that starts user interaction: askForConfirmation: { description: 'Ask the user for confirmation.', parameters: z.object({ message: z.string().describe('The message to ask for confirmation.'), }), }, // client-side tool that is automatically executed on the client: getLocation: { description: 'Get the user location. Always ask for confirmation before using this tool.', parameters: z.object({}), }, }, }); return result.toDataStreamResponse(); } ``` ### Client-side page The client-side page uses the `useChat` hook to create a chatbot application with real-time message streaming. Tool invocations are displayed in the chat UI as tool invocation parts. Please make sure to render the messages using the `parts` property of the message. There are three things worth mentioning: 1. The [`onToolCall`](/docs/reference/ai-sdk-ui/use-chat#on-tool-call) callback is used to handle client-side tools that should be automatically executed. In this example, the `getLocation` tool is a client-side tool that returns a random city. 2. The `toolInvocations` property of the last assistant message contains all tool calls and results. The client-side tool `askForConfirmation` is displayed in the UI. It asks the user for confirmation and displays the result once the user confirms or denies the execution. The result is added to the chat using `addToolResult`. 3. The [`maxSteps`](/docs/reference/ai-sdk-ui/use-chat#max-steps) option is set to 5. This enables several tool use iterations between the client and the server. ```tsx filename='app/page.tsx' highlight="9,12,31" 'use client'; import { ToolInvocation } from 'ai'; import { useChat } from '@ai-sdk/react'; export default function Chat() { const { messages, input, handleInputChange, handleSubmit, addToolResult } = useChat({ maxSteps: 5, // run client-side tools that are automatically executed: async onToolCall({ toolCall }) { if (toolCall.toolName === 'getLocation') { const cities = [ 'New York', 'Los Angeles', 'Chicago', 'San Francisco', ]; return cities[Math.floor(Math.random() * cities.length)]; } }, }); return ( <> {messages?.map(message => (
{`${message.role}: `} {message.parts.map(part => { switch (part.type) { // render text parts as simple text: case 'text': return part.text; // for tool invocations, distinguish between the tools and the state: case 'tool-invocation': { const callId = part.toolInvocation.toolCallId; switch (part.toolInvocation.toolName) { case 'askForConfirmation': { switch (part.toolInvocation.state) { case 'call': return (
{part.toolInvocation.args.message}
); case 'result': return (
Location access allowed:{' '} {part.toolInvocation.result}
); } break; } case 'getLocation': { switch (part.toolInvocation.state) { case 'call': return
Getting location...
; case 'result': return (
Location: {part.toolInvocation.result}
); } break; } case 'getWeatherInformation': { switch (part.toolInvocation.state) { // example of pre-rendering streaming tool calls: case 'partial-call': return (
                            {JSON.stringify(part.toolInvocation, null, 2)}
                          
); case 'call': return (
Getting weather information for{' '} {part.toolInvocation.args.city}...
); case 'result': return (
Weather in {part.toolInvocation.args.city}:{' '} {part.toolInvocation.result}
); } break; } } } } })}
))}
); } ``` ## Tool call streaming You can stream tool calls while they are being generated by enabling the `toolCallStreaming` option in `streamText`. ```tsx filename='app/api/chat/route.ts' highlight="5" export async function POST(req: Request) { // ... const result = streamText({ toolCallStreaming: true, // ... }); return result.toDataStreamResponse(); } ``` When the flag is enabled, partial tool calls will be streamed as part of the data stream. They are available through the `useChat` hook. The tool invocation parts of assistant messages will also contain partial tool calls. You can use the `state` property of the tool invocation to render the correct UI. ```tsx filename='app/page.tsx' highlight="9,10" export default function Chat() { // ... return ( <> {messages?.map(message => (
{message.parts.map(part => { if (part.type === 'tool-invocation') { switch (part.toolInvocation.state) { case 'partial-call': return <>render partial tool call; case 'call': return <>render full tool call; case 'result': return <>render tool result; } } })}
))} ); } ``` ## Server-side Multi-Step Calls You can also use multi-step calls on the server-side with `streamText`. This works when all invoked tools have an `execute` function on the server side. ```tsx filename='app/api/chat/route.ts' highlight="15-21,24" import { openai } from '@ai-sdk/openai'; import { streamText } from 'ai'; import { z } from 'zod'; export async function POST(req: Request) { const { messages } = await req.json(); const result = streamText({ model: openai('gpt-4o'), messages, tools: { getWeatherInformation: { description: 'show the weather in a given city to the user', parameters: z.object({ city: z.string() }), // tool has execute function: execute: async ({}: { city: string }) => { const weatherOptions = ['sunny', 'cloudy', 'rainy', 'snowy', 'windy']; return weatherOptions[ Math.floor(Math.random() * weatherOptions.length) ]; }, }, }, maxSteps: 5, }); return result.toDataStreamResponse(); } ``` ## Errors Language models can make errors when calling tools. By default, these errors are masked for security reasons, and show up as "An error occurred" in the UI. To surface the errors, you can use the `getErrorMessage` function when calling `toDataStreamResponse`. ```tsx export function errorHandler(error: unknown) { if (error == null) { return 'unknown error'; } if (typeof error === 'string') { return error; } if (error instanceof Error) { return error.message; } return JSON.stringify(error); } ``` ```tsx const result = streamText({ // ... }); return result.toDataStreamResponse({ getErrorMessage: errorHandler, }); ``` In case you are using `createDataStreamResponse`, you can use the `onError` function when calling `toDataStreamResponse`: ```tsx const response = createDataStreamResponse({ // ... async execute(dataStream) { // ... }, onError: error => `Custom error: ${error.message}`, }); ``` --- title: Generative User Interfaces description: Learn how to build Generative UI with AI SDK UI. --- # Generative User Interfaces Generative user interfaces (generative UI) is the process of allowing a large language model (LLM) to go beyond text and "generate UI". This creates a more engaging and AI-native experience for users. At the core of generative UI are [ tools ](/docs/ai-sdk-core/tools-and-tool-calling), which are functions you provide to the model to perform specialized tasks like getting the weather in a location. The model can decide when and how to use these tools based on the context of the conversation. Generative UI is the process of connecting the results of a tool call to a React component. Here's how it works: 1. You provide the model with a prompt or conversation history, along with a set of tools. 2. Based on the context, the model may decide to call a tool. 3. If a tool is called, it will execute and return data. 4. This data can then be passed to a React component for rendering. By passing the tool results to React components, you can create a generative UI experience that's more engaging and adaptive to your needs. ## Build a Generative UI Chat Interface Let's create a chat interface that handles text-based conversations and incorporates dynamic UI elements based on model responses. ### Basic Chat Implementation Start with a basic chat implementation using the `useChat` hook: ```tsx filename="app/page.tsx" 'use client'; import { useChat } from '@ai-sdk/react'; export default function Page() { const { messages, input, handleInputChange, handleSubmit } = useChat(); return (
{messages.map(message => (
{message.role === 'user' ? 'User: ' : 'AI: '}
{message.content}
))}
); } ``` To handle the chat requests and model responses, set up an API route: ```ts filename="app/api/chat/route.ts" import { openai } from '@ai-sdk/openai'; import { streamText } from 'ai'; export async function POST(request: Request) { const { messages } = await request.json(); const result = streamText({ model: openai('gpt-4o'), system: 'You are a friendly assistant!', messages, maxSteps: 5, }); return result.toDataStreamResponse(); } ``` This API route uses the `streamText` function to process chat messages and stream the model's responses back to the client. ### Create a Tool Before enhancing your chat interface with dynamic UI elements, you need to create a tool and corresponding React component. A tool will allow the model to perform a specific action, such as fetching weather information. Create a new file called `ai/tools.ts` with the following content: ```ts filename="ai/tools.ts" import { tool as createTool } from 'ai'; import { z } from 'zod'; export const weatherTool = createTool({ description: 'Display the weather for a location', parameters: z.object({ location: z.string().describe('The location to get the weather for'), }), execute: async function ({ location }) { await new Promise(resolve => setTimeout(resolve, 2000)); return { weather: 'Sunny', temperature: 75, location }; }, }); export const tools = { displayWeather: weatherTool, }; ``` In this file, you've created a tool called `weatherTool`. This tool simulates fetching weather information for a given location. This tool will return simulated data after a 2-second delay. In a real-world application, you would replace this simulation with an actual API call to a weather service. ### Update the API Route Update the API route to include the tool you've defined: ```ts filename="app/api/chat/route.ts" highlight="3,13" import { openai } from '@ai-sdk/openai'; import { streamText } from 'ai'; import { tools } from '@/ai/tools'; export async function POST(request: Request) { const { messages } = await request.json(); const result = streamText({ model: openai('gpt-4o'), system: 'You are a friendly assistant!', messages, maxSteps: 5, tools, }); return result.toDataStreamResponse(); } ``` Now that you've defined the tool and added it to your `streamText` call, let's build a React component to display the weather information it returns. ### Create UI Components Create a new file called `components/weather.tsx`: ```tsx filename="components/weather.tsx" type WeatherProps = { temperature: number; weather: string; location: string; }; export const Weather = ({ temperature, weather, location }: WeatherProps) => { return (

Current Weather for {location}

Condition: {weather}

Temperature: {temperature}°C

); }; ``` This component will display the weather information for a given location. It takes three props: `temperature`, `weather`, and `location` (exactly what the `weatherTool` returns). ### Render the Weather Component Now that you have your tool and corresponding React component, let's integrate them into your chat interface. You'll render the Weather component when the model calls the weather tool. To check if the model has called a tool, you can use the `toolInvocations` property of the message object. This property contains information about any tools that were invoked in that generation including `toolCallId`, `toolName`, `args`, `toolState`, and `result`. Update your `page.tsx` file: ```tsx filename="app/page.tsx" highlight="4,16-39" 'use client'; import { useChat } from '@ai-sdk/react'; import { Weather } from '@/components/weather'; export default function Page() { const { messages, input, handleInputChange, handleSubmit } = useChat(); return (
{messages.map(message => (
{message.role === 'user' ? 'User: ' : 'AI: '}
{message.content}
{message.toolInvocations?.map(toolInvocation => { const { toolName, toolCallId, state } = toolInvocation; if (state === 'result') { if (toolName === 'displayWeather') { const { result } = toolInvocation; return (
); } } else { return (
{toolName === 'displayWeather' ? (
Loading weather...
) : null}
); } })}
))}
); } ``` In this updated code snippet, you: 1. Check if the message has `toolInvocations`. 2. Check if the tool invocation state is 'result'. 3. If it's a result and the tool name is 'displayWeather', render the Weather component. 4. If the tool invocation state is not 'result', show a loading message. This approach allows you to dynamically render UI components based on the model's responses, creating a more interactive and context-aware chat experience. ## Expanding Your Generative UI Application You can enhance your chat application by adding more tools and components, creating a richer and more versatile user experience. Here's how you can expand your application: ### Adding More Tools To add more tools, simply define them in your `ai/tools.ts` file: ```ts // Add a new stock tool export const stockTool = createTool({ description: 'Get price for a stock', parameters: z.object({ symbol: z.string().describe('The stock symbol to get the price for'), }), execute: async function ({ symbol }) { // Simulated API call await new Promise(resolve => setTimeout(resolve, 2000)); return { symbol, price: 100 }; }, }); // Update the tools object export const tools = { displayWeather: weatherTool, getStockPrice: stockTool, }; ``` Now, create a new file called `components/stock.tsx`: ```tsx type StockProps = { price: number; symbol: string; }; export const Stock = ({ price, symbol }: StockProps) => { return (

Stock Information

Symbol: {symbol}

Price: ${price}

); }; ``` Finally, update your `page.tsx` file to include the new Stock component: ```tsx 'use client'; import { useChat } from '@ai-sdk/react'; import { Weather } from '@/components/weather'; import { Stock } from '@/components/stock'; export default function Page() { const { messages, input, setInput, handleSubmit } = useChat(); return (
{messages.map(message => (
{message.role}
{message.content}
{message.toolInvocations?.map(toolInvocation => { const { toolName, toolCallId, state } = toolInvocation; if (state === 'result') { if (toolName === 'displayWeather') { const { result } = toolInvocation; return (
); } else if (toolName === 'getStockPrice') { const { result } = toolInvocation; return ; } } else { return (
{toolName === 'displayWeather' ? (
Loading weather...
) : toolName === 'getStockPrice' ? (
Loading stock price...
) : (
Loading...
)}
); } })}
))}
{ setInput(event.target.value); }} />
); } ``` By following this pattern, you can continue to add more tools and components, expanding the capabilities of your Generative UI application. --- title: Completion description: Learn how to use the useCompletion hook. --- # Completion The `useCompletion` hook allows you to create a user interface to handle text completions in your application. It enables the streaming of text completions from your AI provider, manages the state for chat input, and updates the UI automatically as new messages are received. In this guide, you will learn how to use the `useCompletion` hook in your application to generate text completions and stream them in real-time to your users. ## Example ```tsx filename='app/page.tsx' 'use client'; import { useCompletion } from '@ai-sdk/react'; export default function Page() { const { completion, input, handleInputChange, handleSubmit } = useCompletion({ api: '/api/completion', }); return (
{completion}
); } ``` ```ts filename='app/api/completion/route.ts' import { streamText } from 'ai'; import { openai } from '@ai-sdk/openai'; // Allow streaming responses up to 30 seconds export const maxDuration = 30; export async function POST(req: Request) { const { prompt }: { prompt: string } = await req.json(); const result = streamText({ model: openai('gpt-3.5-turbo'), prompt, }); return result.toDataStreamResponse(); } ``` In the `Page` component, the `useCompletion` hook will request to your AI provider endpoint whenever the user submits a message. The completion is then streamed back in real-time and displayed in the UI. This enables a seamless text completion experience where the user can see the AI response as soon as it is available, without having to wait for the entire response to be received. ## Customized UI `useCompletion` also provides ways to manage the prompt via code, show loading and error states, and update messages without being triggered by user interactions. ### Loading and error states To show a loading spinner while the chatbot is processing the user's message, you can use the `isLoading` state returned by the `useCompletion` hook: ```tsx const { isLoading, ... } = useCompletion() return( <> {isLoading ? : null} ) ``` Similarly, the `error` state reflects the error object thrown during the fetch request. It can be used to display an error message, or show a toast notification: ```tsx const { error, ... } = useCompletion() useEffect(() => { if (error) { toast.error(error.message) } }, [error]) // Or display the error message in the UI: return ( <> {error ?
{error.message}
: null} ) ``` ### Controlled input In the initial example, we have `handleSubmit` and `handleInputChange` callbacks that manage the input changes and form submissions. These are handy for common use cases, but you can also use uncontrolled APIs for more advanced scenarios such as form validation or customized components. The following example demonstrates how to use more granular APIs like `setInput` with your custom input and submit button components: ```tsx const { input, setInput } = useCompletion(); return ( <> setInput(value)} /> ); ``` ### Cancelation It's also a common use case to abort the response message while it's still streaming back from the AI provider. You can do this by calling the `stop` function returned by the `useCompletion` hook. ```tsx const { stop, isLoading, ... } = useCompletion() return ( <> ) ``` When the user clicks the "Stop" button, the fetch request will be aborted. This avoids consuming unnecessary resources and improves the UX of your application. ### Throttling UI Updates This feature is currently only available for React. By default, the `useCompletion` hook will trigger a render every time a new chunk is received. You can throttle the UI updates with the `experimental_throttle` option. ```tsx filename="page.tsx" highlight="2-3" const { completion, ... } = useCompletion({ // Throttle the completion and data updates to 50ms: experimental_throttle: 50 }) ``` ## Event Callbacks `useCompletion` also provides optional event callbacks that you can use to handle different stages of the chatbot lifecycle. These callbacks can be used to trigger additional actions, such as logging, analytics, or custom UI updates. ```tsx const { ... } = useCompletion({ onResponse: (response: Response) => { console.log('Received response from server:', response) }, onFinish: (message: Message) => { console.log('Finished streaming message:', message) }, onError: (error: Error) => { console.error('An error occurred:', error) }, }) ``` It's worth noting that you can abort the processing by throwing an error in the `onResponse` callback. This will trigger the `onError` callback and stop the message from being appended to the chat UI. This can be useful for handling unexpected responses from the AI provider. ## Configure Request Options By default, the `useCompletion` hook sends a HTTP POST request to the `/api/completion` endpoint with the prompt as part of the request body. You can customize the request by passing additional options to the `useCompletion` hook: ```tsx const { messages, input, handleInputChange, handleSubmit } = useCompletion({ api: '/api/custom-completion', headers: { Authorization: 'your_token', }, body: { user_id: '123', }, credentials: 'same-origin', }); ``` In this example, the `useCompletion` hook sends a POST request to the `/api/completion` endpoint with the specified headers, additional body fields, and credentials for that fetch request. On your server side, you can handle the request with these additional information. --- title: Object Generation description: Learn how to use the useObject hook. --- # Object Generation `useObject` is an experimental feature and only available in React. The [`useObject`](/docs/reference/ai-sdk-ui/use-object) hook allows you to create interfaces that represent a structured JSON object that is being streamed. In this guide, you will learn how to use the `useObject` hook in your application to generate UIs for structured data on the fly. ## Example The example shows a small notfications demo app that generates fake notifications in real-time. ### Schema It is helpful to set up the schema in a separate file that is imported on both the client and server. ```ts filename='app/api/notifications/schema.ts' import { z } from 'zod'; // define a schema for the notifications export const notificationSchema = z.object({ notifications: z.array( z.object({ name: z.string().describe('Name of a fictional person.'), message: z.string().describe('Message. Do not use emojis or links.'), }), ), }); ``` ### Client The client uses [`useObject`](/docs/reference/ai-sdk-ui/use-object) to stream the object generation process. The results are partial and are displayed as they are received. Please note the code for handling `undefined` values in the JSX. ```tsx filename='app/page.tsx' 'use client'; import { experimental_useObject as useObject } from '@ai-sdk/react'; import { notificationSchema } from './api/notifications/schema'; export default function Page() { const { object, submit } = useObject({ api: '/api/notifications', schema: notificationSchema, }); return ( <> {object?.notifications?.map((notification, index) => (

{notification?.name}

{notification?.message}

))} ); } ``` ### Server On the server, we use [`streamObject`](/docs/reference/ai-sdk-core/stream-object) to stream the object generation process. ```typescript filename='app/api/notifications/route.ts' import { openai } from '@ai-sdk/openai'; import { streamObject } from 'ai'; import { notificationSchema } from './schema'; // Allow streaming responses up to 30 seconds export const maxDuration = 30; export async function POST(req: Request) { const context = await req.json(); const result = streamObject({ model: openai('gpt-4-turbo'), schema: notificationSchema, prompt: `Generate 3 notifications for a messages app in this context:` + context, }); return result.toTextStreamResponse(); } ``` ## Customized UI `useObject` also provides ways to show loading and error states: ### Loading State The `isLoading` state returned by the `useObject` hook can be used for several purposes: - To show a loading spinner while the object is generated. - To disable the submit button. ```tsx filename='app/page.tsx' highlight="6,13-20,24" 'use client'; import { useObject } from '@ai-sdk/react'; export default function Page() { const { isLoading, object, submit } = useObject({ api: '/api/notifications', schema: notificationSchema, }); return ( <> {isLoading && } {object?.notifications?.map((notification, index) => (

{notification?.name}

{notification?.message}

))} ); } ``` ### Stop Handler The `stop` function can be used to stop the object generation process. This can be useful if the user wants to cancel the request or if the server is taking too long to respond. ```tsx filename='app/page.tsx' highlight="6,14-16" 'use client'; import { useObject } from '@ai-sdk/react'; export default function Page() { const { isLoading, stop, object, submit } = useObject({ api: '/api/notifications', schema: notificationSchema, }); return ( <> {isLoading && ( )} {object?.notifications?.map((notification, index) => (

{notification?.name}

{notification?.message}

))} ); } ``` ### Error State Similarly, the `error` state reflects the error object thrown during the fetch request. It can be used to display an error message, or to disable the submit button: We recommend showing a generic error message to the user, such as "Something went wrong." This is a good practice to avoid leaking information from the server. ```tsx file="app/page.tsx" highlight="6,13" 'use client'; import { useObject } from '@ai-sdk/react'; export default function Page() { const { error, object, submit } = useObject({ api: '/api/notifications', schema: notificationSchema, }); return ( <> {error &&
An error occurred.
} {object?.notifications?.map((notification, index) => (

{notification?.name}

{notification?.message}

))} ); } ``` ## Event Callbacks `useObject` provides optional event callbacks that you can use to handle life-cycle events. - `onFinish`: Called when the object generation is completed. - `onError`: Called when an error occurs during the fetch request. These callbacks can be used to trigger additional actions, such as logging, analytics, or custom UI updates. ```tsx filename='app/page.tsx' highlight="10-20" 'use client'; import { experimental_useObject as useObject } from '@ai-sdk/react'; import { notificationSchema } from './api/notifications/schema'; export default function Page() { const { object, submit } = useObject({ api: '/api/notifications', schema: notificationSchema, onFinish({ object, error }) { // typed object, undefined if schema validation fails: console.log('Object generation completed:', object); // error, undefined if schema validation succeeds: console.log('Schema validation error:', error); }, onError(error) { // error during fetch request: console.error('An error occurred:', error); }, }); return (
{object?.notifications?.map((notification, index) => (

{notification?.name}

{notification?.message}

))}
); } ``` ## Configure Request Options You can configure the API endpoint and optional headers using the `api` and `headers` settings. ```tsx highlight="2-5" const { submit, object } = useObject({ api: '/api/use-object', headers: { 'X-Custom-Header': 'CustomValue', }, schema: yourSchema, }); ``` --- title: OpenAI Assistants description: Learn how to use the useAssistant hook. --- # OpenAI Assistants The `useAssistant` hook allows you to handle the client state when interacting with an OpenAI compatible assistant API. This hook is useful when you want to integrate assistant capabilities into your application, with the UI updated automatically as the assistant is streaming its execution. The `useAssistant` hook is supported in `@ai-sdk/react`, `ai/svelte`, and `ai/vue`. ## Example ```tsx filename='app/page.tsx' 'use client'; import { Message, useAssistant } from '@ai-sdk/react'; export default function Chat() { const { status, messages, input, submitMessage, handleInputChange } = useAssistant({ api: '/api/assistant' }); return (
{messages.map((m: Message) => (
{`${m.role}: `} {m.role !== 'data' && m.content} {m.role === 'data' && ( <> {(m.data as any).description}
                {JSON.stringify(m.data, null, 2)}
              
)}
))} {status === 'in_progress' &&
}
); } ``` ```tsx filename='app/api/assistant/route.ts' import { AssistantResponse } from 'ai'; import OpenAI from 'openai'; const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY || '', }); // Allow streaming responses up to 30 seconds export const maxDuration = 30; export async function POST(req: Request) { // Parse the request body const input: { threadId: string | null; message: string; } = await req.json(); // Create a thread if needed const threadId = input.threadId ?? (await openai.beta.threads.create({})).id; // Add a message to the thread const createdMessage = await openai.beta.threads.messages.create(threadId, { role: 'user', content: input.message, }); return AssistantResponse( { threadId, messageId: createdMessage.id }, async ({ forwardStream, sendDataMessage }) => { // Run the assistant on the thread const runStream = openai.beta.threads.runs.stream(threadId, { assistant_id: process.env.ASSISTANT_ID ?? (() => { throw new Error('ASSISTANT_ID is not set'); })(), }); // forward run status would stream message deltas let runResult = await forwardStream(runStream); // status can be: queued, in_progress, requires_action, cancelling, cancelled, failed, completed, or expired while ( runResult?.status === 'requires_action' && runResult.required_action?.type === 'submit_tool_outputs' ) { const tool_outputs = runResult.required_action.submit_tool_outputs.tool_calls.map( (toolCall: any) => { const parameters = JSON.parse(toolCall.function.arguments); switch (toolCall.function.name) { // configure your tool calls here default: throw new Error( `Unknown tool call function: ${toolCall.function.name}`, ); } }, ); runResult = await forwardStream( openai.beta.threads.runs.submitToolOutputsStream( threadId, runResult.id, { tool_outputs }, ), ); } }, ); } ``` ## Customized UI `useAssistant` also provides ways to manage the chat message and input states via code and show loading and error states. ### Loading and error states To show a loading spinner while the assistant is running the thread, you can use the `status` state returned by the `useAssistant` hook: ```tsx const { status, ... } = useAssistant() return( <> {status === "in_progress" ? : null} ) ``` Similarly, the `error` state reflects the error object thrown during the fetch request. It can be used to display an error message, or show a toast notification: ```tsx const { error, ... } = useAssistant() useEffect(() => { if (error) { toast.error(error.message) } }, [error]) // Or display the error message in the UI: return ( <> {error ?
{error.message}
: null} ) ``` ### Controlled input In the initial example, we have `handleSubmit` and `handleInputChange` callbacks that manage the input changes and form submissions. These are handy for common use cases, but you can also use uncontrolled APIs for more advanced scenarios such as form validation or customized components. The following example demonstrates how to use more granular APIs like `append` with your custom input and submit button components: ```tsx const { append } = useAssistant(); return ( <> { // Send a new message to the AI provider append({ role: 'user', content: input, }); }} /> ); ``` ## Configure Request Options By default, the `useAssistant` hook sends a HTTP POST request to the `/api/assistant` endpoint with the prompt as part of the request body. You can customize the request by passing additional options to the `useAssistant` hook: ```tsx const { messages, input, handleInputChange, handleSubmit } = useAssistant({ api: '/api/custom-completion', headers: { Authorization: 'your_token', }, body: { user_id: '123', }, credentials: 'same-origin', }); ``` In this example, the `useAssistant` hook sends a POST request to the `/api/custom-completion` endpoint with the specified headers, additional body fields, and credentials for that fetch request. On your server side, you can handle the request with these additional information. --- title: Streaming Custom Data description: Learn how to stream custom data to the client. --- # Streaming Custom Data It is often useful to send additional data alongside the model's response. For example, you may want to send status information, the message ids after storing them, or references to content that the language model is referring to. The AI SDK provides several helpers that allows you to stream additional data to the client and attach it either to the `Message` or to the `data` object of the `useChat` hook: - `createDataStream`: creates a data stream - `createDataStreamResponse`: creates a response object that streams data - `pipeDataStreamToResponse`: pipes a data stream to a server response object The data is streamed as part of the response stream. ## Sending Custom Data from the Server In your server-side route handler, you can use `createDataStreamResponse` and `pipeDataStreamToResponse` in combination with `streamText`. You need to: 1. Call `createDataStreamResponse` or `pipeDataStreamToResponse` to get a callback function with a `DataStreamWriter`. 2. Write to the `DataStreamWriter` to stream additional data. 3. Merge the `streamText` result into the `DataStreamWriter`. 4. Return the response from `createDataStreamResponse` (if that method is used) Here is an example: ```tsx filename="route.ts" highlight="7-10,16,19-23,25-26,30" import { openai } from '@ai-sdk/openai'; import { generateId, createDataStreamResponse, streamText } from 'ai'; export async function POST(req: Request) { const { messages } = await req.json(); // immediately start streaming (solves RAG issues with status, etc.) return createDataStreamResponse({ execute: dataStream => { dataStream.writeData('initialized call'); const result = streamText({ model: openai('gpt-4o'), messages, onChunk() { dataStream.writeMessageAnnotation({ chunk: '123' }); }, onFinish() { // message annotation: dataStream.writeMessageAnnotation({ id: generateId(), // e.g. id from saved DB record other: 'information', }); // call annotation: dataStream.writeData('call completed'); }, }); result.mergeIntoDataStream(dataStream); }, onError: error => { // Error messages are masked by default for security reasons. // If you want to expose the error message to the client, you can do so here: return error instanceof Error ? error.message : String(error); }, }); } ``` You can also send stream data from custom backends, e.g. Python / FastAPI, using the [Data Stream Protocol](/docs/ai-sdk-ui/stream-protocol#data-stream-protocol). ## Sending Custom Sources You can send custom sources to the client using the `writeSource` method on the `DataStreamWriter`: ```tsx filename="route.ts" highlight="9-15" import { openai } from '@ai-sdk/openai'; import { createDataStreamResponse, streamText } from 'ai'; export async function POST(req: Request) { const { messages } = await req.json(); return createDataStreamResponse({ execute: dataStream => { // write a custom url source to the stream: dataStream.writeSource({ sourceType: 'url', id: 'source-1', url: 'https://example.com', title: 'Example Source', }); const result = streamText({ model: openai('gpt-4o'), messages, }); result.mergeIntoDataStream(dataStream); }, }); } ``` ## Processing Custom Data in `useChat` The `useChat` hook automatically processes the streamed data and makes it available to you. ### Accessing Data On the client, you can destructure `data` from the `useChat` hook which stores all `StreamData` as a `JSONValue[]`. ```tsx filename="page.tsx" import { useChat } from '@ai-sdk/react'; const { data } = useChat(); ``` ### Accessing Message Annotations Each message from the `useChat` hook has an optional `annotations` property that contains the message annotations sent from the server. Since the shape of the annotations depends on what you send from the server, you have to destructure them in a type-safe way on the client side. Here we just show the annotations as a JSON string: ```tsx filename="page.tsx" highlight="9" import { Message, useChat } from '@ai-sdk/react'; const { messages } = useChat(); const result = ( <> {messages?.map((m: Message) => (
{m.annotations && <>{JSON.stringify(m.annotations)}}
))} ); ``` ### Updating and Clearing Data You can update and clear the `data` object of the `useChat` hook using the `setData` function. ```tsx filename="page.tsx" const { setData } = useChat(); // clear existing data setData(undefined); // set new data setData([{ test: 'value' }]); // transform existing data, e.g. adding additional values: setData(currentData => [...currentData, { test: 'value' }]); ``` #### Example: Clear on Submit ```tsx filename="page.tsx" highlight="18-21" 'use client'; import { Message, useChat } from '@ai-sdk/react'; export default function Chat() { const { messages, input, handleInputChange, handleSubmit, data, setData } = useChat(); return ( <> {data &&
{JSON.stringify(data, null, 2)}
} {messages?.map((m: Message) => (
{`${m.role}: ${m.content}`}
))}
{ setData(undefined); // clear stream data handleSubmit(e); }} >
); } ``` --- title: Error Handling description: Learn how to handle errors in the AI SDK UI --- # Error Handling ### Error Helper Object Each AI SDK UI hook also returns an [error](/docs/reference/ai-sdk-ui/use-chat#error) object that you can use to render the error in your UI. You can use the error object to show an error message, disable the submit button, or show a retry button. We recommend showing a generic error message to the user, such as "Something went wrong." This is a good practice to avoid leaking information from the server. ```tsx file="app/page.tsx" highlight="7,17-24,30" 'use client'; import { useChat } from '@ai-sdk/react'; export default function Chat() { const { messages, input, handleInputChange, handleSubmit, error, reload } = useChat({}); return (
{messages.map(m => (
{m.role}: {m.content}
))} {error && ( <>
An error occurred.
)}
); } ``` #### Alternative: replace last message Alternatively you can write a custom submit handler that replaces the last message when an error is present. ```tsx file="app/page.tsx" highlight="15-21,33" 'use client'; import { useChat } from '@ai-sdk/react'; export default function Chat() { const { handleInputChange, handleSubmit, error, input, messages, setMessages, } = useChat({}); function customSubmit(event: React.FormEvent) { if (error != null) { setMessages(messages.slice(0, -1)); // remove last message } handleSubmit(event); } return (
{messages.map(m => (
{m.role}: {m.content}
))} {error &&
An error occurred.
}
); } ``` ### Error Handling Callback Errors can be processed by passing an [`onError`](/docs/reference/ai-sdk-ui/use-chat#on-error) callback function as an option to the [`useChat`](/docs/reference/ai-sdk-ui/use-chat), [`useCompletion`](/docs/reference/ai-sdk-ui/use-completion) or [`useAssistant`](/docs/reference/ai-sdk-ui/use-assistant) hooks. The callback function receives an error object as an argument. ```tsx file="app/page.tsx" highlight="8-11" import { useChat } from '@ai-sdk/react'; export default function Page() { const { /* ... */ } = useChat({ // handle error: onError: error => { console.error(error); }, }); } ``` ### Injecting Errors for Testing You might want to create errors for testing. You can easily do so by throwing an error in your route handler: ```ts file="app/api/chat/route.ts" export async function POST(req: Request) { throw new Error('This is a test error'); } ``` --- title: AI_APICallError description: Learn how to fix AI_APICallError --- # AI_APICallError This error occurs when an API call fails. ## Properties - `url`: The URL of the API request that failed - `requestBodyValues`: The request body values sent to the API - `statusCode`: The HTTP status code returned by the API - `responseHeaders`: The response headers returned by the API - `responseBody`: The response body returned by the API - `isRetryable`: Whether the request can be retried based on the status code - `data`: Any additional data associated with the error ## Checking for this Error You can check if an error is an instance of `AI_APICallError` using: ```typescript import { APICallError } from 'ai'; if (APICallError.isInstance(error)) { // Handle the error } ``` --- title: AI_DownloadError description: Learn how to fix AI_DownloadError --- # AI_DownloadError This error occurs when a download fails. ## Properties - `url`: The URL that failed to download - `statusCode`: The HTTP status code returned by the server - `statusText`: The HTTP status text returned by the server - `message`: The error message containing details about the download failure ## Checking for this Error You can check if an error is an instance of `AI_DownloadError` using: ```typescript import { DownloadError } from 'ai'; if (DownloadError.isInstance(error)) { // Handle the error } ``` --- title: AI_EmptyResponseBodyError description: Learn how to fix AI_EmptyResponseBodyError --- # AI_EmptyResponseBodyError This error occurs when the server returns an empty response body. ## Properties - `message`: The error message ## Checking for this Error You can check if an error is an instance of `AI_EmptyResponseBodyError` using: ```typescript import { EmptyResponseBodyError } from 'ai'; if (EmptyResponseBodyError.isInstance(error)) { // Handle the error } ``` --- title: AI_InvalidArgumentError description: Learn how to fix AI_InvalidArgumentError --- # AI_InvalidArgumentError This error occurs when an invalid argument was provided. ## Properties - `parameter`: The name of the parameter that is invalid - `value`: The invalid value - `message`: The error message ## Checking for this Error You can check if an error is an instance of `AI_InvalidArgumentError` using: ```typescript import { InvalidArgumentError } from 'ai'; if (InvalidArgumentError.isInstance(error)) { // Handle the error } ``` --- title: AI_InvalidDataContentError description: How to fix AI_InvalidDataContentError --- # AI_InvalidDataContentError This error occurs when the data content provided in a multi-modal message part is invalid. Check out the [ prompt examples for multi-modal messages ](/docs/foundations/prompts#message-prompts). ## Properties - `content`: The invalid content value - `message`: The error message describing the expected and received content types ## Checking for this Error You can check if an error is an instance of `AI_InvalidDataContentError` using: ```typescript import { InvalidDataContentError } from 'ai'; if (InvalidDataContentError.isInstance(error)) { // Handle the error } ``` --- title: AI_InvalidDataContent description: Learn how to fix AI_InvalidDataContent --- # AI_InvalidDataContent This error occurs when invalid data content is provided. ## Properties - `content`: The invalid content value - `message`: The error message - `cause`: The cause of the error ## Checking for this Error You can check if an error is an instance of `AI_InvalidDataContent` using: ```typescript import { InvalidDataContent } from 'ai'; if (InvalidDataContent.isInstance(error)) { // Handle the error } ``` --- title: AI_InvalidMessageRoleError description: Learn how to fix AI_InvalidMessageRoleError --- # AI_InvalidMessageRoleError This error occurs when an invalid message role is provided. ## Properties - `role`: The invalid role value - `message`: The error message ## Checking for this Error You can check if an error is an instance of `AI_InvalidMessageRoleError` using: ```typescript import { InvalidMessageRoleError } from 'ai'; if (InvalidMessageRoleError.isInstance(error)) { // Handle the error } ``` --- title: AI_InvalidPromptError description: Learn how to fix AI_InvalidPromptError --- # AI_InvalidPromptError This error occurs when the prompt provided is invalid. ## Properties - `prompt`: The invalid prompt value - `message`: The error message - `cause`: The cause of the error ## Checking for this Error You can check if an error is an instance of `AI_InvalidPromptError` using: ```typescript import { InvalidPromptError } from 'ai'; if (InvalidPromptError.isInstance(error)) { // Handle the error } ``` --- title: AI_InvalidResponseDataError description: Learn how to fix AI_InvalidResponseDataError --- # AI_InvalidResponseDataError This error occurs when the server returns a response with invalid data content. ## Properties - `data`: The invalid response data value - `message`: The error message ## Checking for this Error You can check if an error is an instance of `AI_InvalidResponseDataError` using: ```typescript import { InvalidResponseDataError } from 'ai'; if (InvalidResponseDataError.isInstance(error)) { // Handle the error } ``` --- title: AI_InvalidToolArgumentsError description: Learn how to fix AI_InvalidToolArgumentsError --- # AI_InvalidToolArgumentsError This error occurs when invalid tool argument was provided. ## Properties - `toolName`: The name of the tool with invalid arguments - `toolArgs`: The invalid tool arguments - `message`: The error message - `cause`: The cause of the error ## Checking for this Error You can check if an error is an instance of `AI_InvalidToolArgumentsError` using: ```typescript import { InvalidToolArgumentsError } from 'ai'; if (InvalidToolArgumentsError.isInstance(error)) { // Handle the error } ``` --- title: AI_JSONParseError description: Learn how to fix AI_JSONParseError --- # AI_JSONParseError This error occurs when JSON fails to parse. ## Properties - `text`: The text value that could not be parsed - `message`: The error message including parse error details ## Checking for this Error You can check if an error is an instance of `AI_JSONParseError` using: ```typescript import { JSONParseError } from 'ai'; if (JSONParseError.isInstance(error)) { // Handle the error } ``` --- title: AI_LoadAPIKeyError description: Learn how to fix AI_LoadAPIKeyError --- # AI_LoadAPIKeyError This error occurs when API key is not loaded successfully. ## Properties - `message`: The error message ## Checking for this Error You can check if an error is an instance of `AI_LoadAPIKeyError` using: ```typescript import { LoadAPIKeyError } from 'ai'; if (LoadAPIKeyError.isInstance(error)) { // Handle the error } ``` --- title: AI_LoadSettingError description: Learn how to fix AI_LoadSettingError --- # AI_LoadSettingError This error occurs when a setting is not loaded successfully. ## Properties - `message`: The error message ## Checking for this Error You can check if an error is an instance of `AI_LoadSettingError` using: ```typescript import { LoadSettingError } from 'ai'; if (LoadSettingError.isInstance(error)) { // Handle the error } ``` --- title: AI_MessageConversionError description: Learn how to fix AI_MessageConversionError --- # AI_MessageConversionError This error occurs when message conversion fails. ## Properties - `originalMessage`: The original message that failed conversion - `message`: The error message ## Checking for this Error You can check if an error is an instance of `AI_MessageConversionError` using: ```typescript import { MessageConversionError } from 'ai'; if (MessageConversionError.isInstance(error)) { // Handle the error } ``` --- title: AI_NoContentGeneratedError description: Learn how to fix AI_NoContentGeneratedError --- # AI_NoContentGeneratedError This error occurs when the AI provider fails to generate content. ## Properties - `message`: The error message ## Checking for this Error You can check if an error is an instance of `AI_NoContentGeneratedError` using: ```typescript import { NoContentGeneratedError } from 'ai'; if (NoContentGeneratedError.isInstance(error)) { // Handle the error } ``` --- title: AI_NoImageGeneratedError description: Learn how to fix AI_NoImageGeneratedError --- # AI_NoImageGeneratedError This error occurs when the AI provider fails to generate an image. It can arise due to the following reasons: - The model failed to generate a response. - The model generated an invalid response. ## Properties - `message`: The error message. - `responses`: Metadata about the image model responses, including timestamp, model, and headers. - `cause`: The cause of the error. You can use this for more detailed error handling. ## Checking for this Error You can check if an error is an instance of `AI_NoImageGeneratedError` using: ```typescript import { generateImage, NoImageGeneratedError } from 'ai'; try { await generateImage({ model, prompt }); } catch (error) { if (NoImageGeneratedError.isInstance(error)) { console.log('NoImageGeneratedError'); console.log('Cause:', error.cause); console.log('Responses:', error.responses); } } ``` --- title: AI_NoObjectGeneratedError description: Learn how to fix AI_NoObjectGeneratedError --- # AI_NoObjectGeneratedError This error occurs when the AI provider fails to generate a parsable object that conforms to the schema. It can arise due to the following reasons: - The model failed to generate a response. - The model generated a response that could not be parsed. - The model generated a response that could not be validated against the schema. ## Properties - `message`: The error message. - `text`: The text that was generated by the model. This can be the raw text or the tool call text, depending on the object generation mode. - `response`: Metadata about the language model response, including response id, timestamp, and model. - `usage`: Request token usage. - `cause`: The cause of the error (e.g. a JSON parsing error). You can use this for more detailed error handling. ## Checking for this Error You can check if an error is an instance of `AI_NoObjectGeneratedError` using: ```typescript import { generateObject, NoObjectGeneratedError } from 'ai'; try { await generateObject({ model, schema, prompt }); } catch (error) { if (NoObjectGeneratedError.isInstance(error)) { console.log('NoObjectGeneratedError'); console.log('Cause:', error.cause); console.log('Text:', error.text); console.log('Response:', error.response); console.log('Usage:', error.usage); } } ``` --- title: AI_NoOutputSpecifiedError description: Learn how to fix AI_NoOutputSpecifiedError --- # AI_NoOutputSpecifiedError This error occurs when no output format was specified for the AI response, and output-related methods are called. ## Properties - `message`: The error message (defaults to 'No output specified.') ## Checking for this Error You can check if an error is an instance of `AI_NoOutputSpecifiedError` using: ```typescript import { NoOutputSpecifiedError } from 'ai'; if (NoOutputSpecifiedError.isInstance(error)) { // Handle the error } ``` --- title: AI_NoSuchModelError description: Learn how to fix AI_NoSuchModelError --- # AI_NoSuchModelError This error occurs when a model ID is not found. ## Properties - `modelId`: The ID of the model that was not found - `modelType`: The type of model - `message`: The error message ## Checking for this Error You can check if an error is an instance of `AI_NoSuchModelError` using: ```typescript import { NoSuchModelError } from 'ai'; if (NoSuchModelError.isInstance(error)) { // Handle the error } ``` --- title: AI_NoSuchProviderError description: Learn how to fix AI_NoSuchProviderError --- # AI_NoSuchProviderError This error occurs when a provider ID is not found. ## Properties - `providerId`: The ID of the provider that was not found - `availableProviders`: Array of available provider IDs - `modelId`: The ID of the model - `modelType`: The type of model - `message`: The error message ## Checking for this Error You can check if an error is an instance of `AI_NoSuchProviderError` using: ```typescript import { NoSuchProviderError } from 'ai'; if (NoSuchProviderError.isInstance(error)) { // Handle the error } ``` --- title: AI_NoSuchToolError description: Learn how to fix AI_NoSuchToolError --- # AI_NoSuchToolError This error occurs when a model tries to call an unavailable tool. ## Properties - `toolName`: The name of the tool that was not found - `availableTools`: Array of available tool names - `message`: The error message ## Checking for this Error You can check if an error is an instance of `AI_NoSuchToolError` using: ```typescript import { NoSuchToolError } from 'ai'; if (NoSuchToolError.isInstance(error)) { // Handle the error } ``` --- title: AI_RetryError description: Learn how to fix AI_RetryError --- # AI_RetryError This error occurs when a retry operation fails. ## Properties - `reason`: The reason for the retry failure - `lastError`: The most recent error that occurred during retries - `errors`: Array of all errors that occurred during retry attempts - `message`: The error message ## Checking for this Error You can check if an error is an instance of `AI_RetryError` using: ```typescript import { RetryError } from 'ai'; if (RetryError.isInstance(error)) { // Handle the error } ``` --- title: AI_TooManyEmbeddingValuesForCallError description: Learn how to fix AI_TooManyEmbeddingValuesForCallError --- # AI_TooManyEmbeddingValuesForCallError This error occurs when too many values are provided in a single embedding call. ## Properties - `provider`: The AI provider name - `modelId`: The ID of the embedding model - `maxEmbeddingsPerCall`: The maximum number of embeddings allowed per call - `values`: The array of values that was provided ## Checking for this Error You can check if an error is an instance of `AI_TooManyEmbeddingValuesForCallError` using: ```typescript import { TooManyEmbeddingValuesForCallError } from 'ai'; if (TooManyEmbeddingValuesForCallError.isInstance(error)) { // Handle the error } ``` --- title: ToolCallRepairError description: Learn how to fix AI SDK ToolCallRepairError --- # ToolCallRepairError This error occurs when there is a failure while attempting to repair an invalid tool call. This typically happens when the AI attempts to fix either a `NoSuchToolError` or `InvalidToolArgumentsError`. ## Properties - `originalError`: The original error that triggered the repair attempt (either `NoSuchToolError` or `InvalidToolArgumentsError`) - `message`: The error message - `cause`: The underlying error that caused the repair to fail ## Checking for this Error You can check if an error is an instance of `ToolCallRepairError` using: ```typescript import { ToolCallRepairError } from 'ai'; if (ToolCallRepairError.isInstance(error)) { // Handle the error } ``` --- title: AI_ToolExecutionError description: Learn how to fix AI_ToolExecutionError --- # AI_ToolExecutionError This error occurs when there is a failure during the execution of a tool. ## Properties - `toolName`: The name of the tool that failed - `toolArgs`: The arguments passed to the tool - `toolCallId`: The ID of the tool call that failed - `message`: The error message - `cause`: The underlying error that caused the tool execution to fail ## Checking for this Error You can check if an error is an instance of `AI_ToolExecutionError` using: ```typescript import { ToolExecutionError } from 'ai'; if (ToolExecutionError.isInstance(error)) { // Handle the error } ``` --- title: AI_TypeValidationError description: Learn how to fix AI_TypeValidationError --- # AI_TypeValidationError This error occurs when type validation fails. ## Properties - `value`: The value that failed validation - `message`: The error message including validation details ## Checking for this Error You can check if an error is an instance of `AI_TypeValidationError` using: ```typescript import { TypeValidationError } from 'ai'; if (TypeValidationError.isInstance(error)) { // Handle the error } ``` --- title: AI_UnsupportedFunctionalityError description: Learn how to fix AI_UnsupportedFunctionalityError --- # AI_UnsupportedFunctionalityError This error occurs when functionality is not unsupported. ## Properties - `functionality`: The name of the unsupported functionality - `message`: The error message ## Checking for this Error You can check if an error is an instance of `AI_UnsupportedFunctionalityError` using: ```typescript import { UnsupportedFunctionalityError } from 'ai'; if (UnsupportedFunctionalityError.isInstance(error)) { // Handle the error } ``` --- title: OpenAI description: Learn how to use the OpenAI provider for the AI SDK. --- # OpenAI Provider The [OpenAI](https://openai.com/) provider contains language model support for the OpenAI responses, chat, and completion APIs, as well as embedding model support for the OpenAI embeddings API. ## Setup The OpenAI provider is available in the `@ai-sdk/openai` module. You can install it with ## Provider Instance You can import the default provider instance `openai` from `@ai-sdk/openai`: ```ts import { openai } from '@ai-sdk/openai'; ``` If you need a customized setup, you can import `createOpenAI` from `@ai-sdk/openai` and create a provider instance with your settings: ```ts import { createOpenAI } from '@ai-sdk/openai'; const openai = createOpenAI({ // custom settings, e.g. compatibility: 'strict', // strict mode, enable when using the OpenAI API }); ``` You can use the following optional settings to customize the OpenAI provider instance: - **baseURL** _string_ Use a different URL prefix for API calls, e.g. to use proxy servers. The default prefix is `https://api.openai.com/v1`. - **apiKey** _string_ API key that is being sent using the `Authorization` header. It defaults to the `OPENAI_API_KEY` environment variable. - **name** _string_ The provider name. You can set this when using OpenAI compatible providers to change the model provider property. Defaults to `openai`. - **organization** _string_ OpenAI Organization. - **project** _string_ OpenAI project. - **headers** _Record<string,string>_ Custom headers to include in the requests. - **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_ Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation. Defaults to the global `fetch` function. You can use it as a middleware to intercept requests, or to provide a custom fetch implementation for e.g. testing. - **compatibility** _"strict" | "compatible"_ OpenAI compatibility mode. Should be set to `strict` when using the OpenAI API, and `compatible` when using 3rd party providers. In `compatible` mode, newer information such as `streamOptions` are not being sent, resulting in `NaN` token usage. Defaults to 'compatible'. ## Language Models The OpenAI provider instance is a function that you can invoke to create a language model: ```ts const model = openai('gpt-4-turbo'); ``` It automatically selects the correct API based on the model id. You can also pass additional settings in the second argument: ```ts const model = openai('gpt-4-turbo', { // additional settings }); ``` The available options depend on the API that's automatically chosen for the model (see below). If you want to explicitly select a specific model API, you can use `.chat` or `.completion`. ### Example You can use OpenAI language models to generate text with the `generateText` function: ```ts import { openai } from '@ai-sdk/openai'; import { generateText } from 'ai'; const { text } = await generateText({ model: openai('gpt-4-turbo'), prompt: 'Write a vegetarian lasagna recipe for 4 people.', }); ``` OpenAI language models can also be used in the `streamText`, `generateObject`, `streamObject`, and `streamUI` functions (see [AI SDK Core](/docs/ai-sdk-core) and [AI SDK RSC](/docs/ai-sdk-rsc)). ### Chat Models You can create models that call the [OpenAI chat API](https://platform.openai.com/docs/api-reference/chat) using the `.chat()` factory method. The first argument is the model id, e.g. `gpt-4`. The OpenAI chat models support tool calls and some have multi-modal capabilities. ```ts const model = openai.chat('gpt-3.5-turbo'); ``` OpenAI chat models support also some model specific settings that are not part of the [standard call settings](/docs/ai-sdk-core/settings). You can pass them as an options argument: ```ts const model = openai.chat('gpt-3.5-turbo', { logitBias: { // optional likelihood for specific tokens '50256': -100, }, user: 'test-user', // optional unique user identifier }); ``` The following optional settings are available for OpenAI chat models: - **logitBias** _Record<number, number>_ Modifies the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens (specified by their token ID in the GPT tokenizer) to an associated bias value from -100 to 100. You can use this tokenizer tool to convert text to token IDs. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token. As an example, you can pass `{"50256": -100}` to prevent the token from being generated. - **logprobs** _boolean | number_ Return the log probabilities of the tokens. Including logprobs will increase the response size and can slow down response times. However, it can be useful to better understand how the model is behaving. Setting to true will return the log probabilities of the tokens that were generated. Setting to a number will return the log probabilities of the top n tokens that were generated. - **parallelToolCalls** _boolean_ Whether to enable parallel function calling during tool use. Defaults to `true`. - **useLegacyFunctionCalls** _boolean_ Whether to use legacy function calling. Defaults to false. Required by some open source inference engines which do not support the `tools` API. May also provide a workaround for `parallelToolCalls` resulting in the provider buffering tool calls, which causes `streamObject` to be non-streaming. Prefer setting `parallelToolCalls: false` over this option. - **structuredOutputs** _boolean_ Whether to use [structured outputs](#structured-outputs). Defaults to `false` for normal models, and `true` for reasoning models. When enabled, tool calls and object generation will be strict and follow the provided schema. - **user** _string_ A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. [Learn more](https://platform.openai.com/docs/guides/safety-best-practices/end-user-ids). - **downloadImages** _boolean_ Automatically download images and pass the image as data to the model. OpenAI supports image URLs for public models, so this is only needed for private models or when the images are not publicly accessible. Defaults to `false`. - **simulateStreaming** _boolean_ Simulates streaming by using a normal generate call and returning it as a stream. Enable this if the model that you are using does not support streaming. Defaults to `false`. - **reasoningEffort** _'low' | 'medium' | 'high'_ Reasoning effort for reasoning models. Defaults to `medium`. If you use `providerOptions` to set the `reasoningEffort` option, this model setting will be ignored. #### Reasoning OpenAI has introduced the `o1` and `o3` series of [reasoning models](https://platform.openai.com/docs/guides/reasoning). Currently, `o3-mini`, `o1`, `o1-mini`, and `o1-preview` are available. Reasoning models currently only generate text, have several limitations, and are only supported using `generateText` and `streamText`. They support additional settings and response metadata: - You can use `providerOptions` to set - the `reasoningEffort` option (or alternatively the `reasoningEffort` model setting), which determines the amount of reasoning the model performs. - You can use response `providerMetadata` to access the number of reasoning tokens that the model generated. ```ts highlight="4,7-11,17" import { openai } from '@ai-sdk/openai'; import { generateText } from 'ai'; const { text, usage, providerMetadata } = await generateText({ model: openai('o3-mini'), prompt: 'Invent a new holiday and describe its traditions.', providerOptions: { openai: { reasoningEffort: 'low', }, }, }); console.log(text); console.log('Usage:', { ...usage, reasoningTokens: providerMetadata?.openai?.reasoningTokens, }); ``` System messages are automatically converted to OpenAI developer messages for reasoning models when supported. For models that do not support developer messages, such as `o1-preview`, system messages are removed and a warning is added. Reasoning models like `o1-mini` and `o1-preview` require additional runtime inference to complete their reasoning phase before generating a response. This introduces longer latency compared to other models, with `o1-preview` exhibiting significantly more inference time than `o1-mini`. `maxTokens` is automatically mapped to `max_completion_tokens` for reasoning models. #### Structured Outputs You can enable [OpenAI structured outputs](https://openai.com/index/introducing-structured-outputs-in-the-api/) by setting the `structuredOutputs` option to `true`. Structured outputs are a form of grammar-guided generation. The JSON schema is used as a grammar and the outputs will always conform to the schema. ```ts highlight="7" import { openai } from '@ai-sdk/openai'; import { generateObject } from 'ai'; import { z } from 'zod'; const result = await generateObject({ model: openai('gpt-4o-2024-08-06', { structuredOutputs: true, }), schemaName: 'recipe', schemaDescription: 'A recipe for lasagna.', schema: z.object({ name: z.string(), ingredients: z.array( z.object({ name: z.string(), amount: z.string(), }), ), steps: z.array(z.string()), }), prompt: 'Generate a lasagna recipe.', }); console.log(JSON.stringify(result.object, null, 2)); ``` OpenAI structured outputs have several [limitations](https://openai.com/index/introducing-structured-outputs-in-the-api), in particular around the [supported schemas](https://platform.openai.com/docs/guides/structured-outputs/supported-schemas), and are therefore opt-in. For example, optional schema properties are not supported. You need to change Zod `.nullish()` and `.optional()` to `.nullable()`. #### Predicted Outputs OpenAI supports [predicted outputs](https://platform.openai.com/docs/guides/latency-optimization#use-predicted-outputs) for `gpt-4o` and `gpt-4o-mini`. Predicted outputs help you reduce latency by allowing you to specify a base text that the model should modify. You can enable predicted outputs by adding the `prediction` option to the `providerOptions.openai` object: ```ts highlight="15-18" const result = streamText({ model: openai('gpt-4o'), messages: [ { role: 'user', content: 'Replace the Username property with an Email property.', }, { role: 'user', content: existingCode, }, ], providerOptions: { openai: { prediction: { type: 'content', content: existingCode, }, }, }, }); ``` OpenAI provides usage information for predicted outputs (`acceptedPredictionTokens` and `rejectedPredictionTokens`). You can access it in the `providerMetadata` object. ```ts highlight="11" const openaiMetadata = (await result.providerMetadata)?.openai; const acceptedPredictionTokens = openaiMetadata?.acceptedPredictionTokens; const rejectedPredictionTokens = openaiMetadata?.rejectedPredictionTokens; ``` OpenAI Predicted Outputs have several [limitations](https://platform.openai.com/docs/guides/predicted-outputs#limitations), e.g. unsupported API parameters and no tool calling support. #### Image Detail You can use the `openai` provider option to set the [image generation detail](https://platform.openai.com/docs/guides/vision/low-or-high-fidelity-image-understanding) to `high`, `low`, or `auto`: ```ts highlight="13-16" const result = await generateText({ model: openai('gpt-4o'), messages: [ { role: 'user', content: [ { type: 'text', text: 'Describe the image in detail.' }, { type: 'image', image: 'https://github.com/vercel/ai/blob/main/examples/ai-core/data/comic-cat.png?raw=true', // OpenAI specific options - image detail: providerOptions: { openai: { imageDetail: 'low' }, }, }, ], }, ], }); ``` #### Distillation OpenAI supports model distillation for some models. If you want to store a generation for use in the distillation process, you can add the `store` option to the `providerOptions.openai` object. This will save the generation to the OpenAI platform for later use in distillation. ```typescript highlight="9-16" import { openai } from '@ai-sdk/openai'; import { generateText } from 'ai'; import 'dotenv/config'; async function main() { const { text, usage } = await generateText({ model: openai('gpt-4o-mini'), prompt: 'Who worked on the original macintosh?', providerOptions: { openai: { store: true, metadata: { custom: 'value', }, }, }, }); console.log(text); console.log(); console.log('Usage:', usage); } main().catch(console.error); ``` #### Prompt Caching OpenAI has introduced [Prompt Caching](https://platform.openai.com/docs/guides/prompt-caching) for supported models including `gpt-4o`, `gpt-4o-mini`, `o1-preview`, and `o1-mini`. - Prompt caching is automatically enabled for these models, when the prompt is 1024 tokens or longer. It does not need to be explicitly enabled. - You can use response `providerMetadata` to access the number of prompt tokens that were a cache hit. - Note that caching behavior is dependent on load on OpenAI's infrastructure. Prompt prefixes generally remain in the cache following 5-10 minutes of inactivity before they are evicted, but during off-peak periods they may persist for up to an hour. ```ts highlight="11" import { openai } from '@ai-sdk/openai'; import { generateText } from 'ai'; const { text, usage, providerMetadata } = await generateText({ model: openai('gpt-4o-mini'), prompt: `A 1024-token or longer prompt...`, }); console.log(`usage:`, { ...usage, cachedPromptTokens: providerMetadata?.openai?.cachedPromptTokens, }); ``` #### Audio Input With the `gpt-4o-audio-preview` model, you can pass audio files to the model. The `gpt-4o-audio-preview` model is currently in preview and requires at least some audio inputs. It will not work with non-audio data. ```ts highlight="12-14" import { openai } from '@ai-sdk/openai'; import { generateText } from 'ai'; const result = await generateText({ model: openai('gpt-4o-audio-preview'), messages: [ { role: 'user', content: [ { type: 'text', text: 'What is the audio saying?' }, { type: 'file', mimeType: 'audio/mpeg', data: fs.readFileSync('./data/galileo.mp3'), }, ], }, ], }); ``` ### Responses Models You can use the OpenAI responses API with the `openai.responses(modelId)` factory method. ```ts const model = openai.responses('gpt-4o-mini'); ``` Further configuration can be done using OpenAI provider options: ```ts const result = await generateText({ model: openai.responses('gpt-4o-mini'), providerOptions: { openai: { parallelToolCalls: false, store: false, user: 'user_123', // ... }, }, // ... }); ``` The following provider options are available: - **parallelToolCalls** _boolean_ Whether to use parallel tool calls. Defaults to `true`. - **store** _boolean_ Whether to store the generation. Defaults to `true`. - **metadata** _Record<string, string>_ Additional metadata to store with the generation. - **previousResponseId** _string_ The ID of the previous response. You can use it to continue a conversation. Defaults to `undefined`. - **user** _string_ A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. Defaults to `undefined`. - **reasoningEffort** _'low' | 'medium' | 'high'_ Reasoning effort for reasoning models. Defaults to `medium`. If you use `providerOptions` to set the `reasoningEffort` option, this model setting will be ignored. - **strictSchemas** _boolean_ Whether to use strict JSON schemas in tools and when generating JSON outputs. Defaults to `true`. The OpenAI responses provider also returns provider-specific metadata: ```ts const { providerMetadata } = await generateText({ model: openai.responses('gpt-4o-mini'), }); const openaiMetadata = providerMetadata?.openai; ``` The following OpenAI-specific metadata is returned: - **responseId** _string_ The ID of the response. Can be used to continue a conversation. - **cachedPromptTokens** _number_ The number of prompt tokens that were a cache hit. - **reasoningTokens** _number_ The number of reasoning tokens that the model generated. #### Web Search The OpenAI responses provider supports web search through the `openai.tools.webSearchPreview` tool. ```ts const result = await generateText({ model: openai.responses('gpt-4o-mini'), prompt: 'What happened in San Francisco last week?', tools: { web_search_preview: openai.tools.webSearchPreview({ // optional configuration: searchContextSize: 'high', userLocation: { type: 'approximate', city: 'San Francisco', region: 'California', }, }), }, }); // URL sources const sources = result.sources; ``` #### PDF support The OpenAI Responses API supports reading PDF files. You can pass PDF files as part of the message content using the `file` type: ```ts const result = await generateText({ model: openai.responses('gpt-4o'), messages: [ { role: 'user', content: [ { type: 'text', text: 'What is an embedding model?', }, { type: 'file', data: fs.readFileSync('./data/ai.pdf'), mimeType: 'application/pdf', filename: 'ai.pdf', // optional }, ], }, ], }); ``` The model will have access to the contents of the PDF file and respond to questions about it. The PDF file should be passed using the `data` field, and the `mimeType` should be set to `'application/pdf'`. ### Completion Models You can create models that call the [OpenAI completions API](https://platform.openai.com/docs/api-reference/completions) using the `.completion()` factory method. The first argument is the model id. Currently only `gpt-3.5-turbo-instruct` is supported. ```ts const model = openai.completion('gpt-3.5-turbo-instruct'); ``` OpenAI completion models support also some model specific settings that are not part of the [standard call settings](/docs/ai-sdk-core/settings). You can pass them as an options argument: ```ts const model = openai.completion('gpt-3.5-turbo-instruct', { echo: true, // optional, echo the prompt in addition to the completion logitBias: { // optional likelihood for specific tokens '50256': -100, }, suffix: 'some text', // optional suffix that comes after a completion of inserted text user: 'test-user', // optional unique user identifier }); ``` The following optional settings are available for OpenAI completion models: - **echo**: _boolean_ Echo back the prompt in addition to the completion. - **logitBias** _Record<number, number>_ Modifies the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens (specified by their token ID in the GPT tokenizer) to an associated bias value from -100 to 100. You can use this tokenizer tool to convert text to token IDs. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token. As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token from being generated. - **logprobs** _boolean | number_ Return the log probabilities of the tokens. Including logprobs will increase the response size and can slow down response times. However, it can be useful to better understand how the model is behaving. Setting to true will return the log probabilities of the tokens that were generated. Setting to a number will return the log probabilities of the top n tokens that were generated. - **suffix** _string_ The suffix that comes after a completion of inserted text. - **user** _string_ A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. [Learn more](https://platform.openai.com/docs/guides/safety-best-practices/end-user-ids). ### Model Capabilities | Model | Image Input | Audio Input | Object Generation | Tool Usage | | ---------------------- | ------------------- | ------------------- | ------------------- | ------------------- | | `gpt-4o` | | | | | | `gpt-4o-mini` | | | | | | `gpt-4o-audio-preview` | | | | | | `gpt-4-turbo` | | | | | | `gpt-4` | | | | | | `gpt-3.5-turbo` | | | | | | `o1` | | | | | | `o1-mini` | | | | | | `o1-preview` | | | | | | `o3-mini` | | | | | The table above lists popular models. Please see the [OpenAI docs](https://platform.openai.com/docs/models) for a full list of available models. The table above lists popular models. You can also pass any available provider model ID as a string if needed. ## Embedding Models You can create models that call the [OpenAI embeddings API](https://platform.openai.com/docs/api-reference/embeddings) using the `.embedding()` factory method. ```ts const model = openai.embedding('text-embedding-3-large'); ``` OpenAI embedding models support several additional settings. You can pass them as an options argument: ```ts const model = openai.embedding('text-embedding-3-large', { dimensions: 512 // optional, number of dimensions for the embedding user: 'test-user' // optional unique user identifier }) ``` The following optional settings are available for OpenAI embedding models: - **dimensions**: _number_ The number of dimensions the resulting output embeddings should have. Only supported in text-embedding-3 and later models. - **user** _string_ A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. [Learn more](https://platform.openai.com/docs/guides/safety-best-practices/end-user-ids). ### Model Capabilities | Model | Default Dimensions | Custom Dimensions | | ------------------------ | ------------------ | ------------------- | | `text-embedding-3-large` | 3072 | | | `text-embedding-3-small` | 1536 | | | `text-embedding-ada-002` | 1536 | | ## Image Models You can create models that call the [OpenAI image generation API](https://platform.openai.com/docs/api-reference/images) using the `.image()` factory method. ```ts const model = openai.image('dall-e-3'); ``` Dall-E models do not support the `aspectRatio` parameter. Use the `size` parameter instead. ### Model Capabilities | Model | Sizes | | ---------- | ------------------------------- | | `dall-e-3` | 1024x1024, 1792x1024, 1024x1792 | | `dall-e-2` | 256x256, 512x512, 1024x1024 | --- title: Azure OpenAI description: Learn how to use the Azure OpenAI provider for the AI SDK. --- # Azure OpenAI Provider The [Azure OpenAI](https://azure.microsoft.com/en-us/products/ai-services/openai-service) provider contains language model support for the Azure OpenAI chat API. ## Setup The Azure OpenAI provider is available in the `@ai-sdk/azure` module. You can install it with ## Provider Instance You can import the default provider instance `azure` from `@ai-sdk/azure`: ```ts import { azure } from '@ai-sdk/azure'; ``` If you need a customized setup, you can import `createAzure` from `@ai-sdk/azure` and create a provider instance with your settings: ```ts import { createAzure } from '@ai-sdk/azure'; const azure = createAzure({ resourceName: 'your-resource-name', // Azure resource name apiKey: 'your-api-key', }); ``` You can use the following optional settings to customize the OpenAI provider instance: - **resourceName** _string_ Azure resource name. It defaults to the `AZURE_RESOURCE_NAME` environment variable. The resource name is used in the assembled URL: `https://{resourceName}.openai.azure.com/openai/deployments/{modelId}{path}`. You can use `baseURL` instead to specify the URL prefix. - **apiKey** _string_ API key that is being sent using the `api-key` header. It defaults to the `AZURE_API_KEY` environment variable. - **apiVersion** _string_ Sets a custom [api version](https://learn.microsoft.com/en-us/azure/ai-services/openai/api-version-deprecation). Defaults to `2024-10-01-preview`. - **baseURL** _string_ Use a different URL prefix for API calls, e.g. to use proxy servers. Either this or `resourceName` can be used. When a baseURL is provided, the resourceName is ignored. With a baseURL, the resolved URL is `{baseURL}/{modelId}{path}`. - **headers** _Record<string,string>_ Custom headers to include in the requests. - **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_ Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation. Defaults to the global `fetch` function. You can use it as a middleware to intercept requests, or to provide a custom fetch implementation for e.g. testing. ## Language Models The Azure OpenAI provider instance is a function that you can invoke to create a language model: ```ts const model = azure('your-deployment-name'); ``` You need to pass your deployment name as the first argument. ### Reasoning Models Azure exposes the thinking of `DeepSeek-R1` in the generated text using the `` tag. You can use the `extractReasoningMiddleware` to extract this reasoning and expose it as a `reasoning` property on the result: ```ts import { azure } from '@ai-sdk/azure'; import { wrapLanguageModel, extractReasoningMiddleware } from 'ai'; const enhancedModel = wrapLanguageModel({ model: azure('your-deepseek-r1-deployment-name'), middleware: extractReasoningMiddleware({ tagName: 'think' }), }); ``` You can then use that enhanced model in functions like `generateText` and `streamText`. ### Example You can use OpenAI language models to generate text with the `generateText` function: ```ts import { azure } from '@ai-sdk/azure'; import { generateText } from 'ai'; const { text } = await generateText({ model: azure('your-deployment-name'), prompt: 'Write a vegetarian lasagna recipe for 4 people.', }); ``` OpenAI language models can also be used in the `streamText`, `generateObject`, `streamObject`, and `streamUI` functions (see [AI SDK Core](/docs/ai-sdk-core) and [AI SDK RSC](/docs/ai-sdk-rsc)). Azure OpenAI sends larger chunks than OpenAI. This can lead to the perception that the response is slower. See [Troubleshooting: Azure OpenAI Slow To Stream](/docs/troubleshooting/common-issues/azure-stream-slow) ### Chat Models The URL for calling Azure chat models will be constructed as follows: `https://RESOURCE_NAME.openai.azure.com/openai/deployments/DEPLOYMENT_NAME/chat/completions?api-version=API_VERSION` Azure OpenAI chat models support also some model specific settings that are not part of the [standard call settings](/docs/ai-sdk-core/settings). You can pass them as an options argument: ```ts const model = azure('your-deployment-name', { logitBias: { // optional likelihood for specific tokens '50256': -100, }, user: 'test-user', // optional unique user identifier }); ``` The following optional settings are available for OpenAI chat models: - **logitBias** _Record<number, number>_ Modifies the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens (specified by their token ID in the GPT tokenizer) to an associated bias value from -100 to 100. You can use this tokenizer tool to convert text to token IDs. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token. As an example, you can pass `{"50256": -100}` to prevent the token from being generated. - **logprobs** _boolean | number_ Return the log probabilities of the tokens. Including logprobs will increase the response size and can slow down response times. However, it can be useful to better understand how the model is behaving. Setting to true will return the log probabilities of the tokens that were generated. Setting to a number will return the log probabilities of the top n tokens that were generated. - **parallelToolCalls** _boolean_ Whether to enable parallel function calling during tool use. Default to true. - **user** _string_ A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. Learn more. ### Completion Models You can create models that call the completions API using the `.completion()` factory method. The first argument is the model id. Currently only `gpt-35-turbo-instruct` is supported. ```ts const model = azure.completion('your-gpt-35-turbo-instruct-deployment'); ``` OpenAI completion models support also some model specific settings that are not part of the [standard call settings](/docs/ai-sdk-core/settings). You can pass them as an options argument: ```ts const model = azure.completion('your-gpt-35-turbo-instruct-deployment', { echo: true, // optional, echo the prompt in addition to the completion logitBias: { // optional likelihood for specific tokens '50256': -100, }, suffix: 'some text', // optional suffix that comes after a completion of inserted text user: 'test-user', // optional unique user identifier }); ``` The following optional settings are available for Azure OpenAI completion models: - **echo**: _boolean_ Echo back the prompt in addition to the completion. - **logitBias** _Record<number, number>_ Modifies the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens (specified by their token ID in the GPT tokenizer) to an associated bias value from -100 to 100. You can use this tokenizer tool to convert text to token IDs. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token. As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token from being generated. - **logprobs** _boolean | number_ Return the log probabilities of the tokens. Including logprobs will increase the response size and can slow down response times. However, it can be useful to better understand how the model is behaving. Setting to true will return the log probabilities of the tokens that were generated. Setting to a number will return the log probabilities of the top n tokens that were generated. - **suffix** _string_ The suffix that comes after a completion of inserted text. - **user** _string_ A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. Learn more. ## Embedding Models You can create models that call the Azure OpenAI embeddings API using the `.embedding()` factory method. ```ts const model = azure.embedding('your-embedding-deployment'); ``` Azure OpenAI embedding models support several aditional settings. You can pass them as an options argument: ```ts const model = azure.embedding('your-embedding-deployment', { dimensions: 512 // optional, number of dimensions for the embedding user: 'test-user' // optional unique user identifier }) ``` The following optional settings are available for Azure OpenAI embedding models: - **dimensions**: _number_ The number of dimensions the resulting output embeddings should have. Only supported in text-embedding-3 and later models. - **user** _string_ A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. Learn more. ## Image Models You can create models that call the Azure OpenAI image generation API (DALL-E) using the `.imageModel()` factory method. The first argument is your deployment name for the DALL-E model. ```ts const model = azure.imageModel('your-dalle-deployment-name'); ``` Azure OpenAI image models support several additional settings. You can pass them as an options argument: ```ts const model = azure.imageModel('your-dalle-deployment-name', { user: 'test-user', // optional unique user identifier responseFormat: 'url', // 'url' or 'b64_json', defaults to 'url' }); ``` ### Example You can use Azure OpenAI image models to generate images with the `generateImage` function: ```ts import { azure } from '@ai-sdk/azure'; import { experimental_generateImage as generateImage } from 'ai'; const { image } = await generateImage({ model: azure.imageModel('your-dalle-deployment-name'), prompt: 'A photorealistic image of a cat astronaut floating in space', size: '1024x1024', // '1024x1024', '1792x1024', or '1024x1792' for DALL-E 3 }); // image contains the URL or base64 data of the generated image console.log(image); ``` ### Model Capabilities Azure OpenAI supports DALL-E 2 and DALL-E 3 models through deployments. The capabilities depend on which model version your deployment is using: | Model Version | Sizes | | ------------- | ------------------------------- | | DALL-E 3 | 1024x1024, 1792x1024, 1024x1792 | | DALL-E 2 | 256x256, 512x512, 1024x1024 | DALL-E models do not support the `aspectRatio` parameter. Use the `size` parameter instead. When creating your Azure OpenAI deployment, make sure to set the DALL-E model version you want to use. --- title: Anthropic description: Learn how to use the Anthropic provider for the AI SDK. --- # Anthropic Provider The [Anthropic](https://www.anthropic.com/) provider contains language model support for the [Anthropic Messages API](https://docs.anthropic.com/claude/reference/messages_post). ## Setup The Anthropic provider is available in the `@ai-sdk/anthropic` module. You can install it with ## Provider Instance You can import the default provider instance `anthropic` from `@ai-sdk/anthropic`: ```ts import { anthropic } from '@ai-sdk/anthropic'; ``` If you need a customized setup, you can import `createAnthropic` from `@ai-sdk/anthropic` and create a provider instance with your settings: ```ts import { createAnthropic } from '@ai-sdk/anthropic'; const anthropic = createAnthropic({ // custom settings }); ``` You can use the following optional settings to customize the Anthropic provider instance: - **baseURL** _string_ Use a different URL prefix for API calls, e.g. to use proxy servers. The default prefix is `https://api.anthropic.com/v1`. - **apiKey** _string_ API key that is being sent using the `x-api-key` header. It defaults to the `ANTHROPIC_API_KEY` environment variable. - **headers** _Record<string,string>_ Custom headers to include in the requests. - **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_ Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation. Defaults to the global `fetch` function. You can use it as a middleware to intercept requests, or to provide a custom fetch implementation for e.g. testing. ## Language Models You can create models that call the [Anthropic Messages API](https://docs.anthropic.com/claude/reference/messages_post) using the provider instance. The first argument is the model id, e.g. `claude-3-haiku-20240307`. Some models have multi-modal capabilities. ```ts const model = anthropic('claude-3-haiku-20240307'); ``` You can use Anthropic language models to generate text with the `generateText` function: ```ts import { anthropic } from '@ai-sdk/anthropic'; import { generateText } from 'ai'; const { text } = await generateText({ model: anthropic('claude-3-haiku-20240307'), prompt: 'Write a vegetarian lasagna recipe for 4 people.', }); ``` Anthropic language models can also be used in the `streamText`, `generateObject`, and `streamObject` functions (see [AI SDK Core](/docs/ai-sdk-core) and [AI SDK RSC](/docs/ai-sdk-rsc)). The Anthropic API returns streaming tool calls all at once after a delay. This causes the `streamObject` function to generate the object fully after a delay instead of streaming it incrementally. The following optional settings are available for Anthropic models: - `sendReasoning` _boolean_ Optional. Include reasoning content in requests sent to the model. Defaults to `true`. If you are experiencing issues with the model handling requests involving reasoning content, you can set this to `false` to omit them from the request. ### Reasoning Anthropic has reasoning support for the `claude-3-7-sonnet-20250219` model. You can enable it using the `thinking` provider option and specifying a thinking budget in tokens. ```ts import { anthropic } from '@ai-sdk/anthropic'; import { generateText } from 'ai'; const { text, reasoning, reasoningDetails } = await generateText({ model: anthropic('claude-3-7-sonnet-20250219'), prompt: 'How many people will live in the world in 2040?', providerOptions: { anthropic: { thinking: { type: 'enabled', budgetTokens: 12000 }, }, }, }); console.log(reasoning); // reasoning text console.log(reasoningDetails); // reasoning details including redacted reasoning console.log(text); // text response ``` See [AI SDK UI: Chatbot](/docs/ai-sdk-ui/chatbot#reasoning) for more details on how to integrate reasoning into your chatbot. ### Cache Control Anthropic cache control was originally a beta feature and required passing an opt-in `cacheControl` setting when creating the model instance. It is now Generally Available and enabled by default. The `cacheControl` setting is no longer needed and will be removed in a future release. In the messages and message parts, you can use the `providerOptions` property to set cache control breakpoints. You need to set the `anthropic` property in the `providerOptions` object to `{ cacheControl: { type: 'ephemeral' } }` to set a cache control breakpoint. The cache creation input tokens are then returned in the `providerMetadata` object for `generateText` and `generateObject`, again under the `anthropic` property. When you use `streamText` or `streamObject`, the response contains a promise that resolves to the metadata. Alternatively you can receive it in the `onFinish` callback. ```ts highlight="8,18-20,29-30" import { anthropic } from '@ai-sdk/anthropic'; import { generateText } from 'ai'; const errorMessage = '... long error message ...'; const result = await generateText({ model: anthropic('claude-3-5-sonnet-20240620'), messages: [ { role: 'user', content: [ { type: 'text', text: 'You are a JavaScript expert.' }, { type: 'text', text: `Error message: ${errorMessage}`, providerOptions: { anthropic: { cacheControl: { type: 'ephemeral' } }, }, }, { type: 'text', text: 'Explain the error message.' }, ], }, ], }); console.log(result.text); console.log(result.providerMetadata?.anthropic); // e.g. { cacheCreationInputTokens: 2118, cacheReadInputTokens: 0 } ``` You can also use cache control on system messages by providing multiple system messages at the head of your messages array: ```ts highlight="3,7-9" const result = await generateText({ model: anthropic('claude-3-5-sonnet-20240620'), messages: [ { role: 'system', content: 'Cached system message part', providerOptions: { anthropic: { cacheControl: { type: 'ephemeral' } }, }, }, { role: 'system', content: 'Uncached system message part', }, { role: 'user', content: 'User prompt', }, ], }); ``` The minimum cacheable prompt length is: - 1024 tokens for Claude 3.7 Sonnet, Claude 3.5 Sonnet and Claude 3 Opus - 2048 tokens for Claude 3.5 Haiku and Claude 3 Haiku Shorter prompts cannot be cached, even if marked with `cacheControl`. Any requests to cache fewer than this number of tokens will be processed without caching. For more on prompt caching with Anthropic, see [Anthropic's Cache Control documentation](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching). ### Computer Use Anthropic provides three built-in tools that can be used to interact with external systems: 1. **Bash Tool**: Allows running bash commands. 2. **Text Editor Tool**: Provides functionality for viewing and editing text files. 3. **Computer Tool**: Enables control of keyboard and mouse actions on a computer. They are available via the `tools` property of the provider instance. #### Bash Tool The Bash Tool allows running bash commands. Here's how to create and use it: ```ts const bashTool = anthropic.tools.bash_20241022({ execute: async ({ command, restart }) => { // Implement your bash command execution logic here // Return the result of the command execution }, }); ``` Parameters: - `command` (string): The bash command to run. Required unless the tool is being restarted. - `restart` (boolean, optional): Specifying true will restart this tool. #### Text Editor Tool The Text Editor Tool provides functionality for viewing and editing text files: ```ts const textEditorTool = anthropic.tools.textEditor_20241022({ execute: async ({ command, path, file_text, insert_line, new_str, old_str, view_range, }) => { // Implement your text editing logic here // Return the result of the text editing operation }, }); ``` Parameters: - `command` ('view' | 'create' | 'str_replace' | 'insert' | 'undo_edit'): The command to run. - `path` (string): Absolute path to file or directory, e.g. `/repo/file.py` or `/repo`. - `file_text` (string, optional): Required for `create` command, with the content of the file to be created. - `insert_line` (number, optional): Required for `insert` command. The line number after which to insert the new string. - `new_str` (string, optional): New string for `str_replace` or `insert` commands. - `old_str` (string, optional): Required for `str_replace` command, containing the string to replace. - `view_range` (number[], optional): Optional for `view` command to specify line range to show. #### Computer Tool The Computer Tool enables control of keyboard and mouse actions on a computer: ```ts const computerTool = anthropic.tools.computer_20241022({ displayWidthPx: 1920, displayHeightPx: 1080, displayNumber: 0, // Optional, for X11 environments execute: async ({ action, coordinate, text }) => { // Implement your computer control logic here // Return the result of the action // Example code: switch (action) { case 'screenshot': { // multipart result: return { type: 'image', data: fs .readFileSync('./data/screenshot-editor.png') .toString('base64'), }; } default: { console.log('Action:', action); console.log('Coordinate:', coordinate); console.log('Text:', text); return `executed ${action}`; } } }, // map to tool result content for LLM consumption: experimental_toToolResultContent(result) { return typeof result === 'string' ? [{ type: 'text', text: result }] : [{ type: 'image', data: result.data, mimeType: 'image/png' }]; }, }); ``` Parameters: - `action` ('key' | 'type' | 'mouse_move' | 'left_click' | 'left_click_drag' | 'right_click' | 'middle_click' | 'double_click' | 'screenshot' | 'cursor_position'): The action to perform. - `coordinate` (number[], optional): Required for `mouse_move` and `left_click_drag` actions. Specifies the (x, y) coordinates. - `text` (string, optional): Required for `type` and `key` actions. These tools can be used in conjunction with the `sonnet-3-5-sonnet-20240620` model to enable more complex interactions and tasks. ### PDF support Anthropic Sonnet `claude-3-5-sonnet-20241022` supports reading PDF files. You can pass PDF files as part of the message content using the `file` type: ```ts const result = await generateText({ model: anthropic('claude-3-5-sonnet-20241022'), messages: [ { role: 'user', content: [ { type: 'text', text: 'What is an embedding model according to this document?', }, { type: 'file', data: fs.readFileSync('./data/ai.pdf'), mimeType: 'application/pdf', }, ], }, ], }); ``` The model will have access to the contents of the PDF file and respond to questions about it. The PDF file should be passed using the `data` field, and the `mimeType` should be set to `'application/pdf'`. ### Model Capabilities | Model | Image Input | Object Generation | Tool Usage | Computer Use | | ---------------------------- | ------------------- | ------------------- | ------------------- | ------------------- | | `claude-3-7-sonnet-20250219` | | | | | | `claude-3-5-sonnet-20241022` | | | | | | `claude-3-5-sonnet-20240620` | | | | | | `claude-3-5-haiku-20241022` | | | | | | `claude-3-opus-20240229` | | | | | | `claude-3-sonnet-20240229` | | | | | | `claude-3-haiku-20240307` | | | | | The table above lists popular models. Please see the [Anthropic docs](https://docs.anthropic.com/en/docs/about-claude/models) for a full list of available models. The table above lists popular models. You can also pass any available provider model ID as a string if needed. --- title: Amazon Bedrock description: Learn how to use the Amazon Bedrock provider. --- # Amazon Bedrock Provider The Amazon Bedrock provider for the [AI SDK](https://sdk.vercel.ai/docs) contains language model support for the [Amazon Bedrock](https://aws.amazon.com/bedrock) APIs. ## Setup The Bedrock provider is available in the `@ai-sdk/amazon-bedrock` module. You can install it with ### Prerequisites Access to Amazon Bedrock foundation models isn't granted by default. In order to gain access to a foundation model, an IAM user with sufficient permissions needs to request access to it through the console. Once access is provided to a model, it is available for all users in the account. See the [Model Access Docs](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html) for more information. ### Authentication **Step 1: Creating AWS Access Key and Secret Key** To get started, you'll need to create an AWS access key and secret key. Here's how: **Login to AWS Management Console** - Go to the [AWS Management Console](https://console.aws.amazon.com/) and log in with your AWS account credentials. **Create an IAM User** - Navigate to the [IAM dashboard](https://console.aws.amazon.com/iam/home) and click on "Users" in the left-hand navigation menu. - Click on "Create user" and fill in the required details to create a new IAM user. - Make sure to select "Programmatic access" as the access type. - The user account needs the `AmazonBedrockFullAccess` policy attached to it. **Create Access Key** - Click on the "Security credentials" tab and then click on "Create access key". - Click "Create access key" to generate a new access key pair. - Download the `.csv` file containing the access key ID and secret access key. **Step 2: Configuring the Access Key and Secret Key** Within your project add a `.env` file if you don't already have one. This file will be used to set the access key and secret key as environment variables. Add the following lines to the `.env` file: ```makefile AWS_ACCESS_KEY_ID=YOUR_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY=YOUR_SECRET_ACCESS_KEY AWS_REGION=YOUR_REGION ``` Many frameworks such as [Next.js](https://nextjs.org/) load the `.env` file automatically. If you're using a different framework, you may need to load the `.env` file manually using a package like [`dotenv`](https://github.com/motdotla/dotenv). Remember to replace `YOUR_ACCESS_KEY_ID`, `YOUR_SECRET_ACCESS_KEY`, and `YOUR_REGION` with the actual values from your AWS account. ## Provider Instance You can import the default provider instance `bedrock` from `@ai-sdk/amazon-bedrock`: ```ts import { bedrock } from '@ai-sdk/amazon-bedrock'; ``` If you need a customized setup, you can import `createAmazonBedrock` from `@ai-sdk/amazon-bedrock` and create a provider instance with your settings: ```ts import { createAmazonBedrock } from '@ai-sdk/amazon-bedrock'; const bedrock = createAmazonBedrock({ region: 'us-east-1', accessKeyId: 'xxxxxxxxx', secretAccessKey: 'xxxxxxxxx', sessionToken: 'xxxxxxxxx', }); ``` The credentials settings fall back to environment variable defaults described below. These may be set by your serverless environment without your awareness, which can lead to merged/conflicting credential values and provider errors around failed authentication. If you're experiencing issues be sure you are explicitly specifying all settings (even if `undefined`) to avoid any defaults. You can use the following optional settings to customize the Amazon Bedrock provider instance: - **region** _string_ The AWS region that you want to use for the API calls. It uses the `AWS_REGION` environment variable by default. - **accessKeyId** _string_ The AWS access key ID that you want to use for the API calls. It uses the `AWS_ACCESS_KEY_ID` environment variable by default. - **secretAccessKey** _string_ The AWS secret access key that you want to use for the API calls. It uses the `AWS_SECRET_ACCESS_KEY` environment variable by default. - **sessionToken** _string_ Optional. The AWS session token that you want to use for the API calls. It uses the `AWS_SESSION_TOKEN` environment variable by default. ## Language Models You can create models that call the Bedrock API using the provider instance. The first argument is the model id, e.g. `meta.llama3-70b-instruct-v1:0`. ```ts const model = bedrock('meta.llama3-70b-instruct-v1:0'); ``` Amazon Bedrock models also support some model specific settings that are not part of the [standard call settings](/docs/ai-sdk-core/settings). You can pass them as an options argument: ```ts const model = bedrock('anthropic.claude-3-sonnet-20240229-v1:0', { additionalModelRequestFields: { top_k: 350 }, }); ``` Documentation for additional settings based on the selected model can be found within the [Amazon Bedrock Inference Parameter Documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html). You can use Amazon Bedrock language models to generate text with the `generateText` function: ```ts import { bedrock } from '@ai-sdk/amazon-bedrock'; import { generateText } from 'ai'; const { text } = await generateText({ model: bedrock('meta.llama3-70b-instruct-v1:0'), prompt: 'Write a vegetarian lasagna recipe for 4 people.', }); ``` Amazon Bedrock language models can also be used in the `streamText` function (see [AI SDK Core](/docs/ai-sdk-core)). ### File Inputs Amazon Bedrock supports file inputs on in combination with specific models, e.g. `anthropic.claude-3-haiku-20240307-v1:0`. The Amazon Bedrock provider supports file inputs, e.g. PDF files. ```ts import { bedrock } from '@ai-sdk/amazon-bedrock'; import { generateText } from 'ai'; const result = await generateText({ model: bedrock('anthropic.claude-3-haiku-20240307-v1:0'), messages: [ { role: 'user', content: [ { type: 'text', text: 'Describe the pdf in detail.' }, { type: 'file', data: fs.readFileSync('./data/ai.pdf'), mimeType: 'application/pdf', }, ], }, ], }); ``` ### Guardrails You can use the `bedrock` provider options to utilize [Amazon Bedrock Guardrails](https://aws.amazon.com/bedrock/guardrails/): ```ts const result = await generateText({ bedrock('anthropic.claude-3-sonnet-20240229-v1:0'), providerOptions: { bedrock: { guardrailConfig: { guardrailIdentifier: '1abcd2ef34gh', guardrailVersion: '1', trace: 'enabled' as const, streamProcessingMode: 'async', }, }, }, }); ``` Tracing information will be returned in the provider metadata if you have tracing enabled. ```ts if (result.providerMetadata?.bedrock.trace) { // ... } ``` See the [Amazon Bedrock Guardrails documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html) for more information. ### Cache Points Amazon Bedrock prompt caching is currently in preview release. To request access, visit the [Amazon Bedrock prompt caching page](https://aws.amazon.com/bedrock/prompt-caching/). In messages, you can use the `providerOptions` property to set cache points. Set the `bedrock` property in the `providerOptions` object to `{ cachePoint: { type: 'default' } }` to create a cache point. Cache usage information is returned in the `providerMetadata` object`. See examples below. Cache points have model-specific token minimums and limits. For example, Claude 3.5 Sonnet v2 requires at least 1,024 tokens for a cache point and allows up to 4 cache points. See the [Amazon Bedrock prompt caching documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html) for details on supported models, regions, and limits. ```ts import { bedrock } from '@ai-sdk/amazon-bedrock'; import { generateText } from 'ai'; const cyberpunkAnalysis = '... literary analysis of cyberpunk themes and concepts ...'; const result = await generateText({ model: bedrock('anthropic.claude-3-5-sonnet-20241022-v2:0'), messages: [ { role: 'system', content: `You are an expert on William Gibson's cyberpunk literature and themes. You have access to the following academic analysis: ${cyberpunkAnalysis}`, providerOptions: { bedrock: { cachePoint: { type: 'default' } }, }, }, { role: 'user', content: 'What are the key cyberpunk themes that Gibson explores in Neuromancer?', }, ], }); console.log(result.text); console.log(result.providerMetadata?.bedrock?.usage); // Shows cache read/write token usage, e.g.: // { // cacheReadInputTokens: 1337, // cacheWriteInputTokens: 42, // } ``` Cache points also work with streaming responses: ```ts import { bedrock } from '@ai-sdk/amazon-bedrock'; import { streamText } from 'ai'; const cyberpunkAnalysis = '... literary analysis of cyberpunk themes and concepts ...'; const result = streamText({ model: bedrock('anthropic.claude-3-5-sonnet-20241022-v2:0'), messages: [ { role: 'assistant', content: [ { type: 'text', text: 'You are an expert on cyberpunk literature.' }, { type: 'text', text: `Academic analysis: ${cyberpunkAnalysis}` }, ], providerOptions: { bedrock: { cachePoint: { type: 'default' } } }, }, { role: 'user', content: 'How does Gibson explore the relationship between humanity and technology?', }, ], }); for await (const textPart of result.textStream) { process.stdout.write(textPart); } console.log( 'Cache token usage:', (await result.providerMetadata)?.bedrock?.usage, ); // Shows cache read/write token usage, e.g.: // { // cacheReadInputTokens: 1337, // cacheWriteInputTokens: 42, // } ``` ## Reasoning Amazon Bedrock has reasoning support for the `claude-3-7-sonnet-20250219` model. You can enable it using the `reasoning_config` provider option and specifying a thinking budget in tokens (minimum: `1024`, maximum: `64000`). ```ts import { bedrock } from '@ai-sdk/amazon-bedrock'; import { generateText } from 'ai'; const { text, reasoning, reasoningDetails } = await generateText({ model: bedrock('us.anthropic.claude-3-7-sonnet-20250219-v1:0'), prompt: 'How many people will live in the world in 2040?', providerOptions: { bedrock: { reasoningConfig: { type: 'enabled', budgetTokens: 1024 }, }, }, }); console.log(reasoning); // reasoning text console.log(reasoningDetails); // reasoning details including redacted reasoning console.log(text); // text response ``` See [AI SDK UI: Chatbot](/docs/ai-sdk-ui/chatbot#reasoning) for more details on how to integrate reasoning into your chatbot. ### Model Capabilities | Model | Image Input | Object Generation | Tool Usage | Tool Streaming | | ------------------------------------------- | ------------------- | ------------------- | ------------------- | ------------------- | | `amazon.titan-tg1-large` | | | | | | `amazon.titan-text-express-v1` | | | | | | `amazon.nova-micro-v1:0` | | | | | | `amazon.nova-lite-v1:0` | | | | | | `amazon.nova-pro-v1:0` | | | | | | `anthropic.claude-3-7-sonnet-20250219-v1:0` | | | | | | `anthropic.claude-3-5-sonnet-20241022-v2:0` | | | | | | `anthropic.claude-3-5-sonnet-20240620-v1:0` | | | | | | `anthropic.claude-3-5-haiku-20241022-v1:0` | | | | | | `anthropic.claude-3-opus-20240229-v1:0` | | | | | | `anthropic.claude-3-sonnet-20240229-v1:0` | | | | | | `anthropic.claude-3-haiku-20240307-v1:0` | | | | | | `anthropic.claude-v2:1` | | | | | | `cohere.command-r-v1:0` | | | | | | `cohere.command-r-plus-v1:0` | | | | | | `deepseek.r1-v1:0` | | | | | | `meta.llama2-13b-chat-v1` | | | | | | `meta.llama2-70b-chat-v1` | | | | | | `meta.llama3-8b-instruct-v1:0` | | | | | | `meta.llama3-70b-instruct-v1:0` | | | | | | `meta.llama3-1-8b-instruct-v1:0` | | | | | | `meta.llama3-1-70b-instruct-v1:0` | | | | | | `meta.llama3-1-405b-instruct-v1:0` | | | | | | `meta.llama3-2-1b-instruct-v1:0` | | | | | | `meta.llama3-2-3b-instruct-v1:0` | | | | | | `meta.llama3-2-11b-instruct-v1:0` | | | | | | `meta.llama3-2-90b-instruct-v1:0` | | | | | | `mistral.mistral-7b-instruct-v0:2` | | | | | | `mistral.mixtral-8x7b-instruct-v0:1` | | | | | | `mistral.mistral-large-2402-v1:0` | | | | | | `mistral.mistral-small-2402-v1:0` | | | | | The table above lists popular models. Please see the [Amazon Bedrock docs](https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference-supported-models-features.html) for a full list of available models. The table above lists popular models. You can also pass any available provider model ID as a string if needed. ## Embedding Models You can create models that call the Bedrock API [Bedrock API](https://docs.aws.amazon.com/bedrock/latest/userguide/titan-embedding-models.html) using the `.embedding()` factory method. ```ts const model = bedrock.embedding('amazon.titan-embed-text-v1'); ``` Bedrock Titan embedding model amazon.titan-embed-text-v2:0 supports several aditional settings. You can pass them as an options argument: ```ts const model = bedrock.embedding('amazon.titan-embed-text-v2:0', { dimensions: 512 // optional, number of dimensions for the embedding normalize: true // optional normalize the output embeddings }) ``` The following optional settings are available for Bedrock Titan embedding models: - **dimensions**: _number_ The number of dimensions the output embeddings should have. The following values are accepted: 1024 (default), 512, 256. - **normalize** _boolean_ Flag indicating whether or not to normalize the output embeddings. Defaults to true. ### Model Capabilities | Model | Default Dimensions | Custom Dimensions | | ------------------------------ | ------------------ | ------------------- | | `amazon.titan-embed-text-v1` | 1536 | | | `amazon.titan-embed-text-v2:0` | 1024 | | ## Image Models You can create models that call the Bedrock API [Bedrock API](https://docs.aws.amazon.com/nova/latest/userguide/image-generation.html) using the `.image()` factory method. For more on the Amazon Nova Canvas image model, see the [Nova Canvas Overview](https://docs.aws.amazon.com/ai/responsible-ai/nova-canvas/overview.html). The `amazon.nova-canvas-v1:0` model is available in the `us-east-1` region. ```ts const model = bedrock.image('amazon.nova-canvas-v1:0'); ``` You can then generate images with the `experimental_generateImage` function: ```ts import { bedrock } from '@ai-sdk/amazon-bedrock'; import { experimental_generateImage as generateImage } from 'ai'; const { image } = await generateImage({ model: bedrock.imageModel('amazon.nova-canvas-v1:0'), prompt: 'A beautiful sunset over a calm ocean', size: '512x512', seed: 42, }); ``` You can also pass the `providerOptions` object to the `generateImage` function to customize the generation behavior: ```ts import { bedrock } from '@ai-sdk/amazon-bedrock'; import { experimental_generateImage as generateImage } from 'ai'; const { image } = await generateImage({ model: bedrock.imageModel('amazon.nova-canvas-v1:0'), prompt: 'A beautiful sunset over a calm ocean', size: '512x512', seed: 42, providerOptions: { bedrock: { quality: 'premium' } }, }); ``` Documentation for additional settings can be found within the [Amazon Bedrock User Guide for Amazon Nova Documentation](https://docs.aws.amazon.com/nova/latest/userguide/image-gen-req-resp-structure.html). ### Image Model Settings When creating an image model, you can customize the generation behavior with optional settings: ```ts const model = bedrock.imageModel('amazon.nova-canvas-v1:0', { maxImagesPerCall: 1, // Maximum number of images to generate per API call }); ``` - **maxImagesPerCall** _number_ Override the maximum number of images generated per API call. Default can vary by model, with 5 as a common default. ### Model Capabilities The Amazon Nova Canvas model supports custom sizes with constraints as follows: - Each side must be between 320-4096 pixels, inclusive. - Each side must be evenly divisible by 16. - The aspect ratio must be between 1:4 and 4:1. That is, one side can't be more than 4 times longer than the other side. - The total pixel count must be less than 4,194,304. For more, see [Image generation access and usage](https://docs.aws.amazon.com/nova/latest/userguide/image-gen-access.html). | Model | Sizes | | ------------------------- | ----------------------------------------------------------------------------------------------------- | | `amazon.nova-canvas-v1:0` | Custom sizes: 320-4096px per side (must be divisible by 16), aspect ratio 1:4 to 4:1, max 4.2M pixels | ## Response Headers The Amazon Bedrock provider will return the response headers associated with network requests made of the Bedrock servers. ```ts import { bedrock } from '@ai-sdk/amazon-bedrock'; import { generateText } from 'ai'; const { text } = await generateText({ model: bedrock('meta.llama3-70b-instruct-v1:0'), prompt: 'Write a vegetarian lasagna recipe for 4 people.', }); console.log(result.response.headers); ``` Below is sample output where you can see the `x-amzn-requestid` header. This can be useful for correlating Bedrock API calls with requests made by the AI SDK: ```js highlight="6" { connection: 'keep-alive', 'content-length': '2399', 'content-type': 'application/json', date: 'Fri, 07 Feb 2025 04:28:30 GMT', 'x-amzn-requestid': 'c9f3ace4-dd5d-49e5-9807-39aedfa47c8e' } ``` This information is also available with `streamText`: ```ts import { bedrock } from '@ai-sdk/amazon-bedrock'; import { streamText } from 'ai'; const result = streamText({ model: bedrock('meta.llama3-70b-instruct-v1:0'), prompt: 'Write a vegetarian lasagna recipe for 4 people.', }); for await (const textPart of result.textStream) { process.stdout.write(textPart); } console.log('Response headers:', (await result.response).headers); ``` With sample output as: ```js highlight="6" { connection: 'keep-alive', 'content-type': 'application/vnd.amazon.eventstream', date: 'Fri, 07 Feb 2025 04:33:37 GMT', 'transfer-encoding': 'chunked', 'x-amzn-requestid': 'a976e3fc-0e45-4241-9954-b9bdd80ab407' } ``` ## Migrating to `@ai-sdk/amazon-bedrock` 2.x The Amazon Bedrock provider was rewritten in version 2.x to remove the dependency on the `@aws-sdk/client-bedrock-runtime` package. The `bedrockOptions` provider setting previously available has been removed. If you were using the `bedrockOptions` object, you should now use the `region`, `accessKeyId`, `secretAccessKey`, and `sessionToken` settings directly instead. Note that you may need to set all of these explicitly, e.g. even if you're not using `sessionToken`, set it to `undefined`. If you're running in a serverless environment, there may be default environment variables set by your containing environment that the Amazon Bedrock provider will then pick up and could conflict with the ones you're intending to use. --- title: Google Generative AI description: Learn how to use Google Generative AI Provider. --- # Google Generative AI Provider The [Google Generative AI](https://ai.google/discover/generativeai/) provider contains language and embedding model support for the [Google Generative AI](https://ai.google.dev/api/rest) APIs. ## Setup The Google provider is available in the `@ai-sdk/google` module. You can install it with ## Provider Instance You can import the default provider instance `google` from `@ai-sdk/google`: ```ts import { google } from '@ai-sdk/google'; ``` If you need a customized setup, you can import `createGoogleGenerativeAI` from `@ai-sdk/google` and create a provider instance with your settings: ```ts import { createGoogleGenerativeAI } from '@ai-sdk/google'; const google = createGoogleGenerativeAI({ // custom settings }); ``` You can use the following optional settings to customize the Google Generative AI provider instance: - **baseURL** _string_ Use a different URL prefix for API calls, e.g. to use proxy servers. The default prefix is `https://generativelanguage.googleapis.com/v1beta`. - **apiKey** _string_ API key that is being sent using the `x-goog-api-key` header. It defaults to the `GOOGLE_GENERATIVE_AI_API_KEY` environment variable. - **headers** _Record<string,string>_ Custom headers to include in the requests. - **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_ Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation. Defaults to the global `fetch` function. You can use it as a middleware to intercept requests, or to provide a custom fetch implementation for e.g. testing. ## Language Models You can create models that call the [Google Generative AI API](https://ai.google.dev/api/rest) using the provider instance. The first argument is the model id, e.g. `gemini-1.5-pro-latest`. The models support tool calls and some have multi-modal capabilities. ```ts const model = google('gemini-1.5-pro-latest'); ``` You can use fine-tuned models by prefixing the model id with `tunedModels/`, e.g. `tunedModels/my-model`. Google Generative AI models support also some model specific settings that are not part of the [standard call settings](/docs/ai-sdk-core/settings). You can pass them as an options argument: ```ts const model = google('gemini-1.5-pro-latest', { safetySettings: [ { category: 'HARM_CATEGORY_UNSPECIFIED', threshold: 'BLOCK_LOW_AND_ABOVE' }, ], }); ``` The following optional settings are available for Google Generative AI models: - **cachedContent** _string_ Optional. The name of the cached content used as context to serve the prediction. Format: cachedContents/\{cachedContent\} - **structuredOutputs** _boolean_ Optional. Enable structured output. Default is true. This is useful when the JSON Schema contains elements that are not supported by the OpenAPI schema version that Google Generative AI uses. You can use this to disable structured outputs if you need to. See [Troubleshooting: Schema Limitations](#schema-limitations) for more details. - **safetySettings** _Array\<\{ category: string; threshold: string \}\>_ Optional. Safety settings for the model. - **category** _string_ The category of the safety setting. Can be one of the following: - `HARM_CATEGORY_HATE_SPEECH` - `HARM_CATEGORY_DANGEROUS_CONTENT` - `HARM_CATEGORY_HARASSMENT` - `HARM_CATEGORY_SEXUALLY_EXPLICIT` - **threshold** _string_ The threshold of the safety setting. Can be one of the following: - `HARM_BLOCK_THRESHOLD_UNSPECIFIED` - `BLOCK_LOW_AND_ABOVE` - `BLOCK_MEDIUM_AND_ABOVE` - `BLOCK_ONLY_HIGH` - `BLOCK_NONE` You can use Google Generative AI language models to generate text with the `generateText` function: ```ts import { google } from '@ai-sdk/google'; import { generateText } from 'ai'; const { text } = await generateText({ model: google('gemini-1.5-pro-latest'), prompt: 'Write a vegetarian lasagna recipe for 4 people.', }); ``` Google Generative AI language models can also be used in the `streamText`, `generateObject`, `streamObject`, and `streamUI` functions (see [AI SDK Core](/docs/ai-sdk-core) and [AI SDK RSC](/docs/ai-sdk-rsc)). ### File Inputs The Google Generative AI provider supports file inputs, e.g. PDF files. ```ts import { google } from '@ai-sdk/google'; import { generateText } from 'ai'; const result = await generateText({ model: google('gemini-1.5-flash'), messages: [ { role: 'user', content: [ { type: 'text', text: 'What is an embedding model according to this document?', }, { type: 'file', data: fs.readFileSync('./data/ai.pdf'), mimeType: 'application/pdf', }, ], }, ], }); ``` The AI SDK will automatically download URLs if you pass them as data, except for `https://generativelanguage.googleapis.com/v1beta/files/`. You can use the Google Generative AI Files API to upload larger files to that location. See [File Parts](/docs/foundations/prompts#file-parts) for details on how to use files in prompts. ### Cached Content You can use Google Generative AI language models to cache content: ```ts import { google } from '@ai-sdk/google'; import { GoogleAICacheManager } from '@google/generative-ai/server'; import { generateText } from 'ai'; const cacheManager = new GoogleAICacheManager( process.env.GOOGLE_GENERATIVE_AI_API_KEY, ); // As of August 23rd, 2024, these are the only models that support caching type GoogleModelCacheableId = | 'models/gemini-1.5-flash-001' | 'models/gemini-1.5-pro-001'; const model: GoogleModelCacheableId = 'models/gemini-1.5-pro-001'; const { name: cachedContent } = await cacheManager.create({ model, contents: [ { role: 'user', parts: [{ text: '1000 Lasanga Recipes...' }], }, ], ttlSeconds: 60 * 5, }); const { text: veggieLasangaRecipe } = await generateText({ model: google(model, { cachedContent }), prompt: 'Write a vegetarian lasagna recipe for 4 people.', }); const { text: meatLasangaRecipe } = await generateText({ model: google(model, { cachedContent }), prompt: 'Write a meat lasagna recipe for 12 people.', }); ``` ### Search Grounding With [search grounding](https://ai.google.dev/gemini-api/docs/grounding), the model has access to the latest information using Google search. Search grounding can be used to provide answers around current events: ```ts highlight="7,14-20" import { google } from '@ai-sdk/google'; import { GoogleGenerativeAIProviderMetadata } from '@ai-sdk/google'; import { generateText } from 'ai'; const { text, providerMetadata } = await generateText({ model: google('gemini-1.5-pro', { useSearchGrounding: true, }), prompt: 'List the top 5 San Francisco news from the past week.' + 'You must include the date of each article.', }); // access the grounding metadata. Casting to the provider metadata type // is optional but provides autocomplete and type safety. const metadata = providerMetadata?.google as | GoogleGenerativeAIProviderMetadata | undefined; const groundingMetadata = metadata?.groundingMetadata; const safetyRatings = metadata?.safetyRatings; ``` The grounding metadata includes detailed information about how search results were used to ground the model's response. Here are the available fields: - **`webSearchQueries`** (`string[] | null`) - Array of search queries used to retrieve information - Example: `["What's the weather in Chicago this weekend?"]` - **`searchEntryPoint`** (`{ renderedContent: string } | null`) - Contains the main search result content used as an entry point - The `renderedContent` field contains the formatted content - **`groundingSupports`** (Array of support objects | null) - Contains details about how specific response parts are supported by search results - Each support object includes: - **`segment`**: Information about the grounded text segment - `text`: The actual text segment - `startIndex`: Starting position in the response - `endIndex`: Ending position in the response - **`groundingChunkIndices`**: References to supporting search result chunks - **`confidenceScores`**: Confidence scores (0-1) for each supporting chunk Example response: ```json { "groundingMetadata": { "webSearchQueries": ["What's the weather in Chicago this weekend?"], "searchEntryPoint": { "renderedContent": "..." }, "groundingSupports": [ { "segment": { "startIndex": 0, "endIndex": 65, "text": "Chicago weather changes rapidly, so layers let you adjust easily." }, "groundingChunkIndices": [0], "confidenceScores": [0.99] } ] } } ``` #### Dynamic Retrieval With [dynamic retrieval](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/ground-with-google-search#dynamic-retrieval), you can configure how the model decides when to turn on Grounding with Google Search. This gives you more control over when and how the model grounds its responses. ```ts highlight="7-10" import { google } from '@ai-sdk/google'; import { generateText } from 'ai'; const { text, providerMetadata } = await generateText({ model: google('gemini-1.5-flash', { useSearchGrounding: true, dynamicRetrievalConfig: { mode: 'MODE_DYNAMIC', dynamicThreshold: 0.8, }, }), prompt: 'Who won the latest F1 grand prix?', }); ``` The `dynamicRetrievalConfig` describes the options to customize dynamic retrieval: - `mode`: The mode of the predictor to be used in dynamic retrieval. The following modes are supported: - `MODE_DYNAMIC`: Run retrieval only when system decides it is necessary - `MODE_UNSPECIFIED`: Always trigger retrieval - `dynamicThreshold`: The threshold to be used in dynamic retrieval (if not set, a system default value is used). Dynamic retrieval is only available with Gemini 1.5 Flash models and is not supported with 8B variants. ### Sources When you use [Search Grounding](#search-grounding), the model will include sources in the response. You can access them using the `sources` property of the result: ```ts import { google } from '@ai-sdk/google'; import { generateText } from 'ai'; const { sources } = await generateText({ model: google('gemini-2.0-flash-exp', { useSearchGrounding: true }), prompt: 'List the top 5 San Francisco news from the past week.', }); ``` ### Safety Ratings The safety ratings provide insight into the safety of the model's response. See [Google AI documentation on safety settings](https://ai.google.dev/gemini-api/docs/safety-settings). Example response excerpt: ```json { "safetyRatings": [ { "category": "HARM_CATEGORY_HATE_SPEECH", "probability": "NEGLIGIBLE", "probabilityScore": 0.11027937, "severity": "HARM_SEVERITY_LOW", "severityScore": 0.28487435 }, { "category": "HARM_CATEGORY_DANGEROUS_CONTENT", "probability": "HIGH", "blocked": true, "probabilityScore": 0.95422274, "severity": "HARM_SEVERITY_MEDIUM", "severityScore": 0.43398145 }, { "category": "HARM_CATEGORY_HARASSMENT", "probability": "NEGLIGIBLE", "probabilityScore": 0.11085559, "severity": "HARM_SEVERITY_NEGLIGIBLE", "severityScore": 0.19027223 }, { "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT", "probability": "NEGLIGIBLE", "probabilityScore": 0.22901751, "severity": "HARM_SEVERITY_NEGLIGIBLE", "severityScore": 0.09089675 } ] } ``` ### Troubleshooting #### Schema Limitations The Google Generative AI API uses a subset of the OpenAPI 3.0 schema, which does not support features such as unions. The errors that you get in this case look like this: `GenerateContentRequest.generation_config.response_schema.properties[occupation].type: must be specified` By default, structured outputs are enabled (and for tool calling they are required). You can disable structured outputs for object generation as a workaround: ```ts highlight="3,8" const result = await generateObject({ model: google('gemini-1.5-pro-latest', { structuredOutputs: false, }), schema: z.object({ name: z.string(), age: z.number(), contact: z.union([ z.object({ type: z.literal('email'), value: z.string(), }), z.object({ type: z.literal('phone'), value: z.string(), }), ]), }), prompt: 'Generate an example person for testing.', }); ``` The following Zod features are known to not work with Google Generative AI: - `z.union` - `z.record` ### Model Capabilities | Model | Image Input | Object Generation | Tool Usage | Tool Streaming | | ---------------------------- | ------------------- | ------------------- | ------------------- | ------------------- | | `gemini-2.0-flash-001` | | | | | | `gemini-1.5-pro` | | | | | | `gemini-1.5-pro-latest` | | | | | | `gemini-1.5-flash` | | | | | | `gemini-1.5-flash-latest` | | | | | | `gemini-1.5-flash-8b` | | | | | | `gemini-1.5-flash-8b-latest` | | | | | The table above lists popular models. Please see the [Google Generative AI docs](https://ai.google.dev/gemini-api/docs/models/gemini) for a full list of available models. The table above lists popular models. You can also pass any available provider model ID as a string if needed. ## Embedding Models You can create models that call the [Google Generative AI embeddings API](https://ai.google.dev/api/embeddings) using the `.textEmbeddingModel()` factory method. ```ts const model = google.textEmbeddingModel('text-embedding-004'); ``` Google Generative AI embedding models support aditional settings. You can pass them as an options argument: ```ts const model = google.textEmbeddingModel('text-embedding-004', { outputDimensionality: 512, // optional, number of dimensions for the embedding }); ``` The following optional settings are available for Google Generative AI embedding models: - **outputDimensionality**: _number_ Optional reduced dimension for the output embedding. If set, excessive values in the output embedding are truncated from the end. ### Model Capabilities | Model | Default Dimensions | Custom Dimensions | | -------------------- | ------------------ | ------------------- | | `text-embedding-004` | 768 | | --- title: Google Vertex AI description: Learn how to use the Google Vertex AI provider. --- # Google Vertex Provider The Google Vertex provider for the [AI SDK](https://sdk.vercel.ai/docs) contains language model support for the [Google Vertex AI](https://cloud.google.com/vertex-ai) APIs. This includes support for [Google's Gemini models](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models) and [Anthropic's Claude partner models](https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/use-claude). The Google Vertex provider is compatible with both Node.js and Edge runtimes. The Edge runtime is supported through the `@ai-sdk/google-vertex/edge` sub-module. More details can be found in the [Google Vertex Edge Runtime](#google-vertex-edge-runtime) and [Google Vertex Anthropic Edge Runtime](#google-vertex-anthropic-edge-runtime) sections below. ## Setup The Google Vertex and Google Vertex Anthropic providers are both available in the `@ai-sdk/google-vertex` module. You can install it with ## Google Vertex Provider Usage The Google Vertex provider instance is used to create model instances that call the Vertex AI API. The models available with this provider include [Google's Gemini models](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models). If you're looking to use [Anthropic's Claude models](https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/use-claude), see the [Google Vertex Anthropic Provider](#google-vertex-anthropic-provider-usage) section below. ### Provider Instance You can import the default provider instance `vertex` from `@ai-sdk/google-vertex`: ```ts import { vertex } from '@ai-sdk/google-vertex'; ``` If you need a customized setup, you can import `createVertex` from `@ai-sdk/google-vertex` and create a provider instance with your settings: ```ts import { createVertex } from '@ai-sdk/google-vertex'; const vertex = createVertex({ project: 'my-project', // optional location: 'us-central1', // optional }); ``` Google Vertex supports two different authentication implementations depending on your runtime environment. #### Node.js Runtime The Node.js runtime is the default runtime supported by the AI SDK. It supports all standard Google Cloud authentication options through the [`google-auth-library`](https://github.com/googleapis/google-auth-library-nodejs?tab=readme-ov-file#ways-to-authenticate). Typical use involves setting a path to a json credentials file in the `GOOGLE_APPLICATION_CREDENTIALS` environment variable. The credentials file can be obtained from the [Google Cloud Console](https://console.cloud.google.com/apis/credentials). If you want to customize the Google authentication options you can pass them as options to the `createVertex` function, for example: ```ts import { createVertex } from '@ai-sdk/google-vertex'; const vertex = createVertex({ googleAuthOptions: { credentials: { client_email: 'my-email', private_key: 'my-private-key', }, }, }); ``` ##### Optional Provider Settings You can use the following optional settings to customize the provider instance: - **project** _string_ The Google Cloud project ID that you want to use for the API calls. It uses the `GOOGLE_VERTEX_PROJECT` environment variable by default. - **location** _string_ The Google Cloud location that you want to use for the API calls, e.g. `us-central1`. It uses the `GOOGLE_VERTEX_LOCATION` environment variable by default. - **googleAuthOptions** _object_ Optional. The Authentication options used by the [Google Auth Library](https://github.com/googleapis/google-auth-library-nodejs/). See also the [GoogleAuthOptions](https://github.com/googleapis/google-auth-library-nodejs/blob/08978822e1b7b5961f0e355df51d738e012be392/src/auth/googleauth.ts#L87C18-L87C35) interface. - **authClient** _object_ An `AuthClient` to use. - **keyFilename** _string_ Path to a .json, .pem, or .p12 key file. - **keyFile** _string_ Path to a .json, .pem, or .p12 key file. - **credentials** _object_ Object containing client_email and private_key properties, or the external account client options. - **clientOptions** _object_ Options object passed to the constructor of the client. - **scopes** _string | string[]_ Required scopes for the desired API request. - **projectId** _string_ Your project ID. - **universeDomain** _string_ The default service domain for a given Cloud universe. - **headers** _Resolvable<Record<string, string | undefined>>_ Headers to include in the requests. Can be provided in multiple formats: - A record of header key-value pairs: `Record` - A function that returns headers: `() => Record` - An async function that returns headers: `async () => Record` - A promise that resolves to headers: `Promise>` - **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_ Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation. Defaults to the global `fetch` function. You can use it as a middleware to intercept requests, or to provide a custom fetch implementation for e.g. testing. - **baseURL** _string_ Optional. Base URL for the Google Vertex API calls e.g. to use proxy servers. By default, it is constructed using the location and project: `https://${location}-aiplatform.googleapis.com/v1/projects/${project}/locations/${location}/publishers/google` #### Edge Runtime Edge runtimes (like Vercel Edge Functions and Cloudflare Workers) are lightweight JavaScript environments that run closer to users at the network edge. They only provide a subset of the standard Node.js APIs. For example, direct file system access is not available, and many Node.js-specific libraries (including the standard Google Auth library) are not compatible. The Edge runtime version of the Google Vertex provider supports Google's [Application Default Credentials](https://github.com/googleapis/google-auth-library-nodejs?tab=readme-ov-file#application-default-credentials) through environment variables. The values can be obtained from a json credentials file from the [Google Cloud Console](https://console.cloud.google.com/apis/credentials). You can import the default provider instance `vertex` from `@ai-sdk/google-vertex/edge`: ```ts import { vertex } from '@ai-sdk/google-vertex/edge'; ``` The `/edge` sub-module is included in the `@ai-sdk/google-vertex` package, so you don't need to install it separately. You must import from `@ai-sdk/google-vertex/edge` to differentiate it from the Node.js provider. If you need a customized setup, you can import `createVertex` from `@ai-sdk/google-vertex/edge` and create a provider instance with your settings: ```ts import { createVertex } from '@ai-sdk/google-vertex/edge'; const vertex = createVertex({ project: 'my-project', // optional location: 'us-central1', // optional }); ``` For Edge runtime authentication, you'll need to set these environment variables from your Google Default Application Credentials JSON file: - `GOOGLE_CLIENT_EMAIL` - `GOOGLE_PRIVATE_KEY` - `GOOGLE_PRIVATE_KEY_ID` (optional) These values can be obtained from a service account JSON file from the [Google Cloud Console](https://console.cloud.google.com/apis/credentials). ##### Optional Provider Settings You can use the following optional settings to customize the provider instance: - **project** _string_ The Google Cloud project ID that you want to use for the API calls. It uses the `GOOGLE_VERTEX_PROJECT` environment variable by default. - **location** _string_ The Google Cloud location that you want to use for the API calls, e.g. `us-central1`. It uses the `GOOGLE_VERTEX_LOCATION` environment variable by default. - **googleCredentials** _object_ Optional. The credentials used by the Edge provider for authentication. These credentials are typically set through environment variables and are derived from a service account JSON file. - **clientEmail** _string_ The client email from the service account JSON file. Defaults to the contents of the `GOOGLE_CLIENT_EMAIL` environment variable. - **privateKey** _string_ The private key from the service account JSON file. Defaults to the contents of the `GOOGLE_PRIVATE_KEY` environment variable. - **privateKeyId** _string_ The private key ID from the service account JSON file (optional). Defaults to the contents of the `GOOGLE_PRIVATE_KEY_ID` environment variable. - **headers** _Resolvable<Record<string, string | undefined>>_ Headers to include in the requests. Can be provided in multiple formats: - A record of header key-value pairs: `Record` - A function that returns headers: `() => Record` - An async function that returns headers: `async () => Record` - A promise that resolves to headers: `Promise>` - **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_ Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation. Defaults to the global `fetch` function. You can use it as a middleware to intercept requests, or to provide a custom fetch implementation for e.g. testing. ### Language Models You can create models that call the Vertex API using the provider instance. The first argument is the model id, e.g. `gemini-1.5-pro`. ```ts const model = vertex('gemini-1.5-pro'); ``` If you are using [your own models](https://cloud.google.com/vertex-ai/docs/training-overview), the name of your model needs to start with `projects/`. Google Vertex models support also some model specific settings that are not part of the [standard call settings](/docs/ai-sdk-core/settings). You can pass them as an options argument: ```ts const model = vertex('gemini-1.5-pro', { safetySettings: [ { category: 'HARM_CATEGORY_UNSPECIFIED', threshold: 'BLOCK_LOW_AND_ABOVE' }, ], }); ``` The following optional settings are available for Google Vertex models: - **structuredOutputs** _boolean_ Optional. Enable structured output. Default is true. This is useful when the JSON Schema contains elements that are not supported by the OpenAPI schema version that Google Vertex uses. You can use this to disable structured outputs if you need to. See [Troubleshooting: Schema Limitations](#schema-limitations) for more details. - **safetySettings** _Array\<\{ category: string; threshold: string \}\>_ Optional. Safety settings for the model. - **category** _string_ The category of the safety setting. Can be one of the following: - `HARM_CATEGORY_UNSPECIFIED` - `HARM_CATEGORY_HATE_SPEECH` - `HARM_CATEGORY_DANGEROUS_CONTENT` - `HARM_CATEGORY_HARASSMENT` - `HARM_CATEGORY_SEXUALLY_EXPLICIT` - `HARM_CATEGORY_CIVIC_INTEGRITY` - **threshold** _string_ The threshold of the safety setting. Can be one of the following: - `HARM_BLOCK_THRESHOLD_UNSPECIFIED` - `BLOCK_LOW_AND_ABOVE` - `BLOCK_MEDIUM_AND_ABOVE` - `BLOCK_ONLY_HIGH` - `BLOCK_NONE` - **useSearchGrounding** _boolean_ Optional. When enabled, the model will [use Google search to ground the response](https://cloud.google.com/vertex-ai/generative-ai/docs/grounding/overview). - **audioTimestamp** _boolean_ Optional. Enables timestamp understanding for audio files. Defaults to false. This is useful for generating transcripts with accurate timestamps. Consult [Google's Documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/audio-understanding) for usage details. You can use Google Vertex language models to generate text with the `generateText` function: ```ts highlight="1,4" import { vertex } from '@ai-sdk/google-vertex'; import { generateText } from 'ai'; const { text } = await generateText({ model: vertex('gemini-1.5-pro'), prompt: 'Write a vegetarian lasagna recipe for 4 people.', }); ``` Google Vertex language models can also be used in the `streamText` and `streamUI` functions (see [AI SDK Core](/docs/ai-sdk-core) and [AI SDK RSC](/docs/ai-sdk-rsc)). #### File Inputs The Google Vertex provider supports file inputs, e.g. PDF files. ```ts import { vertex } from '@ai-sdk/google-vertex'; import { generateText } from 'ai'; const { text } = await generateText({ model: vertex('gemini-1.5-pro'), messages: [ { role: 'user', content: [ { type: 'text', text: 'What is an embedding model according to this document?', }, { type: 'file', data: fs.readFileSync('./data/ai.pdf'), mimeType: 'application/pdf', }, ], }, ], }); ``` The AI SDK will automatically download URLs if you pass them as data, except for `gs://` URLs. You can use the Google Cloud Storage API to upload larger files to that location. See [File Parts](/docs/foundations/prompts#file-parts) for details on how to use files in prompts. #### Search Grounding With [search grounding](https://cloud.google.com/vertex-ai/generative-ai/docs/grounding/overview), the model has access to the latest information using Google search. Search grounding can be used to provide answers around current events: ```ts highlight="7,14-20" import { vertex } from '@ai-sdk/google-vertex'; import { GoogleGenerativeAIProviderMetadata } from '@ai-sdk/google'; import { generateText } from 'ai'; const { text, providerMetadata } = await generateText({ model: vertex('gemini-1.5-pro', { useSearchGrounding: true, }), prompt: 'List the top 5 San Francisco news from the past week.' + 'You must include the date of each article.', }); // access the grounding metadata. Casting to the provider metadata type // is optional but provides autocomplete and type safety. const metadata = providerMetadata?.google as | GoogleGenerativeAIProviderMetadata | undefined; const groundingMetadata = metadata?.groundingMetadata; const safetyRatings = metadata?.safetyRatings; ``` The grounding metadata includes detailed information about how search results were used to ground the model's response. Here are the available fields: - **`webSearchQueries`** (`string[] | null`) - Array of search queries used to retrieve information - Example: `["What's the weather in Chicago this weekend?"]` - **`searchEntryPoint`** (`{ renderedContent: string } | null`) - Contains the main search result content used as an entry point - The `renderedContent` field contains the formatted content - **`groundingSupports`** (Array of support objects | null) - Contains details about how specific response parts are supported by search results - Each support object includes: - **`segment`**: Information about the grounded text segment - `text`: The actual text segment - `startIndex`: Starting position in the response - `endIndex`: Ending position in the response - **`groundingChunkIndices`**: References to supporting search result chunks - **`confidenceScores`**: Confidence scores (0-1) for each supporting chunk Example response excerpt: ```json { "groundingMetadata": { "retrievalQueries": ["What's the weather in Chicago this weekend?"], "searchEntryPoint": { "renderedContent": "..." }, "groundingSupports": [ { "segment": { "startIndex": 0, "endIndex": 65, "text": "Chicago weather changes rapidly, so layers let you adjust easily." }, "groundingChunkIndices": [0], "confidenceScores": [0.99] } ] } } ``` The Google Vertex provider does not yet support [dynamic retrieval mode and threshold](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/ground-gemini#dynamic-retrieval). ### Sources When you use [Search Grounding](#search-grounding), the model will include sources in the response. You can access them using the `sources` property of the result: ```ts import { vertex } from '@ai-sdk/google-vertex'; import { generateText } from 'ai'; const { sources } = await generateText({ model: vertex('gemini-1.5-pro', { useSearchGrounding: true }), prompt: 'List the top 5 San Francisco news from the past week.', }); ``` ### Safety Ratings The safety ratings provide insight into the safety of the model's response. See [Google Vertex AI documentation on configuring safety filters](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/configure-safety-filters). Example response excerpt: ```json { "safetyRatings": [ { "category": "HARM_CATEGORY_HATE_SPEECH", "probability": "NEGLIGIBLE", "probabilityScore": 0.11027937, "severity": "HARM_SEVERITY_LOW", "severityScore": 0.28487435 }, { "category": "HARM_CATEGORY_DANGEROUS_CONTENT", "probability": "HIGH", "blocked": true, "probabilityScore": 0.95422274, "severity": "HARM_SEVERITY_MEDIUM", "severityScore": 0.43398145 }, { "category": "HARM_CATEGORY_HARASSMENT", "probability": "NEGLIGIBLE", "probabilityScore": 0.11085559, "severity": "HARM_SEVERITY_NEGLIGIBLE", "severityScore": 0.19027223 }, { "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT", "probability": "NEGLIGIBLE", "probabilityScore": 0.22901751, "severity": "HARM_SEVERITY_NEGLIGIBLE", "severityScore": 0.09089675 } ] } ``` For more details, see the [Google Vertex AI documentation on grounding with Google Search](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/ground-gemini#ground-to-search). ### Troubleshooting #### Schema Limitations The Google Vertex API uses a subset of the OpenAPI 3.0 schema, which does not support features such as unions. The errors that you get in this case look like this: `GenerateContentRequest.generation_config.response_schema.properties[occupation].type: must be specified` By default, structured outputs are enabled (and for tool calling they are required). You can disable structured outputs for object generation as a workaround: ```ts highlight="3,8" const result = await generateObject({ model: vertex('gemini-1.5-pro', { structuredOutputs: false, }), schema: z.object({ name: z.string(), age: z.number(), contact: z.union([ z.object({ type: z.literal('email'), value: z.string(), }), z.object({ type: z.literal('phone'), value: z.string(), }), ]), }), prompt: 'Generate an example person for testing.', }); ``` The following Zod features are known to not work with Google Vertex: - `z.union` - `z.record` ### Model Capabilities | Model | Image Input | Object Generation | Tool Usage | Tool Streaming | | ---------------------- | ------------------- | ------------------- | ------------------- | ------------------- | | `gemini-2.0-flash-001` | | | | | | `gemini-2.0-flash-exp` | | | | | | `gemini-1.5-flash` | | | | | | `gemini-1.5-pro` | | | | | The table above lists popular models. Please see the [Google Vertex AI docs](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference#supported-models) for a full list of available models. The table above lists popular models. You can also pass any available provider model ID as a string if needed. ### Embedding Models You can create models that call the Google Vertex AI embeddings API using the `.textEmbeddingModel()` factory method: ```ts const model = vertex.textEmbeddingModel('text-embedding-004'); ``` Google Vertex AI embedding models support additional settings. You can pass them as an options argument: ```ts const model = vertex.textEmbeddingModel('text-embedding-004', { outputDimensionality: 512, // optional, number of dimensions for the embedding }); ``` The following optional settings are available for Google Vertex AI embedding models: - **outputDimensionality**: _number_ Optional reduced dimension for the output embedding. If set, excessive values in the output embedding are truncated from the end. #### Model Capabilities | Model | Max Values Per Call | Parallel Calls | | -------------------- | ------------------- | ------------------- | | `text-embedding-004` | 2048 | | The table above lists popular models. You can also pass any available provider model ID as a string if needed. ### Image Models You can create [Imagen](https://cloud.google.com/vertex-ai/generative-ai/docs/image/overview) models that call the [Imagen on Vertex AI API](https://cloud.google.com/vertex-ai/generative-ai/docs/image/generate-images) using the `.image()` factory method. For more on image generation with the AI SDK see [generateImage()](/docs/reference/ai-sdk-core/generate-image). ```ts import { vertex } from '@ai-sdk/google-vertex'; import { experimental_generateImage as generateImage } from 'ai'; const { image } = await generateImage({ model: vertex.image('imagen-3.0-generate-001'), prompt: 'A futuristic cityscape at sunset', aspectRatio: '16:9', }); ``` Imagen models do not support the `size` parameter. Use the `aspectRatio` parameter instead. #### Model Capabilities | Model | Aspect Ratios | | ------------------------------ | ------------------------- | | `imagen-3.0-generate-001` | 1:1, 3:4, 4:3, 9:16, 16:9 | | `imagen-3.0-fast-generate-001` | 1:1, 3:4, 4:3, 9:16, 16:9 | ## Google Vertex Anthropic Provider Usage The Google Vertex Anthropic provider for the [AI SDK](https://sdk.vercel.ai/docs) offers support for Anthropic's Claude models through the Google Vertex AI APIs. This section provides details on how to set up and use the Google Vertex Anthropic provider. ### Provider Instance You can import the default provider instance `vertexAnthropic` from `@ai-sdk/google-vertex/anthropic`: ```typescript import { vertexAnthropic } from '@ai-sdk/google-vertex/anthropic'; ``` If you need a customized setup, you can import `createVertexAnthropic` from `@ai-sdk/google-vertex/anthropic` and create a provider instance with your settings: ```typescript import { createVertexAnthropic } from '@ai-sdk/google-vertex/anthropic'; const vertexAnthropic = createVertexAnthropic({ project: 'my-project', // optional location: 'us-central1', // optional }); ``` #### Node.js Runtime For Node.js environments, the Google Vertex Anthropic provider supports all standard Google Cloud authentication options through the `google-auth-library`. You can customize the authentication options by passing them to the `createVertexAnthropic` function: ```typescript import { createVertexAnthropic } from '@ai-sdk/google-vertex/anthropic'; const vertexAnthropic = createVertexAnthropic({ googleAuthOptions: { credentials: { client_email: 'my-email', private_key: 'my-private-key', }, }, }); ``` ##### Optional Provider Settings You can use the following optional settings to customize the Google Vertex Anthropic provider instance: - **project** _string_ The Google Cloud project ID that you want to use for the API calls. It uses the `GOOGLE_VERTEX_PROJECT` environment variable by default. - **location** _string_ The Google Cloud location that you want to use for the API calls, e.g. `us-central1`. It uses the `GOOGLE_VERTEX_LOCATION` environment variable by default. - **googleAuthOptions** _object_ Optional. The Authentication options used by the [Google Auth Library](https://github.com/googleapis/google-auth-library-nodejs/). See also the [GoogleAuthOptions](https://github.com/googleapis/google-auth-library-nodejs/blob/08978822e1b7b5961f0e355df51d738e012be392/src/auth/googleauth.ts#L87C18-L87C35) interface. - **authClient** _object_ An `AuthClient` to use. - **keyFilename** _string_ Path to a .json, .pem, or .p12 key file. - **keyFile** _string_ Path to a .json, .pem, or .p12 key file. - **credentials** _object_ Object containing client_email and private_key properties, or the external account client options. - **clientOptions** _object_ Options object passed to the constructor of the client. - **scopes** _string | string[]_ Required scopes for the desired API request. - **projectId** _string_ Your project ID. - **universeDomain** _string_ The default service domain for a given Cloud universe. - **headers** _Resolvable<Record<string, string | undefined>>_ Headers to include in the requests. Can be provided in multiple formats: - A record of header key-value pairs: `Record` - A function that returns headers: `() => Record` - An async function that returns headers: `async () => Record` - A promise that resolves to headers: `Promise>` - **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_ Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation. Defaults to the global `fetch` function. You can use it as a middleware to intercept requests, or to provide a custom fetch implementation for e.g. testing. #### Edge Runtime Edge runtimes (like Vercel Edge Functions and Cloudflare Workers) are lightweight JavaScript environments that run closer to users at the network edge. They only provide a subset of the standard Node.js APIs. For example, direct file system access is not available, and many Node.js-specific libraries (including the standard Google Auth library) are not compatible. The Edge runtime version of the Google Vertex Anthropic provider supports Google's [Application Default Credentials](https://github.com/googleapis/google-auth-library-nodejs?tab=readme-ov-file#application-default-credentials) through environment variables. The values can be obtained from a json credentials file from the [Google Cloud Console](https://console.cloud.google.com/apis/credentials). For Edge runtimes, you can import the provider instance from `@ai-sdk/google-vertex/anthropic/edge`: ```typescript import { vertexAnthropic } from '@ai-sdk/google-vertex/anthropic/edge'; ``` To customize the setup, use `createVertexAnthropic` from the same module: ```typescript import { createVertexAnthropic } from '@ai-sdk/google-vertex/anthropic/edge'; const vertexAnthropic = createVertexAnthropic({ project: 'my-project', // optional location: 'us-central1', // optional }); ``` For Edge runtime authentication, set these environment variables from your Google Default Application Credentials JSON file: - `GOOGLE_CLIENT_EMAIL` - `GOOGLE_PRIVATE_KEY` - `GOOGLE_PRIVATE_KEY_ID` (optional) ##### Optional Provider Settings You can use the following optional settings to customize the provider instance: - **project** _string_ The Google Cloud project ID that you want to use for the API calls. It uses the `GOOGLE_VERTEX_PROJECT` environment variable by default. - **location** _string_ The Google Cloud location that you want to use for the API calls, e.g. `us-central1`. It uses the `GOOGLE_VERTEX_LOCATION` environment variable by default. - **googleCredentials** _object_ Optional. The credentials used by the Edge provider for authentication. These credentials are typically set through environment variables and are derived from a service account JSON file. - **clientEmail** _string_ The client email from the service account JSON file. Defaults to the contents of the `GOOGLE_CLIENT_EMAIL` environment variable. - **privateKey** _string_ The private key from the service account JSON file. Defaults to the contents of the `GOOGLE_PRIVATE_KEY` environment variable. - **privateKeyId** _string_ The private key ID from the service account JSON file (optional). Defaults to the contents of the `GOOGLE_PRIVATE_KEY_ID` environment variable. - **headers** _Resolvable<Record<string, string | undefined>>_ Headers to include in the requests. Can be provided in multiple formats: - A record of header key-value pairs: `Record` - A function that returns headers: `() => Record` - An async function that returns headers: `async () => Record` - A promise that resolves to headers: `Promise>` - **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_ Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation. Defaults to the global `fetch` function. You can use it as a middleware to intercept requests, or to provide a custom fetch implementation for e.g. testing. ### Language Models You can create models that call the [Anthropic Messages API](https://docs.anthropic.com/claude/reference/messages_post) using the provider instance. The first argument is the model id, e.g. `claude-3-haiku-20240307`. Some models have multi-modal capabilities. ```ts const model = anthropic('claude-3-haiku-20240307'); ``` You can use Anthropic language models to generate text with the `generateText` function: ```ts import { vertexAnthropic } from '@ai-sdk/google-vertex/anthropic'; import { generateText } from 'ai'; const { text } = await generateText({ model: vertexAnthropic('claude-3-haiku-20240307'), prompt: 'Write a vegetarian lasagna recipe for 4 people.', }); ``` Anthropic language models can also be used in the `streamText`, `generateObject`, and `streamObject` functions (see [AI SDK Core](/docs/ai-sdk-core) and [AI SDK RSC](/docs/ai-sdk-rsc)). The Anthropic API returns streaming tool calls all at once after a delay. This causes the `streamObject` function to generate the object fully after a delay instead of streaming it incrementally. The following optional settings are available for Anthropic models: - `sendReasoning` _boolean_ Optional. Include reasoning content in requests sent to the model. Defaults to `true`. If you are experiencing issues with the model handling requests involving reasoning content, you can set this to `false` to omit them from the request. ### Reasoning Anthropic has reasoning support for the `claude-3-7-sonnet@20250219` model. You can enable it using the `thinking` provider option and specifying a thinking budget in tokens. ```ts import { vertexAnthropic } from '@ai-sdk/google-vertex/anthropic'; import { generateText } from 'ai'; const { text, reasoning, reasoningDetails } = await generateText({ model: vertexAnthropic('claude-3-7-sonnet@20250219'), prompt: 'How many people will live in the world in 2040?', providerOptions: { anthropic: { thinking: { type: 'enabled', budgetTokens: 12000 }, }, }, }); console.log(reasoning); // reasoning text console.log(reasoningDetails); // reasoning details including redacted reasoning console.log(text); // text response ``` See [AI SDK UI: Chatbot](/docs/ai-sdk-ui/chatbot#reasoning) for more details on how to integrate reasoning into your chatbot. #### Cache Control Anthropic cache control is in a Pre-Generally Available (GA) state on Google Vertex. For more see [Google Vertex Anthropic cache control documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/claude-prompt-caching). In the messages and message parts, you can use the `providerOptions` property to set cache control breakpoints. You need to set the `anthropic` property in the `providerOptions` object to `{ cacheControl: { type: 'ephemeral' } }` to set a cache control breakpoint. The cache creation input tokens are then returned in the `providerMetadata` object for `generateText` and `generateObject`, again under the `anthropic` property. When you use `streamText` or `streamObject`, the response contains a promise that resolves to the metadata. Alternatively you can receive it in the `onFinish` callback. ```ts highlight="8,18-20,29-30" import { vertexAnthropic } from '@ai-sdk/google-vertex/anthropic'; import { generateText } from 'ai'; const errorMessage = '... long error message ...'; const result = await generateText({ model: vertexAnthropic('claude-3-5-sonnet-20240620'), messages: [ { role: 'user', content: [ { type: 'text', text: 'You are a JavaScript expert.' }, { type: 'text', text: `Error message: ${errorMessage}`, providerOptions: { anthropic: { cacheControl: { type: 'ephemeral' } }, }, }, { type: 'text', text: 'Explain the error message.' }, ], }, ], }); console.log(result.text); console.log(result.providerMetadata?.anthropic); // e.g. { cacheCreationInputTokens: 2118, cacheReadInputTokens: 0 } ``` You can also use cache control on system messages by providing multiple system messages at the head of your messages array: ```ts highlight="3,9-11" const result = await generateText({ model: vertexAnthropic('claude-3-5-sonnet-20240620'), messages: [ { role: 'system', content: 'Cached system message part', providerOptions: { anthropic: { cacheControl: { type: 'ephemeral' } }, }, }, { role: 'system', content: 'Uncached system message part', }, { role: 'user', content: 'User prompt', }, ], }); ``` For more on prompt caching with Anthropic, see [Google Vertex AI's Claude prompt caching documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/claude-prompt-caching) and [Anthropic's Cache Control documentation](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching). ### Computer Use Anthropic provides three built-in tools that can be used to interact with external systems: 1. **Bash Tool**: Allows running bash commands. 2. **Text Editor Tool**: Provides functionality for viewing and editing text files. 3. **Computer Tool**: Enables control of keyboard and mouse actions on a computer. They are available via the `tools` property of the provider instance. For more background see [Anthropic's Computer Use documentation](https://docs.anthropic.com/en/docs/build-with-claude/computer-use). #### Bash Tool The Bash Tool allows running bash commands. Here's how to create and use it: ```ts const bashTool = vertexAnthropic.tools.bash_20241022({ execute: async ({ command, restart }) => { // Implement your bash command execution logic here // Return the result of the command execution }, }); ``` Parameters: - `command` (string): The bash command to run. Required unless the tool is being restarted. - `restart` (boolean, optional): Specifying true will restart this tool. #### Text Editor Tool The Text Editor Tool provides functionality for viewing and editing text files: ```ts const textEditorTool = vertexAnthropic.tools.textEditor_20241022({ execute: async ({ command, path, file_text, insert_line, new_str, old_str, view_range, }) => { // Implement your text editing logic here // Return the result of the text editing operation }, }); ``` Parameters: - `command` ('view' | 'create' | 'str_replace' | 'insert' | 'undo_edit'): The command to run. - `path` (string): Absolute path to file or directory, e.g. `/repo/file.py` or `/repo`. - `file_text` (string, optional): Required for `create` command, with the content of the file to be created. - `insert_line` (number, optional): Required for `insert` command. The line number after which to insert the new string. - `new_str` (string, optional): New string for `str_replace` or `insert` commands. - `old_str` (string, optional): Required for `str_replace` command, containing the string to replace. - `view_range` (number[], optional): Optional for `view` command to specify line range to show. #### Computer Tool The Computer Tool enables control of keyboard and mouse actions on a computer: ```ts const computerTool = vertexAnthropic.tools.computer_20241022({ displayWidthPx: 1920, displayHeightPx: 1080, displayNumber: 0, // Optional, for X11 environments execute: async ({ action, coordinate, text }) => { // Implement your computer control logic here // Return the result of the action // Example code: switch (action) { case 'screenshot': { // multipart result: return { type: 'image', data: fs .readFileSync('./data/screenshot-editor.png') .toString('base64'), }; } default: { console.log('Action:', action); console.log('Coordinate:', coordinate); console.log('Text:', text); return `executed ${action}`; } } }, // map to tool result content for LLM consumption: experimental_toToolResultContent(result) { return typeof result === 'string' ? [{ type: 'text', text: result }] : [{ type: 'image', data: result.data, mimeType: 'image/png' }]; }, }); ``` Parameters: - `action` ('key' | 'type' | 'mouse_move' | 'left_click' | 'left_click_drag' | 'right_click' | 'middle_click' | 'double_click' | 'screenshot' | 'cursor_position'): The action to perform. - `coordinate` (number[], optional): Required for `mouse_move` and `left_click_drag` actions. Specifies the (x, y) coordinates. - `text` (string, optional): Required for `type` and `key` actions. These tools can be used in conjunction with the `claude-3-5-sonnet-v2@20241022` model to enable more complex interactions and tasks. ### Model Capabilities The latest Anthropic model list on Vertex AI is available [here](https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/use-claude#model-list). See also [Anthropic Model Comparison](https://docs.anthropic.com/en/docs/about-claude/models#model-comparison). | Model | Image Input | Object Generation | Tool Usage | Tool Streaming | Computer Use | | ------------------------------- | ------------------- | ------------------- | ------------------- | ------------------- | ------------------- | | `claude-3-7-sonnet@20250219` | | | | | | | `claude-3-5-sonnet-v2@20241022` | | | | | | | `claude-3-5-sonnet@20240620` | | | | | | | `claude-3-5-haiku@20241022` | | | | | | | `claude-3-sonnet@20240229` | | | | | | | `claude-3-haiku@20240307` | | | | | | | `claude-3-opus@20240229` | | | | | | The table above lists popular models. You can also pass any available provider model ID as a string if needed. --- title: Mistral AI description: Learn how to use Mistral. --- # Mistral AI Provider The [Mistral AI](https://mistral.ai/) provider contains language model support for the Mistral chat API. ## Setup The Mistral provider is available in the `@ai-sdk/mistral` module. You can install it with ## Provider Instance You can import the default provider instance `mistral` from `@ai-sdk/mistral`: ```ts import { mistral } from '@ai-sdk/mistral'; ``` If you need a customized setup, you can import `createMistral` from `@ai-sdk/mistral` and create a provider instance with your settings: ```ts import { createMistral } from '@ai-sdk/mistral'; const mistral = createMistral({ // custom settings }); ``` You can use the following optional settings to customize the Mistral provider instance: - **baseURL** _string_ Use a different URL prefix for API calls, e.g. to use proxy servers. The default prefix is `https://api.mistral.ai/v1`. - **apiKey** _string_ API key that is being sent using the `Authorization` header. It defaults to the `MISTRAL_API_KEY` environment variable. - **headers** _Record<string,string>_ Custom headers to include in the requests. - **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_ Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation. Defaults to the global `fetch` function. You can use it as a middleware to intercept requests, or to provide a custom fetch implementation for e.g. testing. ## Language Models You can create models that call the [Mistral chat API](https://docs.mistral.ai/api/#operation/createChatCompletion) using a provider instance. The first argument is the model id, e.g. `mistral-large-latest`. Some Mistral chat models support tool calls. ```ts const model = mistral('mistral-large-latest'); ``` Mistral chat models also support additional model settings that are not part of the [standard call settings](/docs/ai-sdk-core/settings). You can pass them as an options argument: ```ts const model = mistral('mistral-large-latest', { safePrompt: true, // optional safety prompt injection }); ``` The following optional settings are available for Mistral models: - **safePrompt** _boolean_ Whether to inject a safety prompt before all conversations. Defaults to `false`. ### Document OCR Mistral chat models support document OCR for PDF files. You can optionally set image and page limits using the provider options. ```ts const result = await generateText({ model: mistral('mistral-small-latest'), messages: [ { role: 'user', content: [ { type: 'text', text: 'What is an embedding model according to this document?', }, { type: 'file', data: new URL( 'https://github.com/vercel/ai/blob/main/examples/ai-core/data/ai.pdf?raw=true', ), mimeType: 'application/pdf', }, ], }, ], // optional settings: providerOptions: { mistral: { documentImageLimit: 8, documentPageLimit: 64, }, }, }); ``` ### Example You can use Mistral language models to generate text with the `generateText` function: ```ts import { mistral } from '@ai-sdk/mistral'; import { generateText } from 'ai'; const { text } = await generateText({ model: mistral('mistral-large-latest'), prompt: 'Write a vegetarian lasagna recipe for 4 people.', }); ``` Mistral language models can also be used in the `streamText`, `generateObject`, `streamObject`, and `streamUI` functions (see [AI SDK Core](/docs/ai-sdk-core) and [AI SDK RSC](/docs/ai-sdk-rsc)). ### Model Capabilities | Model | Image Input | Object Generation | Tool Usage | Tool Streaming | | ---------------------- | ------------------- | ------------------- | ------------------- | ------------------- | | `pixtral-large-latest` | | | | | | `mistral-large-latest` | | | | | | `mistral-small-latest` | | | | | | `ministral-3b-latest` | | | | | | `ministral-8b-latest` | | | | | | `pixtral-12b-2409` | | | | | The table above lists popular models. Please see the [Mistral docs](https://docs.mistral.ai/getting-started/models/models_overview/) for a full list of available models. The table above lists popular models. You can also pass any available provider model ID as a string if needed. ## Embedding Models You can create models that call the [Mistral embeddings API](https://docs.mistral.ai/api/#operation/createEmbedding) using the `.embedding()` factory method. ```ts const model = mistral.embedding('mistral-embed'); ``` ### Model Capabilities | Model | Default Dimensions | | --------------- | ------------------ | | `mistral-embed` | 1024 | --- title: xAI Grok description: Learn how to use xAI Grok. --- # xAI Grok Provider The [xAI Grok](https://x.ai) provider contains language model support for the [xAI API](https://x.ai/api). ## Setup The xAI Grok provider is available via the `@ai-sdk/xai` module. You can install it with ## Provider Instance You can import the default provider instance `xai` from `@ai-sdk/xai`: ```ts import { xai } from '@ai-sdk/xai'; ``` If you need a customized setup, you can import `createXai` from `@ai-sdk/xai` and create a provider instance with your settings: ```ts import { createXai } from '@ai-sdk/xai'; const xai = createXai({ apiKey: 'your-api-key', }); ``` You can use the following optional settings to customize the xAI provider instance: - **baseURL** _string_ Use a different URL prefix for API calls, e.g. to use proxy servers. The default prefix is `https://api.x.ai/v1`. - **apiKey** _string_ API key that is being sent using the `Authorization` header. It defaults to the `XAI_API_KEY` environment variable. - **headers** _Record<string,string>_ Custom headers to include in the requests. - **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_ Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation. Defaults to the global `fetch` function. You can use it as a middleware to intercept requests, or to provide a custom fetch implementation for e.g. testing. ## Language Models You can create [xAI models](https://console.x.ai) using a provider instance. The first argument is the model id, e.g. `grok-beta`. ```ts const model = xai('grok-beta'); ``` ### Example You can use xAI language models to generate text with the `generateText` function: ```ts import { xai } from '@ai-sdk/xai'; import { generateText } from 'ai'; const { text } = await generateText({ model: xai('grok-2-1212'), prompt: 'Write a vegetarian lasagna recipe for 4 people.', }); ``` xAI language models can also be used in the `streamText`, `generateObject`, `streamObject`, and `streamUI` functions (see [AI SDK Core](/docs/ai-sdk-core) and [AI SDK RSC](/docs/ai-sdk-rsc)). ### Chat Models xAI chat models also support some model specific settings that are not part of the [standard call settings](/docs/ai-sdk-core/settings). You can pass them as an options argument: ```ts const model = xai('grok-2-1212', { user: 'test-user', // optional unique user identifier }); ``` The following optional settings are available for xAI chat models: - **user** _string_ A unique identifier representing your end-user, which can help xAI to monitor and detect abuse. ## Model Capabilities | Model | Image Input | Object Generation | Tool Usage | Tool Streaming | | -------------------- | ------------------- | ------------------- | ------------------- | ------------------- | | `grok-2-1212` | | | | | | `grok-2-vision-1212` | | | | | | `grok-beta` | | | | | | `grok-vision-beta` | | | | | The table above lists popular models. Please see the [xAI docs](https://docs.x.ai/docs#models) for a full list of available models. The table above lists popular models. You can also pass any available provider model ID as a string if needed. --- title: Together.ai description: Learn how to use Together.ai's models with the AI SDK. --- # Together.ai Provider The [Together.ai](https://together.ai) provider contains support for 200+ open-source models through the [Together.ai API](https://docs.together.ai/reference). ## Setup The Together.ai provider is available via the `@ai-sdk/togetherai` module. You can install it with ## Provider Instance You can import the default provider instance `togetherai` from `@ai-sdk/togetherai`: ```ts import { togetherai } from '@ai-sdk/togetherai'; ``` If you need a customized setup, you can import `createTogetherAI` from `@ai-sdk/togetherai` and create a provider instance with your settings: ```ts import { createTogetherAI } from '@ai-sdk/togetherai'; const togetherai = createTogetherAI({ apiKey: process.env.TOGETHER_AI_API_KEY ?? '', }); ``` You can use the following optional settings to customize the Together.ai provider instance: - **baseURL** _string_ Use a different URL prefix for API calls, e.g. to use proxy servers. The default prefix is `https://api.together.xyz/v1`. - **apiKey** _string_ API key that is being sent using the `Authorization` header. It defaults to the `TOGETHER_AI_API_KEY` environment variable. - **headers** _Record<string,string>_ Custom headers to include in the requests. - **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_ Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation. Defaults to the global `fetch` function. You can use it as a middleware to intercept requests, or to provide a custom fetch implementation for e.g. testing. ## Language Models You can create [Together.ai models](https://docs.together.ai/docs/serverless-models) using a provider instance. The first argument is the model id, e.g. `google/gemma-2-9b-it`. ```ts const model = togetherai('google/gemma-2-9b-it'); ``` ### Reasoning Models Together.ai exposes the thinking of `deepseek-ai/DeepSeek-R1` in the generated text using the `` tag. You can use the `extractReasoningMiddleware` to extract this reasoning and expose it as a `reasoning` property on the result: ```ts import { togetherai } from '@ai-sdk/togetherai'; import { wrapLanguageModel, extractReasoningMiddleware } from 'ai'; const enhancedModel = wrapLanguageModel({ model: togetherai('deepseek-ai/DeepSeek-R1'), middleware: extractReasoningMiddleware({ tagName: 'think' }), }); ``` You can then use that enhanced model in functions like `generateText` and `streamText`. ### Example You can use Together.ai language models to generate text with the `generateText` function: ```ts import { togetherai } from '@ai-sdk/togetherai'; import { generateText } from 'ai'; const { text } = await generateText({ model: togetherai('meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo'), prompt: 'Write a vegetarian lasagna recipe for 4 people.', }); ``` Together.ai language models can also be used in the `streamText` and `streamUI` functions (see [AI SDK Core](/docs/ai-sdk-core) and [AI SDK RSC](/docs/ai-sdk-rsc)). The Together.ai provider also supports [completion models](https://docs.together.ai/docs/serverless-models#language-models) via (following the above example code) `togetherai.completionModel()` and [embedding models](https://docs.together.ai/docs/serverless-models#embedding-models) via `togetherai.textEmbeddingModel()`. ## Model Capabilities | Model | Image Input | Object Generation | Tool Usage | Tool Streaming | | ---------------------------------------------- | ------------------- | ------------------- | ------------------- | ------------------- | | `meta-llama/Meta-Llama-3.3-70B-Instruct-Turbo` | | | | | | `meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo` | | | | | | `mistralai/Mixtral-8x22B-Instruct-v0.1` | | | | | | `mistralai/Mistral-7B-Instruct-v0.3` | | | | | | `deepseek-ai/DeepSeek-V3` | | | | | | `google/gemma-2b-it` | | | | | | `Qwen/Qwen2.5-72B-Instruct-Turbo` | | | | | | `databricks/dbrx-instruct` | | | | | The table above lists popular models. Please see the [Together.ai docs](https://docs.together.ai/docs/serverless-models) for a full list of available models. You can also pass any available provider model ID as a string if needed. ## Image Models You can create Together.ai image models using the `.image()` factory method. For more on image generation with the AI SDK see [generateImage()](/docs/reference/ai-sdk-core/generate-image). ```ts import { togetherai } from '@ai-sdk/togetherai'; import { experimental_generateImage as generateImage } from 'ai'; const { images } = await generateImage({ model: togetherai.image('black-forest-labs/FLUX.1-dev'), prompt: 'A delighted resplendent quetzal mid flight amidst raindrops', }); ``` You can pass optional provider-specific request parameters using the `providerOptions` argument. ```ts import { togetherai } from '@ai-sdk/togetherai'; import { experimental_generateImage as generateImage } from 'ai'; const { images } = await generateImage({ model: togetherai.image('black-forest-labs/FLUX.1-dev'), prompt: 'A delighted resplendent quetzal mid flight amidst raindrops', size: '512x512', // Optional additional provider-specific request parameters providerOptions: { togetherai: { steps: 40, }, }, }); ``` For a complete list of available provider-specific options, see the [Together.ai Image Generation API Reference](https://docs.together.ai/reference/post_images-generations). ### Model Capabilities Together.ai image models support various image dimensions that vary by model. Common sizes include 512x512, 768x768, and 1024x1024, with some models supporting up to 1792x1792. The default size is 1024x1024. | Available Models | | ------------------------------------------ | | `stabilityai/stable-diffusion-xl-base-1.0` | | `black-forest-labs/FLUX.1-dev` | | `black-forest-labs/FLUX.1-dev-lora` | | `black-forest-labs/FLUX.1-schnell` | | `black-forest-labs/FLUX.1-canny` | | `black-forest-labs/FLUX.1-depth` | | `black-forest-labs/FLUX.1-redux` | | `black-forest-labs/FLUX.1.1-pro` | | `black-forest-labs/FLUX.1-pro` | | `black-forest-labs/FLUX.1-schnell-Free` | Please see the [Together.ai models page](https://docs.together.ai/docs/serverless-models#image-models) for a full list of available image models and their capabilities. --- title: Cohere description: Learn how to use the Cohere provider for the AI SDK. --- # Cohere Provider The [Cohere](https://cohere.com/) provider contains language and emdedding model support for the Cohere chat API. ## Setup The Cohere provider is available in the `@ai-sdk/cohere` module. You can install it with ## Provider Instance You can import the default provider instance `cohere` from `@ai-sdk/cohere`: ```ts import { cohere } from '@ai-sdk/cohere'; ``` If you need a customized setup, you can import `createCohere` from `@ai-sdk/cohere` and create a provider instance with your settings: ```ts import { createCohere } from '@ai-sdk/cohere'; const cohere = createCohere({ // custom settings }); ``` You can use the following optional settings to customize the Cohere provider instance: - **baseURL** _string_ Use a different URL prefix for API calls, e.g. to use proxy servers. The default prefix is `https://api.cohere.com/v2`. - **apiKey** _string_ API key that is being sent using the `Authorization` header. It defaults to the `COHERE_API_KEY` environment variable. - **headers** _Record<string,string>_ Custom headers to include in the requests. - **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_ Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation. Defaults to the global `fetch` function. You can use it as a middleware to intercept requests, or to provide a custom fetch implementation for e.g. testing. ## Language Models You can create models that call the [Cohere chat API](https://docs.cohere.com/v2/docs/chat-api) using a provider instance. The first argument is the model id, e.g. `command-r-plus`. Some Cohere chat models support tool calls. ```ts const model = cohere('command-r-plus'); ``` ### Example You can use Cohere language models to generate text with the `generateText` function: ```ts import { cohere } from '@ai-sdk/cohere'; import { generateText } from 'ai'; const { text } = await generateText({ model: cohere('command-r-plus'), prompt: 'Write a vegetarian lasagna recipe for 4 people.', }); ``` Cohere language models can also be used in the `streamText` function (see [AI SDK Core](/docs/ai-sdk-core)). ### Model Capabilities | Model | Image Input | Object Generation | Tool Usage | Tool Streaming | | ------------------- | ------------------- | ------------------- | ------------------- | ------------------- | | `command-r-plus` | | | | | | `command-r` | | | | | | `command-a-03-2025` | | | | | | `command` | | | | | | `command-light` | | | | | The table above lists popular models. Please see the [Cohere docs](https://docs.cohere.com/v2/docs/models#command) for a full list of available models. You can also pass any available provider model ID as a string if needed. ## Embedding Models You can create models that call the [Cohere embed API](https://docs.cohere.com/v2/reference/embed) using the `.embedding()` factory method. ```ts const model = cohere.embedding('embed-english-v3.0'); ``` Cohere embedding models support additional settings. You can pass them as an options argument: ```ts const model = cohere.embedding('embed-english-v3.0', { inputType: 'search_document', }); ``` The following optional settings are available for Cohere embedding models: - **inputType** _'search_document' | 'search_query' | 'classification' | 'clustering'_ Specifies the type of input passed to the model. Default is `search_query`. - `search_document`: Used for embeddings stored in a vector database for search use-cases. - `search_query`: Used for embeddings of search queries run against a vector DB to find relevant documents. - `classification`: Used for embeddings passed through a text classifier. - `clustering`: Used for embeddings run through a clustering algorithm. - **truncate** _'NONE' | 'START' | 'END'_ Specifies how the API will handle inputs longer than the maximum token length. Default is `END`. - `NONE`: If selected, when the input exceeds the maximum input token length will return an error. - `START`: Will discard the start of the input until the remaining input is exactly the maximum input token length for the model. - `END`: Will discard the end of the input until the remaining input is exactly the maximum input token length for the model. ### Model Capabilities | Model | Embedding Dimensions | | ------------------------------- | -------------------- | | `embed-english-v3.0` | 1024 | | `embed-multilingual-v3.0` | 1024 | | `embed-english-light-v3.0` | 384 | | `embed-multilingual-light-v3.0` | 384 | | `embed-english-v2.0` | 4096 | | `embed-english-light-v2.0` | 1024 | | `embed-multilingual-v2.0` | 768 | --- title: Fireworks description: Learn how to use Fireworks models with the AI SDK. --- # Fireworks Provider [Fireworks](https://fireworks.ai/) is a platform for running and testing LLMs through their [API](https://readme.fireworks.ai/). ## Setup The Fireworks provider is available via the `@ai-sdk/fireworks` module. You can install it with ## Provider Instance You can import the default provider instance `fireworks` from `@ai-sdk/fireworks`: ```ts import { fireworks } from '@ai-sdk/fireworks'; ``` If you need a customized setup, you can import `createFireworks` from `@ai-sdk/fireworks` and create a provider instance with your settings: ```ts import { createFireworks } from '@ai-sdk/fireworks'; const fireworks = createFireworks({ apiKey: process.env.FIREWORKS_API_KEY ?? '', }); ``` You can use the following optional settings to customize the Fireworks provider instance: - **baseURL** _string_ Use a different URL prefix for API calls, e.g. to use proxy servers. The default prefix is `https://api.fireworks.ai/inference/v1`. - **apiKey** _string_ API key that is being sent using the `Authorization` header. It defaults to the `FIREWORKS_API_KEY` environment variable. - **headers** _Record<string,string>_ Custom headers to include in the requests. - **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_ Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation. ## Language Models You can create [Fireworks models](https://fireworks.ai/models) using a provider instance. The first argument is the model id, e.g. `accounts/fireworks/models/firefunction-v1`: ```ts const model = fireworks('accounts/fireworks/models/firefunction-v1'); ``` ### Reasoning Models Fireworks exposes the thinking of `deepseek-r1` in the generated text using the `` tag. You can use the `extractReasoningMiddleware` to extract this reasoning and expose it as a `reasoning` property on the result: ```ts import { fireworks } from '@ai-sdk/fireworks'; import { wrapLanguageModel, extractReasoningMiddleware } from 'ai'; const enhancedModel = wrapLanguageModel({ model: fireworks('accounts/fireworks/models/deepseek-r1'), middleware: extractReasoningMiddleware({ tagName: 'think' }), }); ``` You can then use that enhanced model in functions like `generateText` and `streamText`. ### Example You can use Fireworks language models to generate text with the `generateText` function: ```ts import { fireworks } from '@ai-sdk/fireworks'; import { generateText } from 'ai'; const { text } = await generateText({ model: fireworks('accounts/fireworks/models/firefunction-v1'), prompt: 'Write a vegetarian lasagna recipe for 4 people.', }); ``` Fireworks language models can also be used in the `streamText` and `streamUI` functions (see [AI SDK Core](/docs/ai-sdk-core) and [AI SDK RSC](/docs/ai-sdk-rsc)). ### Completion Models You can create models that call the Fireworks completions API using the `.completion()` factory method: ```ts const model = fireworks.completion('accounts/fireworks/models/firefunction-v1'); ``` ### Model Capabilities | Model | Image Input | Object Generation | Tool Usage | Tool Streaming | | ---------------------------------------------------------- | ------------------- | ------------------- | ------------------- | ------------------- | | `accounts/fireworks/models/deepseek-r1` | | | | | | `accounts/fireworks/models/deepseek-v3` | | | | | | `accounts/fireworks/models/llama-v3p1-405b-instruct` | | | | | | `accounts/fireworks/models/llama-v3p1-8b-instruct` | | | | | | `accounts/fireworks/models/llama-v3p2-3b-instruct` | | | | | | `accounts/fireworks/models/llama-v3p3-70b-instruct` | | | | | | `accounts/fireworks/models/mixtral-8x7b-instruct-hf` | | | | | | `accounts/fireworks/models/mixtral-8x22b-instruct` | | | | | | `accounts/fireworks/models/qwen2p5-coder-32b-instruct` | | | | | | `accounts/fireworks/models/llama-v3p2-11b-vision-instruct` | | | | | | `accounts/fireworks/models/yi-large` | | | | | The table above lists popular models. Please see the [Fireworks models page](https://fireworks.ai/models) for a full list of available models. ## Embedding Models You can create models that call the Fireworks embeddings API using the `.textEmbeddingModel()` factory method: ```ts const model = fireworks.textEmbeddingModel( 'accounts/fireworks/models/nomic-embed-text-v1', ); ``` ## Image Models You can create Fireworks image models using the `.image()` factory method. For more on image generation with the AI SDK see [generateImage()](/docs/reference/ai-sdk-core/generate-image). ```ts import { fireworks } from '@ai-sdk/fireworks'; import { experimental_generateImage as generateImage } from 'ai'; const { image } = await generateImage({ model: fireworks.image('accounts/fireworks/models/flux-1-dev-fp8'), prompt: 'A futuristic cityscape at sunset', aspectRatio: '16:9', }); ``` Model support for `size` and `aspectRatio` parameters varies. See the [Model Capabilities](#model-capabilities-1) section below for supported dimensions, or check the model's documentation on [Fireworks models page](https://fireworks.ai/models) for more details. ### Model Capabilities For all models supporting aspect ratios, the following aspect ratios are supported: `1:1 (default), 2:3, 3:2, 4:5, 5:4, 16:9, 9:16, 9:21, 21:9` For all models supporting size, the following sizes are supported: `640 x 1536, 768 x 1344, 832 x 1216, 896 x 1152, 1024x1024 (default), 1152 x 896, 1216 x 832, 1344 x 768, 1536 x 640` | Model | Dimensions Specification | | ------------------------------------------------------------ | ------------------------ | | `accounts/fireworks/models/flux-1-dev-fp8` | Aspect Ratio | | `accounts/fireworks/models/flux-1-schnell-fp8` | Aspect Ratio | | `accounts/fireworks/models/playground-v2-5-1024px-aesthetic` | Size | | `accounts/fireworks/models/japanese-stable-diffusion-xl` | Size | | `accounts/fireworks/models/playground-v2-1024px-aesthetic` | Size | | `accounts/fireworks/models/SSD-1B` | Size | | `accounts/fireworks/models/stable-diffusion-xl-1024-v1-0` | Size | For more details, see the [Fireworks models page](https://fireworks.ai/models). #### Stability AI Models Fireworks also presents several Stability AI models backed by Stability AI API keys and endpoint. The AI SDK Fireworks provider does not currently include support for these models: | Model ID | | -------------------------------------- | | `accounts/stability/models/sd3-turbo` | | `accounts/stability/models/sd3-medium` | | `accounts/stability/models/sd3` | --- title: DeepInfra description: Learn how to use DeepInfra's models with the AI SDK. --- # DeepInfra Provider The [DeepInfra](https://deepinfra.com) provider contains support for state-of-the-art models through the DeepInfra API, including Llama 3, Mixtral, Qwen, and many other popular open-source models. ## Setup The DeepInfra provider is available via the `@ai-sdk/deepinfra` module. You can install it with: ## Provider Instance You can import the default provider instance `deepinfra` from `@ai-sdk/deepinfra`: ```ts import { deepinfra } from '@ai-sdk/deepinfra'; ``` If you need a customized setup, you can import `createDeepInfra` from `@ai-sdk/deepinfra` and create a provider instance with your settings: ```ts import { createDeepInfra } from '@ai-sdk/deepinfra'; const deepinfra = createDeepInfra({ apiKey: process.env.DEEPINFRA_API_KEY ?? '', }); ``` You can use the following optional settings to customize the DeepInfra provider instance: - **baseURL** _string_ Use a different URL prefix for API calls, e.g. to use proxy servers. The default prefix is `https://api.deepinfra.com/v1/openai`. - **apiKey** _string_ API key that is being sent using the `Authorization` header. It defaults to the `DEEPINFRA_API_KEY` environment variable. - **headers** _Record<string,string>_ Custom headers to include in the requests. - **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_ Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation. Defaults to the global `fetch` function. You can use it as a middleware to intercept requests, or to provide a custom fetch implementation for e.g. testing. ## Language Models You can create language models using a provider instance. The first argument is the model ID, for example: ```ts import { deepinfra } from '@ai-sdk/deepinfra'; import { generateText } from 'ai'; const { text } = await generateText({ model: deepinfra('meta-llama/Meta-Llama-3.1-70B-Instruct'), prompt: 'Write a vegetarian lasagna recipe for 4 people.', }); ``` DeepInfra language models can also be used in the `streamText` and `streamUI` functions (see [AI SDK Core](/docs/ai-sdk-core) and [AI SDK RSC](/docs/ai-sdk-rsc)). ## Model Capabilities | Model | Image Input | Object Generation | Tool Usage | Tool Streaming | | ---------------------------------------------- | ------------------- | ------------------- | ------------------- | ------------------- | | `meta-llama/Llama-3.3-70B-Instruct-Turbo` | | | | | | `meta-llama/Llama-3.3-70B-Instruct` | | | | | | `meta-llama/Meta-Llama-3.1-405B-Instruct` | | | | | | `meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo` | | | | | | `meta-llama/Meta-Llama-3.1-70B-Instruct` | | | | | | `meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo` | | | | | | `meta-llama/Meta-Llama-3.1-8B-Instruct` | | | | | | `meta-llama/Llama-3.2-11B-Vision-Instruct` | | | | | | `meta-llama/Llama-3.2-90B-Vision-Instruct` | | | | | | `mistralai/Mixtral-8x7B-Instruct-v0.1` | | | | | | `deepseek-ai/DeepSeek-V3` | | | | | | `nvidia/Llama-3.1-Nemotron-70B-Instruct` | | | | | | `Qwen/Qwen2-7B-Instruct` | | | | | | `Qwen/Qwen2.5-72B-Instruct` | | | | | | `Qwen/Qwen2.5-Coder-32B-Instruct` | | | | | | `Qwen/QwQ-32B-Preview` | | | | | | `google/codegemma-7b-it` | | | | | | `google/gemma-2-9b-it` | | | | | | `microsoft/WizardLM-2-8x22B` | | | | | The table above lists popular models. Please see the [DeepInfra docs](https://deepinfra.com) for a full list of available models. You can also pass any available provider model ID as a string if needed. ## Image Models You can create DeepInfra image models using the `.image()` factory method. For more on image generation with the AI SDK see [generateImage()](/docs/reference/ai-sdk-core/generate-image). ```ts import { deepinfra } from '@ai-sdk/deepinfra'; import { experimental_generateImage as generateImage } from 'ai'; const { image } = await generateImage({ model: deepinfra.image('stabilityai/sd3.5'), prompt: 'A futuristic cityscape at sunset', aspectRatio: '16:9', }); ``` Model support for `size` and `aspectRatio` parameters varies by model. Please check the individual model documentation on [DeepInfra's models page](https://deepinfra.com/models/text-to-image) for supported options and additional parameters. ### Model-specific options You can pass model-specific parameters using the `providerOptions.deepinfra` field: ```ts import { deepinfra } from '@ai-sdk/deepinfra'; import { experimental_generateImage as generateImage } from 'ai'; const { image } = await generateImage({ model: deepinfra.image('stabilityai/sd3.5'), prompt: 'A futuristic cityscape at sunset', aspectRatio: '16:9', providerOptions: { deepinfra: { num_inference_steps: 30, // Control the number of denoising steps (1-50) }, }, }); ``` ### Model Capabilities For models supporting aspect ratios, the following ratios are typically supported: `1:1 (default), 16:9, 1:9, 3:2, 2:3, 4:5, 5:4, 9:16, 9:21` For models supporting size parameters, dimensions must typically be: - Multiples of 32 - Width and height between 256 and 1440 pixels - Default size is 1024x1024 | Model | Dimensions Specification | Notes | | ---------------------------------- | ------------------------ | -------------------------------------------------------- | | `stabilityai/sd3.5` | Aspect Ratio | Premium quality base model, 8B parameters | | `black-forest-labs/FLUX-1.1-pro` | Size | Latest state-of-art model with superior prompt following | | `black-forest-labs/FLUX-1-schnell` | Size | Fast generation in 1-4 steps | | `black-forest-labs/FLUX-1-dev` | Size | Optimized for anatomical accuracy | | `black-forest-labs/FLUX-pro` | Size | Flagship Flux model | | `stabilityai/sd3.5-medium` | Aspect Ratio | Balanced 2.5B parameter model | | `stabilityai/sdxl-turbo` | Aspect Ratio | Optimized for fast generation | For more details and pricing information, see the [DeepInfra text-to-image models page](https://deepinfra.com/models/text-to-image). --- title: DeepSeek description: Learn how to use DeepSeek's models with the AI SDK. --- # DeepSeek Provider The [DeepSeek](https://www.deepseek.com) provider offers access to powerful language models through the DeepSeek API, including their [DeepSeek-V3 model](https://github.com/deepseek-ai/DeepSeek-V3). API keys can be obtained from the [DeepSeek Platform](https://platform.deepseek.com/api_keys). ## Setup The DeepSeek provider is available via the `@ai-sdk/deepseek` module. You can install it with: ## Provider Instance You can import the default provider instance `deepseek` from `@ai-sdk/deepseek`: ```ts import { deepseek } from '@ai-sdk/deepseek'; ``` For custom configuration, you can import `createDeepSeek` and create a provider instance with your settings: ```ts import { createDeepSeek } from '@ai-sdk/deepseek'; const deepseek = createDeepSeek({ apiKey: process.env.DEEPSEEK_API_KEY ?? '', }); ``` You can use the following optional settings to customize the DeepSeek provider instance: - **baseURL** _string_ Use a different URL prefix for API calls. The default prefix is `https://api.deepseek.com/v1`. - **apiKey** _string_ API key that is being sent using the `Authorization` header. It defaults to the `DEEPSEEK_API_KEY` environment variable. - **headers** _Record<string,string>_ Custom headers to include in the requests. - **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_ Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation. ## Language Models You can create language models using a provider instance: ```ts import { deepseek } from '@ai-sdk/deepseek'; import { generateText } from 'ai'; const { text } = await generateText({ model: deepseek('deepseek-chat'), prompt: 'Write a vegetarian lasagna recipe for 4 people.', }); ``` DeepSeek language models can be used in the `streamText` and `streamUI` functions (see [AI SDK Core](/docs/ai-sdk-core) and [AI SDK RSC](/docs/ai-sdk-rsc)). ### Reasoning DeepSeek has reasoning support for the `deepseek-reasoner` model: ```ts import { deepseek } from '@ai-sdk/deepseek'; import { generateText } from 'ai'; const { text, reasoning } = await generateText({ model: deepseek('deepseek-reasoner'), prompt: 'How many people will live in the world in 2040?', }); console.log(reasoning); console.log(text); ``` See [AI SDK UI: Chatbot](/docs/ai-sdk-ui/chatbot#reasoning) for more details on how to integrate reasoning into your chatbot. ### Cache Token Usage DeepSeek provides context caching on disk technology that can significantly reduce token costs for repeated content. You can access the cache hit/miss metrics through the `providerMetadata` property in the response: ```ts import { deepseek } from '@ai-sdk/deepseek'; import { generateText } from 'ai'; const result = await generateText({ model: deepseek('deepseek-chat'), prompt: 'Your prompt here', }); console.log(result.providerMetadata); // Example output: { deepseek: { promptCacheHitTokens: 1856, promptCacheMissTokens: 5 } } ``` The metrics include: - `promptCacheHitTokens`: Number of input tokens that were cached - `promptCacheMissTokens`: Number of input tokens that were not cached For more details about DeepSeek's caching system, see the [DeepSeek caching documentation](https://api-docs.deepseek.com/guides/kv_cache#checking-cache-hit-status). ## Model Capabilities | Model | Text Generation | Object Generation | Image Input | Tool Usage | Tool Streaming | | ------------------- | ------------------- | ------------------- | ------------------- | ------------------- | ------------------- | | `deepseek-chat` | | | | | | | `deepseek-reasoner` | | | | | | Please see the [DeepSeek docs](https://api-docs.deepseek.com) for a full list of available models. You can also pass any available provider model ID as a string if needed. --- title: Cerebras description: Learn how to use Cerebras's models with the AI SDK. --- # Cerebras Provider The [Cerebras](https://cerebras.ai) provider offers access to powerful language models through the Cerebras API, including their high-speed inference capabilities powered by Wafer-Scale Engines and CS-3 systems. API keys can be obtained from the [Cerebras Platform](https://cloud.cerebras.ai). ## Setup The Cerebras provider is available via the `@ai-sdk/cerebras` module. You can install it with: ## Provider Instance You can import the default provider instance `cerebras` from `@ai-sdk/cerebras`: ```ts import { cerebras } from '@ai-sdk/cerebras'; ``` For custom configuration, you can import `createCerebras` and create a provider instance with your settings: ```ts import { createCerebras } from '@ai-sdk/cerebras'; const cerebras = createCerebras({ apiKey: process.env.CEREBRAS_API_KEY ?? '', }); ``` You can use the following optional settings to customize the Cerebras provider instance: - **baseURL** _string_ Use a different URL prefix for API calls. The default prefix is `https://api.cerebras.ai/v1`. - **apiKey** _string_ API key that is being sent using the `Authorization` header. It defaults to the `CEREBRAS_API_KEY` environment variable. - **headers** _Record<string,string>_ Custom headers to include in the requests. - **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_ Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation. ## Language Models You can create language models using a provider instance: ```ts import { cerebras } from '@ai-sdk/cerebras'; import { generateText } from 'ai'; const { text } = await generateText({ model: cerebras('llama3.1-8b'), prompt: 'Write a vegetarian lasagna recipe for 4 people.', }); ``` Cerebras language models can be used in the `streamText` and `streamUI` functions (see [AI SDK Core](/docs/ai-sdk-core) and [AI SDK RSC](/docs/ai-sdk-rsc)). ## Model Capabilities | Model | Image Input | Object Generation | Tool Usage | Tool Streaming | | -------------- | ------------------- | ------------------- | ------------------- | ------------------- | | `llama3.1-8b` | | | | | | `llama3.1-70b` | | | | | | `llama3.3-70b` | | | | | Please see the [Cerebras docs](https://inference-docs.cerebras.ai/introduction) for more details about the available models. Note that context windows are temporarily limited to 8192 tokens in the Free Tier. --- title: Groq description: Learn how to use Groq. --- # Groq Provider The [Groq](https://groq.com/) provider contains language model support for the Groq API. ## Setup The Groq provider is available via the `@ai-sdk/groq` module. You can install it with ## Provider Instance You can import the default provider instance `groq` from `@ai-sdk/groq`: ```ts import { groq } from '@ai-sdk/groq'; ``` If you need a customized setup, you can import `createGroq` from `@ai-sdk/groq` and create a provider instance with your settings: ```ts import { createGroq } from '@ai-sdk/groq'; const groq = createGroq({ // custom settings }); ``` You can use the following optional settings to customize the Groq provider instance: - **baseURL** _string_ Use a different URL prefix for API calls, e.g. to use proxy servers. The default prefix is `https://api.groq.com/openai/v1`. - **apiKey** _string_ API key that is being sent using the `Authorization` header. It defaults to the `GROQ_API_KEY` environment variable. - **headers** _Record<string,string>_ Custom headers to include in the requests. - **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_ Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation. Defaults to the global `fetch` function. You can use it as a middleware to intercept requests, or to provide a custom fetch implementation for e.g. testing. ## Language Models You can create [Groq models](https://console.groq.com/docs/models) using a provider instance. The first argument is the model id, e.g. `gemma2-9b-it`. ```ts const model = groq('gemma2-9b-it'); ``` ### Reasoning Models Groq exposes the thinking of `deepseek-r1-distill-llama-70b` in the generated text using the `` tag. You can use the `extractReasoningMiddleware` to extract this reasoning and expose it as a `reasoning` property on the result: ```ts import { groq } from '@ai-sdk/groq'; import { wrapLanguageModel, extractReasoningMiddleware } from 'ai'; const enhancedModel = wrapLanguageModel({ model: groq('deepseek-r1-distill-llama-70b'), middleware: extractReasoningMiddleware({ tagName: 'think' }), }); ``` You can then use that enhanced model in functions like `generateText` and `streamText`. ### Example You can use Groq language models to generate text with the `generateText` function: ```ts import { groq } from '@ai-sdk/groq'; import { generateText } from 'ai'; const { text } = await generateText({ model: groq('gemma2-9b-it'), prompt: 'Write a vegetarian lasagna recipe for 4 people.', }); ``` ## Model Capabilities | Model | Image Input | Object Generation | Tool Usage | Tool Streaming | | ------------------------------- | ------------------- | ------------------- | ------------------- | ------------------- | | `deepseek-r1-distill-llama-70b` | | | | | | `llama-3.3-70b-versatile` | | | | | | `llama-3.1-8b-instant` | | | | | | `gemma2-9b-it` | | | | | | `mixtral-8x7b-32768` | | | | | The table above lists popular models. Please see the [Groq docs](https://console.groq.com/docs/models) for a full list of available models. The table above lists popular models. You can also pass any available provider model ID as a string if needed. --- title: Replicate description: Learn how to use Replicate models with the AI SDK. --- # Replicate Provider [Replicate](https://replicate.com/) is a platform for running open-source AI models. It is a popular choice for running image generation models. ## Setup The Replicate provider is available via the `@ai-sdk/replicate` module. You can install it with ## Provider Instance You can import the default provider instance `replicate` from `@ai-sdk/replicate`: ```ts import { replicate } from '@ai-sdk/replicate'; ``` If you need a customized setup, you can import `createReplicate` from `@ai-sdk/replicate` and create a provider instance with your settings: ```ts import { createReplicate } from '@ai-sdk/replicate'; const replicate = createReplicate({ apiToken: process.env.REPLICATE_API_TOKEN ?? '', }); ``` You can use the following optional settings to customize the Replicate provider instance: - **baseURL** _string_ Use a different URL prefix for API calls, e.g. to use proxy servers. The default prefix is `https://api.replicate.com/v1`. - **apiToken** _string_ API token that is being sent using the `Authorization` header. It defaults to the `REPLICATE_API_TOKEN` environment variable. - **headers** _Record<string,string>_ Custom headers to include in the requests. - **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_ Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation. ## Image Models You can create Replicate image models using the `.image()` factory method. For more on image generation with the AI SDK see [generateImage()](/docs/reference/ai-sdk-core/generate-image). Model support for `size` and other parameters varies by model. Check the model's documentation on [Replicate](https://replicate.com/explore) for supported options and additional parameters that can be passed via `providerOptions.replicate`. ### Supported Image Models The following image models are currently supported by the Replicate provider: - [black-forest-labs/flux-1.1-pro-ultra](https://replicate.com/black-forest-labs/flux-1.1-pro-ultra) - [black-forest-labs/flux-1.1-pro](https://replicate.com/black-forest-labs/flux-1.1-pro) - [black-forest-labs/flux-dev](https://replicate.com/black-forest-labs/flux-dev) - [black-forest-labs/flux-pro](https://replicate.com/black-forest-labs/flux-pro) - [black-forest-labs/flux-schnell](https://replicate.com/black-forest-labs/flux-schnell) - [ideogram-ai/ideogram-v2-turbo](https://replicate.com/ideogram-ai/ideogram-v2-turbo) - [ideogram-ai/ideogram-v2](https://replicate.com/ideogram-ai/ideogram-v2) - [luma/photon-flash](https://replicate.com/luma/photon-flash) - [luma/photon](https://replicate.com/luma/photon) - [recraft-ai/recraft-v3-svg](https://replicate.com/recraft-ai/recraft-v3-svg) - [recraft-ai/recraft-v3](https://replicate.com/recraft-ai/recraft-v3) - [stability-ai/stable-diffusion-3.5-large-turbo](https://replicate.com/stability-ai/stable-diffusion-3.5-large-turbo) - [stability-ai/stable-diffusion-3.5-large](https://replicate.com/stability-ai/stable-diffusion-3.5-large) - [stability-ai/stable-diffusion-3.5-medium](https://replicate.com/stability-ai/stable-diffusion-3.5-medium) You can also use [versioned models](https://replicate.com/docs/topics/models/versions). The id for versioned models is the Replicate model id followed by a colon and the version ID (`$modelId:$versionId`), e.g. `bytedance/sdxl-lightning-4step:5599ed30703defd1d160a25a63321b4dec97101d98b4674bcc56e41f62f35637`. ### Basic Usage ```ts import { replicate } from '@ai-sdk/replicate'; import { experimental_generateImage as generateImage } from 'ai'; import { writeFile } from 'node:fs/promises'; const { image } = await generateImage({ model: replicate.image('black-forest-labs/flux-schnell'), prompt: 'The Loch Ness Monster getting a manicure', aspectRatio: '16:9', }); await writeFile('image.webp', image.uint8Array); console.log('Image saved as image.webp'); ``` ### Model-specific options ```ts highlight="9-11" import { replicate } from '@ai-sdk/replicate'; import { experimental_generateImage as generateImage } from 'ai'; const { image } = await generateImage({ model: replicate.image('recraft-ai/recraft-v3'), prompt: 'The Loch Ness Monster getting a manicure', size: '1365x1024', providerOptions: { replicate: { style: 'realistic_image', }, }, }); ``` ### Versioned Models ```ts import { replicate } from '@ai-sdk/replicate'; import { experimental_generateImage as generateImage } from 'ai'; const { image } = await generateImage({ model: replicate.image( 'bytedance/sdxl-lightning-4step:5599ed30703defd1d160a25a63321b4dec97101d98b4674bcc56e41f62f35637', ), prompt: 'The Loch Ness Monster getting a manicure', }); ``` For more details, see the [Replicate models page](https://replicate.com/explore). --- title: Perplexity description: Learn how to use Perplexity's Sonar API with the AI SDK. --- # Perplexity Provider The [Perplexity](https://sonar.perplexity.ai) provider offers access to Sonar API - a language model that uniquely combines real-time web search with natural language processing. Each response is grounded in current web data and includes detailed citations, making it ideal for research, fact-checking, and obtaining up-to-date information. API keys can be obtained from the [Perplexity Platform](https://docs.perplexity.ai). ## Setup The Perplexity provider is available via the `@ai-sdk/perplexity` module. You can install it with: ## Provider Instance You can import the default provider instance `perplexity` from `@ai-sdk/perplexity`: ```ts import { perplexity } from '@ai-sdk/perplexity'; ``` For custom configuration, you can import `createPerplexity` and create a provider instance with your settings: ```ts import { createPerplexity } from '@ai-sdk/perplexity'; const perplexity = createPerplexity({ apiKey: process.env.PERPLEXITY_API_KEY ?? '', }); ``` You can use the following optional settings to customize the Perplexity provider instance: - **baseURL** _string_ Use a different URL prefix for API calls. The default prefix is `https://api.perplexity.ai`. - **apiKey** _string_ API key that is being sent using the `Authorization` header. It defaults to the `PERPLEXITY_API_KEY` environment variable. - **headers** _Record<string,string>_ Custom headers to include in the requests. - **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_ Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation. ## Language Models You can create Perplexity models using a provider instance: ```ts import { perplexity } from '@ai-sdk/perplexity'; import { generateText } from 'ai'; const { text } = await generateText({ model: perplexity('sonar-pro'), prompt: 'What are the latest developments in quantum computing?', }); ``` ### Sources Websites that have been used to generate the response are included in the `sources` property of the result: ```ts import { perplexity } from '@ai-sdk/perplexity'; import { generateText } from 'ai'; const { text, sources } = await generateText({ model: perplexity('sonar-pro'), prompt: 'What are the latest developments in quantum computing?', }); console.log(sources); ``` ### Provider Options & Metadata The Perplexity provider includes additional metadata in the response through `providerMetadata`. Additional configuration options are available through `providerOptions`. ```ts const result = await generateText({ model: perplexity('sonar-pro'), prompt: 'What are the latest developments in quantum computing?', providerOptions: { perplexity: { return_images: true, // Enable image responses (Tier-2 Perplexity users only) }, }, }); console.log(result.providerMetadata); // Example output: // { // perplexity: { // usage: { citationTokens: 5286, numSearchQueries: 1 }, // images: [ // { imageUrl: "https://example.com/image1.jpg", originUrl: "https://elsewhere.com/page1", height: 1280, width: 720 }, // { imageUrl: "https://example.com/image2.jpg", originUrl: "https://elsewhere.com/page2", height: 1280, width: 720 } // ] // }, // } ``` The metadata includes: - `usage`: Object containing `citationTokens` and `numSearchQueries` metrics - `images`: Array of image URLs when `return_images` is enabled (Tier-2 users only) You can enable image responses by setting `return_images: true` in the provider options. This feature is only available to Perplexity Tier-2 users and above. For more details about Perplexity's capabilities, see the [Perplexity chat completion docs](https://docs.perplexity.ai/api-reference/chat-completions). ## Model Capabilities | Model | Image Input | Object Generation | Tool Usage | Tool Streaming | | ----------- | ------------------- | ------------------- | ------------------- | ------------------- | | `sonar-pro` | | | | | | `sonar` | | | | | Please see the [Perplexity docs](https://docs.perplexity.ai) for detailed API documentation and the latest updates. --- title: Luma description: Learn how to use Luma AI models with the AI SDK. --- # Luma Provider [Luma AI](https://lumalabs.ai/) provides state-of-the-art image generation models through their Dream Machine platform. Their models offer ultra-high quality image generation with superior prompt understanding and unique capabilities like character consistency and multi-image reference support. ## Setup The Luma provider is available via the `@ai-sdk/luma` module. You can install it with ## Provider Instance You can import the default provider instance `luma` from `@ai-sdk/luma`: ```ts import { luma } from '@ai-sdk/luma'; ``` If you need a customized setup, you can import `createLuma` and create a provider instance with your settings: ```ts import { createLuma } from '@ai-sdk/luma'; const luma = createLuma({ apiKey: 'your-api-key', // optional, defaults to LUMA_API_KEY environment variable baseURL: 'custom-url', // optional headers: { /* custom headers */ }, // optional }); ``` You can use the following optional settings to customize the Luma provider instance: - **baseURL** _string_ Use a different URL prefix for API calls, e.g. to use proxy servers. The default prefix is `https://api.lumalabs.ai`. - **apiKey** _string_ API key that is being sent using the `Authorization` header. It defaults to the `LUMA_API_KEY` environment variable. - **headers** _Record<string,string>_ Custom headers to include in the requests. - **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_ Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation. You can use it as a middleware to intercept requests, or to provide a custom fetch implementation for e.g. testing. ## Image Models You can create Luma image models using the `.image()` factory method. For more on image generation with the AI SDK see [generateImage()](/docs/reference/ai-sdk-core/generate-image). ### Basic Usage ```ts import { luma } from '@ai-sdk/luma'; import { experimental_generateImage as generateImage } from 'ai'; import fs from 'fs'; const { image } = await generateImage({ model: luma.image('photon-1'), prompt: 'A serene mountain landscape at sunset', aspectRatio: '16:9', }); const filename = `image-${Date.now()}.png`; fs.writeFileSync(filename, image.uint8Array); console.log(`Image saved to ${filename}`); ``` ### Image Model Settings When creating an image model, you can customize the generation behavior with optional settings: ```ts const model = luma.image('photon-1', { maxImagesPerCall: 1, // Maximum number of images to generate per API call pollIntervalMillis: 5000, // How often to check for completed images (in ms) maxPollAttempts: 10, // Maximum number of polling attempts before timeout }); ``` Since Luma processes images through an asynchronous queue system, these settings allow you to tune the polling behavior: - **maxImagesPerCall** _number_ Override the maximum number of images generated per API call. Defaults to 1. - **pollIntervalMillis** _number_ Control how frequently the API is checked for completed images while they are being processed. Defaults to 500ms. - **maxPollAttempts** _number_ Limit how long to wait for results before timing out, since image generation is queued asynchronously. Defaults to 120 attempts. ### Model Capabilities Luma offers two main models: | Model | Description | | ---------------- | ---------------------------------------------------------------- | | `photon-1` | High-quality image generation with superior prompt understanding | | `photon-flash-1` | Faster generation optimized for speed while maintaining quality | Both models support the following aspect ratios: - 1:1 - 3:4 - 4:3 - 9:16 - 16:9 (default) - 9:21 - 21:9 For more details about supported aspect ratios, see the [Luma Image Generation documentation](https://docs.lumalabs.ai/docs/image-generation). Key features of Luma models include: - Ultra-high quality image generation - 10x higher cost efficiency compared to similar models - Superior prompt understanding and adherence - Unique character consistency capabilities from single reference images - Multi-image reference support for precise style matching ### Advanced Options Luma models support several advanced features through the `providerOptions.luma` parameter. #### Image Reference Use up to 4 reference images to guide your generation. Useful for creating variations or visualizing complex concepts. Adjust the `weight` (0-1) to control the influence of reference images. ```ts // Example: Generate a salamander with reference await generateImage({ model: luma.image('photon-1'), prompt: 'A salamander at dusk in a forest pond, in the style of ukiyo-e', providerOptions: { luma: { image_ref: [ { url: 'https://example.com/reference.jpg', weight: 0.85, }, ], }, }, }); ``` #### Style Reference Apply specific visual styles to your generations using reference images. Control the style influence using the `weight` parameter. ```ts // Example: Generate with style reference await generateImage({ model: luma.image('photon-1'), prompt: 'A blue cream Persian cat launching its website on Vercel', providerOptions: { luma: { style_ref: [ { url: 'https://example.com/style.jpg', weight: 0.8, }, ], }, }, }); ``` #### Character Reference Create consistent and personalized characters using up to 4 reference images of the same subject. More reference images improve character representation. ```ts // Example: Generate character-based image await generateImage({ model: luma.image('photon-1'), prompt: 'A woman with a cat riding a broomstick in a forest', providerOptions: { luma: { character_ref: { identity0: { images: ['https://example.com/character.jpg'], }, }, }, }, }); ``` #### Modify Image Transform existing images using text prompts. Use the `weight` parameter to control how closely the result matches the input image (higher weight = closer to input but less creative). For color changes, it's recommended to use a lower weight value (0.0-0.1). ```ts // Example: Modify existing image await generateImage({ model: luma.image('photon-1'), prompt: 'transform the bike to a boat', providerOptions: { luma: { modify_image_ref: { url: 'https://example.com/image.jpg', weight: 1.0, }, }, }, }); ``` For more details about Luma's capabilities and features, visit the [Luma Image Generation documentation](https://docs.lumalabs.ai/docs/image-generation). --- title: Fal description: Learn how to use Fal AI models with the AI SDK. --- # Fal Provider [Fal AI](https://fal.ai/) provides a generative media platform for developers with lightning-fast inference capabilities. Their platform offers optimized performance for running diffusion models, with speeds up to 4x faster than alternatives. ## Setup The Fal provider is available via the `@ai-sdk/fal` module. You can install it with ## Provider Instance You can import the default provider instance `fal` from `@ai-sdk/fal`: ```ts import { fal } from '@ai-sdk/fal'; ``` If you need a customized setup, you can import `createFal` and create a provider instance with your settings: ```ts import { createFal } from '@ai-sdk/fal'; const fal = createFal({ apiKey: 'your-api-key', // optional, defaults to FAL_API_KEY environment variable baseURL: 'custom-url', // optional headers: { /* custom headers */ }, // optional }); ``` You can use the following optional settings to customize the Fal provider instance: - **baseURL** _string_ Use a different URL prefix for API calls, e.g. to use proxy servers. The default prefix is `https://fal.run`. - **apiKey** _string_ API key that is being sent using the `Authorization` header. It defaults to the `FAL_API_KEY` environment variable. - **headers** _Record<string,string>_ Custom headers to include in the requests. - **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_ Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation. You can use it as a middleware to intercept requests, or to provide a custom fetch implementation for e.g. testing. ## Image Models You can create Fal image models using the `.image()` factory method. For more on image generation with the AI SDK see [generateImage()](/docs/reference/ai-sdk-core/generate-image). ### Basic Usage ```ts import { fal } from '@ai-sdk/fal'; import { experimental_generateImage as generateImage } from 'ai'; import fs from 'fs'; const { image } = await generateImage({ model: fal.image('fal-ai/fast-sdxl'), prompt: 'A serene mountain landscape at sunset', }); const filename = `image-${Date.now()}.png`; fs.writeFileSync(filename, image.uint8Array); console.log(`Image saved to ${filename}`); ``` ### Model Capabilities Fal offers many models optimized for different use cases. Here are a few popular examples. For a full list of models, see the [Fal AI documentation](https://fal.ai/models). | Model | Description | | ----------------------------------- | -------------------------------------------------------------------------------------- | | `fal-ai/fast-sdxl` | High-speed SDXL model optimized for quick inference with up to 4x faster speeds | | `fal-ai/flux-pro/v1.1-ultra` | Professional-grade image generation with up to 2K resolution and enhanced photorealism | | `fal-ai/ideogram/v2` | Specialized for high-quality posters and logos with exceptional typography handling | | `fal-ai/recraft-v3` | SOTA in image generation with vector art and brand style capabilities | | `fal-ai/stable-diffusion-3.5-large` | Advanced MMDiT model with improved typography and complex prompt understanding | | `fal-ai/hyper-sdxl` | Performance-optimized SDXL variant with enhanced creative capabilities | Fal models support the following aspect ratios: - 1:1 (square HD) - 16:9 (landscape) - 9:16 (portrait) - 4:3 (landscape) - 3:4 (portrait) - 16:10 (1280x800) - 10:16 (800x1280) - 21:9 (2560x1080) - 9:21 (1080x2560) Key features of Fal models include: - Up to 4x faster inference speeds compared to alternatives - Optimized by the Fal Inference Engine™ - Support for real-time infrastructure - Cost-effective scaling with pay-per-use pricing - LoRA training capabilities for model personalization ### Advanced Features Fal's platform offers several advanced capabilities: - **Private Model Inference**: Run your own diffusion transformer models with up to 50% faster inference - **LoRA Training**: Train and personalize models in under 5 minutes - **Real-time Infrastructure**: Enable new user experiences with fast inference times - **Scalable Architecture**: Scale to thousands of GPUs when needed For more details about Fal's capabilities and features, visit the [Fal AI documentation](https://fal.ai/docs). --- title: LM Studio description: Use the LM Studio OpenAI compatible API with the AI SDK. --- # LM Studio Provider [LM Studio](https://lmstudio.ai/) is a user interface for running local models. It contains an OpenAI compatible API server that you can use with the AI SDK. You can start the local server under the [Local Server tab](https://lmstudio.ai/docs/basics/server) in the LM Studio UI ("Start Server" button). ## Setup The LM Studio provider is available via the `@ai-sdk/openai-compatible` module as it is compatible with the OpenAI API. You can install it with ## Provider Instance To use LM Studio, you can create a custom provider instance with the `createOpenAICompatible` function from `@ai-sdk/openai-compatible`: ```ts import { createOpenAICompatible } from '@ai-sdk/openai-compatible'; const lmstudio = createOpenAICompatible({ name: 'lmstudio', baseURL: 'http://localhost:1234/v1', }); ``` LM Studio uses port `1234` by default, but you can change in the [app's Local Server tab](https://lmstudio.ai/docs/basics/server). ## Language Models You can interact with local LLMs in [LM Studio](https://lmstudio.ai/docs/basics/server#endpoints-overview) using a provider instance. The first argument is the model id, e.g. `llama-3.2-1b`. ```ts const model = lmstudio('llama-3.2-1b'); ``` ###### To be able to use a model, you need to [download it first](https://lmstudio.ai/docs/basics/download-model). ### Example You can use LM Studio language models to generate text with the `generateText` function: ```ts import { createOpenAICompatible } from '@ai-sdk/openai-compatible'; import { generateText } from 'ai'; const lmstudio = createOpenAICompatible({ name: 'lmstudio', baseURL: 'https://localhost:1234/v1', }); const { text } = await generateText({ model: lmstudio('llama-3.2-1b'), prompt: 'Write a vegetarian lasagna recipe for 4 people.', maxRetries: 1, // immediately error if the server is not running }); ``` LM Studio language models can also be used with `streamText`. ## Embedding Models You can create models that call the [LM Studio embeddings API](https://lmstudio.ai/docs/basics/server#endpoints-overview) using the `.embedding()` factory method. ```ts const model = lmstudio.embedding('text-embedding-nomic-embed-text-v1.5'); ``` ### Example - Embedding a Single Value ```tsx import { createOpenAICompatible } from '@ai-sdk/openai-compatible'; import { embed } from 'ai'; const lmstudio = createOpenAICompatible({ name: 'lmstudio', baseURL: 'https://localhost:1234/v1', }); // 'embedding' is a single embedding object (number[]) const { embedding } = await embed({ model: lmstudio.textEmbeddingModel('text-embedding-nomic-embed-text-v1.5'), value: 'sunny day at the beach', }); ``` ### Example - Embedding Many Values When loading data, e.g. when preparing a data store for retrieval-augmented generation (RAG), it is often useful to embed many values at once (batch embedding). The AI SDK provides the [`embedMany`](/docs/reference/ai-sdk-core/embed-many) function for this purpose. Similar to `embed`, you can use it with embeddings models, e.g. `lmstudio.textEmbeddingModel('text-embedding-nomic-embed-text-v1.5')` or `lmstudio.textEmbeddingModel('text-embedding-bge-small-en-v1.5')`. ```tsx import { createOpenAICompatible } from '@ai-sdk/openai'; import { embedMany } from 'ai'; const lmstudio = createOpenAICompatible({ name: 'lmstudio', baseURL: 'https://localhost:1234/v1', }); // 'embeddings' is an array of embedding objects (number[][]). // It is sorted in the same order as the input values. const { embeddings } = await embedMany({ model: lmstudio.textEmbeddingModel('text-embedding-nomic-embed-text-v1.5'), values: [ 'sunny day at the beach', 'rainy afternoon in the city', 'snowy night in the mountains', ], }); ``` --- title: NVIDIA NIM description: Use NVIDIA NIM OpenAI compatible API with the AI SDK. --- # NVIDIA NIM Provider [NVIDIA NIM](https://www.nvidia.com/en-us/ai/) provides optimized inference microservices for deploying foundation models. It offers an OpenAI-compatible API that you can use with the AI SDK. ## Setup The NVIDIA NIM provider is available via the `@ai-sdk/openai-compatible` module as it is compatible with the OpenAI API. You can install it with: ## Provider Instance To use NVIDIA NIM, you can create a custom provider instance with the `createOpenAICompatible` function from `@ai-sdk/openai-compatible`: ```ts import { createOpenAICompatible } from '@ai-sdk/openai-compatible'; const nim = createOpenAICompatible({ name: 'nim', baseURL: 'https://integrate.api.nvidia.com/v1', headers: { Authorization: `Bearer ${process.env.NIM_API_KEY}`, }, }); ``` You can obtain an API key and free credits by registering at [NVIDIA Build](https://build.nvidia.com/explore/discover). New users receive 1,000 inference credits to get started. ## Language Models You can interact with NIM models using a provider instance. For example, to use [DeepSeek-R1](https://build.nvidia.com/deepseek-ai/deepseek-r1), a powerful open-source language model: ```ts const model = nim.chatModel('deepseek-ai/deepseek-r1'); ``` ### Example - Generate Text You can use NIM language models to generate text with the `generateText` function: ```ts import { createOpenAICompatible } from '@ai-sdk/openai-compatible'; import { generateText } from 'ai'; const nim = createOpenAICompatible({ name: 'nim', baseURL: 'https://integrate.api.nvidia.com/v1', headers: { Authorization: `Bearer ${process.env.NIM_API_KEY}`, }, }); const { text, usage, finishReason } = await generateText({ model: nim.chatModel('deepseek-ai/deepseek-r1'), prompt: 'Tell me the history of the San Francisco Mission-style burrito.', }); console.log(text); console.log('Token usage:', usage); console.log('Finish reason:', finishReason); ``` ### Example - Stream Text NIM language models can also generate text in a streaming fashion with the `streamText` function: ```ts import { createOpenAICompatible } from '@ai-sdk/openai-compatible'; import { streamText } from 'ai'; const nim = createOpenAICompatible({ name: 'nim', baseURL: 'https://integrate.api.nvidia.com/v1', headers: { Authorization: `Bearer ${process.env.NIM_API_KEY}`, }, }); const result = streamText({ model: nim.chatModel('deepseek-ai/deepseek-r1'), prompt: 'Tell me the history of the Northern White Rhino.', }); for await (const textPart of result.textStream) { process.stdout.write(textPart); } console.log(); console.log('Token usage:', await result.usage); console.log('Finish reason:', await result.finishReason); ``` NIM language models can also be used with other AI SDK functions like `generateObject` and `streamObject`. Model support for tool calls and structured object generation varies. For example, the [`meta/llama-3.3-70b-instruct`](https://build.nvidia.com/meta/llama-3_3-70b-instruct) model supports object generation capabilities. Check each model's documentation on NVIDIA Build for specific supported features. --- title: OpenAI Compatible Providers description: Use OpenAI compatible providers with the AI SDK. --- # OpenAI Compatible Providers You can use the [OpenAI Compatible Provider](https://www.npmjs.com/package/@ai-sdk/openai-compatible) package to use language model providers that implement the OpenAI API. Below we focus on the general setup and provider instance creation. You can also [write a custom provider package leveraging the OpenAI Compatible package](/providers/openai-compatible-providers/custom-providers). We provide detailed documentation for the following OpenAI compatible providers: - [LM Studio](/providers/openai-compatible-providers/lmstudio) - [NIM](/providers/openai-compatible-providers/nim) - [Baseten](/providers/openai-compatible-providers/baseten) The general setup and provider instance creation is the same for all of these providers. ## Setup The OpenAI Compatible provider is available via the `@ai-sdk/openai-compatible` module. You can install it with: ## Provider Instance To use an OpenAI compatible provider, you can create a custom provider instance with the `createOpenAICompatible` function from `@ai-sdk/openai-compatible`: ```ts import { createOpenAICompatible } from '@ai-sdk/openai-compatible'; const provider = createOpenAICompatible({ name: 'provider-name', apiKey: process.env.PROVIDER_API_KEY, baseURL: 'https://api.provider.com/v1', }); ``` You can use the following optional settings to customize the provider instance: - **baseURL** _string_ Set the URL prefix for API calls. - **apiKey** _string_ API key for authenticating requests. If specified, adds an `Authorization` header to request headers with the value `Bearer `. This will be added before any headers potentially specified in the `headers` option. - **headers** _Record<string,string>_ Optional custom headers to include in requests. These will be added to request headers after any headers potentially added by use of the `apiKey` option. - **queryParams** _Record<string,string>_ Optional custom url query parameters to include in request urls. - **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_ Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation. Defaults to the global `fetch` function. You can use it as a middleware to intercept requests, or to provide a custom fetch implementation for e.g. testing. ## Language Models You can create provider models using a provider instance. The first argument is the model id, e.g. `model-id`. ```ts const model = provider('model-id'); ``` ### Example You can use provider language models to generate text with the `generateText` function: ```ts import { createOpenAICompatible } from '@ai-sdk/openai-compatible'; import { generateText } from 'ai'; const provider = createOpenAICompatible({ name: 'provider-name', apiKey: process.env.PROVIDER_API_KEY, baseURL: 'https://api.provider.com/v1', }); const { text } = await generateText({ model: provider('model-id'), prompt: 'Write a vegetarian lasagna recipe for 4 people.', }); ``` ### Including model ids for auto-completion ```ts import { createOpenAICompatible } from '@ai-sdk/openai-compatible'; import { generateText } from 'ai'; type ExampleChatModelIds = | 'meta-llama/Llama-3-70b-chat-hf' | 'meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo' | (string & {}); type ExampleCompletionModelIds = | 'codellama/CodeLlama-34b-Instruct-hf' | 'Qwen/Qwen2.5-Coder-32B-Instruct' | (string & {}); type ExampleEmbeddingModelIds = | 'BAAI/bge-large-en-v1.5' | 'bert-base-uncased' | (string & {}); const model = createOpenAICompatible< ExampleChatModelIds, ExampleCompletionModelIds, ExampleEmbeddingModelIds >({ name: 'example', apiKey: process.env.PROVIDER_API_KEY, baseURL: 'https://api.example.com/v1', }); // Subsequent calls to e.g. `model.chatModel` will auto-complete the model id // from the list of `ExampleChatModelIds` while still allowing free-form // strings as well. const { text } = await generateText({ model: model.chatModel('meta-llama/Llama-3-70b-chat-hf'), prompt: 'Write a vegetarian lasagna recipe for 4 people.', }); ``` ### Custom query parameters Some providers may require custom query parameters. An example is the [Azure AI Model Inference API](https://learn.microsoft.com/en-us/azure/machine-learning/reference-model-inference-chat-completions?view=azureml-api-2) which requires an `api-version` query parameter. You can set these via the optional `queryParams` provider setting. These will be added to all requests made by the provider. ```ts highlight="7-9" import { createOpenAICompatible } from '@ai-sdk/openai-compatible'; const provider = createOpenAICompatible({ name: 'provider-name', apiKey: process.env.PROVIDER_API_KEY, baseURL: 'https://api.provider.com/v1', queryParams: { 'api-version': '1.0.0', }, }); ``` For example, with the above configuration, API requests would include the query parameter in the URL like: `https://api.provider.com/v1/chat/completions?api-version=1.0.0`. ## Provider-specific options The OpenAI Compatible provider supports adding provider-specific options to the request body. These are specified with the `providerOptions` field in the request body. For example, if you create a provider instance with the name `provider-name`, you can add a `custom-option` field to the request body like this: ```ts const provider = createOpenAICompatible({ name: 'provider-name', apiKey: process.env.PROVIDER_API_KEY, baseURL: 'https://api.provider.com/v1', }); const { text } = await generateText({ model: provider('model-id'), prompt: 'Hello', providerOptions: { 'provider-name': { customOption: 'magic-value' }, }, }); ``` The request body sent to the provider will include the `customOption` field with the value `magic-value`. This gives you an easy way to add provider-specific options to requests without having to modify the provider or AI SDK code. ## Custom Metadata Extraction The OpenAI Compatible provider supports extracting provider-specific metadata from API responses through metadata extractors. These extractors allow you to capture additional information returned by the provider beyond the standard response format. Metadata extractors receive the raw, unprocessed response data from the provider, giving you complete flexibility to extract any custom fields or experimental features that the provider may include. This is particularly useful when: - Working with providers that include non-standard response fields - Experimenting with beta or preview features - Capturing provider-specific metrics or debugging information - Supporting rapid provider API evolution without SDK changes Metadata extractors work with both streaming and non-streaming chat completions and consist of two main components: 1. A function to extract metadata from complete responses 2. A streaming extractor that can accumulate metadata across chunks in a streaming response Here's an example metadata extractor that captures both standard and custom provider data: ```typescript const myMetadataExtractor: MetadataExtractor = { // Process complete, non-streaming responses extractMetadata: ({ parsedBody }) => { // You have access to the complete raw response // Extract any fields the provider includes return { myProvider: { standardUsage: parsedBody.usage, experimentalFeatures: parsedBody.beta_features, customMetrics: { processingTime: parsedBody.server_timing?.total_ms, modelVersion: parsedBody.model_version, // ... any other provider-specific data }, }, }; }, // Process streaming responses createStreamExtractor: () => { let accumulatedData = { timing: [], customFields: {}, }; return { // Process each chunk's raw data processChunk: parsedChunk => { if (parsedChunk.server_timing) { accumulatedData.timing.push(parsedChunk.server_timing); } if (parsedChunk.custom_data) { Object.assign(accumulatedData.customFields, parsedChunk.custom_data); } }, // Build final metadata from accumulated data buildMetadata: () => ({ myProvider: { streamTiming: accumulatedData.timing, customData: accumulatedData.customFields, }, }), }; }, }; ``` You can provide a metadata extractor when creating your provider instance: ```typescript const provider = createOpenAICompatible({ name: 'my-provider', apiKey: process.env.PROVIDER_API_KEY, baseURL: 'https://api.provider.com/v1', metadataExtractor: myMetadataExtractor, }); ``` The extracted metadata will be included in the response under the `providerMetadata` field: ```typescript const { text, providerMetadata } = await generateText({ model: provider('model-id'), prompt: 'Hello', }); console.log(providerMetadata.myProvider.customMetric); ``` This allows you to access provider-specific information while maintaining a consistent interface across different providers.