Tool Calling
As covered under Foundations, tools are objects that can be called by the model to perform a specific task. AI SDK Core tools contain three elements:
description
: An optional description of the tool that can influence when the tool is picked.parameters
: A Zod schema or a JSON schema that defines the parameters. The schema is consumed by the LLM, and also used to validate the LLM tool calls.execute
: An optional async function that is called with the arguments from the tool call. It produces a value of typeRESULT
(generic type). It is optional because you might want to forward tool calls to the client or to a queue instead of executing them in the same process.
You can use the tool
helper function to
infer the types of the execute
parameters.
The tools
parameter of generateText
and streamText
is an object that has the tool names as keys and the tools as values:
import { z } from 'zod';import { generateText, tool } from 'ai';
const result = await generateText({ model: yourModel, tools: { weather: tool({ description: 'Get the weather in a location', parameters: z.object({ location: z.string().describe('The location to get the weather for'), }), execute: async ({ location }) => ({ location, temperature: 72 + Math.floor(Math.random() * 21) - 10, }), }), }, prompt: 'What is the weather in San Francisco?',});
When a model uses a tool, it is called a "tool call" and the output of the tool is called a "tool result".
Tool calling is not restricted to only text generation. You can also use it to render user interfaces (Generative UI).
Multi-Step Calls
Large language models need to know the tool results before they can continue to generate text.
This requires sending the tool results back to the model.
You can enable this feature by setting the maxSteps
setting to a number greater than 1.
When maxSteps
is set to a number greater than 1, the language model will be called
in a loop when there are tool calls and for every tool call there is a tool result, until there
are no further tool calls or the maximum number of tool steps is reached.
Example
In the following example, there are two steps:
- Step 1
- The prompt
'What is the weather in San Francisco?'
is sent to the model. - The model generates a tool call.
- The tool call is executed.
- The prompt
- Step 2
- The tool result is sent to the model.
- The model generates a response considering the tool result.
import { z } from 'zod';import { generateText, tool } from 'ai';
const { text, steps } = await generateText({ model: yourModel, tools: { weather: tool({ description: 'Get the weather in a location', parameters: z.object({ location: z.string().describe('The location to get the weather for'), }), execute: async ({ location }) => ({ location, temperature: 72 + Math.floor(Math.random() * 21) - 10, }), }), }, maxSteps: 5, // allow up to 5 steps prompt: 'What is the weather in San Francisco?',});
streamText
in a similar way.Steps
To access intermediate tool calls and results, you can use the steps
property in the result object
or the streamText
onFinish
callback.
It contains all the text, tool calls, tool results, and more from each step.
Example: Extract tool results from all steps
import { generateText } from 'ai';
const { steps } = await generateText({ model: openai('gpt-4-turbo'), maxSteps: 10, // ...});
// extract all tool calls from the steps:const allToolCalls = steps.flatMap(step => step.toolCalls);
onStepFinish
callback
When using generateText
or streamText
, you can provide an onStepFinish
callback that
is triggered when a step is finished,
i.e. all text deltas, tool calls, and tool results for the step are available.
When you have multiple steps, the callback is triggered for each step.
import { generateText } from 'ai';
const result = await generateText({ // ... onStepFinish({ text, toolCalls, toolResults, finishReason, usage }) { // your own logic, e.g. for saving the chat history or recording usage },});
Response Messages
Adding the generated assistant and tool messages to your conversation history is a common task, especially if you are using multi-step tool calls.
Both generateText
and streamText
have a responseMessages
property that you can use to
add the assistant and tool messages to your conversation history.
It is also available in the onFinish
callback of streamText
.
The responseMessages
property contains an array of CoreMessage
objects that you can add to your conversation history:
import { generateText } from 'ai';
const messages: CoreMessage[] = [ // ...];
const { responseMessages } = await generateText({ // ... messages,});
// add the response messages to your conversation history:messages.push(...responseMessages); // streamText: ...(await responseMessages)
Tool Choice
You can use the toolChoice
setting to influence when a tool is selected.
It supports the following settings:
auto
(default): the model can choose whether and which tools to call.required
: the model must call a tool. It can choose which tool to call.none
: the model must not call tools{ type: 'tool', toolName: string (typed) }
: the model must call the specified tool
import { z } from 'zod';import { generateText, tool } from 'ai';
const result = await generateText({ model: yourModel, tools: { weather: tool({ description: 'Get the weather in a location', parameters: z.object({ location: z.string().describe('The location to get the weather for'), }), execute: async ({ location }) => ({ location, temperature: 72 + Math.floor(Math.random() * 21) - 10, }), }), }, toolChoice: 'required', // force the model to call a tool prompt: 'What is the weather in San Francisco?',});
Tool Execution Options
When tools are called, they receive additional options as a second parameter.
Messages
The messages that were sent to the language model to initiate the response that contained the tool call are forwarded to the tool execution.
You can access them in the second parameter of the execute
function.
In multi-step calls, the messages contain the text, tool calls, and tool results from all previous steps.
import { generateText, tool } from 'ai';
const result = await generateText({ // ... tools: { myTool: tool({ // ... execute: async (args, { messages }) => { // use the message history in e.g. calls to other language models return something; }, }), },});
Abort Signals
The abort signals from generateText
and streamText
are forwarded to the tool execution.
You can access them in the second parameter of the execute
function and e.g. abort long-running computations or forward them to fetch calls inside tools.
import { z } from 'zod';import { generateText, tool } from 'ai';
const result = await generateText({ model: yourModel, abortSignal: myAbortSignal, // signal that will be forwarded to tools tools: { weather: tool({ description: 'Get the weather in a location', parameters: z.object({ location: z.string() }), execute: async ({ location }, { abortSignal }) => { return fetch( `https://api.weatherapi.com/v1/current.json?q=${location}`, { signal: abortSignal }, // forward the abort signal to fetch ); }, }), }, prompt: 'What is the weather in San Francisco?',});
Types
Modularizing your code often requires defining types to ensure type safety and reusability. To enable this, the AI SDK provides several helper types for tools, tool calls, and tool results.
You can use them to strongly type your variables, function parameters, and return types
in parts of the code that are not directly related to streamText
or generateText
.
Each tool call is typed with CoreToolCall<NAME extends string, ARGS>
, depending
on the tool that has been invoked.
Similarly, the tool results are typed with CoreToolResult<NAME extends string, ARGS, RESULT>
.
The tools in streamText
and generateText
are defined as a Record<string, CoreTool>
.
The type inference helpers CoreToolCallUnion<TOOLS extends Record<string, CoreTool>>
and CoreToolResultUnion<TOOLS extends Record<string, CoreTool>>
can be used to
extract the tool call and tool result types from the tools.
import { openai } from '@ai-sdk/openai';import { CoreToolCallUnion, CoreToolResultUnion, generateText, tool } from 'ai';import { z } from 'zod';
const myToolSet = { firstTool: tool({ description: 'Greets the user', parameters: z.object({ name: z.string() }), execute: async ({ name }) => `Hello, ${name}!`, }), secondTool: tool({ description: 'Tells the user their age', parameters: z.object({ age: z.number() }), execute: async ({ age }) => `You are ${age} years old!`, }),};
type MyToolCall = CoreToolCallUnion<typeof myToolSet>;type MyToolResult = CoreToolResultUnion<typeof myToolSet>;
async function generateSomething(prompt: string): Promise<{ text: string; toolCalls: Array<MyToolCall>; // typed tool calls toolResults: Array<MyToolResult>; // typed tool results}> { return generateText({ model: openai('gpt-4o'), tools: myToolSet, prompt, });}
Active Tools
The activeTools
property is experimental and may change in the future.
Language models can only handle a limited number of tools at a time, depending on the model.
To allow for static typing using a large number of tools and limiting the available tools to the model at the same time,
the AI SDK provides the experimental_activeTools
property.
It is an array of tool names that are currently active.
By default, the value is undefined
and all tools are active.
import { openai } from '@ai-sdk/openai';import { generateText } from 'ai';
const { text } = await generateText({ model: openai('gpt-4o'), tools: myToolSet, experimental_activeTools: ['firstTool'],});
Multi-modal Tool Results
Multi-modal tool results are experimental and only supported by Anthropic.
In order to send multi-modal tool results, e.g. screenshots, back to the model, they need to be converted into a specific format.
AI SDK Core tools have an optional experimental_toToolResultContent
function
that converts the tool result into a content part.
Here is an example for converting a screenshot into a content part:
const result = await generateText({ model: anthropic('claude-3-5-sonnet-20241022'), tools: { computer: anthropic.tools.computer_20241022({ // ... async execute({ action, coordinate, text }) { switch (action) { case 'screenshot': { return { type: 'image', data: fs .readFileSync('./data/screenshot-editor.png') .toString('base64'), }; } default: { return `executed ${action}`; } } },
// map to tool result content for LLM consumption: experimental_toToolResultContent(result) { return typeof result === 'string' ? [{ type: 'text', text: result }] : [{ type: 'image', data: result.data, mimeType: 'image/png' }]; }, }), }, // ...});
Examples
You can see tools in action using various frameworks in the following examples: