Generating and Streaming Text

Large language models (LLMs) can generate text in response to a prompt, which can contain instructions and information to process. For example, you can ask a model to come up with a recipe, draft an email, or summarize a document.

The AI SDK Core provides two functions to generate text and stream it from LLMs:

  • generateText: Generates text for a given prompt and model.
  • streamText: Streams text from a given prompt and model.

Advanced LLM features such as tool calling and structured data generation are built on top of text generation.

generateText

You can generate text using the generateText function. This function is ideal for non-interactive use cases where you need to write text (e.g. drafting email or summarizing web pages) and for agents that use tools.

import { generateText } from 'ai';
const { text } = await generateText({
model: yourModel,
prompt: 'Write a vegetarian lasagna recipe for 4 people.',
});

You can use more advanced prompts to generate text with more complex instructions and content:

import { generateText } from 'ai';
const { text } = await generateText({
model: yourModel,
system:
'You are a professional writer. ' +
'You write simple, clear, and concise content.',
prompt: `Summarize the following article in 3-5 sentences: ${article}`,
});

The result object of generateText contains several promises that resolve when all required data is available:

  • result.text: The generated text.
  • result.finishReason: The reason the model finished generating text.
  • result.usage: The usage of the model during text generation.

streamText

Depending on your model and prompt, it can take a large language model (LLM) up to a minute to finish generating it's response. This delay can be unacceptable for interactive use cases such as chatbots or real-time applications, where users expect immediate responses.

AI SDK Core provides the streamText function which simplifies streaming text from LLMs:

import { streamText } from 'ai';
const result = streamText({
model: yourModel,
prompt: 'Invent a new holiday and describe its traditions.',
});
// example: use textStream as an async iterable
for await (const textPart of result.textStream) {
console.log(textPart);
}

result.textStream is both a ReadableStream and an AsyncIterable.

You can use streamText on it's own or in combination with AI SDK UI and AI SDK RSC. The result object contains several helper functions to make the integration into AI SDK UI easier:

  • result.toDataStreamResponse(): Creates a data stream HTTP response (with tool calls etc.) that can be used in a Next.js App Router API route.
  • result.pipeDataStreamToResponse(): Writes data stream delta output to a Node.js response-like object.
  • result.toTextStreamResponse(): Creates a simple text stream HTTP response.
  • result.pipeTextStreamToResponse(): Writes text delta output to a Node.js response-like object.

streamText is using backpressure and only generates tokens as they are requested. You need to consume the stream in order for it to finish.

It also provides several promises that resolve when the stream is finished:

  • result.text: The generated text.
  • result.finishReason: The reason the model finished generating text.
  • result.usage: The usage of the model during text generation.

onChunk callback

When using streamText, you can provide an onChunk callback that is triggered for each chunk of the stream.

It receives the following chunk types:

  • text-delta
  • tool-call
  • tool-result
  • tool-call-streaming-start (when experimental_toolCallStreaming is enabled)
  • tool-call-delta (when experimental_toolCallStreaming is enabled)
import { streamText } from 'ai';
const result = streamText({
model: yourModel,
prompt: 'Invent a new holiday and describe its traditions.',
onChunk({ chunk }) {
// implement your own logic here, e.g.:
if (chunk.type === 'text-delta') {
console.log(chunk.text);
}
},
});

onFinish callback

When using streamText, you can provide an onFinish callback that is triggered when the stream is finished ( API Reference ). It contains the text, usage information, finish reason, messages, and more:

import { streamText } from 'ai';
const result = streamText({
model: yourModel,
prompt: 'Invent a new holiday and describe its traditions.',
onFinish({ text, finishReason, usage, response }) {
// your own logic, e.g. for saving the chat history or recording usage
const messages = response.messages; // messages that were generated
},
});

fullStream property

You can read a stream with all events using the fullStream property. This can be useful if you want to implement your own UI or handle the stream in a different way. Here is an example of how to use the fullStream property:

import { streamText } from 'ai';
import { z } from 'zod';
const result = streamText({
model: yourModel,
tools: {
cityAttractions: {
parameters: z.object({ city: z.string() }),
execute: async ({ city }) => ({
attractions: ['attraction1', 'attraction2', 'attraction3'],
}),
},
},
prompt: 'What are some San Francisco tourist attractions?',
});
for await (const part of result.fullStream) {
switch (part.type) {
case 'text-delta': {
// handle text delta here
break;
}
case 'tool-call': {
switch (part.toolName) {
case 'cityAttractions': {
// handle tool call here
break;
}
}
break;
}
case 'tool-result': {
switch (part.toolName) {
case 'cityAttractions': {
// handle tool result here
break;
}
}
break;
}
case 'finish': {
// handle finish here
break;
}
case 'error': {
// handle error here
break;
}
}
}

Stream transformation

You can use the experimental_transform option to transform the stream. This is useful for e.g. filtering, changing, or smoothing the text stream.

The transformations are applied before the callbacks are invoked and the promises are resolved. If you e.g. have a transformation that changes all text to uppercase, the onFinish callback will receive the transformed text.

Smoothing streams

The AI SDK Core provides a smoothStream function that can be used to smooth out text streaming.

import { smoothStream, streamText } from 'ai';
const result = streamText({
model,
prompt,
experimental_transform: smoothStream(),
});

Generating Long Text

Most language models have an output limit that is much shorter than their context window. This means that you cannot generate long text in one go, but it is possible to add responses back to the input and continue generating to create longer text.

generateText and streamText support such continuations for long text generation using the experimental continueSteps setting:

import { openai } from '@ai-sdk/openai';
import { generateText } from 'ai';
const {
text, // combined text
usage, // combined usage of all steps
} = await generateText({
model: openai('gpt-4o'), // 4096 output tokens
maxSteps: 5, // enable multi-step calls
experimental_continueSteps: true,
prompt:
'Write a book about Roman history, ' +
'from the founding of the city of Rome ' +
'to the fall of the Western Roman Empire. ' +
'Each chapter MUST HAVE at least 1000 words.',
});

When experimental_continueSteps is enabled, only full words are streamed in streamText, and both generateText and streamText might drop the trailing tokens of some calls to prevent whitespace issues.

Some models might not always stop correctly on their own and keep generating until maxSteps is reached. You can hint the model to stop by e.g. using a system message such as "Stop when sufficient information was provided."

Examples

You can see generateText and streamText in action using various frameworks in the following examples:

generateText

Learn to generate text in Node.js
Learn to generate text in Next.js with Route Handlers (AI SDK UI)
Learn to generate text in Next.js with Server Actions (AI SDK RSC)

streamText

Learn to stream text in Node.js
Learn to stream text in Next.js with Route Handlers (AI SDK UI)
Learn to stream text in Next.js with Server Actions (AI SDK RSC)