AI SDK CoreGenerating Structured Data

Generating Structured Data

While text generation can be useful, your usecase will likely call for generating structured data. For example, you might want to extract information from text, classify data, or generate synthetic data.

Many language models are capable of generating structured data, often defined as using "JSON modes" or "tools". However, you need to manually provide schemas and then validate the generated data as LLMs can produce incorrect or incomplete structured data.

The Vercel AI SDK standardises structured object generation across model providers with the generateObject function.

The generateObject function uses Zod schemas to specify the shape of the data that you want, and the AI model will generate data that conforms to that structure. The schema is also used to validate the generated data, ensuring type safety and correctness.

import { generateObject } from 'ai';
import { z } from 'zod';
const { object } = await generateObject({
model: yourModel,
schema: z.object({
recipe: z.object({
name: z.string(),
ingredients: z.array(z.object({ name: z.string(), amount: z.string() })),
steps: z.array(z.string()),
}),
}),
prompt: 'Generate a lasagna recipe.',
});

Specifying Generation Mode

While some models (like OpenAI) natively support object generation, others require alternative methods, like modified tool calling. The generateObject function allows you to specify the method it will use to return structured data.

  • auto: The provider will choose the best mode for the model. This recommended mode is used by default.
  • tool: A tool with the JSON schema as parameters is provided and the provider is instructed to use it.
  • json: The JSON schema and an instruction is injected into the prompt. If the provider supports JSON mode, it is enabled.
  • grammar: The provider is instructed to convert the JSON schema into a provider specific grammar and use it to select the output tokens.
Please note that not every provider supports all generation modes.

Streaming Objects

Given the added complexity of returning structured data, model response time can be unacceptable for your interactive use case. With the streamObject function, you can stream the model's response as it is generated.

import { streamObject } from 'ai';
const { partialObjectStream } = await streamObject({
// ...
});
// use partialObjectStream as an async iterable
for await (const partialObject of partialObjectStream) {
console.log(partialObject);
}

You can use streamObject to stream generated UIs in combination with React Server Components (see Generative UI)) or the useObject hook.

Guide

Generating Arrays

Most models require an object as the top-level schema. If you want to generate an array, you can wrap it in an object with a single descriptive key and use destructuring to access the array.

const {
object: { users },
} = await generateObject({
model: yourModel,
schema: z.object({
users: z.array(
z.object({
login: z.string(),
fullName: z.string(),
age: z.number(),
}),
),
}),
prompt: 'Generate a list of fake user profiles for testing.',
});
console.log('users', users);

Dates

Zod expectes JavaScript Date objects, but most models return dates as strings. You can use the z.string().datetime() method to specify and validate datetime strings.

const result = await generateObject({
model: openai('gpt-4o'),
schema: z.object({
user: z.object({
login: z.string(),
lastSeen: z
.string()
.datetime()
.describe('Last time the user was seen (ISO 8601 date string().'),
}),
}),
prompt: 'Generate a fake user profile for testing.',
});