Generating Structured Data
While text generation can be useful, your usecase will likely call for generating structured data. For example, you might want to extract information from text, classify data, or generate synthetic data.
Many language models are capable of generating structured data, often defined as using "JSON modes" or "tools". However, you need to manually provide schemas and then validate the generated data as LLMs can produce incorrect or incomplete structured data.
The Vercel AI SDK standardises structured object generation across model providers with the generateObject
function.
The generateObject
function uses Zod schemas or JSON schemas to specify the shape of the data that you want, and the AI model will generate data that conforms to that structure.
The schema is also used to validate the generated data, ensuring type safety and correctness.
import { generateObject } from 'ai';import { z } from 'zod';
const { object } = await generateObject({ model: yourModel, schema: z.object({ recipe: z.object({ name: z.string(), ingredients: z.array(z.object({ name: z.string(), amount: z.string() })), steps: z.array(z.string()), }), }), prompt: 'Generate a lasagna recipe.',});
Specifying Generation Mode
While some models (like OpenAI) natively support object generation, others require alternative methods, like modified tool calling. The generateObject
function allows you to specify the method it will use to return structured data.
auto
: The provider will choose the best mode for the model. This recommended mode is used by default.tool
: A tool with the JSON schema as parameters is provided and the provider is instructed to use it.json
: The JSON schema and an instruction is injected into the prompt. If the provider supports JSON mode, it is enabled. If the provider supports JSON grammars, the grammar is used.
Streaming Objects
Given the added complexity of returning structured data, model response time can be unacceptable for your interactive use case. With the streamObject
function, you can stream the model's response as it is generated.
import { streamObject } from 'ai';
const { partialObjectStream } = await streamObject({ // ...});
// use partialObjectStream as an async iterablefor await (const partialObject of partialObjectStream) { console.log(partialObject);}
You can use streamObject
to stream generated UIs in combination with React Server Components (see Generative UI)) or the useObject
hook.
Schema Writing Tips
The mapping from Zod schemas to LLM inputs (typically JSON schema) is not always straightforward, since the mapping is not one-to-one. Please checkout the following tips and the Prompt Engineering with Tools guide.
Generating Arrays
Most models require an object as the top-level schema. If you want to generate an array, you can wrap it in an object with a single descriptive key and use destructuring to access the array.
const { object: { users },} = await generateObject({ model: yourModel, schema: z.object({ users: z.array( z.object({ login: z.string(), fullName: z.string(), age: z.number(), }), ), }), prompt: 'Generate a list of fake user profiles for testing.',});
console.log('users', users);
Dates
Zod expects JavaScript Date objects, but most models return dates as strings. You can use the z.string().datetime()
method to specify and validate datetime strings.
const result = await generateObject({ model: openai('gpt-4o'), schema: z.object({ user: z.object({ login: z.string(), lastSeen: z .string() .datetime() .describe('Last time the user was seen (ISO 8601 date string().'), }), }), prompt: 'Generate a fake user profile for testing.',});
Error Handling
When you use generateObject
, errors are thrown when the model fails to generate proper JSON (JSONParseError
)
or when the generated JSON does not match the schema (TypeValidationError
).
Both error types contain additional information, e.g. the generated text or the invalid value.
You can use this to e.g. design a function that safely process the result object and also returns values in error cases:
import { openai } from '@ai-sdk/openai';import { JSONParseError, TypeValidationError, generateObject } from 'ai';import { z } from 'zod';
const recipeSchema = z.object({ recipe: z.object({ name: z.string(), ingredients: z.array(z.object({ name: z.string(), amount: z.string() })), steps: z.array(z.string()), }),});
type Recipe = z.infer<typeof recipeSchema>;
async function generateRecipe( food: string,): Promise< | { type: 'success'; recipe: Recipe } | { type: 'parse-error'; text: string } | { type: 'validation-error'; value: unknown } | { type: 'unknown-error'; error: unknown }> { try { const result = await generateObject({ model: openai('gpt-4-turbo'), schema: recipeSchema, prompt: `Generate a ${food} recipe.`, });
return { type: 'success', recipe: result.object }; } catch (error) { if (TypeValidationError.isTypeValidationError(error)) { return { type: 'validation-error', value: error.value }; } else if (JSONParseError.isJSONParseError(error)) { return { type: 'parse-error', text: error.text }; } else { return { type: 'unknown-error', error }; } }}
Examples
You can see generateObject
and streamObject
in action using various frameworks in the following examples: