Generate Image with Chat Prompt
When building a chatbot, you may want to allow the user to generate an image. This can be done by creating a tool that generates an image using the experimental_generateImage
function from the AI SDK.
Server
Let's create an endpoint at /api/chat
that generates the assistant's response based on the conversation history. You will also define a tool called generateImage
that will generate an image based on the assistant's response.
import { openai } from '@ai-sdk/openai';import { experimental_generateImage, Message, streamText, tool } from 'ai';import { z } from 'zod';
export async function POST(request: Request) { const { messages }: { messages: Message[] } = await request.json();
// filter through messages and remove base64 image data to avoid sending to the model const formattedMessages = messages.map(m => { if (m.role === 'assistant' && m.toolInvocations) { m.toolInvocations.forEach(ti => { if (ti.toolName === 'generateImage' && ti.state === 'result') { ti.result.image = `redacted-for-length`; } }); } return m; });
const result = streamText({ model: openai('gpt-4o'), messages: formattedMessages, tools: { generateImage: tool({ description: 'Generate an image', parameters: z.object({ prompt: z.string().describe('The prompt to generate the image from'), }), execute: async ({ prompt }) => { const { image } = await experimental_generateImage({ model: openai.image('dall-e-3'), prompt, }); // in production, save this image to blob storage and return a URL return { image: image.base64, prompt }; }, }), }, }); return result.toDataStreamResponse();}
In production, you should save the generated image to a blob storage and return a URL instead of the base64 image data. If you don't, the base64 image data will be sent to the model which may cause the generation to fail.
Client
Let's create a simple chat interface with useChat
. You will call the /api/chat
endpoint to generate the assistant's response. If the assistant's response contains a generateImage
tool invocation, you will display the tool result (the image in base64 format and the prompt) using the Next Image
component.
'use client';
import { useChat } from 'ai/react';import Image from 'next/image';
export default function Chat() { const { messages, input, handleInputChange, handleSubmit } = useChat();
return ( <div className="flex flex-col w-full max-w-md py-24 mx-auto stretch"> <div className="space-y-4"> {messages.map(m => ( <div key={m.id} className="whitespace-pre-wrap"> <div key={m.id}> <div className="font-bold">{m.role}</div> {m.toolInvocations ? ( m.toolInvocations.map(ti => ti.toolName === 'generateImage' ? ( ti.state === 'result' ? ( <Image key={ti.toolCallId} src={`data:image/png;base64,${ti.result.image}`} alt={ti.result.prompt} height={400} width={400} /> ) : ( <div key={ti.toolCallId} className="animate-pulse"> Generating image... </div> ) ) : null, ) ) : ( <p>{m.content}</p> )} </div> </div> ))} </div>
<form onSubmit={handleSubmit}> <input className="fixed bottom-0 w-full max-w-md p-2 mb-8 border border-gray-300 rounded shadow-xl" value={input} placeholder="Say something..." onChange={handleInputChange} /> </form> </div> );}