AdvancedCaching
Caching Responses
Depending on the type of application you're building, you may want to cache the responses you receive from your AI provider, at least temporarily.
Each stream helper for each provider has special lifecycle callbacks you can use.
The one of interest is likely onFinish
, which is called when the stream is closed. This is where you can cache the full response.
Here's an example of how you can implement caching using Vercel KV and Next.js to cache the OpenAI response for 1 hour:
Example: Vercel KV
This example uses Vercel KV and Next.js to cache the OpenAI response for 1 hour.
app/api/chat/route.ts
import { openai } from '@ai-sdk/openai';import { convertToCoreMessages, formatStreamPart, streamText } from 'ai';import kv from '@vercel/kv';
// Allow streaming responses up to 30 secondsexport const maxDuration = 30;
// simple cache implementation, use Vercel KV or a similar service for productionconst cache = new Map<string, string>();
export async function POST(req: Request) { const { messages } = await req.json();
// come up with a key based on the request: const key = JSON.stringify(messages);
// Check if we have a cached response const cached = await kv.get(key); if (cached != null) { return new Response(formatStreamPart('text', cached), { status: 200, headers: { 'Content-Type': 'text/plain' }, }); }
// Call the language model: const result = await streamText({ model: openai('gpt-4o'), messages: convertToCoreMessages(messages), async onFinish({ text }) { // Cache the response text: await kv.set(key, text); await kv.expire(key, 60 * 60); }, });
// Respond with the stream return result.toDataStreamResponse();}