Skip to content
Docs
AWSBedrock*Stream

AWSBedrockStream

The AWS Bedrock stream functions are utilties that transform the outputs from the AWS Bedrock API (opens in a new tab) into a ReadableStream. They use AIStream under the hood and handle parsing Bedrock's response.

There are currently 4 AWS Bedrock stream wrappers:

  • AWSBedrockAnthropicStream for Anthropic models
  • AWSBedrockAnthropicMessagesStream for Anthropic models using Messages API
  • AWSBedrockCohereStream for Cohere models
  • AWSBedrockLlama2Stream for Llama 2 models

This works with the official Bedrock API, and it's supported in both Node.js, Edge Runtime (opens in a new tab), and browser environments.

AWSBedrock*Stream(res: Response, cb?: AIStreamCallbacks): ReadableStream

Parameters

res: Response

The Response object returned by the request to the Cohere API.

cb?: AIStreamCallbacks

This optional parameter can be an object containing callback functions to handle the start, each token, and completion of the AI response. In the absence of this parameter, default behavior is implemented.

Example: Anthropic Models

The AWSBedrockAnthropicStream function can be coupled with a call to an Anthropic model using the Bedrock API to generate a readable stream of the completion. This stream can then facilitate the real-time consumption of AI outputs as they're being generated.

Here's a step-by-step example of how to implement this in Next.js:

app/api/chat/route.ts
import {
  BedrockRuntimeClient,
  InvokeModelWithResponseStreamCommand,
} from '@aws-sdk/client-bedrock-runtime';
import { AWSBedrockAnthropicStream, StreamingTextResponse } from 'ai';
import { experimental_buildAnthropicPrompt } from 'ai/prompts';
 
// IMPORTANT! Set the runtime to edge
export const runtime = 'edge';
 
const bedrockClient = new BedrockRuntimeClient({
  region: process.env.AWS_REGION ?? 'us-east-1',
  credentials: {
    accessKeyId: process.env.AWS_ACCESS_KEY_ID ?? '',
    secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY ?? '',
  },
});
 
export async function POST(req: Request) {
  // Extract the `prompt` from the body of the request
  const { messages } = await req.json();
 
  // Ask Claude for a streaming chat completion given the prompt
  const bedrockResponse = await bedrockClient.send(
    new InvokeModelWithResponseStreamCommand({
      modelId: 'anthropic.claude-v2',
      contentType: 'application/json',
      accept: 'application/json',
      body: JSON.stringify({
        prompt: experimental_buildAnthropicPrompt(messages),
        max_tokens_to_sample: 300,
      }),
    }),
  );
 
  // Convert the response into a friendly text-stream
  const stream = AWSBedrockAnthropicStream(bedrockResponse);
 
  // Respond with the stream
  return new StreamingTextResponse(stream);
}

In this example, the AWSBedrockAnthropicStream function transforms the text generation stream from the Bedrock into a ReadableStream of parsed result. This allows clients to consume AI outputs in real-time as they're generated, instead of waiting for the complete response.

Example: Anthropic Models using Messages API

The AWSBedrockAnthropicMessagesStream function can be coupled with a call to an Anthropic model using the Messages API (opens in a new tab). This stream can then facilitate the real-time consumption of AI outputs as they're being generated.

Here's a step-by-step example of how to implement this in Next.js:

app/api/chat/route.ts
import {
  BedrockRuntimeClient,
  InvokeModelWithResponseStreamCommand,
} from '@aws-sdk/client-bedrock-runtime';
import { AWSBedrockAnthropicMessagesStream, StreamingTextResponse } from 'ai';
import { experimental_buildAnthropicMessages } from 'ai/prompts';
 
// IMPORTANT! Set the runtime to edge
export const runtime = 'edge';
 
const bedrockClient = new BedrockRuntimeClient({
  region: process.env.AWS_REGION ?? 'us-east-1',
  credentials: {
    accessKeyId: process.env.AWS_ACCESS_KEY_ID ?? '',
    secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY ?? '',
  },
});
 
export async function POST(req: Request) {
  // Extract the `prompt` from the body of the request
  const { messages } = await req.json();
 
  // Ask Claude for a streaming chat completion given the prompt
  const bedrockResponse = await bedrockClient.send(
    new InvokeModelWithResponseStreamCommand({
      modelId: 'anthropic.claude-3-haiku-20240307-v1:0',
      contentType: 'application/json',
      accept: 'application/json',
      body: JSON.stringify({
        anthropic_version: 'bedrock-2023-05-31',
        messages: experimental_buildAnthropicMessages(messages),
        max_tokens: 300,
      }),
    }),
  );
 
  // Convert the response into a friendly text-stream
  const stream = AWSBedrockAnthropicMessagesStream(bedrockResponse);
 
  // Respond with the stream
  return new StreamingTextResponse(stream);
}

In this example, the AWSBedrockAnthropicStream function transforms the text generation stream from the Bedrock into a ReadableStream of parsed result. This allows clients to consume AI outputs in real-time as they're generated, instead of waiting for the complete response.

Example: Cohere Models

The AWSBedrockCohereStream function can be coupled with a call to a Cohere model using the Bedrock API to generate a readable stream of the completion. This stream can then facilitate the real-time consumption of AI outputs as they're being generated.

Here's a step-by-step example of how to implement this in Next.js:

app/api/chat/route.ts
import {
  BedrockRuntimeClient,
  InvokeModelWithResponseStreamCommand,
} from '@aws-sdk/client-bedrock-runtime';
import { AWSBedrockCohereStream, StreamingTextResponse } from 'ai';
 
const bedrockClient = new BedrockRuntimeClient({
  region: process.env.AWS_REGION ?? 'us-east-1',
  credentials: {
    accessKeyId: process.env.AWS_ACCESS_KEY_ID ?? '',
    secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY ?? '',
  },
});
 
function buildPrompt(
  messages: { content: string, role: 'system' | 'user' | 'assistant' }[],
) {
  return (
    messages.map(({ content, role }) => {
      if (role === 'user') {
        return `Human: ${content}\n`;
      } else {
        return `Assistant: ${content}\n`;
      }
    }) + 'Assistant:\n'
  );
}
 
export async function POST(req: Request) {
  // Extract the `prompt` from the body of the request
  const { messages } = await req.json();
 
  // Ask Claude for a streaming chat completion given the prompt
  const bedrockResponse = await bedrockClient.send(
    new InvokeModelWithResponseStreamCommand({
      modelId: 'cohere.command-light-text-v14',
      contentType: 'application/json',
      accept: 'application/json',
      body: JSON.stringify({
        prompt: buildPrompt(messages),
        stream: true,
        max_tokens: 300,
      }),
    }),
  );
 
  // Convert the response into a friendly text-stream
  const stream = AWSBedrockCohereStream(bedrockResponse);
 
  // Respond with the stream
  return new StreamingTextResponse(stream);
}

In this example, the AWSBedrockCohereStream function transforms the text generation stream from the Bedrock into a ReadableStream of parsed result. This allows clients to consume AI outputs in real-time as they're generated, instead of waiting for the complete response.

Example: Llama 2 Models

The AWSBedrockLlama2Stream function can be coupled with a call to a Llama 2 model using the Bedrock API to generate a readable stream of the completion. This stream can then facilitate the real-time consumption of AI outputs as they're being generated.

Here's a step-by-step example of how to implement this in Next.js:

app/api/chat/route.ts
import {
  BedrockRuntimeClient,
  InvokeModelWithResponseStreamCommand,
} from '@aws-sdk/client-bedrock-runtime';
import { AWSBedrockLlama2Stream, StreamingTextResponse } from 'ai';
import { experimental_buildLlama2Prompt } from 'ai/prompts';
 
const bedrockClient = new BedrockRuntimeClient({
  region: process.env.AWS_REGION ?? 'us-east-1',
  credentials: {
    accessKeyId: process.env.AWS_ACCESS_KEY_ID ?? '',
    secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY ?? '',
  },
});
 
export async function POST(req: Request) {
  // Extract the `prompt` from the body of the request
  const { messages } = await req.json();
 
  // Ask Claude for a streaming chat completion given the prompt
  const bedrockResponse = await bedrockClient.send(
    new InvokeModelWithResponseStreamCommand({
      modelId: 'meta.llama2-13b-chat-v1',
      contentType: 'application/json',
      accept: 'application/json',
      body: JSON.stringify({
        prompt: experimental_buildLlama2Prompt(messages),
        max_gen_len: 300,
      }),
    }),
  );
 
  // Convert the response into a friendly text-stream
  const stream = AWSBedrockLlama2Stream(bedrockResponse);
 
  // Respond with the stream
  return new StreamingTextResponse(stream);
}

In this example, the AWSBedrockLlama2Stream function transforms the text generation stream from the Bedrock into a ReadableStream of parsed result. This allows clients to consume AI outputs in real-time as they're generated, instead of waiting for the complete response.


© 2023 Vercel Inc.