Rev.ai Provider

The Rev.ai provider contains language model support for the Rev.ai transcription API.

Setup

The Rev.ai provider is available in the @ai-sdk/revai module. You can install it with

pnpm
npm
yarn
pnpm add @ai-sdk/revai

Provider Instance

You can import the default provider instance revai from @ai-sdk/revai:

import { revai } from '@ai-sdk/revai';

If you need a customized setup, you can import createRevai from @ai-sdk/revai and create a provider instance with your settings:

import { createRevai } from '@ai-sdk/revai';
const revai = createRevai({
// custom settings, e.g.
fetch: customFetch,
});

You can use the following optional settings to customize the Rev.ai provider instance:

  • apiKey string

    API key that is being sent using the Authorization header. It defaults to the REVAI_API_KEY environment variable.

  • headers Record<string,string>

    Custom headers to include in the requests.

  • fetch (input: RequestInfo, init?: RequestInit) => Promise<Response>

    Custom fetch implementation. Defaults to the global fetch function. You can use it as a middleware to intercept requests, or to provide a custom fetch implementation for e.g. testing.

Transcription Models

You can create models that call the Rev.ai transcription API using the .transcription() factory method.

The first argument is the model id e.g. machine.

const model = revai.transcription('machine');

You can also pass additional provider-specific options using the providerOptions argument. For example, supplying the input language in ISO-639-1 (e.g. en) format can sometimes improve transcription performance if known beforehand.

import { experimental_transcribe as transcribe } from 'ai';
import { revai } from '@ai-sdk/revai';
import { readFile } from 'fs/promises';
const result = await transcribe({
model: revai.transcription('machine'),
audio: await readFile('audio.mp3'),
providerOptions: { revai: { language: 'en' } },
});

The following provider options are available:

  • metadata string

    Optional metadata that was provided during job submission.

  • notification_config object

    Optional configuration for a callback url to invoke when processing is complete.

    • url string - Callback url to invoke when processing is complete.
    • auth_headers object - Optional authorization headers, if needed to invoke the callback.
  • delete_after_seconds integer

    Amount of time after job completion when job is auto-deleted.

  • verbatim boolean

    Configures the transcriber to transcribe every syllable, including all false starts and disfluencies.

  • rush boolean

    [HIPAA Unsupported] Only available for human transcriber option. When set to true, your job is given higher priority.

  • skip_diarization boolean

    Specify if speaker diarization will be skipped by the speech engine.

  • skip_postprocessing boolean

    Only available for English and Spanish languages. User-supplied preference on whether to skip post-processing operations.

  • skip_punctuation boolean

    Specify if "punct" type elements will be skipped by the speech engine.

  • remove_disfluencies boolean

    When set to true, disfluencies (like 'ums' and 'uhs') will not appear in the transcript.

  • remove_atmospherics boolean

    When set to true, atmospherics (like <laugh>, <affirmative>) will not appear in the transcript.

  • filter_profanity boolean

    When enabled, profanities will be filtered by replacing characters with asterisks except for the first and last.

  • speaker_channels_count integer

    Only available for English, Spanish and French languages. Specify the total number of unique speaker channels in the audio.

  • speakers_count integer

    Only available for English, Spanish and French languages. Specify the total number of unique speakers in the audio.

  • diarization_type string

    Specify diarization type. Possible values: "standard" (default), "premium".

  • custom_vocabulary_id string

    Supply the id of a pre-completed custom vocabulary submitted through the Custom Vocabularies API.

  • custom_vocabularies Array

    Specify a collection of custom vocabulary to be used for this job.

  • strict_custom_vocabulary boolean

    If true, only exact phrases will be used as custom vocabulary.

  • summarization_config object

    Specify summarization options.

    • model string - Model type for summarization. Possible values: "standard" (default), "premium".
    • type string - Summarization formatting type. Possible values: "paragraph" (default), "bullets".
    • prompt string - Custom prompt for flexible summaries (mutually exclusive with type).
  • translation_config object

    Specify translation options.

    • target_languages Array - Array of target languages for translation.
    • model string - Model type for translation. Possible values: "standard" (default), "premium".
  • language string

    Language is provided as a ISO 639-1 language code. Default is "en".

  • forced_alignment boolean

    When enabled, provides improved accuracy for per-word timestamps for a transcript. Default is false.

    Currently supported languages:

    • English (en, en-us, en-gb)
    • French (fr)
    • Italian (it)
    • German (de)
    • Spanish (es)

    Note: This option is not available in low-cost environment.

Model Capabilities

ModelTranscriptionDurationSegmentsLanguage
machine
human
low_cost
fusion