Helicone Observability

Helicone is an open-source LLM observability platform that helps you monitor, analyze, and optimize your AI applications through a proxy-based approach, requiring minimal setup and zero additional dependencies.

Setup

Setting up Helicone:

  1. Create a Helicone account at helicone.ai

  2. Set your API key as an environment variable:

    .env
    HELICONE_API_KEY=your-helicone-api-key
  3. Update your model provider configuration to use Helicone's proxy:

    import { createOpenAI } from '@ai-sdk/openai';
    const openai = createOpenAI({
    baseURL: 'https://oai.helicone.ai/v1',
    headers: {
    'Helicone-Auth': `Bearer ${process.env.HELICONE_API_KEY}`,
    },
    });
    // Use normally with AI SDK
    const response = await generateText({
    model: openai('gpt-4o-mini'),
    prompt: 'Hello world',
    });

That's it! Your requests are now being logged and monitored through Helicone.

→ Learn more about getting started with Helicone on AI SDK

Integration Approach

While other observability solutions require OpenTelemetry instrumentation, Helicone uses a simple proxy approach:

Helicone Proxy (3 lines)
Typical OTEL Setup (simplified)
const openai = createOpenAI({
baseURL: "https://oai.helicone.ai/v1",
headers: { "Helicone-Auth": `Bearer ${process.env.HELICONE_API_KEY}` },
});

Characteristics of Helicone's Proxy Approach:

  • No additional packages required
  • Compatible with JavaScript environments
  • Minimal code changes to existing implementations
  • Supports features such as caching and rate limiting

→ Learn more about Helicone's proxy approach

Core Features

User Tracking

Monitor how individual users interact with your AI application:

const response = await generateText({
model: openai('gpt-4o-mini'),
prompt: 'Hello world',
headers: {
'Helicone-User-Id': 'user@example.com',
},
});

→ Learn more about User Metrics

Custom Properties

Add structured metadata to filter and analyze requests:

const response = await generateText({
model: openai('gpt-4o-mini'),
prompt: 'Translate this text to French',
headers: {
'Helicone-Property-Feature': 'translation',
'Helicone-Property-Source': 'mobile-app',
'Helicone-Property-Language': 'French',
},
});

→ Learn more about Custom Properties

Session Tracking

Group related requests into coherent conversations:

const response = await generateText({
model: openai('gpt-4o-mini'),
prompt: 'Tell me more about that',
headers: {
'Helicone-Session-Id': 'convo-123',
'Helicone-Session-Name': 'Travel Planning',
'Helicone-Session-Path': '/chats/travel',
},
});

→ Learn more about Sessions

Advanced Configuration

Request Caching

Reduce costs by caching identical requests:

const response = await generateText({
model: openai('gpt-4o-mini'),
prompt: 'What is the capital of France?',
headers: {
'Helicone-Cache-Enabled': 'true',
},
});

→ Learn more about Caching

Rate Limiting

Control usage by adding a rate limit policy:

const response = await generateText({
model: openai('gpt-4o-mini'),
prompt: 'Generate creative content',
headers: {
// Allow 10,000 requests per hour
'Helicone-RateLimit-Policy': '10000;w=3600',
// Optional: limit by user
'Helicone-User-Id': 'user@example.com',
},
});

Format: [quota];w=[time_window];u=[unit];s=[segment] where:

  • quota: Maximum requests allowed in the time window
  • w: Time window in seconds (minimum 60s)
  • u: Optional unit - "request" (default) or "cents"
  • s: Optional segment - "user", custom property, or global (default)

→ Learn more about Rate Limiting

LLM Security

Protect against prompt injection, jailbreaking, and other LLM-specific threats:

const response = await generateText({
model: openai('gpt-4o-mini'),
prompt: userInput,
headers: {
// Basic protection (Prompt Guard model)
'Helicone-LLM-Security-Enabled': 'true',
// Optional: Advanced protection (Llama Guard model)
'Helicone-LLM-Security-Advanced': 'true',
},
});

Protects against multiple attack vectors in 8 languages with minimal latency. Advanced mode adds protection across 14 threat categories.

→ Learn more about LLM Security

Resources