OpenRouter Enhanced - Ruby Gem

The future will bring us hundreds of language models and dozens of providers for each. How will you choose the best?

The OpenRouter API is a single unified interface for all LLMs! And now you can easily use it with Ruby! πŸ€–πŸŒŒ

OpenRouter Enhanced is an advanced fork of the original OpenRouter Ruby gem by Obie Fernandez that adds comprehensive AI application development features including tool calling, structured outputs, intelligent model selection, prompt templates, observability, and automatic response healingβ€”all while maintaining full backward compatibility.

πŸ“– Read the story behind OpenRouter Enhanced - Learn why this gem was built and the philosophy behind its design.

Enhanced Features

This fork extends the original OpenRouter gem with enterprise-grade AI development capabilities:

Core AI Features

  • Tool Calling: Full support for OpenRouter's function calling API with Ruby-idiomatic DSL for tool definitions
  • Structured Outputs: JSON Schema validation with automatic healing for non-native models and Ruby DSL for schema definitions
  • Smart Model Selection: Intelligent model selection with fluent DSL for cost optimization, capability requirements, and provider preferences
  • Prompt Templates: Reusable prompt templates with variable interpolation and few-shot learning support

Performance & Reliability

  • Model Registry: Local caching and querying of OpenRouter model data with capability detection
  • Enhanced Response Handling: Rich Response objects with automatic parsing for tool calls and structured outputs
  • Automatic Healing: Self-healing responses for malformed JSON from models that don't natively support structured outputs
  • Model Fallbacks: Automatic failover between models with graceful degradation
  • Streaming Support: Enhanced streaming client with callback system and response reconstruction

Observability & Analytics

  • Usage Tracking: Comprehensive token usage and cost tracking across all API calls
  • Response Analytics: Detailed metadata including tokens, costs, cache hits, and performance metrics
  • Callback System: Extensible event system for monitoring requests, responses, and errors
  • Cost Management: Built-in cost estimation and budget constraints

Development & Testing

  • Comprehensive Testing: VCR-based integration tests with real API interactions
  • Debug Support: Detailed error reporting and validation feedback
  • Configuration Options: Extensive configuration for healing, validation, and performance tuning
  • Backward Compatible: All existing code continues to work unchanged

Core OpenRouter Benefits

  • Prioritize price or performance: OpenRouter scouts for the lowest prices and best latencies/throughputs across dozens of providers, and lets you choose how to prioritize them.
  • Standardized API: No need to change your code when switching between models or providers. You can even let users choose and pay for their own.
  • Easy integration: This Ruby gem provides a simple and intuitive interface to interact with the OpenRouter API, making it effortless to integrate AI capabilities into your Ruby applications.

Table of Contents

Installation

Bundler

Add this line to your application's Gemfile:

gem "open_router_enhanced"

And then execute:

bundle install

Gem install

Or install it directly:

gem install open_router_enhanced

And require it in your code:

require "open_router"

Quick Start

1. Get Your API Key

2. Basic Setup and Usage

require "open_router"

# Configure the gem
OpenRouter.configure do |config|
  config.access_token = ENV["OPENROUTER_API_KEY"]
  config.site_name = "Your App Name"
  config.site_url = "https://yourapp.com"
end

# Create a client
client = OpenRouter::Client.new

# Basic completion
response = client.complete([
  { role: "user", content: "What is the capital of France?" }
])

puts response.content
# => "The capital of France is Paris."

3. Enhanced Features Quick Example

# Smart model selection
model = OpenRouter::ModelSelector.new
                                 .require(:function_calling)
                                 .optimize_for(:cost)
                                 .choose

# Tool calling with structured output
weather_tool = OpenRouter::Tool.define do
  name "get_weather"
  description "Get current weather"
  parameters do
    string :location, required: true
  end
end

weather_schema = OpenRouter::Schema.define("weather") do
  string :location, required: true
  number :temperature, required: true
  string :conditions, required: true
end

response = client.complete(
  [{ role: "user", content: "What's the weather in Tokyo?" }],
  model: model,
  tools: [weather_tool],
  response_format: weather_schema
)

# Process results
if response.has_tool_calls?
  weather_data = response.structured_output
  puts "Temperature in #{weather_data['location']}: #{weather_data['temperature']}Β°"
end

Configuration

Global Configuration

Configure the gem globally, for example in an open_router.rb initializer file. Never hardcode secrets into your codebase - instead use Rails.application.credentials or something like dotenv to pass the keys safely into your environments.

OpenRouter.configure do |config|
  config.access_token = ENV["OPENROUTER_API_KEY"]
  config.site_name = "Your App Name"
  config.site_url = "https://yourapp.com"

  # Optional: Configure response healing for non-native structured output models
  config.auto_heal_responses = true
  config.healer_model = "openai/gpt-4o-mini"
  config.max_heal_attempts = 2

  # Optional: Configure strict mode for capability validation
  config.strict_mode = true

  # Optional: Configure automatic forcing for unsupported models
  config.auto_force_on_unsupported_models = true
end

Per-Client Configuration

You can also configure clients individually:

client = OpenRouter::Client.new(
  access_token: ENV["OPENROUTER_API_KEY"],
  request_timeout: 120
)

Faraday Configuration

The configuration object exposes a faraday method that you can pass a block to configure Faraday settings and middleware.

This example adds faraday-retry and a logger that redacts the api key so it doesn't get leaked to logs.

require 'faraday/retry'

retry_options = {
  max: 2,
  interval: 0.05,
  interval_randomness: 0.5,
  backoff_factor: 2
}

OpenRouter::Client.new(access_token: ENV["ACCESS_TOKEN"]) do |config|
  config.faraday do |f|
    f.request :retry, retry_options
    f.response :logger, ::Logger.new($stdout), { headers: true, bodies: true, errors: true } do |logger|
      logger.filter(/(Bearer) (\S+)/, '\1[REDACTED]')
    end
  end
end

Change version or timeout

The default timeout for any request using this library is 120 seconds. You can change that by passing a number of seconds to the request_timeout when initializing the client.

client = OpenRouter::Client.new(
    access_token: "access_token_goes_here",
    request_timeout: 240 # Optional
)

Core Features

Basic Completions

Hit the OpenRouter API for a completion:

messages = [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "What is the color of the sky?" }
]

response = client.complete(messages)
puts response.content
# => "The sky is typically blue during the day due to a phenomenon called Rayleigh scattering. Sunlight..."

Model Selection

Pass an array to the model parameter to enable explicit model routing.

OpenRouter::Client.new.complete(
  [
    { role: "system", content: SYSTEM_PROMPT },
    { role: "user", content: "Provide analysis of the data formatted as JSON:" }
  ],
  model: [
    "mistralai/mixtral-8x7b-instruct:nitro",
    "mistralai/mixtral-8x7b-instruct"
  ],
  extras: {
    response_format: {
      type: "json_object"
    }
  }
)

Browse full list of models available or fetch from the OpenRouter API:

models = client.models
puts models
# => [{"id"=>"openrouter/auto", "object"=>"model", "created"=>1684195200, "owned_by"=>"openrouter", "permission"=>[], "root"=>"openrouter", "parent"=>nil}, ...]

Generation Stats

Query the generation stats for a given generation ID:

generation_id = "generation-abcdefg"
stats = client.query_generation_stats(generation_id)
puts stats
# => {"id"=>"generation-abcdefg", "object"=>"generation", "created"=>1684195200, "model"=>"openrouter/auto", "usage"=>{"prompt_tokens"=>10, "completion_tokens"=>50, "total_tokens"=>60}, "cost"=>0.0006}

Enhanced AI Features

Tool Calling

Enable AI models to call functions and interact with external APIs using OpenRouter's function calling with an intuitive Ruby DSL.

Quick Example

# Define a tool using the DSL
weather_tool = OpenRouter::Tool.define do
  name "get_weather"
  description "Get current weather for a location"

  parameters do
    string :location, required: true, description: "City name"
    string :units, enum: ["celsius", "fahrenheit"], default: "celsius"
  end
end

# Use in completion
response = client.complete(
  [{ role: "user", content: "What's the weather in London?" }],
  model: "anthropic/claude-3.5-sonnet",
  tools: [weather_tool],
  tool_choice: "auto"
)

# Handle tool calls
if response.has_tool_calls?
  response.tool_calls.each do |tool_call|
    result = fetch_weather(tool_call.arguments["location"], tool_call.arguments["units"])
    puts "Weather in #{tool_call.arguments['location']}: #{result}"
  end
end

Key Features

  • Ruby DSL: Define tools with intuitive Ruby syntax
  • Parameter Validation: Automatic validation against JSON Schema
  • Tool Choice Control: Auto, required, none, or specific tool selection
  • Conversation Continuation: Easy message building for multi-turn conversations
  • Error Handling: Graceful error handling and validation

πŸ“– Complete Tool Calling Documentation

Structured Outputs

Get JSON responses that conform to specific schemas with automatic validation and healing for non-native models.

Quick Example

# Define a schema using the DSL
user_schema = OpenRouter::Schema.define("user") do
  string :name, required: true, description: "Full name"
  integer :age, required: true, minimum: 0, maximum: 150
  string :email, required: true, description: "Email address"
  boolean :premium, description: "Premium account status"
end

# Get structured response
response = client.complete(
  [{ role: "user", content: "Create a user: John Doe, 30, [email protected]" }],
  model: "openai/gpt-4o",
  response_format: user_schema
)

# Access parsed JSON data
user = response.structured_output
puts user["name"]    # => "John Doe"
puts user["age"]     # => 30
puts user["email"]   # => "[email protected]"

Key Features

  • Ruby DSL: Define JSON schemas with Ruby syntax
  • Automatic Healing: Self-healing for models without native structured output support
  • Validation: Optional validation with detailed error reporting
  • Complex Schemas: Support for nested objects, arrays, and advanced constraints
  • Fallback Support: Graceful degradation for unsupported models

πŸ“– Complete Structured Outputs Documentation

Smart Model Selection

Automatically choose the best AI model based on your specific requirements using a fluent DSL.

Quick Example

# Find the cheapest model with function calling
model = OpenRouter::ModelSelector.new
                                 .require(:function_calling)
                                 .optimize_for(:cost)
                                 .choose

# Advanced selection with multiple criteria
model = OpenRouter::ModelSelector.new
                                 .require(:function_calling, :vision)
                                 .within_budget(max_cost: 0.01)
                                 .min_context(50_000)
                                 .prefer_providers("anthropic", "openai")
                                 .optimize_for(:performance)
                                 .choose

# Get multiple options with fallbacks
models = OpenRouter::ModelSelector.new
                                  .require(:structured_outputs)
                                  .choose_with_fallbacks(limit: 3)
# => ["openai/gpt-4o-mini", "anthropic/claude-3-haiku", "google/gemini-flash"]

Key Features

  • Fluent DSL: Chain requirements and preferences intuitively
  • Cost Optimization: Find models within budget constraints
  • Capability Matching: Require specific features like function calling or vision
  • Provider Preferences: Prefer or avoid specific providers
  • Graceful Fallbacks: Automatic fallback with requirement relaxation
  • Performance Tiers: Choose between cost and performance optimization

πŸ“– Complete Model Selection Documentation

Prompt Templates

Create reusable, parameterized prompts with variable interpolation and few-shot learning support.

Quick Example

# Basic template with variables
translation_template = OpenRouter::PromptTemplate.new(
  template: "Translate '{text}' from {source_lang} to {target_lang}",
  input_variables: [:text, :source_lang, :target_lang]
)

# Use with client
client = OpenRouter::Client.new
response = client.complete(
  translation_template.to_messages(
    text: "Hello world",
    source_lang: "English",
    target_lang: "French"
  ),
  model: "openai/gpt-4o-mini"
)

# Few-shot learning template
classification_template = OpenRouter::PromptTemplate.new(
  prefix: "Classify the sentiment of the following text. Examples:",
  suffix: "Now classify: {text}",
  examples: [
    { text: "I love this product!", sentiment: "positive" },
    { text: "This is terrible.", sentiment: "negative" },
    { text: "It's okay, nothing special.", sentiment: "neutral" }
  ],
  example_template: "Text: {text}\nSentiment: {sentiment}",
  input_variables: [:text]
)

# Render complete prompt
prompt = classification_template.format(text: "This is amazing!")
puts prompt
# =>
# Classify the sentiment of the following text. Examples:
#
# Text: I love this product!
# Sentiment: positive
#
# Text: This is terrible.
# Sentiment: negative
#
# Text: It's okay, nothing special.
# Sentiment: neutral
#
# Now classify: This is amazing!

Key Features

  • Variable Interpolation: Use {variable} syntax for dynamic content
  • Few-Shot Learning: Include examples to improve model performance
  • Chat Formatting: Automatic conversion to OpenRouter message format
  • Partial Variables: Pre-fill common variables for reuse
  • Template Composition: Combine templates for complex prompts
  • Validation: Automatic validation of required input variables

πŸ“– Complete Prompt Templates Documentation

Model Registry

Access detailed information about available models and their capabilities.

Quick Example

# Get specific model information
model_info = OpenRouter::ModelRegistry.get_model_info("anthropic/claude-3-5-sonnet")
puts model_info[:capabilities]  # [:chat, :function_calling, :structured_outputs, :vision]
puts model_info[:cost_per_1k_tokens]  # { input: 0.003, output: 0.015 }

# Find models matching requirements
candidates = OpenRouter::ModelRegistry.models_meeting_requirements(
  capabilities: [:function_calling],
  max_input_cost: 0.01
)

# Estimate costs for specific usage
cost = OpenRouter::ModelRegistry.calculate_estimated_cost(
  "openai/gpt-4o",
  input_tokens: 1000,
  output_tokens: 500
)
puts "Estimated cost: $#{cost.round(4)}"  # => "Estimated cost: $0.0105"

Key Features

  • Model Discovery: Browse all available models and their specifications
  • Capability Detection: Check which features each model supports
  • Cost Calculation: Estimate costs for specific token usage
  • Local Caching: Fast model data access with automatic cache management
  • Real-time Updates: Refresh model data from OpenRouter API

Streaming & Real-time

Streaming Client

The enhanced streaming client provides real-time response streaming with callback support and automatic response reconstruction.

Quick Example

# Create streaming client
streaming_client = OpenRouter::StreamingClient.new

# Set up callbacks
streaming_client
  .on_stream(:on_start) { |data| puts "Starting request to #{data[:model]}" }
  .on_stream(:on_chunk) { |chunk| print chunk.content }
  .on_stream(:on_tool_call_chunk) { |chunk| puts "Tool call: #{chunk.name}" }
  .on_stream(:on_finish) { |response| puts "\nCompleted. Total tokens: #{response.total_tokens}" }
  .on_stream(:on_error) { |error| puts "Error: #{error.message}" }

# Stream with automatic response accumulation
response = streaming_client.stream_complete(
  [{ role: "user", content: "Write a short story about a robot" }],
  model: "openai/gpt-4o-mini",
  accumulate_response: true
)

# Access complete response after streaming
puts "Final response: #{response.content}"
puts "Cost: $#{response.cost_estimate}"

Streaming with Tool Calls

# Define a tool
weather_tool = OpenRouter::Tool.define do
  name "get_weather"
  description "Get current weather"
  parameters { string :location, required: true }
end

# Stream with tool calling
streaming_client.stream_complete(
  [{ role: "user", content: "What's the weather in Tokyo?" }],
  model: "anthropic/claude-3-5-sonnet",
  tools: [weather_tool]
) do |chunk|
  if chunk.has_tool_calls?
    chunk.tool_calls.each do |tool_call|
      puts "Calling #{tool_call.name} with #{tool_call.arguments}"
    end
  else
    print chunk.content
  end
end

Streaming Callbacks

The streaming client supports extensive callback events for monitoring and analytics.

streaming_client = OpenRouter::StreamingClient.new

# Monitor token usage in real-time
streaming_client.on_stream(:on_chunk) do |chunk|
  if chunk.usage
    puts "Tokens so far: #{chunk.usage['total_tokens']}"
  end
end

# Handle errors gracefully
streaming_client.on_stream(:on_error) do |error|
  logger.error "Streaming failed: #{error.message}"
  # Implement fallback logic
  fallback_response = client.complete(messages, model: "openai/gpt-4o-mini")
end

# Track performance metrics
start_time = nil
streaming_client
  .on_stream(:on_start) { |data| start_time = Time.now }
  .on_stream(:on_finish) do |response|
    duration = Time.now - start_time
    puts "Request completed in #{duration.round(2)}s"
    puts "Tokens per second: #{response.total_tokens / duration}"
  end

Observability & Analytics

Usage Tracking

Track token usage, costs, and performance metrics across all API calls.

Quick Example

# Create client with usage tracking enabled
client = OpenRouter::Client.new(track_usage: true)

# Make multiple requests
3.times do |i|
  response = client.complete(
    [{ role: "user", content: "Tell me a fact about space #{i + 1}" }],
    model: "openai/gpt-4o-mini"
  )
  puts "Request #{i + 1}: #{response.total_tokens} tokens, $#{response.cost_estimate}"
end

# View comprehensive usage statistics
tracker = client.usage_tracker
puts "\n=== Usage Summary ==="
puts "Total requests: #{tracker.request_count}"
puts "Total tokens: #{tracker.total_tokens}"
puts "Total cost: $#{tracker.total_cost.round(4)}"
puts "Average cost per request: $#{(tracker.total_cost / tracker.request_count).round(4)}"

# View per-model breakdown
tracker.model_usage.each do |model, stats|
  puts "\n#{model}:"
  puts "  Requests: #{stats[:request_count]}"
  puts "  Tokens: #{stats[:total_tokens]}"
  puts "  Cost: $#{stats[:cost].round(4)}"
end

# Print detailed report
tracker.print_summary

Advanced Usage Tracking

# Track specific operations
client.usage_tracker.reset! # Start fresh

# Simulate different workload types
client.complete(messages, model: "openai/gpt-4o")  # Expensive, high-quality
client.complete(messages, model: "openai/gpt-4o-mini")  # Cheap, fast

# Get usage metrics
cache_hit_rate = client.usage_tracker.cache_hit_rate
tokens_per_second = client.usage_tracker.tokens_per_second

puts "Cache hit rate: #{cache_hit_rate}%"
puts "Tokens per second: #{tokens_per_second}"

# Export usage data as CSV for analysis
csv_data = client.usage_tracker.export_csv
File.write("usage_report.csv", csv_data)

Response Analytics

Every response includes comprehensive metadata for monitoring and optimization.

response = client.complete(messages, model: "anthropic/claude-3-5-sonnet")

# Token metrics
puts "Input tokens: #{response.prompt_tokens}"
puts "Output tokens: #{response.completion_tokens}"
puts "Cached tokens: #{response.cached_tokens}"
puts "Total tokens: #{response.total_tokens}"

# Cost information (requires generation stats query)
puts "Total cost: $#{response.cost_estimate}"

# Model information
puts "Provider: #{response.provider}"
puts "Model: #{response.model}"
puts "System fingerprint: #{response.system_fingerprint}"
puts "Finish reason: #{response.finish_reason}"

Callback System

The client provides an extensible callback system for monitoring requests, responses, and errors.

Basic Callbacks

client = OpenRouter::Client.new

# Monitor all requests
client.on(:before_request) do |params|
  puts "Making request to #{params[:model]} with #{params[:messages].size} messages"
end

# Monitor all responses
client.on(:after_response) do |response|
  puts "Received response: #{response.total_tokens} tokens, $#{response.cost_estimate}"
end

# Monitor tool calls
client.on(:on_tool_call) do |tool_calls|
  tool_calls.each do |call|
    puts "Tool called: #{call.name} with args #{call.arguments}"
  end
end

# Monitor errors
client.on(:on_error) do |error|
  logger.error "API error: #{error.message}"
  # Send to monitoring service
  ErrorReporter.notify(error)
end

Advanced Callback Usage

# Cost monitoring with alerts
client.on(:after_response) do |response|
  if response.cost_estimate > 0.10
    AlertService.send_alert(
      "High cost request: $#{response.cost_estimate} for #{response.total_tokens} tokens"
    )
  end
end

# Performance monitoring
client.on(:before_request) { |params| @start_time = Time.now }
client.on(:after_response) do |response|
  duration = Time.now - @start_time
  if duration > 10.0
    puts "Slow request detected: #{duration.round(2)}s"
  end
end

# Usage analytics
request_count = 0
total_cost = 0.0

client.on(:after_response) do |response|
  request_count += 1
  total_cost += response.cost_estimate || 0.0

  if request_count % 100 == 0
    puts "100 requests processed. Average cost: $#{(total_cost / request_count).round(4)}"
  end
end

# Chain callbacks for complex workflows
client
  .on(:before_request) { |params| log_request(params) }
  .on(:after_response) { |response| log_response(response) }
  .on(:on_tool_call) { |calls| execute_tools(calls) }
  .on(:on_error) { |error| handle_error(error) }

Cost Management

Built-in cost estimation and usage tracking tools.

# Pre-flight cost estimation
estimated_cost = OpenRouter::ModelRegistry.calculate_estimated_cost(
  "anthropic/claude-3-5-sonnet",
  input_tokens: 1500,
  output_tokens: 800
)

puts "Estimated cost: $#{estimated_cost}"

# Use model selector to stay within budget
if estimated_cost > 0.01
  puts "Switching to cheaper model"
  model = OpenRouter::ModelSelector.new
                                   .within_budget(max_cost: 0.01)
                                   .require(:chat)
                                   .choose
end

# Track costs in real-time
client = OpenRouter::Client.new(track_usage: true)

client.on(:after_response) do |response|
  total_spent = client.usage_tracker.total_cost
  puts "Total spent this session: $#{total_spent.round(4)}"

  if total_spent > 5.00
    puts "⚠️  Session cost exceeds $5.00"
  end
end

Advanced Features

Model Fallbacks

Use multiple models with automatic failover for increased reliability.

# Define fallback chain
response = client.complete(
  messages,
  model: ["openai/gpt-4o", "anthropic/claude-3-5-sonnet", "anthropic/claude-3-haiku"],
  tools: tools
)

# Or use ModelSelector for intelligent fallbacks
models = OpenRouter::ModelSelector.new
                                  .require(:function_calling)
                                  .choose_with_fallbacks(limit: 3)

response = client.complete(messages, model: models, tools: tools)

Response Healing

Automatically heal malformed responses from models that don't natively support structured outputs.

# Configure global healing
OpenRouter.configure do |config|
  config.auto_heal_responses = true
  config.healer_model = "openai/gpt-4o-mini"
  config.max_heal_attempts = 2
end

# The gem automatically heals malformed JSON responses
response = client.complete(
  messages,
  model: "some/model-without-native-structured-outputs",
  response_format: schema  # Will be automatically healed if malformed
)

Performance Optimization

Optimize performance for high-throughput applications.

Batching and Parallelization

require 'concurrent-ruby'

# Process multiple requests in parallel
messages_batch = [
  [{ role: "user", content: "Summarize this: #{text1}" }],
  [{ role: "user", content: "Summarize this: #{text2}" }],
  [{ role: "user", content: "Summarize this: #{text3}" }]
]

# Create thread pool
thread_pool = Concurrent::FixedThreadPool.new(5)

# Process batch with shared model selection
model = OpenRouter::ModelSelector.new
                                 .optimize_for(:performance)
                                 .require(:chat)
                                 .choose

futures = messages_batch.map do |messages|
  Concurrent::Future.execute(executor: thread_pool) do
    client.complete(messages, model: model)
  end
end

# Collect results
results = futures.map(&:value)
thread_pool.shutdown

Caching and Optimization

# Enable aggressive caching
OpenRouter.configure do |config|
  config.cache_ttl = 24 * 60 * 60  # 24 hours
  config.auto_heal_responses = true
  config.strict_mode = false  # Better performance
end

# Use cheaper models for development/testing
if Rails.env.development?
  client = OpenRouter::Client.new(
    default_model: "openai/gpt-4o-mini",  # Cheaper for development
    track_usage: true
  )
else
  client = OpenRouter::Client.new(
    default_model: "anthropic/claude-3-5-sonnet"  # Production quality
  )
end

# Pre-warm model registry cache
OpenRouter::ModelRegistry.refresh_cache!

# Optimize for specific workloads
fast_client = OpenRouter::Client.new(
  request_timeout: 30,  # Shorter timeout
  auto_heal_responses: false,  # Skip healing for speed
  strict_mode: false  # Skip capability validation
)

Memory Management

# Reset usage tracking periodically for long-running apps
client.usage_tracker.reset! if client.usage_tracker.request_count > 1000

# Clear callback chains when not needed
client.clear_callbacks(:after_response) if Rails.env.production?

# Use streaming for large responses to reduce memory usage
streaming_client = OpenRouter::StreamingClient.new

streaming_client.stream_complete(
  [{ role: "user", content: "Write a detailed report on AI trends" }],
  model: "anthropic/claude-3-5-sonnet",
  accumulate_response: false  # Don't store full response
) do |chunk|
  # Process chunk immediately and discard
  process_chunk(chunk.content)
end

Testing & Development

The gem includes comprehensive test coverage with VCR integration for real API testing.

Running Tests

# Run all tests
bundle exec rspec

# Run with documentation format
bundle exec rspec --format documentation

# Run specific test types
bundle exec rspec spec/unit/           # Unit tests only
bundle exec rspec spec/vcr/            # VCR integration tests (requires API key)

VCR Testing

The project includes VCR tests that record real API interactions:

# Set API key for VCR tests
export OPENROUTER_API_KEY="your_api_key"

# Run VCR tests
bundle exec rspec spec/vcr/

# Re-record cassettes (deletes old recordings)
rm -rf spec/fixtures/vcr_cassettes/
bundle exec rspec spec/vcr/

Examples

The project includes comprehensive examples for all features:

# Set your API key
export OPENROUTER_API_KEY="your_key_here"

# Run individual examples
ruby -I lib examples/basic_completion.rb
ruby -I lib examples/tool_calling_example.rb
ruby -I lib examples/structured_outputs_example.rb
ruby -I lib examples/model_selection_example.rb
ruby -I lib examples/prompt_template_example.rb
ruby -I lib examples/streaming_example.rb
ruby -I lib examples/observability_example.rb
ruby -I lib examples/smart_completion_example.rb

# Run all examples
find examples -name "*.rb" -exec ruby -I lib {} \;

Model Exploration Rake Tasks

The gem includes convenient rake tasks for exploring and searching available models without writing code:

Model Summary

View an overview of all available models, including provider breakdown, capabilities, costs, and context lengths:

bundle exec rake models:summary

Output includes:

  • Total model count and breakdown by provider
  • Available capabilities across all models
  • Cost analysis (min/max/median for input and output tokens)
  • Context length statistics
  • Performance tier distribution

Search for models using various filters and optimization strategies:

# Basic search by provider
bundle exec rake models:search provider=anthropic

# Search by capabilities
bundle exec rake models:search capability=function_calling,vision

# Optimize for cost with capability requirements
bundle exec rake models:search capability=function_calling optimize=cost limit=10

# Filter by context length
bundle exec rake models:search min_context=200000

# Filter by cost
bundle exec rake models:search max_cost=0.01

# Filter by release date
bundle exec rake models:search newer_than=2024-01-01

# Combine multiple filters
bundle exec rake models:search provider=anthropic capability=function_calling min_context=100000 optimize=cost limit=5

Available search parameters:

  • provider=name - Filter by provider (comma-separated for multiple)
  • capability=cap1,cap2 - Required capabilities (function_calling, vision, structured_outputs, etc.)
  • optimize=strategy - Optimization strategy (cost, performance, latest, context)
  • min_context=tokens - Minimum context length
  • max_cost=amount - Maximum input cost per 1k tokens
  • max_output_cost=amount - Maximum output cost per 1k tokens
  • newer_than=YYYY-MM-DD - Filter models released after date
  • limit=N - Maximum number of results to show (default: 20)
  • fallbacks=true - Show models with fallback support

Examples:

# Find cheapest models with vision support
bundle exec rake models:search capability=vision optimize=cost limit=5

# Find latest Anthropic models with function calling
bundle exec rake models:search provider=anthropic optimize=latest capability=function_calling

# Find high-context models for long documents
bundle exec rake models:search min_context=500000 optimize=context

Troubleshooting

Common Issues and Solutions

Authentication Errors

# Error: "OpenRouter access token missing!"
# Solution: Set your API key
export OPENROUTER_API_KEY="your_key_here"

# Or configure in code
OpenRouter.configure do |config|
  config.access_token = ENV["OPENROUTER_API_KEY"]
end

# Error: "Invalid API key"
# Solution: Verify your key at https://openrouter.ai/keys

Model Selection Issues

# Error: "Model not found or access denied"
# Solution: Check model availability and your account limits
begin
  client.complete(messages, model: "gpt-4")
rescue OpenRouter::ServerError => e
  if e.message.include?("not found")
    puts "Model not available, falling back to default"
    client.complete(messages, model: "openai/gpt-4o-mini")
  end
end

# Error: "Model doesn't support feature X"
# Solution: Use ModelSelector to find compatible models
model = OpenRouter::ModelSelector.new
                                 .require(:function_calling)
                                 .choose

Rate Limiting and Costs

# Error: "Rate limit exceeded"
# Solution: Implement exponential backoff
require 'retries'

with_retries(max_tries: 3, base_sleep_seconds: 1, max_sleep_seconds: 60) do |attempt|
  client.complete(messages, model: model)
end

# Error: "Request too expensive"
# Solution: Use cheaper models or budget constraints
client = OpenRouter::Client.new
model = OpenRouter::ModelSelector.new
                                 .within_budget(max_cost: 0.01)
                                 .choose

Structured Output Issues

# Error: "Invalid JSON response"
# Solution: Enable response healing
OpenRouter.configure do |config|
  config.auto_heal_responses = true
  config.healer_model = "openai/gpt-4o-mini"
end

# Error: "Schema validation failed"
# Solution: Check schema definitions and model capability
schema = OpenRouter::Schema.define("user") do
  string :name, required: true
  integer :age, minimum: 0  # Add constraints
end

# Use models that support structured outputs natively
model = OpenRouter::ModelSelector.new
                                 .require(:structured_outputs)
                                 .choose

Performance Issues

# Issue: Slow responses
# Solution: Optimize client configuration
client = OpenRouter::Client.new(
  request_timeout: 30,  # Lower timeout
  strict_mode: false,   # Skip capability validation
  auto_heal_responses: false  # Skip healing for speed
)

# Issue: High memory usage
# Solution: Use streaming for large responses
streaming_client = OpenRouter::StreamingClient.new
streaming_client.stream_complete(messages, accumulate_response: false) do |chunk|
  process_chunk_immediately(chunk)
end

# Issue: Too many API calls
# Solution: Implement request batching
messages_batch = [...] # Multiple message sets
results = process_batch_concurrently(messages_batch, thread_pool_size: 5)

Tool Calling Issues

# Error: "Tool not found"
# Solution: Verify tool definitions match exactly
tool = OpenRouter::Tool.define do
  name "get_weather"  # Must match exactly in model response
  description "Get current weather for a location"
  parameters do
    string :location, required: true
  end
end

# Error: "Invalid tool parameters"
# Solution: Add parameter validation
def handle_weather_tool(tool_call)
  location = tool_call.arguments["location"]
  raise ArgumentError, "Location required" if location.nil? || location.empty?

  get_weather_data(location)
end

Debug Mode

Enable detailed logging for troubleshooting:

require 'logger'

OpenRouter.configure do |config|
  config.log_errors = true
  config.faraday do |f|
    f.response :logger, Logger.new($stdout), { headers: true, bodies: true, errors: true }
  end
end

# Enable callback debugging
client = OpenRouter::Client.new
client.on(:before_request) { |params| puts "REQUEST: #{params.inspect}" }
client.on(:after_response) { |response| puts "RESPONSE: #{response.inspect}" }
client.on(:on_error) { |error| puts "ERROR: #{error.message}" }

Performance Monitoring

# Monitor request performance
client.on(:before_request) { @start_time = Time.now }
client.on(:after_response) do |response|
  duration = Time.now - @start_time
  if duration > 5.0
    puts "SLOW REQUEST: #{duration.round(2)}s for #{response.total_tokens} tokens"
  end
end

# Monitor costs
client.on(:after_response) do |response|
  if response.cost_estimate > 0.10
    puts "EXPENSIVE REQUEST: $#{response.cost_estimate}"
  end
end

# Export usage data as CSV for analysis
csv_data = client.usage_tracker.export_csv
File.write("debug_usage.csv", csv_data)

Getting Help

  1. Check the documentation: Each feature has detailed documentation in the docs/ directory
  2. Review examples: Look at working examples in the examples/ directory
  3. Enable debug mode: Turn on logging to see request/response details
  4. Check OpenRouter status: Visit OpenRouter Status
  5. Open an issue: Report bugs at GitHub Issues

API Reference

Client Classes

OpenRouter::Client

Main client for OpenRouter API interactions.

client = OpenRouter::Client.new(
  access_token: "...",
  track_usage: false,
  request_timeout: 120
)

# Core methods
client.complete(messages, **options)  # Chat completions with full feature support
client.models                         # List available models
client.query_generation_stats(id)     # Query generation statistics

# Callback methods
client.on(event, &block)              # Register event callback
client.clear_callbacks(event)         # Clear callbacks for event
client.trigger_callbacks(event, data) # Manually trigger callbacks

# Usage tracking
client.usage_tracker                  # Access usage tracker instance

OpenRouter::StreamingClient

Enhanced streaming client with callback support.

streaming_client = OpenRouter::StreamingClient.new

# Streaming methods
streaming_client.stream_complete(messages, **options)  # Stream with callbacks
streaming_client.on_stream(event, &block)              # Register streaming callbacks

# Available streaming events: :on_start, :on_chunk, :on_tool_call_chunk, :on_finish, :on_error

Enhanced Classes

OpenRouter::Tool

Define and manage function calling tools.

# DSL definition
tool = OpenRouter::Tool.define do
  name "function_name"
  description "Function description"
  parameters do
    string :param1, required: true, description: "Parameter description"
    integer :param2, minimum: 0, maximum: 100
    boolean :param3, default: false
  end
end

# Hash definition
tool = OpenRouter::Tool.from_hash({
  name: "function_name",
  description: "Function description",
  parameters: {
    type: "object",
    properties: { ... }
  }
})

# Methods
tool.name                    # Get tool name
tool.description             # Get tool description
tool.parameters              # Get parameters schema
tool.to_h                    # Convert to hash format
tool.validate_arguments(args) # Validate arguments against schema

OpenRouter::Schema

Define JSON schemas for structured outputs.

# DSL definition
schema = OpenRouter::Schema.define("schema_name") do
  string :name, required: true, description: "User's name"
  integer :age, minimum: 0, maximum: 150
  boolean :active, default: true
  array :tags, items: { type: "string" }
  object :address do
    string :street, required: true
    string :city, required: true
    string :country, default: "US"
  end
end

# Hash definition
schema = OpenRouter::Schema.from_hash("schema_name", {
  type: "object",
  properties: { ... },
  required: [...]
})

# Methods
schema.name                   # Get schema name
schema.schema                 # Get JSON schema hash
schema.validate(data)         # Validate data against schema
schema.to_h                   # Convert to hash format

OpenRouter::PromptTemplate

Create reusable prompt templates with variable interpolation.

# Basic template
template = OpenRouter::PromptTemplate.new(
  template: "Translate '{text}' from {source} to {target}",
  input_variables: [:text, :source, :target]
)

# Few-shot template
template = OpenRouter::PromptTemplate.new(
  prefix: "Classification examples:",
  suffix: "Classify: {input}",
  examples: [{ input: "...", output: "..." }],
  example_template: "Input: {input}\nOutput: {output}",
  input_variables: [:input]
)

# Methods
template.format(**variables)        # Format template with variables
template.to_messages(**variables)   # Convert to OpenRouter message format
template.input_variables           # Get required input variables
template.partial(**variables)       # Create partial template with some variables filled

OpenRouter::ModelSelector

Intelligent model selection with fluent DSL.

selector = OpenRouter::ModelSelector.new

# Requirement methods
selector.require(*capabilities)             # Require specific capabilities
selector.within_budget(max_cost: 0.01)     # Set maximum cost constraint
selector.min_context(tokens)               # Minimum context length
selector.prefer_providers(*providers)      # Prefer specific providers
selector.avoid_providers(*providers)       # Avoid specific providers
selector.optimize_for(strategy)            # Optimization strategy (:cost, :performance, :balanced)

# Selection methods
selector.choose                            # Choose best single model
selector.choose_with_fallbacks(limit: 3)  # Choose multiple models for fallback
selector.candidates                        # Get all matching models
selector.explain_choice                    # Get explanation of selection

# Available capabilities: :chat, :function_calling, :structured_outputs, :vision, :code_generation
# Available strategies: :cost, :performance, :balanced

OpenRouter::ModelRegistry

Model information and capability detection.

# Class methods
OpenRouter::ModelRegistry.all_models                          # Get all cached models
OpenRouter::ModelRegistry.get_model_info(model)              # Get specific model info
OpenRouter::ModelRegistry.models_meeting_requirements(...)    # Find models matching criteria
OpenRouter::ModelRegistry.calculate_estimated_cost(model, tokens) # Estimate cost
OpenRouter::ModelRegistry.refresh_cache!                     # Refresh model cache
OpenRouter::ModelRegistry.cache_status                       # Get cache status

OpenRouter::UsageTracker

Track token usage, costs, and performance metrics.

tracker = client.usage_tracker

# Metrics
tracker.total_tokens              # Total tokens used
tracker.total_cost               # Total estimated cost
tracker.request_count            # Number of requests made
tracker.model_usage              # Per-model usage breakdown
tracker.session_duration         # Time since tracking started

# Analysis methods
tracker.cache_hit_rate          # Cache hit rate percentage
tracker.tokens_per_second       # Tokens processed per second
tracker.print_summary           # Print detailed usage report
tracker.export_csv              # Export usage data as CSV
tracker.summary                 # Get usage summary hash
tracker.reset!                  # Reset all counters

Response Objects

OpenRouter::Response

Enhanced response wrapper with metadata and feature support.

response = client.complete(messages)

# Content access
response.content                    # Response content
response.structured_output         # Parsed JSON for structured outputs

# Tool calling
response.has_tool_calls?          # Check if response has tool calls
response.tool_calls               # Array of ToolCall objects

# Token metrics
response.prompt_tokens            # Input tokens
response.completion_tokens        # Output tokens
response.cached_tokens           # Cached tokens
response.total_tokens            # Total tokens

# Cost information
response.input_cost              # Input cost
response.output_cost             # Output cost
response.cost_estimate           # Total estimated cost

# Performance metrics
response.response_time           # Response time in milliseconds
response.tokens_per_second       # Processing speed

# Model information
response.model                   # Model used
response.provider               # Provider name
response.system_fingerprint     # System fingerprint
response.finish_reason          # Why generation stopped

# Cache information
response.cache_hit?             # Whether response used cache
response.cache_efficiency       # Cache efficiency percentage

# Backward compatibility - delegates hash methods to raw response
response["key"]                 # Hash-style access
response.dig("path", "to", "value") # Deep hash access

OpenRouter::ToolCall

Individual tool call handling and execution.

tool_call = response.tool_calls.first

# Properties
tool_call.id                    # Tool call ID
tool_call.name                  # Tool name
tool_call.arguments             # Tool arguments (Hash)

# Methods
tool_call.validate_arguments!   # Validate arguments against tool schema
tool_call.to_message           # Convert to continuation message format
tool_call.execute(&block)      # Execute tool with block

Error Classes

OpenRouter::Error                    # Base error class
OpenRouter::ConfigurationError       # Configuration issues
OpenRouter::CapabilityError         # Capability validation errors
OpenRouter::ServerError             # API server errors
OpenRouter::ToolCallError           # Tool execution errors
OpenRouter::SchemaValidationError   # Schema validation errors
OpenRouter::StructuredOutputError   # JSON parsing/healing errors
OpenRouter::ModelRegistryError      # Model registry errors
OpenRouter::ModelSelectionError     # Model selection errors

Configuration Options

OpenRouter.configure do |config|
  # Authentication
  config.access_token = "sk-..."
  config.site_name = "Your App Name"
  config.site_url = "https://yourapp.com"

  # Request settings
  config.request_timeout = 120
  config.api_version = "v1"
  config.uri_base = "https://openrouter.ai/api"
  config.extra_headers = {}

  # Response healing
  config.auto_heal_responses = true
  config.healer_model = "openai/gpt-4o-mini"
  config.max_heal_attempts = 2

  # Capability validation
  config.strict_mode = true
  config.auto_force_on_unsupported_models = true

  # Structured outputs
  config.default_structured_output_mode = :strict

  # Caching
  config.cache_ttl = 7 * 24 * 60 * 60  # 7 days

  # Model registry
  config.model_registry_timeout = 30
  config.model_registry_retries = 3

  # Logging
  config.log_errors = false
  config.faraday do |f|
    f.response :logger
  end
end

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/estiens/open_router_enhanced.

This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.

For detailed contribution guidelines, see CONTRIBUTING.md.

Branch Strategy

We use a two-branch workflow:

  • main - Stable releases only. Protected branch.
  • dev - Active development. All PRs should target this branch.

⚠️ Important: Always target your PRs to the dev branch, not main. The main branch is reserved for stable releases.

Development Setup

git clone https://github.com/estiens/open_router_enhanced.git
cd open_router_enhanced
bundle install
bundle exec rspec

Running Examples

# Set your API key
export OPENROUTER_API_KEY="your_key_here"

# Run examples
ruby -I lib examples/tool_calling_example.rb
ruby -I lib examples/structured_outputs_example.rb
ruby -I lib examples/model_selection_example.rb

Acknowledgments

This enhanced fork builds upon the excellent foundation laid by Obie Fernandez and the original OpenRouter Ruby gem. The original library was bootstrapped from the Anthropic gem by Alex Rudall and extracted from the codebase of Olympia, Obie's AI startup.

We extend our heartfelt gratitude to:

  • Obie Fernandez - Original OpenRouter gem author and visionary
  • Alex Rudall - Creator of the Anthropic gem that served as the foundation
  • The OpenRouter Team - For creating an amazing unified AI API
  • The Ruby Community - For continuous support and contributions

Maintainer & Consulting

This enhanced fork is maintained by:

Eric Stiens

Need Help with AI Integration?

I'm available for consulting on Ruby AI applications, LLM integration, and building production-ready AI systems. My work extends beyond Ruby to include real-time AI orchestration, character-based AI systems, multi-agent architectures, and low-latency voice/streaming applications. Whether you need help with tool calling workflows, cost optimization, building AI characters with persistent memory, or orchestrating complex multi-model systems, I'd be happy to help.

Get in touch:

License

The gem is available as open source under the terms of the MIT License.

MIT License is chosen for maximum permissiveness and compatibility, allowing unrestricted use, modification, and distribution while maintaining attribution requirements.