PromptWarden
Record, audit, and guard AI prompt usage with automatic SDK instrumentation, policy enforcement, and real-time monitoring.
Features
- Automatic SDK Capture: Zero-code integration with OpenAI, Anthropic, and Langchain
- Policy Guardrails: YAML-based rules for cost limits, regex patterns, and alerts
- Enhanced Cost Calculation: Accurate token counting and model-specific pricing
- Real-time Monitoring: CLI tool for live event streaming and filtering
- Alert System: Non-blocking warnings and blocking rejections based on patterns
- Automatic Alert Recording: Alerts included in events and uploaded to SaaS
- Asynchronous Uploads: Batched, gzipped events with disk-retry fallback
Installation
gem install prompt_warden
Or add to your Gemfile:
gem 'prompt_warden'
Quick Start
- Configure (in your app's initializer):
PromptWarden.configure do |config|
config.project_token = 'your-project-token'
config.api_url = 'https://your-saas.com/api/v1/ingest'
end
- Create Policy (
config/promptwarden.yml):
max_cost_usd: 0.50 # Block if projected call cost > $0.50
reject_if_regex:
- /password/i
- /(ssn|social\s*security)/i
warn_if_regex:
- /\bETA\b/i
- Use AI SDKs (automatically instrumented):
# OpenAI
client = OpenAI::Client.new
response = client.chat(parameters: {
model: "gpt-4o",
messages: [{ role: "user", content: "What is the ETA?" }]
})
# Anthropic
client = Anthropic::Client.new
response = client.(
model: "claude-3-opus-20240229",
max_tokens: 1000,
messages: [{ role: "user", content: "What is the ETA?" }]
)
CLI Tool
Monitor events in real-time with the pw_tail command:
# Follow all events
./bin/pw_tail
# Show only events with alerts
./bin/pw_tail --alerts
# Filter by model
./bin/pw_tail --model gpt-4o
# Show events above cost threshold
./bin/pw_tail --cost 0.01
# Filter by status
./bin/pw_tail --status failed
# Limit number of events
./bin/pw_tail --limit 10
# Output in JSON format
./bin/pw_tail --json
# Show recent events without following
./bin/pw_tail --no-follow
CLI Output Format
10:30:00 gpt-4o $0.005 ok [⚠️ /ETA/i] | What is the ETA for this project?
10:31:15 claude-3 $0.75 ok [💰 >$0.5] | How much does this cost?
10:32:30 gpt-4o $0.001 ok | Simple question without alerts
Policy Features
Cost Limits
max_cost_usd: 0.50 # Block requests exceeding $0.50
Regex Patterns
reject_if_regex: # Block requests matching patterns
- /password/i
- /(ssn|social\s*security)/i
warn_if_regex: # Log warnings for patterns
- /\bETA\b/i
- /urgent/i
Programmatic Checks
# Check for alerts (non-blocking)
alerts = PromptWarden::Policy.instance.check_alerts(
prompt: "What is the ETA?",
cost_estimate: 0.005
)
# Check for blocks (raises PolicyError)
PromptWarden::Policy.instance.check!(
prompt: "What is the password?",
cost_estimate: 0.001
)
Enhanced Cost Calculation
PromptWarden provides accurate cost calculation with:
- Model-specific pricing for OpenAI and Anthropic models
- Token counting with tiktoken integration for OpenAI models
- Response token integration for accurate post-request costs
- Fallback estimation for unknown models
# Calculate cost for a prompt
cost = PromptWarden.calculate_cost(
prompt: "Explain quantum computing",
model: "gpt-4o"
)
# Calculate cost with actual response tokens
actual_cost = PromptWarden.calculate_cost(
prompt: "Explain quantum computing",
model: "gpt-4o",
response_tokens: 150
)
Supported Models
OpenAI Models:
gpt-4o($0.0025/1K input, $0.01/1K output)gpt-4o-mini($0.00015/1K input, $0.0006/1K output)gpt-4-turbo($0.01/1K input, $0.03/1K output)gpt-3.5-turbo($0.0005/1K input, $0.0015/1K output)
Anthropic Models:
claude-3-opus-20240229($0.015/1K input, $0.075/1K output)claude-3-sonnet-20240229($0.003/1K input, $0.015/1K output)claude-3-haiku-20240307($0.00025/1K input, $0.00125/1K output)
Supported SDKs
- OpenAI:
openaigem - Anthropic:
anthropicgem - Langchain:
langchaingem
Gem vs SaaS
PromptWarden Gem (this repository):
- Local policy enforcement
- Event capture and buffering
- Enhanced cost calculation
- Asynchronous uploads to SaaS
- CLI monitoring tool
- Disk-retry for failed uploads
PromptWarden SaaS (separate application):
- Data storage and retention
- Analytics dashboards
- Advanced alerting (Slack, email)
- User and project management
- Cost tracking and reporting
Event Structure
Events are automatically captured and include:
{
"id": "uuid",
"prompt": "What is the ETA?",
"response": "The ETA is 2 weeks",
"model": "gpt-4o",
"latency_ms": 1250,
"cost_usd": 0.005,
"status": "ok",
"timestamp": "2024-01-15T10:30:00Z",
"alerts": [
{
"type": "regex",
"rule": "/ETA/i",
"level": "warn"
}
]
}
Development
# Install dependencies
bundle install
# Run tests
bundle exec rspec
# Run CLI tests
bundle exec rspec spec/cli_spec.rb
# Test cost calculation
ruby test_cost_calculation.rb
License
MIT License - see LICENSE file for details.