Class: RubyLLM::SemanticCache::Middleware

Inherits:

Object

Object
RubyLLM::SemanticCache::Middleware

show all

Defined in:: lib/ruby_llm/semantic_cache/middleware.rb

Overview

Middleware wrapper for RubyLLM::Chat that automatically caches responses

Examples:

Basic usage

chat = RubyLLM.chat(model: "gpt-5.2")
cached_chat = RubyLLM::SemanticCache.wrap(chat)
cached_chat.ask("What is 2+2?")  # First call - executes LLM

With custom threshold

cached_chat = RubyLLM::SemanticCache.wrap(chat, threshold: 0.95)

Direct Known Subclasses

ScopedMiddleware

Constant Summary collapse

DELEGATED_METHODS = Methods to delegate directly to the wrapped chat (no caching)

%i[
  model messages tools params headers schema
  with_instructions with_tool with_tools with_model
  with_temperature with_context with_params with_headers with_schema
  on_new_message on_end_message on_tool_call on_tool_result
  each reset_messages!
].freeze

Instance Attribute Summary collapse

#chat ⇒ Object readonly

Returns the value of attribute chat.

Instance Method Summary collapse

#ask(message = nil, with: nil, &block) ⇒ RubyLLM::Message (also: #say)

Ask a question with automatic caching.
#initialize(chat, threshold: nil, ttl: nil, on_cache_hit: nil, max_messages: nil) ⇒ Middleware constructor

A new instance of Middleware.

Constructor Details

#initialize(chat, threshold: nil, ttl: nil, on_cache_hit: nil, max_messages: nil) ⇒ `Middleware`

Returns a new instance of Middleware.

Parameters:

chat (RubyLLM::Chat) —

the chat instance to wrap
threshold (Float, nil) (defaults to: nil) —

similarity threshold override
ttl (Integer, nil) (defaults to: nil) —

TTL override in seconds
on_cache_hit (Proc, nil) (defaults to: nil) —

callback when cache hit occurs, receives (chat, user_message, cached_response)
max_messages (Integer, :unlimited, false, nil) (defaults to: nil) —
max conversation messages before skipping cache
- Integer: skip cache after N messages (default: 1, only first message cached)
- :unlimited or false: cache all messages regardless of conversation length
- nil: use config default

# File 'lib/ruby_llm/semantic_cache/middleware.rb', line 37

def initialize(chat, threshold: nil, ttl: nil, on_cache_hit: nil, max_messages: nil)
  @chat = chat
  @threshold = threshold
  @ttl = ttl
  @on_cache_hit = on_cache_hit
  @max_messages = max_messages
end

Instance Attribute Details

#chat ⇒ `Object` (readonly)

Returns the value of attribute chat.



27
28
29

# File 'lib/ruby_llm/semantic_cache/middleware.rb', line 27

def chat
  @chat
end

Instance Method Details

#ask(message = nil, with: nil, &block) ⇒ `RubyLLM::Message` Also known as: say

Ask a question with automatic caching