Class: RubyLLM::SemanticCache::Middleware
- Inherits:
-
Object
- Object
- RubyLLM::SemanticCache::Middleware
- Defined in:
- lib/ruby_llm/semantic_cache/middleware.rb
Overview
Middleware wrapper for RubyLLM::Chat that automatically caches responses
Direct Known Subclasses
Constant Summary collapse
- DELEGATED_METHODS =
Methods to delegate directly to the wrapped chat (no caching)
i[ model tools params headers schema with_instructions with_tool with_tools with_model with_temperature with_context with_params with_headers with_schema on_tool_call on_tool_result each ].freeze
Instance Attribute Summary collapse
-
#chat ⇒ Object
readonly
Returns the value of attribute chat.
Instance Method Summary collapse
-
#ask(message = nil, with: nil, &block) ⇒ RubyLLM::Message
(also: #say)
Ask a question with automatic caching.
-
#initialize(chat, threshold: nil, ttl: nil, on_cache_hit: nil, max_messages: nil) ⇒ Middleware
constructor
A new instance of Middleware.
Constructor Details
#initialize(chat, threshold: nil, ttl: nil, on_cache_hit: nil, max_messages: nil) ⇒ Middleware
Returns a new instance of Middleware.
37 38 39 40 41 42 43 |
# File 'lib/ruby_llm/semantic_cache/middleware.rb', line 37 def initialize(chat, threshold: nil, ttl: nil, on_cache_hit: nil, max_messages: nil) @chat = chat @threshold = threshold @ttl = ttl @on_cache_hit = on_cache_hit = end |
Instance Attribute Details
#chat ⇒ Object (readonly)
Returns the value of attribute chat.
27 28 29 |
# File 'lib/ruby_llm/semantic_cache/middleware.rb', line 27 def chat @chat end |
Instance Method Details
#ask(message = nil, with: nil, &block) ⇒ RubyLLM::Message Also known as: say
Ask a question with automatic caching
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 |
# File 'lib/ruby_llm/semantic_cache/middleware.rb', line 49 def ask( = nil, with: nil, &block) # Skip caching if message has attachments return @chat.ask(, with: with, &block) if with # Skip caching for tool-enabled chats (responses may vary) return @chat.ask(, with: with, &block) if @chat.tools.any? # Skip caching if conversation exceeds max_messages (excluding system messages) return @chat.ask(, with: with, &block) if conversation_too_long? # Skip caching for streaming (too complex to handle correctly) return @chat.ask(, with: with, &block) if block_given? # Use cache for non-streaming cache_key = build_cache_key() cached = cache_lookup(cache_key) if cached handle_cache_hit(, cached) return cached end # Execute the actual LLM call response = @chat.ask() # Cache the response store_in_cache(cache_key, response) RubyLLM::SemanticCache.record_miss! response end |