Module: RubyLLM::RedCandle::Chat
- Included in:
- Provider
- Defined in:
- lib/ruby_llm/red_candle/chat.rb
Overview
Chat implementation for Red Candle provider
Instance Method Summary collapse
-
#complete(messages, tools:, temperature:, model:, params: {}, headers: {}, schema: nil, &block) ⇒ Object
Override the base complete method to handle local execution.
- #perform_completion!(payload) ⇒ Object
- #perform_streaming_completion!(payload, &block) ⇒ Object
- #render_payload(messages, tools:, temperature:, model:, stream:, schema:) ⇒ Object
Instance Method Details
#complete(messages, tools:, temperature:, model:, params: {}, headers: {}, schema: nil, &block) ⇒ Object
Override the base complete method to handle local execution
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
# File 'lib/ruby_llm/red_candle/chat.rb', line 8 def complete(, tools:, temperature:, model:, params: {}, headers: {}, schema: nil, &block) _ = headers # Interface compatibility payload = RubyLLM::Utils.deep_merge( render_payload( , tools: tools, temperature: temperature, model: model, stream: block_given?, schema: schema ), params ) if block_given? perform_streaming_completion!(payload, &block) else result = perform_completion!(payload) # Convert to Message object for compatibility # Red Candle doesn't provide token counts by default, but we can estimate them content = result[:content] # Rough estimation: ~4 characters per token estimated_output_tokens = (content.length / 4.0).round estimated_input_tokens = estimate_input_tokens(payload[:messages]) RubyLLM::Message.new( role: result[:role].to_sym, content: content, model_id: model.id, input_tokens: estimated_input_tokens, output_tokens: estimated_output_tokens ) end end |
#perform_completion!(payload) ⇒ Object
58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 |
# File 'lib/ruby_llm/red_candle/chat.rb', line 58 def perform_completion!(payload) model = ensure_model_loaded!(payload[:model]) = (payload[:messages]) # Handle structured generation differently - we need to build the prompt # with JSON instructions BEFORE applying the chat template response = if payload[:schema] generate_with_schema(model, , payload[:schema], payload) else prompt = build_prompt(model, ) validate_context_length!(prompt, payload[:model]) config = build_generation_config(payload) generate_with_error_handling(model, prompt, config, payload[:model]) end format_response(response, payload[:schema]) end |
#perform_streaming_completion!(payload, &block) ⇒ Object
76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 |
# File 'lib/ruby_llm/red_candle/chat.rb', line 76 def perform_streaming_completion!(payload, &block) model = ensure_model_loaded!(payload[:model]) = (payload[:messages]) prompt = build_prompt(model, ) validate_context_length!(prompt, payload[:model]) config = build_generation_config(payload) # Collect all streamed content full_content = "" # Stream tokens with error handling stream_with_error_handling(model, prompt, config, payload[:model]) do |token| full_content += token chunk = format_stream_chunk(token) block.call(chunk) end # Send final chunk with empty content (indicates completion) final_chunk = format_stream_chunk("") block.call(final_chunk) # Return a Message object with the complete response estimated_output_tokens = (full_content.length / 4.0).round estimated_input_tokens = estimate_input_tokens(payload[:messages]) RubyLLM::Message.new( role: :assistant, content: full_content, model_id: payload[:model], input_tokens: estimated_input_tokens, output_tokens: estimated_output_tokens ) end |
#render_payload(messages, tools:, temperature:, model:, stream:, schema:) ⇒ Object
43 44 45 46 47 48 49 50 51 52 53 54 55 56 |
# File 'lib/ruby_llm/red_candle/chat.rb', line 43 def render_payload(, tools:, temperature:, model:, stream:, schema:) # Red Candle doesn't support tools if tools && !tools.empty? raise RubyLLM::Error.new(nil, "Red Candle provider does not support tool calling") end { messages: , temperature: temperature, model: model.id, stream: stream, schema: schema } end |