Class: RelevantChunks::Processor

Inherits:
Object
  • Object
show all
Defined in:
lib/relevant_chunks/processor.rb

Overview

Handles text processing and relevance scoring using Claude/Anthropic

The Processor class manages the chunking and scoring of text using the Anthropic API.

Examples:

Basic usage

processor = RelevantChunks::Processor.new(api_key: "your_key")
results = processor.process("Long text here", "What is this about?")

Advanced configuration

processor = RelevantChunks::Processor.new(
  api_key: "your_key",
  model: "claude-3-5-sonnet-latest",  # Use a different model variant
  temperature: 0.1,                 # Add slight variation to scores
  system_prompt: "Custom scoring system prompt...",
  max_score: 10                     # Use 0-10 scoring range
)

Scoring text relevance with different queries

processor = RelevantChunks::Processor.new(api_key: "your_key")
text = "The solar system consists of the Sun and everything that orbits around it. " \
       "This includes eight planets, numerous moons, asteroids, comets, and other celestial objects. " \
       "Earth is the third planet from the Sun and the only known planet to harbor life. " \
       "Mars, often called the Red Planet, has been the subject of numerous exploration missions."

# Query about Mars
results = processor.process(text, "Tell me about Mars")
# Returns chunks with scores like:
# - "Mars, often called the Red Planet..." (Score: 60)
# - "...numerous exploration missions." (Score: 35)
# - General solar system info (Score: 15)

# Query about life on planets
results = processor.process(text, "What planets are known to have life?")
# Returns chunks with scores like:
# - "Earth is the third planet...only known planet to harbor life" (Score: 65)
# - Chunks mentioning planets (Score: 35)
# - Other chunks (Score: 5-15)

Class Attribute Summary collapse

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(api_key:, max_tokens: 1000, overlap_size: 100, model: "claude-3-5-sonnet-latest", temperature: 0.0, system_prompt: nil, max_score: 100) ⇒ Processor

Initialize a new Processor instance

Parameters:

  • api_key (String)

    Anthropic API key

  • max_tokens (Integer) (defaults to: 1000)

    Maximum tokens per chunk

  • overlap_size (Integer) (defaults to: 100)

    Overlap size between chunks

  • model (String) (defaults to: "claude-3-5-sonnet-latest")

    Claude model to use (default: “claude-3-5-sonnet-latest”)

  • temperature (Float) (defaults to: 0.0)

    Temperature for scoring (0.0-1.0, default: 0.0)

  • system_prompt (String, nil) (defaults to: nil)

    Custom system prompt for scoring (default: nil)

  • max_score (Integer) (defaults to: 100)

    Maximum score in range (default: 100)



77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
# File 'lib/relevant_chunks/processor.rb', line 77

def initialize(api_key:, max_tokens: 1000, overlap_size: 100,
               model: "claude-3-5-sonnet-latest", temperature: 0.0,
               system_prompt: nil, max_score: 100)
  @api_key = api_key
  @chunker = Chunker.new(max_tokens: max_tokens, overlap_size: overlap_size)
  @model = model
  @temperature = temperature
  @max_score = max_score
  @system_prompt = system_prompt || default_system_prompt
  @conn = Faraday.new(url: "https://api.anthropic.com") do |f|
    f.request :json
    f.response :json
    f.adapter :net_http
    f.headers = {
      "accept" => "application/json",
      "anthropic-version" => "2023-06-01",
      "content-type" => "application/json",
      "x-api-key" => api_key
    }
  end
end

Class Attribute Details

.configurationObject

Returns the value of attribute configuration.



46
47
48
# File 'lib/relevant_chunks/processor.rb', line 46

def configuration
  @configuration
end

Instance Attribute Details

#api_keyString (readonly)

Returns Anthropic API key.

Returns:

  • (String)

    Anthropic API key



50
51
52
# File 'lib/relevant_chunks/processor.rb', line 50

def api_key
  @api_key
end

#chunkerChunker (readonly)

Returns Text chunker instance.

Returns:

  • (Chunker)

    Text chunker instance



53
54
55
# File 'lib/relevant_chunks/processor.rb', line 53

def chunker
  @chunker
end

#max_scoreInteger (readonly)

Returns Maximum score in the scoring range.

Returns:

  • (Integer)

    Maximum score in the scoring range



65
66
67
# File 'lib/relevant_chunks/processor.rb', line 65

def max_score
  @max_score
end

#modelString (readonly)

Returns Claude model to use.

Returns:

  • (String)

    Claude model to use



56
57
58
# File 'lib/relevant_chunks/processor.rb', line 56

def model
  @model
end

#system_promptString (readonly)

Returns System prompt for scoring.

Returns:

  • (String)

    System prompt for scoring



62
63
64
# File 'lib/relevant_chunks/processor.rb', line 62

def system_prompt
  @system_prompt
end

#temperatureFloat (readonly)

Returns Temperature for scoring (0.0-1.0).

Returns:

  • (Float)

    Temperature for scoring (0.0-1.0)



59
60
61
# File 'lib/relevant_chunks/processor.rb', line 59

def temperature
  @temperature
end

Instance Method Details

#process(text, query) ⇒ Array<Hash>

Process text and score chunks against a query

Examples:

processor = RelevantChunks::Processor.new(api_key: "your_key")
results = processor.process("Long text here", "What is this about?")
results.each do |result|
  puts "Chunk: #{result[:chunk]}"
  puts "Score: #{result[:score]}"
  puts "Raw response: #{result[:response].inspect}"
end

Parameters:

  • text (String)

    The text to process

  • query (String)

    The query to score chunks against

Returns:

  • (Array<Hash>)

    Array of chunks with their scores and API responses. Each hash contains:

    • :chunk [String] The text chunk that was scored

    • :score [Integer] The relevance score (0-100)

    • :response [Hash] The complete raw response from the Anthropic API



115
116
117
118
# File 'lib/relevant_chunks/processor.rb', line 115

def process(text, query)
  chunks = chunker.chunk_text(text)
  chunks.map { |chunk| score_chunk(chunk, query) }
end