Class: EvalRuby::Evaluator
- Inherits:
-
Object
- Object
- EvalRuby::Evaluator
- Defined in:
- lib/eval_ruby/evaluator.rb
Overview
Runs all configured metrics on a given question/answer/context tuple.
Instance Method Summary collapse
-
#evaluate(question:, answer:, context: [], ground_truth: nil) ⇒ Result
Evaluates an LLM response across quality metrics.
-
#evaluate_retrieval(question:, retrieved:, relevant:) ⇒ RetrievalResult
Evaluates retrieval quality using IR metrics.
-
#initialize(config = EvalRuby.configuration) ⇒ Evaluator
constructor
A new instance of Evaluator.
Constructor Details
#initialize(config = EvalRuby.configuration) ⇒ Evaluator
Returns a new instance of Evaluator.
11 12 13 14 |
# File 'lib/eval_ruby/evaluator.rb', line 11 def initialize(config = EvalRuby.configuration) @config = config @judge = build_judge(config) end |
Instance Method Details
#evaluate(question:, answer:, context: [], ground_truth: nil) ⇒ Result
Evaluates an LLM response across quality metrics.
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
# File 'lib/eval_ruby/evaluator.rb', line 23 def evaluate(question:, answer:, context: [], ground_truth: nil) scores = {} details = {} # LLM-as-judge metrics faith = Metrics::Faithfulness.new(judge: @judge).call(answer: answer, context: context) scores[:faithfulness] = faith[:score] details[:faithfulness] = faith[:details] rel = Metrics::Relevance.new(judge: @judge).call(question: question, answer: answer) scores[:relevance] = rel[:score] details[:relevance] = rel[:details] cp = Metrics::ContextPrecision.new(judge: @judge).call(question: question, context: context) scores[:context_precision] = cp[:score] details[:context_precision] = cp[:details] if ground_truth corr = Metrics::Correctness.new(judge: @judge).call(answer: answer, ground_truth: ground_truth) scores[:correctness] = corr[:score] details[:correctness] = corr[:details] cr = Metrics::ContextRecall.new(judge: @judge).call(context: context, ground_truth: ground_truth) scores[:context_recall] = cr[:score] details[:context_recall] = cr[:details] end Result.new(scores: scores, details: details) end |
#evaluate_retrieval(question:, retrieved:, relevant:) ⇒ RetrievalResult
Evaluates retrieval quality using IR metrics.
59 60 61 |
# File 'lib/eval_ruby/evaluator.rb', line 59 def evaluate_retrieval(question:, retrieved:, relevant:) RetrievalResult.new(retrieved: retrieved, relevant: relevant) end |