Class: EvalRuby::Metrics::Relevance
- Defined in:
- lib/eval_ruby/metrics/relevance.rb
Overview
Measures whether an answer is relevant to the question. Uses an LLM judge to evaluate relevance on a 0.0-1.0 scale.
Constant Summary collapse
- PROMPT_TEMPLATE =
<<~PROMPT Given the following question and answer, evaluate whether the answer is relevant to and addresses the question. Question: %{question} Answer: %{answer} Evaluate relevance on a scale from 0.0 to 1.0 where: - 1.0 = the answer fully and directly addresses the question - 0.5 = the answer partially addresses the question - 0.0 = the answer is completely irrelevant to the question Respond in JSON: {"reasoning": "...", "score": 0.0} PROMPT
Instance Attribute Summary
Attributes inherited from Base
Instance Method Summary collapse
-
#call(question:, answer:, **_kwargs) ⇒ Hash
:score (Float 0.0-1.0) and :details (:reasoning String).
Methods inherited from Base
Constructor Details
This class inherits a constructor from EvalRuby::Metrics::Base
Instance Method Details
#call(question:, answer:, **_kwargs) ⇒ Hash
Returns :score (Float 0.0-1.0) and :details (:reasoning String).
34 35 36 37 38 39 40 41 42 43 44 |
# File 'lib/eval_ruby/metrics/relevance.rb', line 34 def call(question:, answer:, **_kwargs) prompt = format(PROMPT_TEMPLATE, question: question, answer: answer) result = judge.call(prompt) raise Error, "Judge returned invalid response for relevance" unless result&.key?("score") { score: result["score"].to_f.clamp(0.0, 1.0), details: {reasoning: result["reasoning"]} } end |