Class: Ferret::Search::DisjunctionSumScorer

Inherits:
Scorer
  • Object
show all
Defined in:
lib/ferret/search/disjunction_sum_scorer.rb

Overview

A Scorer for OR like queries, counterpart of Lucene’s ConjunctionScorer. This Scorer implements Scorer#skip_to(int) and uses skip_to() on the given Scorers.

Defined Under Namespace

Classes: ScorerQueue

Constant Summary

Constants inherited from Scorer

Scorer::MAX_DOCS

Instance Attribute Summary collapse

Attributes inherited from Scorer

#similarity

Instance Method Summary collapse

Methods inherited from Scorer

#each_hit, #each_hit_up_to

Constructor Details

#initialize(sub_scorers, minimum_nr_matchers = 1) ⇒ DisjunctionSumScorer

Construct a DisjunctionScorer.

sub_scorers

A collection of at least two subscorers.

minimum_nr_matchers

The positive minimum number of subscorers that should match to match this query.

When @minimum_nr_matchers is bigger than the number of sub_scorers,no matches will be produced.

When @minimum_nr_matchers equals the number of sub_scorers, it more efficient to use ConjunctionScorer.



20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
# File 'lib/ferret/search/disjunction_sum_scorer.rb', line 20

def initialize(sub_scorers, minimum_nr_matchers = 1) 
  super(nil)
  
  # The number of subscorers.  
  @nr_scorers = sub_scorers.size

  # The document number of the current match. 
  @current_doc = -1
  @curret_score = nil
  # The number of subscorers that provide the current match. 
  @nr_matchers = -1

  if (minimum_nr_matchers <= 0) 
    raise ArgumentError, "Minimum nr of matchers must be positive"
  end
  if (@nr_scorers <= 1) 
    raise ArgumentError, "There must be at least 2 sub_scorers"
  end

  @minimum_nr_matchers = minimum_nr_matchers
  @sub_scorers = sub_scorers

  # The @scorer_queue contains all subscorers ordered by their current
  # doc, with the minimum at the top.
  # 
  # The @scorer_queue is initialized the first time next? or skip_to() is
  # called.
  # 
  # An exhausted scorer is immediately removed from the @scorer_queue.
  # 
  # If less than the @minimum_nr_matchers scorers remain in the
  # @scorer_queue next? and skip_to() return false.
  # 
  # After each to call to next? or skip_to()
  # +currentSumScore+ is the total score of the current matching doc,
  # +@nr_matchers+ is the number of matching scorers,
  # and all scorers are after the matching doc, or are exhausted.
  @scorer_queue = nil
end

Instance Attribute Details

#sub_scorersObject

the sub-scorers



6
7
8
# File 'lib/ferret/search/disjunction_sum_scorer.rb', line 6

def sub_scorers
  @sub_scorers
end

Instance Method Details

#advance_after_currentObject

Advance all subscorers after the current document determined by the top of the @scorer_queue. Repeat until at least the minimum number of subscorers match on the same document and all subscorers are after that document or are exhausted.

On entry the @scorer_queue has at least @minimum_nr_matchers available. At least the scorer with the minimum document number will be advanced.

returns

true iff there is a match.

In case there is a match, @current_doc, currentSumScore, and @nr_matchers describe the match.

TODO Investigate whether it is possible to use skip_to() when the minimum number of matchers is bigger than one, ie. begin and use the character of ConjunctionScorer for the minimum number of matchers.



106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
# File 'lib/ferret/search/disjunction_sum_scorer.rb', line 106

def advance_after_current()
  begin # repeat until minimum nr of matchers
    top = @scorer_queue.top
    @current_doc = top.doc
    @current_score = top.score
    @nr_matchers = 1
    begin # Until all subscorers are after @current_doc
      if top.next?
        @scorer_queue.adjust_top()
      else 
        @scorer_queue.pop()
        if (@scorer_queue.size < (@minimum_nr_matchers - @nr_matchers)) 
          # Not enough subscorers left for a match on this document,
          # and also no more chance of any further match.
          return false
        end
        if (@scorer_queue.size == 0) 
          break # nothing more to advance, check for last match.
        end
      end
      top = @scorer_queue.top
      if top.doc != @current_doc
        break # All remaining subscorers are after @current_doc.
      else 
        @current_score += top.score
        @nr_matchers += 1
      end
    end while (true)
    
    if (@nr_matchers >= @minimum_nr_matchers) 
      return true
    elsif (@scorer_queue.size < @minimum_nr_matchers) 
      return false
    end
  end while (true)
end

#docObject

Returns the document number of the current document matching the query. Initially invalid, until #next? is called the first time.



151
152
153
# File 'lib/ferret/search/disjunction_sum_scorer.rb', line 151

def doc()
  return @current_doc
end

#explain(doc) ⇒ Object

Gives and explanation for the score of a given document. TODO Show the resulting score. See BooleanScorer.explain() on how to do this.



196
197
198
199
200
201
202
203
# File 'lib/ferret/search/disjunction_sum_scorer.rb', line 196

def explain(doc)
  e = Explanation.new()
  e.description = "At least " + @minimum_nr_matchers + " of"
  @sub_scorers.each do |sub_scorer|
    e.details << sub_scorer.explain(doc)
  end
  return e
end

#init_scorer_queueObject

Called the first time next? or skip_to() is called to initialize @scorer_queue.



62
63
64
65
66
67
68
69
# File 'lib/ferret/search/disjunction_sum_scorer.rb', line 62

def init_scorer_queue()
  @scorer_queue = ScorerQueue.new(@nr_scorers)
  @sub_scorers.each do |sub_scorer|
    if (sub_scorer.next?) # doc() method will be used in @scorer_queue.
      @scorer_queue.insert(sub_scorer)
    end
  end
end

#next?Boolean

Returns:

  • (Boolean)


78
79
80
81
82
83
84
85
86
87
88
# File 'lib/ferret/search/disjunction_sum_scorer.rb', line 78

def next?
  if (@scorer_queue == nil) 
    init_scorer_queue()
  end

  if (@scorer_queue.size < @minimum_nr_matchers) 
    return false
  else 
    return advance_after_current()
  end
end

#number_of_matchersObject

Returns the number of subscorers matching the current document. Initially invalid, until #next? is called the first time.



157
158
159
# File 'lib/ferret/search/disjunction_sum_scorer.rb', line 157

def number_of_matchers()
  return @nr_matchers
end

#scoreObject

Returns the score of the current document matching the query. Initially invalid, until #next? is called the first time.



145
146
147
# File 'lib/ferret/search/disjunction_sum_scorer.rb', line 145

def score()
  return @current_score
end

#skip_to(target) ⇒ Object

Skips to the first match beyond the current whose document number is greater than or equal to a given target.

When this method is used the #explain(int) method should not be used.

The implementation uses the skip_to() method on the subscorers.

target

The target document number.

returns

true iff there is such a match.



169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
# File 'lib/ferret/search/disjunction_sum_scorer.rb', line 169

def skip_to(target)
  if @scorer_queue.nil?
    init_scorer_queue()
  end
  if @scorer_queue.size < @minimum_nr_matchers
    return false
  end
  if target <= @current_doc
    target = @current_doc + 1
  end
  begin 
    top = @scorer_queue.top
    if top.doc >= target 
      return advance_after_current()
    elsif top.skip_to(target) 
      @scorer_queue.adjust_top()
    else 
      @scorer_queue.pop()
      if (@scorer_queue.size < @minimum_nr_matchers) 
        return false
      end
    end
  end while (true)
end