Class: Ferret::Analysis::StopFilter

Inherits:
TokenFilter show all
Defined in:
lib/ferret/analysis/token_filters.rb

Overview

Removes stop words from a token stream. To will need to pass your own set of stopwords to use this stop filter. If you with to use the default list of stopwords then use the StopAnalyzer.

Class Method Summary collapse

Instance Method Summary collapse

Methods inherited from TokenFilter

#close

Methods inherited from TokenStream

#close, #each

Constructor Details

#initialize(input, stop_set) ⇒ StopFilter

Constructs a filter which removes words from the input TokenStream that are named in the array of words.



39
40
41
42
# File 'lib/ferret/analysis/token_filters.rb', line 39

def initialize(input, stop_set)
  super(input);
  @stop_set = stop_set
end

Class Method Details

.new_with_file(input, path) ⇒ Object



44
45
46
47
# File 'lib/ferret/analysis/token_filters.rb', line 44

def StopFilter.new_with_file(input, path)
  ws = WordListLoader.word_set_from_file(path)
  return StopFilter.new(input, ws)
end

Instance Method Details

#nextObject

Returns the next input Token whose termText() is not a stop word.



50
51
52
53
54
55
56
# File 'lib/ferret/analysis/token_filters.rb', line 50

def next()
  # return the first non-stop word found
  while token = @input.next()
    return token if ! @stop_set.include?(token.term_text)
  end
  return nil
end