Class: Ferret::Analysis::PorterStemFilter

Inherits:
TokenFilter show all
Defined in:
lib/ferret/analysis/token_filters.rb

Overview

Transforms the token stream as per the Porter stemming algorithm. Note: the input to the stemming filter must already be in lower case, so you will need to use LowerCaseFilter or LowerCaseTokenizer further down the Tokenizer chain in order for this to work properly!

To use this filter with other analyzers, you’ll want to write an Analyzer class that sets up the TokenStream chain as you want it. To use this with LowerCaseTokenizer, for example, you’d write an analyzer like this:

def MyAnalyzer < Analyzer
  def token_stream(field, reader)
    return PorterStemFilter.new(LowerCaseTokenizer.new(reader))
  end
end

Instance Method Summary collapse

Methods inherited from TokenFilter

#close

Methods inherited from TokenStream

#close, #each

Instance Method Details

#nextObject

Returns the next input Token, after being stemmed



76
77
78
79
80
81
82
83
84
# File 'lib/ferret/analysis/token_filters.rb', line 76

def next()
  token = @input.next()
  if (token == nil)
    return nil
  else
    token.term_text = Stemmable.stem_porter(token.term_text)
  end
  token
end