Class: Ferret::Analysis::Analyzer
- Inherits:
-
Object
- Object
- Ferret::Analysis::Analyzer
- Defined in:
- lib/ferret/analysis/analyzers.rb
Overview
An Analyzer builds TokenStreams, which analyze text. It thus represents a policy for extracting index terms from text.
Typical implementations first build a Tokenizer, which breaks the stream of characters from the Reader into raw Tokens. One or more TokenFilter s may then be applied to the output of the Tokenizer.
The default Analyzer just creates a LowerCaseTokenizer which converts all text to lowercase tokens. See LowerCaseTokenizer for more details.
Direct Known Subclasses
Instance Method Summary collapse
-
#token_stream(field, string) ⇒ Object
Creates a TokenStream which tokenizes all the text in the provided Reader.
Instance Method Details
#token_stream(field, string) ⇒ Object
Creates a TokenStream which tokenizes all the text in the provided Reader. Override to allow Analyzer to choose strategy based on document and/or field.
- string
-
the string representing the text in the field
- field
-
name of the field. Not required.
17 18 19 |
# File 'lib/ferret/analysis/analyzers.rb', line 17 def token_stream(field, string) return LowerCaseTokenizer.new(string) end |