Module: Ferret::Analysis::WordListLoader
- Defined in:
- lib/ferret/analysis/word_list_loader.rb
Overview
Loader for text files that represent a list of stopwords.
Class Method Summary collapse
- .word_set_from_array(word_array) ⇒ Object
-
.word_set_from_file(path) ⇒ Object
Loads a text file and adds every line as an entry to a HashSet (omitting leading and trailing whitespace).
Class Method Details
.word_set_from_array(word_array) ⇒ Object
21 22 23 24 25 |
# File 'lib/ferret/analysis/word_list_loader.rb', line 21 def WordListLoader.word_set_from_array(word_array) result = Set.new() word_array.each {|word| result << word } return result end |
.word_set_from_file(path) ⇒ Object
Loads a text file and adds every line as an entry to a HashSet (omitting leading and trailing whitespace). Every line of the file should contain only one word. The words need to be in lowercase if you make use of an Analyzer which uses LowerCaseFilter (like GermanAnalyzer).
- path
-
path to file containing the wordlist
- return
-
A HashSet with the file’s words
12 13 14 15 16 17 18 19 |
# File 'lib/ferret/analysis/word_list_loader.rb', line 12 def WordListLoader.word_set_from_file(path) result = Set.new() File.open(path) do |word_file| # we have to strip the end of line characters word_file.each {|line| result << line[0..-2] } end return result end |