Module: Ferret::Analysis::WordListLoader

Defined in:
lib/ferret/analysis/word_list_loader.rb

Overview

Loader for text files that represent a list of stopwords.

Class Method Summary collapse

Class Method Details

.word_set_from_array(word_array) ⇒ Object



21
22
23
24
25
# File 'lib/ferret/analysis/word_list_loader.rb', line 21

def WordListLoader.word_set_from_array(word_array)
  result = Set.new()
  word_array.each {|word| result << word }
  return result
end

.word_set_from_file(path) ⇒ Object

Loads a text file and adds every line as an entry to a HashSet (omitting leading and trailing whitespace). Every line of the file should contain only one word. The words need to be in lowercase if you make use of an Analyzer which uses LowerCaseFilter (like GermanAnalyzer).

path

path to file containing the wordlist

return

A HashSet with the file’s words



12
13
14
15
16
17
18
19
# File 'lib/ferret/analysis/word_list_loader.rb', line 12

def WordListLoader.word_set_from_file(path)
  result = Set.new()
  File.open(path) do |word_file|
    # we have to strip the end of line characters
    word_file.each {|line| result << line[0..-2] }
  end
  return result
end