Class: TfIdfSimilarity::Tokenizer

Inherits:
Object
  • Object
show all
Defined in:
lib/tf-idf-similarity/tokenizer.rb

Instance Method Summary collapse

Instance Method Details

#tokenize(text) ⇒ Enumerator

Tokenizes a text.

Parameters:

  • text (String)

Returns:

  • (Enumerator)

    an enumerator of Token objects



13
14
15
16
17
# File 'lib/tf-idf-similarity/tokenizer.rb', line 13

def tokenize(text)
  UnicodeUtils.each_word(text).map do |word|
    Token.new(word)
  end
end