Class: Tokeneyes::WordReader

Inherits:
Object
  • Object
show all
Defined in:
lib/tokeneyes/word_reader.rb

Overview

The WordReader class will read a single word from a StringIO, advancing the IO stream until a word and subsequent boundary are reached (or the string runs out). It will return a Word object containing info on the word and its ending (the object receiving this data will be resopnsible for filling in any data about the previous state, if any).

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(text_stream) ⇒ WordReader

Returns a new instance of WordReader.



11
12
13
# File 'lib/tokeneyes/word_reader.rb', line 11

def initialize(text_stream)
  @text_stream = text_stream
end

Instance Attribute Details

#text_streamObject (readonly)

Returns the value of attribute text_stream.



10
11
12
# File 'lib/tokeneyes/word_reader.rb', line 10

def text_stream
  @text_stream
end

Instance Method Details

#read_word(previous_char = "", word = "") ⇒ Object



15
16
17
18
19
20
21
22
23
24
25
26
27
# File 'lib/tokeneyes/word_reader.rb', line 15

def read_word(previous_char = "", word = "")
  current_char = text_stream.readchar
  word_builder = WordBuilder.new(previous_char, current_char, word)
  word += word_builder.character_to_add_to_word

  # if we detect a word boundary but don't actually have a word yet, keep going -- that is,
  # discard leading punctuation not attached to a word (e.g. x,,y or ^,y)
  if text_stream.eof? || (word_builder.word_finished? && word.length > 0)
    build_word(word, word_builder)
  else
    read_word(current_char, word)
  end
end