Class: RTesseract::Box::BoxParser

Inherits:
Object
  • Object
show all
Defined in:
lib/rtesseract/box.rb

Overview

Parse word data from html.

Instance Method Summary collapse

Constructor Details

#initialize(word_html) ⇒ BoxParser

Returns a new instance of BoxParser.



61
62
63
64
65
# File 'lib/rtesseract/box.rb', line 61

def initialize(word_html)
  @word = word_html
  title = @word.attributes['title'].value.to_s
  @attributes = title.gsub(';', '').split(' ')
end

Instance Method Details

#to_hObject

Hash of word and position



68
69
70
71
72
73
74
75
76
# File 'lib/rtesseract/box.rb', line 68

def to_h
  {
    word: @word.text,
    x_start: @attributes[1].to_i,
    y_start: @attributes[2].to_i,
    x_end: @attributes[3].to_i,
    y_end: @attributes[4].to_i
  }
end