Class: Autotag::Extractor::Textblock
- Inherits:
-
Object
- Object
- Autotag::Extractor::Textblock
- Defined in:
- lib/autotag/extractor/document/textblock.rb
Instance Attribute Summary collapse
-
#size ⇒ Object
readonly
Returns the value of attribute size.
-
#words ⇒ Object
readonly
def stemwords.
Instance Method Summary collapse
- #[](index) ⇒ Object
-
#initialize(str, charsize, wordsize) ⇒ Textblock
constructor
size, HTML data.
- #plaintext ⇒ Object
- #ratio ⇒ Object
Constructor Details
#initialize(str, charsize, wordsize) ⇒ Textblock
size, HTML data
6 7 8 9 10 11 12 |
# File 'lib/autotag/extractor/document/textblock.rb', line 6 def initialize(str,charsize,wordsize) # count the number of blocks of non-whitespace characters @charsize = charsize @wordsize = wordsize @words = str.split(/\p{Z}+/).reject{|f| f.empty?} @size = @words.size end |
Instance Attribute Details
#size ⇒ Object (readonly)
Returns the value of attribute size.
3 4 5 |
# File 'lib/autotag/extractor/document/textblock.rb', line 3 def size @size end |
#words ⇒ Object (readonly)
def stemwords
19 20 21 |
# File 'lib/autotag/extractor/document/textblock.rb', line 19 def words @words end |
Instance Method Details
#[](index) ⇒ Object
29 30 31 |
# File 'lib/autotag/extractor/document/textblock.rb', line 29 def [] (index) @words[index] end |
#plaintext ⇒ Object
25 26 27 |
# File 'lib/autotag/extractor/document/textblock.rb', line 25 def plaintext @words.join(' ') end |
#ratio ⇒ Object
14 15 16 |
# File 'lib/autotag/extractor/document/textblock.rb', line 14 def ratio return @wordsize.to_f/@charsize.to_f end |