Class: VectorEmbed::Maker::Ngram
- Inherits:
-
VectorEmbed::Maker
- Object
- VectorEmbed::Maker
- VectorEmbed::Maker::Ngram
- Defined in:
- lib/vector_embed/maker/ngram.rb
Instance Attribute Summary collapse
-
#delim ⇒ Object
readonly
Returns the value of attribute delim.
-
#len ⇒ Object
readonly
Returns the value of attribute len.
Attributes inherited from VectorEmbed::Maker
Class Method Summary collapse
Instance Method Summary collapse
-
#initialize(k, parent) ⇒ Ngram
constructor
A new instance of Ngram.
- #pairs(v) ⇒ Object
Methods inherited from VectorEmbed::Maker
Constructor Details
#initialize(k, parent) ⇒ Ngram
Returns a new instance of Ngram.
15 16 17 18 19 20 |
# File 'lib/vector_embed/maker/ngram.rb', line 15 def initialize(k, parent) super @len = parent.[:ngram_len].to_i raise ArgumentError, ":ngram_len must be > 0" unless @len > 0 @delim = parent.[:ngram_delim] end |
Instance Attribute Details
#delim ⇒ Object (readonly)
Returns the value of attribute delim.
13 14 15 |
# File 'lib/vector_embed/maker/ngram.rb', line 13 def delim @delim end |
#len ⇒ Object (readonly)
Returns the value of attribute len.
12 13 14 |
# File 'lib/vector_embed/maker/ngram.rb', line 12 def len @len end |
Class Method Details
.want?(v, parent) ⇒ Boolean
7 8 9 |
# File 'lib/vector_embed/maker/ngram.rb', line 7 def want?(v, parent) parent.[:ngram_len] end |
Instance Method Details
#pairs(v) ⇒ Object
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
# File 'lib/vector_embed/maker/ngram.rb', line 22 def pairs(v) raise "Ngram can't handle #{v.inspect}, only a single string for now" unless v.is_a?(String) v = parent.preprocess v.to_s if len == 1 # word mode v.split delim elsif delim == '' # byte mode (0..v.length-len).map { |i| v[i,len] } else raise "Word n-gram not supported yet" end.map do |ngram| [ [ parent.index([k, 'ngram', ngram]), 1 ] ] end end |