Class: Lda::Document

Inherits:
Object
  • Object
show all
Defined in:
lib/lda.rb

Overview

A single document.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(doc_line = nil) ⇒ Document

Create the Document using the svmlight-style text line:

num_words w1:freq1 w2:freq2 ... w_n:freq_n

Ex.

5 1:2 3:1 4:2 7:3 12:1

The value for the number of words should equal the number of pairs following it, though this isn’t strictly enforced. Order of word-pair indices is not important.



73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
# File 'lib/lda.rb', line 73

def initialize(doc_line=nil)
  if doc_line.is_a?(String)
    tmp = doc_line.split
    @words = Array.new
    @counts = Array.new
    @total = 0
    tmp.slice(1,tmp.size).each do |pair|
      tmp2 = pair.split(":")
      @words << tmp2[0].to_i
      @counts << tmp2[1].to_i
    end
    @length = @words.size
    @total = @counts.inject(0) {|sum, i| sum + i}
  else    # doc_line == nil
    @words = Array.new
    @counts = Array.new
    @total = 0
    @length = 0
  end
end

Instance Attribute Details

#countsObject

Returns the value of attribute counts.



59
60
61
# File 'lib/lda.rb', line 59

def counts
  @counts
end

#lengthObject (readonly)

Returns the value of attribute length.



60
61
62
# File 'lib/lda.rb', line 60

def length
  @length
end

#totalObject (readonly)

Returns the value of attribute total.



60
61
62
# File 'lib/lda.rb', line 60

def total
  @total
end

#wordsObject

Returns the value of attribute words.



59
60
61
# File 'lib/lda.rb', line 59

def words
  @words
end

Instance Method Details

#recomputeObject

Recompute the total and length values if the document has been altered externally. This probably won’t happen, but might be useful if you want to subclass Document.



100
101
102
103
# File 'lib/lda.rb', line 100

def recompute
  @total = @counts.inject(0) {|sum, i| sum + i}
  @length = @words.size
end