Class: Chomchom::Topic

Inherits:
Object
  • Object
show all
Defined in:
lib/chomchom/topic.rb

Constant Summary collapse

MAX =
8

Instance Method Summary collapse

Constructor Details

#initialize(text, title = '', title_weight = 1) ⇒ Topic

Returns a new instance of Topic.



8
9
10
11
12
13
14
# File 'lib/chomchom/topic.rb', line 8

def initialize(text, title='', title_weight=1)
    #support unicode (require ruby 1.9.x)
    text = text.force_encoding("UTF-8")
    title = title.force_encoding("UTF-8")
    @content = title * title_weight + text.gsub(/\n+/,"\n")
    @content = @content.force_encoding("UTF-8").downcase
end

Instance Method Details

#multiplesObject

this is not for the benefit of summary (but for db storage so move this into topic method in chomchom.rb) merge words before sorting (this keeps order of words as they appear) look at each word in single_groups and merge with the others O(n^2)(this is inefficient) just go through the list in order, for each combine them and switch the order, take whichever one generate more counts merge for 2-word, then 3-word only for 3 (triples) just build from the doubles, then combine with non-overlap singles subtract from count everytime you legally take away (combine is more than 2 and remainder is more than 2)



30
31
32
# File 'lib/chomchom/topic.rb', line 30

def multiples
  
end

#singlesObject



16
17
18
19
20
21
# File 'lib/chomchom/topic.rb', line 16

def singles
  words = @content.split(' ').map { |w| w.downcase.gsub(/[^\p{Word}]/, '') }.uniq.delete_if { |w| !w or w.length<2 or w.is_common? }
  @singles = words.map { |w| [w, frequency(w)] }      
  @singles = @singles.delete_if { |g| g[1] < 3}.sort { |a,b| b[1] <=> a[1] }
  @singles[0..MAX]
end