Class: Statlysis::HotestItems

Inherits:
SingleKv show all
Defined in:
lib/statlysis/cron/top/hotest_items.rb

Overview

一般最近热门列表通常采用简单对一个字段记录访问数的算法,但是这可能会导致刷量等问题。

解决方法为从用户行为中去综合分析,具体流程为: 从URI中抽取item_id, 从访问日志抽取排重IP和user_id,从like,fav,comment表获取更深的用户行为,把前两者通过一定比例相加得到排行。 最后用时间降温来避免马太效应,必可动态提升比例以使最近稍微热门的替换掉之前太热门的。

线性计算速度很快

Instance Attribute Summary collapse

Attributes inherited from SingleKv

#stat_column_name, #time_ago

Attributes inherited from Top

#logs, #pattern_proc, #result_limit, #stat_model, #user_id_proc, #user_info_proc

Attributes inherited from Cron

#clock, #multiple_dataset, #source_type, #time_column, #time_unit, #time_zone

Instance Method Summary collapse

Methods inherited from Top

#default_assign_attr, #run

Methods inherited from Cron

#_source, #group_by_columns?, #is_activerecord?, #is_mongoid?, #is_orm?, #is_time_column_integer?, #reoutput, #run, #setup_stat_model, #source_where_array, #time_column?, #time_range

Methods included from Common

#cron

Constructor Details

#initialize(key, id_to_score_and_time_hash_proc) ⇒ HotestItems

Returns a new instance of HotestItems.



16
17
18
19
20
21
22
# File 'lib/statlysis/cron/top/hotest_items.rb', line 16

def initialize key, id_to_score_and_time_hash_proc
  cron.key = key
  cron.id_to_score_and_time_hash_proc = id_to_score_and_time_hash_proc
  cron.limit = 20
  super
  cron
end

Instance Attribute Details

#id_to_score_and_time_hash_procObject

Returns the value of attribute id_to_score_and_time_hash_proc.



13
14
15
# File 'lib/statlysis/cron/top/hotest_items.rb', line 13

def id_to_score_and_time_hash_proc
  @id_to_score_and_time_hash_proc
end

#keyObject

Returns the value of attribute key.



13
14
15
# File 'lib/statlysis/cron/top/hotest_items.rb', line 13

def key
  @key
end

#limitObject

Returns the value of attribute limit.



14
15
16
# File 'lib/statlysis/cron/top/hotest_items.rb', line 14

def limit
  @limit
end

Instance Method Details

#outputObject



24
25
26
27
28
29
30
31
32
33
34
35
# File 'lib/statlysis/cron/top/hotest_items.rb', line 24

def output
  t = cron.id_to_score_and_time_hash_proc
  while t.is_a?(Proc) do
    t = t.call
  end
  @id_to_score_and_time_hash = t
  @id_to_day_hash = @id_to_score_and_time_hash.inject({}) {|h, ab| h[ab[0]] = (((Time.now - ab[1][1]) / (3600*24)).round + 1); h }

  @id_to_timecooldown_hash = @id_to_score_and_time_hash.inject({}) {|h, kv| h[kv[0]] = (kv[1][0] / Math.sqrt(@id_to_day_hash[kv[0]])); h }
  array = @id_to_timecooldown_hash.sort {|a, b| b[1] <=> a[1] }.map(&:first)
  {cron.key => array}
end

#writeObject



37
38
39
40
41
42
43
# File 'lib/statlysis/cron/top/hotest_items.rb', line 37

def write
  cron.output.each do |key, array|
    json = array[0..140].to_json
    StSingleKv.find_or_create(:pattern => key).update :result => json
    StSingleKvHistory.find_or_create(:pattern => "#{key}_#{Time.now.strftime('%Y%m%d')}").update :result => json
  end
end