Class: Ai4r::Clusterers::Diana
- Defined in:
- lib/ai4r/clusterers/diana.rb
Overview
DIANA (Divisive ANAlysis) (Kaufman and Rousseeuw, 1990; Macnaughton - Smith et al. 1964) is a Divisive Hierarchical Clusterer. It begins with only one cluster with all data items, and divides the clusters until the desired clusters number is reached.
Instance Attribute Summary collapse
-
#clusters ⇒ Object
readonly
Returns the value of attribute clusters.
-
#data_set ⇒ Object
readonly
Returns the value of attribute data_set.
-
#number_of_clusters ⇒ Object
readonly
Returns the value of attribute number_of_clusters.
Instance Method Summary collapse
-
#build(data_set, number_of_clusters) ⇒ Object
Build a new clusterer, using divisive analysis (DIANA algorithm).
-
#eval(data_item) ⇒ Object
Classifies the given data item, returning the cluster index it belongs to (0-based).
-
#initialize ⇒ Diana
constructor
A new instance of Diana.
Methods included from Data::Parameterizable
#get_parameters, included, #set_parameters
Constructor Details
#initialize ⇒ Diana
Returns a new instance of Diana.
31 32 33 34 35 36 37 |
# File 'lib/ai4r/clusterers/diana.rb', line 31 def initialize @distance_function = lambda do |a,b| Ai4r::Data::Proximity.squared_euclidean_distance( a.select {|att_a| att_a.is_a? Numeric} , b.select {|att_b| att_b.is_a? Numeric}) end end |
Instance Attribute Details
#clusters ⇒ Object (readonly)
Returns the value of attribute clusters.
23 24 25 |
# File 'lib/ai4r/clusterers/diana.rb', line 23 def clusters @clusters end |
#data_set ⇒ Object (readonly)
Returns the value of attribute data_set.
23 24 25 |
# File 'lib/ai4r/clusterers/diana.rb', line 23 def data_set @data_set end |
#number_of_clusters ⇒ Object (readonly)
Returns the value of attribute number_of_clusters.
23 24 25 |
# File 'lib/ai4r/clusterers/diana.rb', line 23 def number_of_clusters @number_of_clusters end |
Instance Method Details
#build(data_set, number_of_clusters) ⇒ Object
Build a new clusterer, using divisive analysis (DIANA algorithm)
40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
# File 'lib/ai4r/clusterers/diana.rb', line 40 def build(data_set, number_of_clusters) @data_set = data_set @number_of_clusters = number_of_clusters @clusters = [@data_set[0..-1]] while(@clusters.length < @number_of_clusters) cluster_index_to_split = max_diameter_cluster(@clusters) cluster_to_split = @clusters[cluster_index_to_split] splinter_cluster = init_splinter_cluster(cluster_to_split) while true dist_diff, index = max_distance_difference(cluster_to_split, splinter_cluster) break if dist_diff < 0 splinter_cluster << cluster_to_split.data_items[index] cluster_to_split.data_items.delete_at(index) end @clusters << splinter_cluster end return self end |
#eval(data_item) ⇒ Object
Classifies the given data item, returning the cluster index it belongs to (0-based).
63 64 65 66 67 |
# File 'lib/ai4r/clusterers/diana.rb', line 63 def eval(data_item) get_min_index(@clusters.collect do |cluster| distance_sum(data_item, cluster) / cluster.data_items.length end) end |