Class: Ai4r::Clusterers::SingleLinkage

Inherits:
Clusterer
  • Object
show all
Defined in:
lib/ai4r/clusterers/single_linkage.rb

Overview

Implementation of a Hierarchical clusterer with single linkage (Everitt et al., 2001 ; Johnson, 1967 ; Jain and Dubes, 1988 ; Sneath, 1957 ) Hierarchical clusterer create one cluster per element, and then progressively merge clusters, until the required number of clusters is reached. With single linkage, the distance between two clusters is computed as the distance between the two closest elements in the two clusters.

D(cx, (ci U cj) = min(D(cx, ci), D(cx, cj))

Instance Attribute Summary collapse

Instance Method Summary collapse

Methods included from Data::Parameterizable

#get_parameters, included, #set_parameters

Constructor Details

#initializeSingleLinkage

Returns a new instance of SingleLinkage.



36
37
38
39
40
41
42
# File 'lib/ai4r/clusterers/single_linkage.rb', line 36

def initialize
  @distance_function = lambda do |a,b| 
      Ai4r::Data::Proximity.squared_euclidean_distance(
        a.select {|att_a| att_a.is_a? Numeric} , 
        b.select {|att_b| att_b.is_a? Numeric})
    end
end

Instance Attribute Details

#clustersObject (readonly)

Returns the value of attribute clusters.



28
29
30
# File 'lib/ai4r/clusterers/single_linkage.rb', line 28

def clusters
  @clusters
end

#data_setObject (readonly)

Returns the value of attribute data_set.



28
29
30
# File 'lib/ai4r/clusterers/single_linkage.rb', line 28

def data_set
  @data_set
end

#number_of_clustersObject (readonly)

Returns the value of attribute number_of_clusters.



28
29
30
# File 'lib/ai4r/clusterers/single_linkage.rb', line 28

def number_of_clusters
  @number_of_clusters
end

Instance Method Details

#build(data_set, number_of_clusters) ⇒ Object

Build a new clusterer, using data examples found in data_set. Items will be clustered in “number_of_clusters” different clusters.



47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
# File 'lib/ai4r/clusterers/single_linkage.rb', line 47

def build(data_set, number_of_clusters)
  @data_set = data_set
  @number_of_clusters = number_of_clusters
  
  @index_clusters = create_initial_index_clusters
  create_distance_matrix(data_set)
  while @index_clusters.length > @number_of_clusters
    ci, cj = get_closest_clusters(@index_clusters)
    update_distance_matrix(ci, cj)
    merge_clusters(ci, cj, @index_clusters)
  end
  @clusters = build_clusters_from_index_clusters @index_clusters
  
  return self
end

#eval(data_item) ⇒ Object

Classifies the given data item, returning the cluster index it belongs to (0-based).



65
66
67
68
# File 'lib/ai4r/clusterers/single_linkage.rb', line 65

def eval(data_item)
  get_min_index(@clusters.collect {|cluster| 
      distance_between_item_and_cluster(data_item, cluster)})
end