Class: Ai4r::Clusterers::BisectingKMeans

Inherits:

KMeans

Object
Clusterer
KMeans
Ai4r::Clusterers::BisectingKMeans

show all

Defined in:: lib/ai4r/clusterers/bisecting_k_means.rb

Overview

The Bisecting k-means algorithm is a variation of the “k-means” algorithm, somewhat less sensitive to the initial election of centroids than the original.

More about K Means algorithm: en.wikipedia.org/wiki/K-means_algorithm

Instance Attribute Summary collapse

#centroids ⇒ Object readonly

Returns the value of attribute centroids.
#clusters ⇒ Object readonly

Returns the value of attribute clusters.
#data_set ⇒ Object readonly

Returns the value of attribute data_set.
#number_of_clusters ⇒ Object readonly

Returns the value of attribute number_of_clusters.

Attributes inherited from KMeans

#history, #iterations

Instance Method Summary collapse

#build(data_set, number_of_clusters) ⇒ Object

Build a new clusterer, using data examples found in data_set.
#initialize ⇒ Object constructor

Constructor Details

#initialize ⇒ `Object`

# File 'lib/ai4r/clusterers/bisecting_k_means.rb', line 43

def initialize
  super
  @refine = true
end

Instance Attribute Details

#centroids ⇒ `Object` (readonly)

Returns the value of attribute centroids.



24
25
26

# File 'lib/ai4r/clusterers/bisecting_k_means.rb', line 24

def centroids
  @centroids
end

#clusters ⇒ `Object` (readonly)

Returns the value of attribute clusters.



24
25
26

# File 'lib/ai4r/clusterers/bisecting_k_means.rb', line 24

def clusters
  @clusters
end

#data_set ⇒ `Object` (readonly)

Returns the value of attribute data_set.



24
25
26

# File 'lib/ai4r/clusterers/bisecting_k_means.rb', line 24

def data_set
  @data_set
end

#number_of_clusters ⇒ `Object` (readonly)

Returns the value of attribute number_of_clusters.



24
25
26

# File 'lib/ai4r/clusterers/bisecting_k_means.rb', line 24

def number_of_clusters
  @number_of_clusters
end

Instance Method Details

#build(data_set, number_of_clusters) ⇒ `Object`

Build a new clusterer, using data examples found in data_set. Items will be clustered in “number_of_clusters” different clusters.

Parameters:

data_set (Object)
number_of_clusters (Object)

Returns:

(Object)

# File 'lib/ai4r/clusterers/bisecting_k_means.rb', line 54

def build(data_set, number_of_clusters)
  @data_set = data_set
  @number_of_clusters = number_of_clusters

  @clusters = [@data_set]
  @centroids = [@data_set.get_mean_or_mode]
  while @clusters.length < @number_of_clusters
    biggest_cluster_index = find_biggest_cluster_index(@clusters)
    clusterer = KMeans.new
                      .set_parameters(get_parameters)
                      .build(@clusters[biggest_cluster_index], 2)
    @clusters.delete_at(biggest_cluster_index)
    @centroids.delete_at(biggest_cluster_index)
    @clusters.concat(clusterer.clusters)
    @centroids.concat(clusterer.centroids)
  end

  super if @refine

  self
end

Class: Ai4r::Clusterers::BisectingKMeans

Overview

Instance Attribute Summary collapse

Attributes inherited from KMeans

Instance Method Summary collapse

Methods inherited from KMeans

Methods inherited from Clusterer

Methods included from Data::Parameterizable

Constructor Details

#initialize ⇒ Object

Instance Attribute Details

#centroids ⇒ Object (readonly)

#clusters ⇒ Object (readonly)

#data_set ⇒ Object (readonly)

#number_of_clusters ⇒ Object (readonly)