Class: Rumale::Clustering::DBSCAN

Inherits:
Object
  • Object
show all
Includes:
Base::BaseEstimator, Base::ClusterAnalyzer
Defined in:
lib/rumale/clustering/dbscan.rb

Overview

DBSCAN is a class that implements DBSCAN cluster analysis. The current implementation uses the Euclidean distance for analyzing the clusters.

Reference

    1. Ester, H-P. Kriegel, J. Sander, and X. Xu, “A density-based algorithm for discovering clusters in large spatial databases with noise,” Proc. KDD’ 96, pp. 266–231, 1996.

Examples:

analyzer = Rumale::Clustering::DBSCAN.new(eps: 0.5, min_samples: 5)
cluster_labels = analyzer.fit_predict(samples)

Instance Attribute Summary collapse

Attributes included from Base::BaseEstimator

#params

Instance Method Summary collapse

Methods included from Base::ClusterAnalyzer

#score

Constructor Details

#initialize(eps: 0.5, min_samples: 5) ⇒ DBSCAN

Create a new cluster analyzer with DBSCAN method.

Parameters:

  • eps (Float) (defaults to: 0.5)

    The radius of neighborhood.

  • min_samples (Integer) (defaults to: 5)

    The number of neighbor samples to be used for the criterion whether a point is a core point.



34
35
36
37
38
39
40
41
42
# File 'lib/rumale/clustering/dbscan.rb', line 34

def initialize(eps: 0.5, min_samples: 5)
  check_params_float(eps: eps)
  check_params_integer(min_samples: min_samples)
  @params = {}
  @params[:eps] = eps
  @params[:min_samples] = min_samples
  @core_sample_ids = nil
  @labels = nil
end

Instance Attribute Details

#core_sample_idsNumo::Int32 (readonly)

Return the core sample indices.

Returns:

  • (Numo::Int32)

    (shape: [n_core_samples])



24
25
26
# File 'lib/rumale/clustering/dbscan.rb', line 24

def core_sample_ids
  @core_sample_ids
end

#labelsNumo::Int32 (readonly)

Return the cluster labels. The negative cluster label indicates that the point is noise.

Returns:

  • (Numo::Int32)

    (shape: [n_samples])



28
29
30
# File 'lib/rumale/clustering/dbscan.rb', line 28

def labels
  @labels
end

Instance Method Details

#fit(x) ⇒ DBSCAN

Analysis clusters with given training data.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_samples, n_features]) The training data to be used for cluster analysis.

Returns:

  • (DBSCAN)

    The learned cluster analyzer itself.



50
51
52
53
54
# File 'lib/rumale/clustering/dbscan.rb', line 50

def fit(x, _y = nil)
  check_sample_array(x)
  partial_fit(x)
  self
end

#fit_predict(x) ⇒ Numo::Int32

Analysis clusters and assign samples to clusters.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_samples, n_features]) The training data to be used for cluster analysis.

Returns:

  • (Numo::Int32)

    (shape: [n_samples]) Predicted cluster label per sample.



60
61
62
63
64
# File 'lib/rumale/clustering/dbscan.rb', line 60

def fit_predict(x)
  check_sample_array(x)
  partial_fit(x)
  labels
end

#marshal_dumpHash

Dump marshal data.

Returns:

  • (Hash)

    The marshal data.



68
69
70
71
72
# File 'lib/rumale/clustering/dbscan.rb', line 68

def marshal_dump
  { params: @params,
    core_sample_ids: @core_sample_ids,
    labels: @labels }
end

#marshal_load(obj) ⇒ nil

Load marshal data.

Returns:

  • (nil)


76
77
78
79
80
81
# File 'lib/rumale/clustering/dbscan.rb', line 76

def marshal_load(obj)
  @params = obj[:params]
  @core_sample_ids = obj[:core_sample_ids]
  @labels = obj[:labels]
  nil
end