Class: Rumale::Decomposition::PCA

Inherits:
Object
  • Object
show all
Includes:
Base::BaseEstimator, Base::Transformer
Defined in:
lib/rumale/decomposition/pca.rb

Overview

PCA is a class that implements Principal Component Analysis.

Reference

    1. Sharma and K K. Paliwal, “Fast principal component analysis using fixed-point algorithm,” Pattern Recognition Letters, 28, pp. 1151–1155, 2007.

Examples:

decomposer = Rumale::Decomposition::PCA.new(n_components: 2)
representaion = decomposer.fit_transform(samples)

# If Numo::Linalg is installed, you can specify 'evd' for the solver option.
require 'numo/linalg/autoloader'
decomposer = Rumale::Decomposition::PCA.new(n_components: 2, solver: 'evd')
representaion = decomposer.fit_transform(samples)

Instance Attribute Summary collapse

Attributes included from Base::BaseEstimator

#params

Instance Method Summary collapse

Constructor Details

#initialize(n_components: 2, solver: 'fpt', max_iter: 100, tol: 1.0e-4, random_seed: nil) ⇒ PCA

Create a new transformer with PCA.

Parameters:

  • n_components (Integer) (defaults to: 2)

    The number of principal components.

  • solver (String) (defaults to: 'fpt')

    The algorithm for the optimization (‘fpt’ or ‘evd’). ‘fpt’ uses the fixed-point algorithm. ‘evd’ performs eigen value decomposition of the covariance matrix of samples.

  • max_iter (Integer) (defaults to: 100)

    The maximum number of iterations. If solver = ‘evd’, this parameter is ignored.

  • tol (Float) (defaults to: 1.0e-4)

    The tolerance of termination criterion. If solver = ‘evd’, this parameter is ignored.

  • random_seed (Integer) (defaults to: nil)

    The seed value using to initialize the random generator.



46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
# File 'lib/rumale/decomposition/pca.rb', line 46

def initialize(n_components: 2, solver: 'fpt', max_iter: 100, tol: 1.0e-4, random_seed: nil)
  check_params_integer(n_components: n_components, max_iter: max_iter)
  check_params_string(solver: solver)
  check_params_float(tol: tol)
  check_params_type_or_nil(Integer, random_seed: random_seed)
  check_params_positive(n_components: n_components, max_iter: max_iter, tol: tol)
  @params = {}
  @params[:solver] = solver != 'evd' ? 'fpt' : 'evd'
  @params[:n_components] = n_components
  @params[:max_iter] = max_iter
  @params[:tol] = tol
  @params[:random_seed] = random_seed
  @params[:random_seed] ||= srand
  @components = nil
  @mean = nil
  @rng = Random.new(@params[:random_seed])
end

Instance Attribute Details

#componentsNumo::DFloat (readonly)

Returns the principal components.

Returns:

  • (Numo::DFloat)

    (shape: [n_components, n_features])



28
29
30
# File 'lib/rumale/decomposition/pca.rb', line 28

def components
  @components
end

#meanNumo::DFloat (readonly)

Returns the mean vector.

Returns:

  • (Numo::DFloat)

    (shape: [n_features])



32
33
34
# File 'lib/rumale/decomposition/pca.rb', line 32

def mean
  @mean
end

#rngRandom (readonly)

Return the random generator.

Returns:

  • (Random)


36
37
38
# File 'lib/rumale/decomposition/pca.rb', line 36

def rng
  @rng
end

Instance Method Details

#fit(x) ⇒ PCA

Fit the model with given training data.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_samples, n_features]) The training data to be used for fitting the model.

Returns:

  • (PCA)

    The learned transformer itself.



70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
# File 'lib/rumale/decomposition/pca.rb', line 70

def fit(x, _y = nil)
  check_sample_array(x)
  # initialize some variables.
  @components = nil
  n_samples, n_features = x.shape
  sub_rng = @rng.dup
  # centering.
  @mean = x.mean(0)
  centered_x = x - @mean
  # optimization.
  covariance_mat = centered_x.transpose.dot(centered_x) / (n_samples - 1)
  if @params[:solver] == 'evd' && enable_linalg?
    _, evecs = Numo::Linalg.eigh(covariance_mat, vals_range: (n_features - @params[:n_components])...n_features)
    comps = evecs.reverse(1).transpose
    @components = @params[:n_components] == 1 ? comps[0, true].dup : comps.dup
  else
    @params[:n_components].times do
      comp_vec = Rumale::Utils.rand_uniform(n_features, sub_rng)
      @params[:max_iter].times do
        updated = orthogonalize(covariance_mat.dot(comp_vec))
        break if (updated.dot(comp_vec) - 1).abs < @params[:tol]
        comp_vec = updated
      end
      @components = @components.nil? ? comp_vec : Numo::NArray.vstack([@components, comp_vec])
    end
  end
  self
end

#fit_transform(x) ⇒ Numo::DFloat

Fit the model with training data, and then transform them with the learned model.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_samples, n_features]) The training data to be used for fitting the model.

Returns:

  • (Numo::DFloat)

    (shape: [n_samples, n_components]) The transformed data



105
106
107
108
# File 'lib/rumale/decomposition/pca.rb', line 105

def fit_transform(x, _y = nil)
  check_sample_array(x)
  fit(x).transform(x)
end

#inverse_transform(z) ⇒ Numo::DFloat

Inverse transform the given transformed data with the learned model.

Parameters:

  • z (Numo::DFloat)

    (shape: [n_samples, n_components]) The data to be restored into original space with the learned model.

Returns:

  • (Numo::DFloat)

    (shape: [n_samples, n_featuress]) The restored data.



123
124
125
126
127
# File 'lib/rumale/decomposition/pca.rb', line 123

def inverse_transform(z)
  check_sample_array(z)
  c = @components.shape[1].nil? ? @components.expand_dims(0) : @components
  z.dot(c) + @mean
end

#marshal_dumpHash

Dump marshal data.

Returns:

  • (Hash)

    The marshal data.



131
132
133
134
135
136
# File 'lib/rumale/decomposition/pca.rb', line 131

def marshal_dump
  { params: @params,
    components: @components,
    mean: @mean,
    rng: @rng }
end

#marshal_load(obj) ⇒ nil

Load marshal data.

Returns:

  • (nil)


140
141
142
143
144
145
146
# File 'lib/rumale/decomposition/pca.rb', line 140

def marshal_load(obj)
  @params = obj[:params]
  @components = obj[:components]
  @mean = obj[:mean]
  @rng = obj[:rng]
  nil
end

#transform(x) ⇒ Numo::DFloat

Transform the given data with the learned model.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_samples, n_features]) The data to be transformed with the learned model.

Returns:

  • (Numo::DFloat)

    (shape: [n_samples, n_components]) The transformed data.



114
115
116
117
# File 'lib/rumale/decomposition/pca.rb', line 114

def transform(x)
  check_sample_array(x)
  (x - @mean).dot(@components.transpose)
end