Class: OpenTox::Transform::SVD

Inherits:
Object
  • Object
show all
Defined in:
lib/transform.rb

Overview

Singular Value Decomposition

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(data_matrix, compression = 0.05) ⇒ GSL::Matrix

Creates a transformed dataset as GSL::Matrix.

Parameters:

  • Data (GSL::Matrix)

    matrix

  • Compression (Float)

    ratio from [0,1], default 0.05



227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
# File 'lib/transform.rb', line 227

def initialize data_matrix, compression=0.05
  begin
    @data_matrix = data_matrix.clone
    @compression = compression

    # Compute the SV Decomposition X=USV
    # vt is *not* the transpose of V here, but V itself (see http://goo.gl/mm2xz)!
    u, vt, s = data_matrix.SV_decomp 
    
    # Determine cutoff index
    s2 = s.mul(s) ; s2_sum = s2.sum
    s2_run = 0
    k = s2.size - 1
    s2.to_a.reverse.each { |v| 
      s2_run += v
      frac = s2_run / s2_sum
      break if frac > compression
      k -= 1
    }
    k += 1 if k == 0 # avoid uni-dimensional (always cos sim of 1)
    
    # Take the k-rank approximation of the Matrix
    #   - Take first k columns of u
    #   - Take first k columns of vt
    #   - Take the first k eigenvalues
    @uk = u.submatrix(nil, (0..k)) # used to transform column format data
    @vk = vt.submatrix(nil, (0..k)) # used to transform row format data
    s = GSL::Matrix.diagonal(s)
    @eigk = s.submatrix((0..k), (0..k))
    @eigk_inv = @eigk.inv

    # Transform data
    @data_transformed_matrix = @uk # = u for all SVs
    # NOTE: @data_transformed_matrix is also equal to
    # @data_matrix * @vk * @eigk_inv

  rescue Exception => e
    LOGGER.debug "#{e.class}: #{e.message}"
    LOGGER.debug "Backtrace:\n\t#{e.backtrace.join("\n\t")}"
  end
end

Instance Attribute Details

#compressionObject

Returns the value of attribute compression.



219
220
221
# File 'lib/transform.rb', line 219

def compression
  @compression
end

#data_matrixObject

Returns the value of attribute data_matrix.



219
220
221
# File 'lib/transform.rb', line 219

def data_matrix
  @data_matrix
end

#data_transformed_matrixObject

Returns the value of attribute data_transformed_matrix.



219
220
221
# File 'lib/transform.rb', line 219

def data_transformed_matrix
  @data_transformed_matrix
end

#eigkObject

Returns the value of attribute eigk.



219
220
221
# File 'lib/transform.rb', line 219

def eigk
  @eigk
end

#eigk_invObject

Returns the value of attribute eigk_inv.



219
220
221
# File 'lib/transform.rb', line 219

def eigk_inv
  @eigk_inv
end

#ukObject

Returns the value of attribute uk.



219
220
221
# File 'lib/transform.rb', line 219

def uk
  @uk
end

#vkObject

Returns the value of attribute vk.



219
220
221
# File 'lib/transform.rb', line 219

def vk
  @vk
end

Instance Method Details

#restoreGSL::Matrix

Restores data in the original feature space (possibly with compression loss).

Parameters:

  • Transformed (GSL::Matrix)

    data matrix.

Returns:

  • (GSL::Matrix)

    Data matrix.



302
303
304
305
306
307
308
309
# File 'lib/transform.rb', line 302

def restore
  begin 
    @data_transformed_matrix * @eigk * @vk.transpose  # reverse svd
  rescue Exception => e
    LOGGER.debug "#{e.class}: #{e.message}"
    LOGGER.debug "Backtrace:\n\t#{e.backtrace.join("\n\t")}"
  end
end

#transform_feature(values) ⇒ GSL::Matrix

Transforms data feature (1 column) to feature space found by SVD.

Parameters:

  • Data (GSL::Matrix)

    matrix (1 x n).

Returns:

  • (GSL::Matrix)

    Transformed data matrix.



288
289
290
291
292
293
294
295
# File 'lib/transform.rb', line 288

def transform_feature values
  begin
    values * @uk * @eigk_inv
  rescue Exception => e
    LOGGER.debug "#{e.class}: #{e.message}"
    LOGGER.debug "Backtrace:\n\t#{e.backtrace.join("\n\t")}"
  end
end

#transform_instance(values) ⇒ GSL::Matrix Also known as: transform

Transforms data instance (1 row) to feature space found by SVD.

Parameters:

  • Data (GSL::Matrix)

    matrix (1 x m).

Returns:

  • (GSL::Matrix)

    Transformed data matrix.



274
275
276
277
278
279
280
281
# File 'lib/transform.rb', line 274

def transform_instance values
  begin
    values * @vk * @eigk_inv
  rescue Exception => e
    LOGGER.debug "#{e.class}: #{e.message}"
    LOGGER.debug "Backtrace:\n\t#{e.backtrace.join("\n\t")}"
  end
end