Class: MachineLearningWorkbench::Compressor::VectorQuantization
- Inherits:
-
Object
- Object
- MachineLearningWorkbench::Compressor::VectorQuantization
- Defined in:
- lib/machine_learning_workbench/compressor/vector_quantization.rb
Overview
Standard Vector Quantization
Direct Known Subclasses
Constant Summary collapse
- Verification =
MachineLearningWorkbench::Tools::Verification
Instance Attribute Summary collapse
-
#centrs ⇒ Object
readonly
Returns the value of attribute centrs.
-
#dims ⇒ Object
readonly
Returns the value of attribute dims.
-
#dtype ⇒ Object
readonly
Returns the value of attribute dtype.
-
#lrate ⇒ Object
readonly
Returns the value of attribute lrate.
-
#ncentrs ⇒ Object
readonly
Returns the value of attribute ncentrs.
-
#ntrains ⇒ Object
readonly
Returns the value of attribute ntrains.
-
#rng ⇒ Object
readonly
Returns the value of attribute rng.
-
#vrange ⇒ Object
readonly
Returns the value of attribute vrange.
Instance Method Summary collapse
-
#check_lrate(lrate) ⇒ Object
Verify lrate to be present and withing unit bounds As a separate method only so it can be overloaded in online_vq.
-
#encode(vec, type: :most_similar) ⇒ Object
Encode a vector.
-
#initialize(ncentrs:, dims:, vrange:, dtype:, lrate:, rseed: Random.new_seed) ⇒ VectorQuantization
constructor
A new instance of VectorQuantization.
-
#most_similar_centr(vec) ⇒ Array<Integer, Float>
Returns index and similitude of most similar centroid to vector.
-
#new_centr ⇒ Object
Creates a new (random) centroid.
-
#reconstr_error(vec) ⇒ NMatrix
Per-pixel errors in reconstructing vector.
-
#reconstruction(code, type: :most_similar) ⇒ Object
Reconstruct vector from its code (encoding).
-
#similarities(vec) ⇒ Object
Computes similarities between vector and all centroids.
-
#train(vec_lst, debug: false) ⇒ Object
Train on vector list.
-
#train_one(vec) ⇒ Integer
Train on one vector.
Constructor Details
#initialize(ncentrs:, dims:, vrange:, dtype:, lrate:, rseed: Random.new_seed) ⇒ VectorQuantization
Returns a new instance of VectorQuantization.
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
# File 'lib/machine_learning_workbench/compressor/vector_quantization.rb', line 8 def initialize ncentrs:, dims:, vrange:, dtype:, lrate:, rseed: Random.new_seed @rng = Random.new rseed @ncentrs = ncentrs @dtype = dtype @dims = Array(dims) check_lrate lrate # hack: so that we can overload it in online_vq @lrate = lrate @vrange = case vrange when Array raise ArgumentError, "vrange size not 2: #{vrange}" unless vrange.size == 2 vrange.map &method(:Float) when Range [vrange.first, vrange.last].map &method(:Float) else raise ArgumentError, "vrange: unrecognized type: #{vrange.class}" end @centrs = ncentrs.times.map { new_centr } @ntrains = [0]*ncentrs # useful to understand what happens end |
Instance Attribute Details
#centrs ⇒ Object (readonly)
Returns the value of attribute centrs.
5 6 7 |
# File 'lib/machine_learning_workbench/compressor/vector_quantization.rb', line 5 def centrs @centrs end |
#dims ⇒ Object (readonly)
Returns the value of attribute dims.
5 6 7 |
# File 'lib/machine_learning_workbench/compressor/vector_quantization.rb', line 5 def dims @dims end |
#dtype ⇒ Object (readonly)
Returns the value of attribute dtype.
5 6 7 |
# File 'lib/machine_learning_workbench/compressor/vector_quantization.rb', line 5 def dtype @dtype end |
#lrate ⇒ Object (readonly)
Returns the value of attribute lrate.
5 6 7 |
# File 'lib/machine_learning_workbench/compressor/vector_quantization.rb', line 5 def lrate @lrate end |
#ncentrs ⇒ Object (readonly)
Returns the value of attribute ncentrs.
5 6 7 |
# File 'lib/machine_learning_workbench/compressor/vector_quantization.rb', line 5 def ncentrs @ncentrs end |
#ntrains ⇒ Object (readonly)
Returns the value of attribute ntrains.
5 6 7 |
# File 'lib/machine_learning_workbench/compressor/vector_quantization.rb', line 5 def ntrains @ntrains end |
#rng ⇒ Object (readonly)
Returns the value of attribute rng.
5 6 7 |
# File 'lib/machine_learning_workbench/compressor/vector_quantization.rb', line 5 def rng @rng end |
#vrange ⇒ Object (readonly)
Returns the value of attribute vrange.
5 6 7 |
# File 'lib/machine_learning_workbench/compressor/vector_quantization.rb', line 5 def vrange @vrange end |
Instance Method Details
#check_lrate(lrate) ⇒ Object
Verify lrate to be present and withing unit bounds As a separate method only so it can be overloaded in online_vq
29 30 31 |
# File 'lib/machine_learning_workbench/compressor/vector_quantization.rb', line 29 def check_lrate lrate raise ArgumentError, "Pass a `lrate` between 0 and 1" unless lrate&.between?(0,1) end |
#encode(vec, type: :most_similar) ⇒ Object
Encode a vector
49 50 51 52 53 54 55 56 57 58 59 60 61 |
# File 'lib/machine_learning_workbench/compressor/vector_quantization.rb', line 49 def encode vec, type: :most_similar simils = similarities vec case type when :most_similar simils.index simils.max when :ensemble simils when :ensemble_norm tot = simils.reduce(:+) simils.map { |s| s/tot } else raise ArgumentError, "unrecognized encode type: #{type}" end end |
#most_similar_centr(vec) ⇒ Array<Integer, Float>
Returns index and similitude of most similar centroid to vector
80 81 82 83 84 85 |
# File 'lib/machine_learning_workbench/compressor/vector_quantization.rb', line 80 def most_similar_centr vec simils = similarities vec max_simil = simils.max max_idx = simils.index max_simil [max_idx, max_simil] end |
#new_centr ⇒ Object
Creates a new (random) centroid
34 35 36 37 38 |
# File 'lib/machine_learning_workbench/compressor/vector_quantization.rb', line 34 def new_centr # TODO: this is too slow, find another way to use the rng # NMatrix.new(dims, dtype: dtype) { rng.rand Range.new *vrange } NMatrix.random dims, dtype: dtype end |
#reconstr_error(vec) ⇒ NMatrix
Per-pixel errors in reconstructing vector
89 90 91 |
# File 'lib/machine_learning_workbench/compressor/vector_quantization.rb', line 89 def reconstr_error vec reconstruction(vec) - vec end |
#reconstruction(code, type: :most_similar) ⇒ Object
Reconstruct vector from its code (encoding)
64 65 66 67 68 69 70 71 72 73 74 75 |
# File 'lib/machine_learning_workbench/compressor/vector_quantization.rb', line 64 def reconstruction code, type: :most_similar case type when :most_similar centrs[code] when :ensemble tot = code.reduce :+ centrs.zip(code).map { |centr, contr| centr*contr/tot }.reduce :+ when :ensemble_norm centrs.zip(code).map { |centr, contr| centr*contr }.reduce :+ else raise ArgumentError, "unrecognized reconstruction type: #{type}" end end |
#similarities(vec) ⇒ Object
Computes similarities between vector and all centroids
41 42 43 44 45 46 |
# File 'lib/machine_learning_workbench/compressor/vector_quantization.rb', line 41 def similarities vec raise NotImplementedError if vec.shape.size > 1 centrs.map { |c| c.dot(vec).first } # require 'parallel' # Parallel.map(centrs) { |c| c.dot(vec).first } end |
#train(vec_lst, debug: false) ⇒ Object
Train on vector list
106 107 108 109 110 111 112 113 114 115 116 |
# File 'lib/machine_learning_workbench/compressor/vector_quantization.rb', line 106 def train vec_lst, debug: false # Two ways here: # - Batch: canonical, centrs updated with each vec # - Parallel: could be parallel either on simils or on training (?) # Unsure on the correctness of either Parallel, let's stick with Batch vec_lst.each_with_index do |vec, i| trained_idx = train_one vec print '.' if debug ntrains[trained_idx] += 1 end end |
#train_one(vec) ⇒ Integer
Train on one vector
95 96 97 98 99 100 101 102 103 |
# File 'lib/machine_learning_workbench/compressor/vector_quantization.rb', line 95 def train_one vec trg_idx, _simil = most_similar_centr(vec) # note: uhm that actually looks like a dot product... optimizable? # `[c[i], vec].dot([1-lrate, lrate])` centrs[trg_idx] = centrs[trg_idx] * (1-lrate) + vec * lrate # Verification.in_range! centrs[trg_idx], vrange # I verified it's not needed trg_idx end |