Module: ClusterKit::Preprocessing
- Defined in:
- lib/clusterkit/preprocessing.rb
Overview
Data preprocessing utilities
Class Method Summary collapse
-
.normalize(data, method: :standard) ⇒ Array
Normalize data using specified method.
-
.pca_reduce(data, n_components) ⇒ Array
Reduce dimensionality using PCA before embedding.
Class Method Details
.normalize(data, method: :standard) ⇒ Array
Normalize data using specified method
13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
# File 'lib/clusterkit/preprocessing.rb', line 13 def normalize(data, method: :standard) raise ArgumentError, "Unsupported data type: #{data.class}" unless data.is_a?(Array) case method when :standard standard_normalize(data) when :minmax minmax_normalize(data) when :l2 l2_normalize(data) else raise ArgumentError, "Unknown normalization method: #{method}" end end |
.pca_reduce(data, n_components) ⇒ Array
Reduce dimensionality using PCA before embedding
32 33 34 35 36 |
# File 'lib/clusterkit/preprocessing.rb', line 32 def pca_reduce(data, n_components) # Note: This would require SVD implementation in pure Ruby # For now, raise an error suggesting to use the Rust-based SVD module raise NotImplementedError, "PCA reduction requires the SVD module which needs to be called directly" end |