Module: ClusterKit::Utils
- Defined in:
- lib/clusterkit/utils.rb
Overview
Utility functions for data analysis
Class Method Summary collapse
-
.estimate_hubness(data) ⇒ Hash
Estimate hubness in the data.
-
.estimate_intrinsic_dimension(data, k_neighbors: 10) ⇒ Float
Estimate the intrinsic dimension of data.
-
.neighborhood_stability(original_data, embedded_data, k: 15) ⇒ Float
Measure neighborhood stability through embedding.
Class Method Details
.estimate_hubness(data) ⇒ Hash
Estimate hubness in the data
22 23 24 25 26 27 |
# File 'lib/clusterkit/utils.rb', line 22 def estimate_hubness(data) raise ArgumentError, "Unsupported data type: #{data.class}" unless data.is_a?(Array) result = estimate_hubness_rust(data) symbolize_keys(result) end |
.estimate_intrinsic_dimension(data, k_neighbors: 10) ⇒ Float
Estimate the intrinsic dimension of data
13 14 15 16 17 |
# File 'lib/clusterkit/utils.rb', line 13 def estimate_intrinsic_dimension(data, k_neighbors: 10) raise ArgumentError, "Unsupported data type: #{data.class}" unless data.is_a?(Array) estimate_intrinsic_dimension_rust(data, k_neighbors) end |
.neighborhood_stability(original_data, embedded_data, k: 15) ⇒ Float
Measure neighborhood stability through embedding
34 35 36 37 38 39 40 |
# File 'lib/clusterkit/utils.rb', line 34 def neighborhood_stability(original_data, , k: 15) raise ArgumentError, "Unsupported data type: #{original_data.class}" unless original_data.is_a?(Array) raise ArgumentError, "Unsupported data type: #{embedded_data.class}" unless .is_a?(Array) # TODO: Implement neighborhood stability calculation raise NotImplementedError, "Neighborhood stability not implemented yet" end |