Class: DaruLite::Vector
- Extended by:
- Gem::Deprecate
- Includes:
- Maths::Arithmetic::Vector, Maths::Statistics::Vector, Aggregatable, Calculatable, Convertible, Duplicatable, Fetchable, Filterable, Indexable, Iterable, Joinable, Missable, Queryable, Setable, Sortable, Enumerable
- Defined in:
- lib/daru_lite/vector.rb,
lib/daru_lite/vector/setable.rb,
lib/daru_lite/vector/iterable.rb,
lib/daru_lite/vector/joinable.rb,
lib/daru_lite/vector/missable.rb,
lib/daru_lite/vector/sortable.rb,
lib/daru_lite/vector/fetchable.rb,
lib/daru_lite/vector/indexable.rb,
lib/daru_lite/vector/queryable.rb,
lib/daru_lite/vector/filterable.rb,
lib/daru_lite/vector/convertible.rb,
lib/daru_lite/vector/aggregatable.rb,
lib/daru_lite/vector/calculatable.rb,
lib/daru_lite/vector/duplicatable.rb
Overview
rubocop:disable Metrics/ClassLength
Defined Under Namespace
Modules: Aggregatable, Calculatable, Convertible, Duplicatable, Fetchable, Filterable, Indexable, Iterable, Joinable, Missable, Queryable, Setable, Sortable
Constant Summary collapse
- DATE_REGEXP =
/^(\d{2}-\d{2}-\d{4}|\d{4}-\d{2}-\d{2})$/
Constants included from Sortable
Instance Attribute Summary collapse
-
#data ⇒ Object
readonly
Store vector data in an array.
-
#dtype ⇒ Object
readonly
The underlying dtype of the Vector.
-
#index ⇒ Object
readonly
The row index.
-
#labels ⇒ Object
Store a hash of labels for values.
-
#missing_positions ⇒ Object
readonly
An Array or the positions in the vector that are being treated as ‘missing’.
-
#name ⇒ Object
readonly
The name of the DaruLite::Vector.
-
#nm_dtype ⇒ Object
readonly
Returns the value of attribute nm_dtype.
Class Method Summary collapse
-
.[](*indexes) ⇒ Object
Create a vector using (almost) any object * Array: flattened * Range: transformed using to_a * DaruLite::Vector * Numeric and string values.
-
._load(data) ⇒ Object
:nodoc:.
- .coerce(data, options = {}) ⇒ Object
-
.new_with_size(n, opts = {}, &block) ⇒ Object
Create a new vector by specifying the size and an optional value and block to generate values.
Instance Method Summary collapse
-
#==(other) ⇒ Object
Two vectors are equal if they have the exact same index values corresponding with the exact same elements.
-
#_dump ⇒ Object
:nodoc:.
-
#bootstrap(estimators, nr, s = nil) ⇒ Object
Bootstrap Generate
nrresamples (with replacement) of sizesfrom vector, computing each estimate fromestimatorsover each resample. -
#cast(opts = {}) ⇒ Object
Cast a vector to a new data type.
-
#category? ⇒ true, false
Tells if vector is categorical or not.
-
#daru_lite_vector ⇒ Object
(also: #dv)
:nocov:.
-
#db_type ⇒ Object
Returns the database type for the vector, according to its content.
-
#delete(element) ⇒ Object
Delete an element by value.
-
#delete_at(index) ⇒ Object
Delete element by index.
-
#delete_at_position(position) ⇒ Object
Delete element by position.
-
#in(other) ⇒ Object
Comparator for checking if any of the elements in other exist in self.
-
#initialize(source, opts = {}) ⇒ Vector
constructor
Create a Vector object.
-
#inspect(spacing = 20, threshold = 15) ⇒ Object
Over rides original inspect for pretty printing in irb.
-
#is_values(*values) ⇒ DaruLite::Vector
Return vector of booleans with value at ith position is either true or false depending upon whether value at position i is equal to any of the values passed in the argument or not.
-
#jackknife(estimators, k = 1) ⇒ Object
Jacknife Returns a dataset with jacknife delete-
kestimatorsestimatorscould be: a) Hash with variable names as keys and lambdas as values a.jacknife(:log_s2=>lambda {|v| Math.log(v.variance)}) b) Array with method names to jacknife a.jacknife([:mean, :sd]) c) A single method to jacknife a.jacknife(:mean)krepresent the block size for block jacknife. -
#lag(k = 1) ⇒ DaruLite::Vector
Lags the series by ‘k` periods.
- #method_missing(name, *args) ⇒ Object
- #numeric? ⇒ Boolean
- #object? ⇒ Boolean
-
#rename(new_name) ⇒ Object
(also: #name=)
Give the vector a new name.
- #respond_to_missing?(name, include_private = false) ⇒ Boolean
-
#save(filename) ⇒ Object
Save the vector to a file.
- #size ⇒ Object
-
#splitted(sep = ',') ⇒ Object
Return an Array with the data splitted by a separator.
-
#to_category(opts = {}) ⇒ DaruLite::Vector
Converts a non category type vector to category type vector.
-
#type ⇒ Object
The type of data contained in the vector.
Methods included from Queryable
#all?, #any?, #empty?, #include_values?, #match
Methods included from Sortable
#reorder, #reorder!, #sort, #sort_by_index, #sorted_data
Methods included from Setable
Methods included from Missable
#has_missing_data?, #n_valid, #only_missing, #only_valid, #replace_nils, #replace_nils!, #rolling_fillna, #rolling_fillna!
Methods included from Joinable
Methods included from Iterable
#apply_method, #each, #each_index, #each_with_index, #map!, #recode, #recode!, #replace_values, #verify
Methods included from Indexable
#detach_index, #has_index?, #index=, #index_of, #indexes, #reindex, #reindex!, #reset_index!
Methods included from Filterable
#apply_where, #delete_if, #keep_if, #only_numerics, #reject_values, #uniq, #where
Methods included from Fetchable
#[], #at, #cut, #get_sub_vector, #head, #last, #positions, #split_by_separator, #split_by_separator_freq, #tail
Methods included from Duplicatable
Methods included from Convertible
#to_a, #to_df, #to_h, #to_html, #to_html_tbody, #to_html_thead, #to_json, #to_matrix, #to_s
Methods included from Calculatable
#count_values, #numeric_summary, #object_summary, #summary
Methods included from Aggregatable
Methods included from Maths::Statistics::Vector
#acf, #acvf, #average_deviation_population, #box_cox_transformation, #center, #coefficient_of_variation, #count, #covariance_population, #covariance_sample, #cumsum, #describe, #dichotomize, #diff, #ema, #emsd, #emv, #factors, #frequencies, #index_of_max, #index_of_max_by, #index_of_min, #index_of_min_by, #kurtosis, #macd, #max, #max_by, #max_index, #mean, #median, #median_absolute_deviation, #min, #min_by, #mode, #percent_change, #percentile, #product, #proportion, #proportions, #range, #ranked, #rolling, #rolling_count, #rolling_max, #rolling_mean, #rolling_median, #rolling_min, #rolling_std, #rolling_sum, #rolling_variance, #sample_with_replacement, #sample_without_replacement, #skew, #standard_deviation_population, #standard_deviation_sample, #standard_error, #standardize, #sum, #sum_of_squared_deviation, #sum_of_squares, #value_counts, #variance_population, #variance_sample, #vector_centered_compute, #vector_percentile, #vector_standardized_compute
Methods included from Maths::Arithmetic::Vector
#%, #*, #**, #+, #-, #/, #abs, #add, #exp, #round, #sqrt
Constructor Details
#initialize(source, opts = {}) ⇒ Vector
Create a Vector object.
Arguments
Hash. If Array, a numeric index will be created if not supplied in the options. Specifying more index elements than actual values in source will insert nil into the surplus index elements. When a Hash is specified, the keys of the Hash are taken as the index elements and the corresponding values as the values that populate the vector.
Options
-
:name- Name of the vector -
:index- Index of the vector -
:dtype- The underlying data type. Can be :array.
Default :array.
-
:missing_values- An Array of the values that are to be treated as ‘missing’.
nil is the default missing value.
Usage
vecarr = DaruLite::Vector.new [1,2,3,4], index: [:a, :e, :i, :o]
vechsh = DaruLite::Vector.new({a: 1, e: 2, i: 3, o: 4})
163 164 165 166 167 168 169 170 171 172 |
# File 'lib/daru_lite/vector.rb', line 163 def initialize(source, opts = {}) if opts[:type] == :category # Initialize category type vector extend DaruLite::Category initialize_category source, opts else # Initialize non-category type vector initialize_vector source, opts end end |
Dynamic Method Handling
This class handles dynamic methods through the method_missing method
#method_missing(name, *args) ⇒ Object
566 567 568 569 570 571 572 573 574 |
# File 'lib/daru_lite/vector.rb', line 566 def method_missing(name, *args, &) if name =~ /^([^=]+)=/ self[Regexp.last_match(1).to_sym] = args[0] elsif has_index?(name) self[name] else super end end |
Instance Attribute Details
#data ⇒ Object (readonly)
Store vector data in an array
134 135 136 |
# File 'lib/daru_lite/vector.rb', line 134 def data @data end |
#dtype ⇒ Object (readonly)
The underlying dtype of the Vector. Can be :array.
124 125 126 |
# File 'lib/daru_lite/vector.rb', line 124 def dtype @dtype end |
#index ⇒ Object (readonly)
The row index. Can be either DaruLite::Index or DaruLite::MultiIndex.
122 123 124 |
# File 'lib/daru_lite/vector.rb', line 122 def index @index end |
#labels ⇒ Object
Store a hash of labels for values. Supplementary only. Recommend using index for proper usage.
132 133 134 |
# File 'lib/daru_lite/vector.rb', line 132 def labels @labels end |
#missing_positions ⇒ Object (readonly)
An Array or the positions in the vector that are being treated as ‘missing’.
127 128 129 |
# File 'lib/daru_lite/vector.rb', line 127 def missing_positions @missing_positions end |
#name ⇒ Object (readonly)
The name of the DaruLite::Vector. String.
120 121 122 |
# File 'lib/daru_lite/vector.rb', line 120 def name @name end |
#nm_dtype ⇒ Object (readonly)
Returns the value of attribute nm_dtype.
125 126 127 |
# File 'lib/daru_lite/vector.rb', line 125 def nm_dtype @nm_dtype end |
Class Method Details
.[](*indexes) ⇒ Object
Create a vector using (almost) any object
-
Array: flattened
-
Range: transformed using to_a
-
DaruLite::Vector
-
Numeric and string values
Description
The ‘Vector.[]` class method creates a vector from almost any object that has a `#to_a` method defined on it. It is similar to R’s ‘c` method.
Usage
a = DaruLite::Vector[1,2,3,4,6..10]
#=>
# <DaruLite::Vector:99448510 @name = nil @size = 9 >
# nil
# 0 1
# 1 2
# 2 3
# 3 4
# 4 6
# 5 7
# 6 8
# 7 9
# 8 10
88 89 90 91 92 93 |
# File 'lib/daru_lite/vector.rb', line 88 def [](*indexes) values = indexes.map do |a| a.respond_to?(:to_a) ? a.to_a : a end.flatten DaruLite::Vector.new(values) end |
._load(data) ⇒ Object
:nodoc:
95 96 97 98 99 100 101 |
# File 'lib/daru_lite/vector.rb', line 95 def _load(data) # :nodoc: h = Marshal.load(data) DaruLite::Vector.new(h[:data], index: h[:index], name: h[:name], dtype: h[:dtype], missing_values: h[:missing_values]) end |
.coerce(data, options = {}) ⇒ Object
103 104 105 106 107 108 109 110 111 112 |
# File 'lib/daru_lite/vector.rb', line 103 def coerce(data, = {}) case data when DaruLite::Vector data when Array, Hash new(data, ) else raise ArgumentError, "Can't coerce #{data.class} to #{self}" end end |
.new_with_size(n, opts = {}, &block) ⇒ Object
Create a new vector by specifying the size and an optional value and block to generate values.
Description
The new_with_size class method lets you create a DaruLite::Vector by specifying the size as the argument. The optional block, if supplied, is run once for populating each element in the Vector.
The result of each run of the block is the value that is ultimately assigned to that position in the Vector.
Options
:value All the rest like .new
55 56 57 58 59 |
# File 'lib/daru_lite/vector.rb', line 55 def new_with_size(n, opts = {}, &block) value = opts.delete :value block ||= ->(_) { value } DaruLite::Vector.new Array.new(n, &block), opts end |
Instance Method Details
#==(other) ⇒ Object
Two vectors are equal if they have the exact same index values corresponding with the exact same elements. Name is ignored.
176 177 178 179 180 181 182 183 184 185 186 |
# File 'lib/daru_lite/vector.rb', line 176 def ==(other) case other when DaruLite::Vector @index == other.index && size == other.size && each_with_index.with_index.all? do |(e, index), position| e == other.at(position) && index == other.index.to_a[position] end else super end end |
#_dump ⇒ Object
:nodoc:
536 537 538 539 540 541 542 543 |
# File 'lib/daru_lite/vector.rb', line 536 def _dump(*) # :nodoc: Marshal.dump( data: @data.to_a, dtype: @dtype, name: @name, index: @index ) end |
#bootstrap(estimators, nr, s = nil) ⇒ Object
Bootstrap
Generate nr resamples (with replacement) of size s from vector, computing each estimate from estimators over each resample. estimators could be a) Hash with variable names as keys and lambdas as values
a.bootstrap(:log_s2=>lambda {|v| Math.log(v.variance)},1000)
b) Array with names of method to bootstrap
a.bootstrap([:mean, :sd],1000)
c) A single method to bootstrap
a.jacknife(:mean, 1000)
If s is nil, is set to vector size by default.
Returns a DataFrame where each vector is a vector of length nr containing the computed resample estimates.
449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 |
# File 'lib/daru_lite/vector.rb', line 449 def bootstrap(estimators, nr, s = nil) s ||= size h_est, es, bss = prepare_bootstrap(estimators) nr.times do bs = sample_with_replacement(s) es.each do |estimator| bss[estimator].push(h_est[estimator].call(bs)) end end es.each do |est| bss[est] = DaruLite::Vector.new bss[est] end DaruLite::DataFrame.new bss end |
#cast(opts = {}) ⇒ Object
Cast a vector to a new data type.
Options
-
:dtype- :array for Ruby Array..
297 298 299 300 301 302 |
# File 'lib/daru_lite/vector.rb', line 297 def cast(opts = {}) dt = opts[:dtype] raise ArgumentError, "Unsupported dtype #{opts[:dtype]}" unless dt == :array @data = cast_vector_to dt unless @dtype == dt end |
#category? ⇒ true, false
Tells if vector is categorical or not.
350 351 352 |
# File 'lib/daru_lite/vector.rb', line 350 def category? type == :category end |
#daru_lite_vector ⇒ Object Also known as: dv
:nocov:
546 547 548 |
# File 'lib/daru_lite/vector.rb', line 546 def daru_lite_vector(*) self end |
#db_type ⇒ Object
Returns the database type for the vector, according to its content
514 515 516 517 518 519 520 521 522 523 524 525 |
# File 'lib/daru_lite/vector.rb', line 514 def db_type # first, detect any character not number if @data.any? { |v| v.to_s =~ DATE_REGEXP } 'DATE' elsif @data.any? { |v| v.to_s =~ /[^0-9e.-]/ } 'VARCHAR (255)' elsif @data.any? { |v| v.to_s.include?('.') } 'DOUBLE' else 'INTEGER' end end |
#delete(element) ⇒ Object
Delete an element by value
305 306 307 |
# File 'lib/daru_lite/vector.rb', line 305 def delete(element) delete_at index_of(element) end |
#delete_at(index) ⇒ Object
Delete element by index
310 311 312 313 314 315 |
# File 'lib/daru_lite/vector.rb', line 310 def delete_at(index) @data.delete_at @index[index] @index = DaruLite::Index.new(@index.to_a - [index]) update_position_cache end |
#delete_at_position(position) ⇒ Object
Delete element by position
318 319 320 321 322 323 |
# File 'lib/daru_lite/vector.rb', line 318 def delete_at_position(position) @data.delete_at(position) @index = @index.delete_at(position) update_position_cache end |
#in(other) ⇒ Object
Comparator for checking if any of the elements in other exist in self.
255 256 257 258 259 260 261 262 |
# File 'lib/daru_lite/vector.rb', line 255 def in(other) other = other.zip(Array.new(other.size, 0)).to_h DaruLite::Core::Query::BoolArray.new( @data.each_with_object([]) do |d, memo| memo << (other.key?(d)) end ) end |
#inspect(spacing = 20, threshold = 15) ⇒ Object
Over rides original inspect for pretty printing in irb
411 412 413 414 415 416 417 418 419 420 421 422 |
# File 'lib/daru_lite/vector.rb', line 411 def inspect(spacing = 20, threshold = 15) row_headers = index.is_a?(MultiIndex) ? index.sparse_tuples : index.to_a "#<#{self.class}(#{size})#{':category' if category?}>\n" + Formatters::Table.format( to_a.lazy.zip, headers: @name && [@name], row_headers: row_headers, threshold: threshold, spacing: spacing ) end |
#is_values(*values) ⇒ DaruLite::Vector
Do not use it to check for Float::NAN as Float::NAN == Float::NAN is false
Return vector of booleans with value at ith position is either true or false depending upon whether value at position i is equal to any of the values passed in the argument or not
288 289 290 |
# File 'lib/daru_lite/vector.rb', line 288 def is_values(*values) DaruLite::Vector.new values.map { |v| eq(v) }.inject(:|) end |
#jackknife(estimators, k = 1) ⇒ Object
Jacknife
Returns a dataset with jacknife delete-k estimators estimators could be: a) Hash with variable names as keys and lambdas as values
a.jacknife(:log_s2=>lambda {|v| Math.log(v.variance)})
b) Array with method names to jacknife
a.jacknife([:mean, :sd])
c) A single method to jacknife
a.jacknife(:mean)
k represent the block size for block jacknife. By default is set to 1, for classic delete-one jacknife.
Returns a dataset where each vector is an vector of length cases/k containing the computed jacknife estimates.
Reference:
-
Sawyer, S. (2005). Resampling Data: Using a Statistical Jacknife.
484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 |
# File 'lib/daru_lite/vector.rb', line 484 def jackknife(estimators, k = 1) # rubocop:disable Metrics/MethodLength raise "n should be divisible by k:#{k}" unless (size % k).zero? nb = (size / k).to_i h_est, es, ps = prepare_bootstrap(estimators) est_n = es.to_h { |v| [v, h_est[v].call(self)] } nb.times do |i| other = @data.dup other.slice!(i * k, k) other = DaruLite::Vector.new other es.each do |estimator| # Add pseudovalue ps[estimator].push( (nb * est_n[estimator]) - ((nb - 1) * h_est[estimator].call(other)) ) end end es.each do |est| ps[est] = DaruLite::Vector.new ps[est] end DaruLite::DataFrame.new ps end |
#lag(k = 1) ⇒ DaruLite::Vector
Lags the series by ‘k` periods.
Lags the series by ‘k` periods, “shifting” data and inserting `nil`s from beginning or end of a vector, while preserving original vector’s size.
‘k` can be positive or negative integer. If `k` is positive, `nil`s are inserted at the beginning of the vector, otherwise they are inserted at the end.
398 399 400 401 402 403 404 405 406 407 408 |
# File 'lib/daru_lite/vector.rb', line 398 def lag(k = 1) case k when 0 then dup when 1...size copy(([nil] * k) + data.to_a) when -size..-1 copy(data.to_a[k.abs...size]) else copy([]) end end |
#numeric? ⇒ Boolean
264 265 266 |
# File 'lib/daru_lite/vector.rb', line 264 def numeric? type == :numeric end |
#object? ⇒ Boolean
268 269 270 |
# File 'lib/daru_lite/vector.rb', line 268 def object? type == :object end |
#rename(new_name) ⇒ Object Also known as: name=
Give the vector a new name
427 428 429 430 |
# File 'lib/daru_lite/vector.rb', line 427 def rename(new_name) @name = new_name self end |
#respond_to_missing?(name, include_private = false) ⇒ Boolean
576 577 578 |
# File 'lib/daru_lite/vector.rb', line 576 def respond_to_missing?(name, include_private = false) name.to_s.end_with?('=') || has_index?(name) || super end |
#save(filename) ⇒ Object
Save the vector to a file
Arguments
-
filename - Path of file where the vector is to be saved
532 533 534 |
# File 'lib/daru_lite/vector.rb', line 532 def save(filename) DaruLite::IO.save self, filename end |
#size ⇒ Object
115 116 117 |
# File 'lib/daru_lite/vector.rb', line 115 def size @data.size end |
#splitted(sep = ',') ⇒ Object
Return an Array with the data splitted by a separator.
a=DaruLite::Vector.new(["a,b","c,d","a,b","d"])
a.splitted
=>
[["a","b"],["c","d"],["a","b"],["d"]]
359 360 361 362 363 364 365 366 367 368 369 |
# File 'lib/daru_lite/vector.rb', line 359 def splitted(sep = ',') @data.map do |s| if s.nil? nil elsif s.respond_to? :split s.split sep else [s] end end end |
#to_category(opts = {}) ⇒ DaruLite::Vector
Converts a non category type vector to category type vector.
559 560 561 562 563 564 |
# File 'lib/daru_lite/vector.rb', line 559 def to_category(opts = {}) dv = DaruLite::Vector.new to_a, type: :category, name: @name, index: @index dv.ordered = opts[:ordered] || false dv.categories = opts[:categories] if opts[:categories] dv end |
#type ⇒ Object
The type of data contained in the vector. Can be :object.
Running through the data to figure out the kind of data is delayed to the last possible moment.
329 330 331 332 333 334 335 336 337 338 339 340 341 342 |
# File 'lib/daru_lite/vector.rb', line 329 def type if @type.nil? || @possibly_changed_type @type = :numeric each do |e| next if e.nil? || e.is_a?(Numeric) @type = :object break end @possibly_changed_type = false end @type end |