Class: Statsample::Vector

Inherits:

Object

Object
Statsample::Vector

show all

Includes:: Enumerable, Summarizable, VectorShorthands, Writable

Defined in:: lib/statsample/vector.rb,
lib/statsample/vector/gsl.rb,
lib/statsample/rserve_extension.rb

Overview

Collection of values on one dimension. Works as a column on a Spreadsheet.

Usage

The fast way to create a vector uses Array.to_vector or Array.to_numeric.

v=[1,2,3,4].to_vector(:numeric)
v=[1,2,3,4].to_numeric

Defined Under Namespace

Modules: GSL_

Instance Attribute Summary collapse

#data ⇒ Object readonly

Original data.
#data_with_nils ⇒ Object readonly

Original data, with all missing values replaced by nils.
#date_data_with_nils ⇒ Object readonly

Date date, with all missing values replaced by nils.
#labels ⇒ Object

Change label for specific values.
#missing_data ⇒ Object readonly

Missing values array.
#missing_values ⇒ Object

Array of values considered as missing.
#name ⇒ Object

Name of vector.
#today_values ⇒ Object

Array of values considered as “Today”, with date type.
#type ⇒ Object

Level of measurement.
#valid_data ⇒ Object readonly

Valid data.

Class Method Summary collapse

.[](*args) ⇒ Object

Create a vector using (almost) any object * Array: flattened * Range: transformed using to_a * Statsample::Vector * Numeric and string values.
._load(data) ⇒ Object

:nodoc:.
.new_numeric(n, val = nil, &block) ⇒ Object

Create a new numeric type vector Parameters [n] Size [val] Value of each value [&block] If block provided, is used to set the values of vector.
.new_scale(n, val = nil, &block) ⇒ Object

Deprecated.

Instance Method Summary collapse

#*(v) ⇒ Object
#+(v) ⇒ Object

Vector sum.
#-(v) ⇒ Object

Vector rest.
#==(v2) ⇒ Object

Vector equality.
#[](i) ⇒ Object

Retrieves i element of data.
#[]=(i, v) ⇒ Object

Set i element of data.
#_check_type(t) ⇒ Object

:nodoc:.
#_dump(i) ⇒ Object

:nodoc:.
#_frequencies ⇒ Object

:nodoc:.
#_set_valid_data_intern ⇒ Object

:nodoc:.
#_vector_ari(method, v) ⇒ Object

:nodoc:.
#add(v, update_valid = true) ⇒ Object

Add a value at the end of the vector.
#average_deviation_population(m = nil) ⇒ Object (also: #adp)

Population average deviation (denominator N) author: Al Chou.
#bootstrap(estimators, nr, s = nil) ⇒ Object

Bootstrap Generate nr resamples (with replacement) of size s from vector, computing each estimate from estimators over each resample.
#box_cox_transformation(lambda) ⇒ Object

:nodoc:.
#can_be_date? ⇒ Boolean

Return true if all data is Date, “today” values or nil.
#can_be_numeric? ⇒ Boolean

Return true if all data is Numeric or nil.
#check_type(t) ⇒ Object

:nodoc:.
#coefficient_of_variation ⇒ Object (also: #cov)

Coefficient of variation Calculed with the sample standard deviation.
#count(x = false) ⇒ Object

Retrieves number of cases which comply condition.
#db_type(dbs = 'mysql') ⇒ Object

Returns the database type for the vector, according to its content.
#dichotomize(low = nil) ⇒ Object

Dicotomize the vector with 0 and 1, based on lowest value If parameter if defined, this value and lower will be 0 and higher, 1.
#dup ⇒ Object

Creates a duplicate of the Vector.
#dup_empty ⇒ Object

Returns an empty duplicate of the vector.
#each ⇒ Object

Iterate on each item.
#each_index ⇒ Object

Iterate on each item, retrieving index.
#factors ⇒ Object

Retrieves uniques values for data.
#frequencies ⇒ Object

:nodoc:.
#has_missing_data? ⇒ Boolean (also: #flawed?)

Retrieves true if data has one o more missing values.
#histogram(bins = 10) ⇒ Object

With a fixnum, creates X bins within the range of data With an Array, each value will be a cut point.
#initialize(data = [], type = :object, opts = Hash.new) ⇒ Vector constructor

Creates a new Vector object.
#inspect ⇒ Object
#is_valid?(x) ⇒ Boolean

Return true if a value is valid (not nil and not included on missing values).
#jacknife(estimators, k = 1) ⇒ Object

Jacknife Returns a dataset with jacknife delete-k estimators estimators could be: a) Hash with variable names as keys and lambdas as values a.jacknife(:log_s2=>lambda {|v| Math.log(v.variance)}) b) Array with method names to jacknife a.jacknife([:mean, :sd]) c) A single method to jacknife a.jacknife(:mean) k represent the block size for block jacknife.
#kurtosis(m = nil) ⇒ Object

Kurtosis of the sample.
#labeling(x) ⇒ Object (also: #label)

Retrieves label for value x.
#max ⇒ Object

Maximum value.
#mean ⇒ Object

The arithmetical mean of data.
#median ⇒ Object

Return the median (percentil 50).
#median_absolute_deviation ⇒ Object (also: #mad)
#min ⇒ Object

Minimun value.
#mode ⇒ Object

Returns the most frequent item.
#n_valid ⇒ Object

The numbers of item with valid data.
#percentil(q, strategy = :midpoint) ⇒ Object

Percentil Returns the value of the percentile q.
#product ⇒ Object

Product of all values on the sample.
#proportion(v = 1) ⇒ Object

Proportion of a given value.
#proportion_confidence_interval_t(n_poblation, margin = 0.95, v = 1) ⇒ Object
#proportion_confidence_interval_z(n_poblation, margin = 0.95, v = 1) ⇒ Object
#proportions ⇒ Object

Returns a hash with the distribution of proportions of the sample.
#push(v) ⇒ Object
#range ⇒ Object

The range of the data (max - min).
#ranked(type = :numeric) ⇒ Object

Returns a ranked vector.
#recode(type = nil) ⇒ Object

Returns a new vector, with data modified by block.
#recode! ⇒ Object

Modifies current vector, with data modified by block.
#report_building(b) ⇒ Object
#sample_with_replacement(sample = 1) ⇒ Object

Returns an random sample of size n, with replacement, only with valid data.
#sample_without_replacement(sample = 1) ⇒ Object

Returns an random sample of size n, without replacement, only with valid data.
#set_valid_data ⇒ Object

Update valid_data, missing_data, data_with_nils and gsl at the end of an insertion.
#set_valid_data_intern ⇒ Object

:nodoc:.
#size ⇒ Object (also: #n)

Size of total data.
#skew(m = nil) ⇒ Object

Skewness of the sample.
#split_by_separator(sep = Statsample::SPLIT_TOKEN) ⇒ Object

Returns a hash of Vectors, defined by the different values defined on the fields Example:.
#split_by_separator_freq(sep = Statsample::SPLIT_TOKEN) ⇒ Object
#splitted(sep = Statsample::SPLIT_TOKEN) ⇒ Object

Return an array with the data splitted by a separator.
#standard_deviation_population(m = nil) ⇒ Object (also: #sdp)

Population Standard deviation (denominator N).
#standard_deviation_sample(m = nil) ⇒ Object (also: #sds, #sd)

Sample Standard deviation (denominator n-1).
#standard_error ⇒ Object (also: #se)

Standard error of the distribution mean Calculated using sd/sqrt(n).
#sum ⇒ Object

The sum of values for the data.
#sum_of_squared_deviation ⇒ Object

Sum of squared deviation.
#sum_of_squares(m = nil) ⇒ Object (also: #ss)

Sum of squares for the data around a value.
#to_a ⇒ Object (also: #to_ary)
#to_matrix(dir = :horizontal) ⇒ Object

Ugly name.
#to_REXP ⇒ Object
#to_s ⇒ Object
#variance_population(m = nil) ⇒ Object

Population variance (denominator N).
#variance_proportion(n_poblation, v = 1) ⇒ Object

Variance of p, according to poblation size.
#variance_sample(m = nil) ⇒ Object (also: #variance)

Sample Variance (denominator n-1).
#variance_total(n_poblation, v = 1) ⇒ Object

Variance of p, according to poblation size.
#vector_centered ⇒ Object (also: #centered)

Return a centered vector.
#vector_centered_compute(m) ⇒ Object

:nodoc:.
#vector_labeled ⇒ Object

Returns a Vector with data with labels replaced by the label.
#vector_percentil ⇒ Object

Return a vector with values replaced with the percentiles of each values.
#vector_standarized(use_population = false) ⇒ Object (also: #standarized)

Return a vector usign the standarized values for data with sd with denominator n-1.
#vector_standarized_compute(m, sd) ⇒ Object

:nodoc:.
#verify ⇒ Object

Reports all values that doesn’t comply with a condition.

Constructor Details

#initialize(data = [], type = :object, opts = Hash.new) ⇒ `Vector`

Creates a new Vector object.

data Any data which can be converted on Array
type Level of meausurement. See Vector#type
opts Hash of options
- :missing_values Array of missing values. See Vector#missing_values
- :today_values Array of ‘today’ values. See Vector#today_values
- :labels Labels for data values
- :name Name of vector

# File 'lib/statsample/vector.rb', line 80

def initialize(data=[], type=:object, opts=Hash.new)
  if type == :ordinal or type == :scale
    $stderr.puts "WARNING: #{type} has been deprecated. Use :numeric instead."
    type = :numeric
  end

  if type == :nominal
    $stderr.puts "WARNING: nominal has been deprecated. Use :object instead."
    type = :object
  end

  @data=data.is_a?(Array) ? data : data.to_a
  @type=type
  opts_default={
    :missing_values=>[],
    :today_values=>['NOW','TODAY', :NOW, :TODAY],
    :labels=>{},
    :name=>nil
  }
  @opts=opts_default.merge(opts)
  if  @opts[:name].nil?
    @@n_table||=0
    @@n_table+=1
    @opts[:name]="Vector #{@@n_table}"
  end
  @missing_values=@opts[:missing_values]
  @labels=@opts[:labels]
  @today_values=@opts[:today_values]
  @name=@opts[:name]
  @valid_data=[]
  @data_with_nils=[]
  @date_data_with_nils=[]
  @missing_data=[]
  @has_missing_data=nil
  @numeric_data=nil
  set_valid_data
  self.type=type
end

Instance Attribute Details

#data ⇒ `Object` (readonly)

Original data.



54
55
56

# File 'lib/statsample/vector.rb', line 54

def data
  @data
end

#data_with_nils ⇒ `Object` (readonly)

Original data, with all missing values replaced by nils



64
65
66

# File 'lib/statsample/vector.rb', line 64

def data_with_nils
  @data_with_nils
end

#date_data_with_nils ⇒ `Object` (readonly)

Date date, with all missing values replaced by nils



66
67
68

# File 'lib/statsample/vector.rb', line 66

def date_data_with_nils
  @date_data_with_nils
end

#labels ⇒ `Object`

Change label for specific values



68
69
70

# File 'lib/statsample/vector.rb', line 68

def labels
  @labels
end

#missing_data ⇒ `Object` (readonly)

Missing values array



62
63
64

# File 'lib/statsample/vector.rb', line 62

def missing_data
  @missing_data
end

#missing_values ⇒ `Object`

Array of values considered as missing. Nil is a missing value, by default



58
59
60

# File 'lib/statsample/vector.rb', line 58

def missing_values
  @missing_values
end

#name ⇒ `Object`

Name of vector. Should be used for output by many classes



70
71
72

# File 'lib/statsample/vector.rb', line 70

def name
  @name
end

#today_values ⇒ `Object`

Array of values considered as “Today”, with date type. “NOW”, “TODAY”, :NOW and :TODAY are ‘today’ values, by default



60
61
62

# File 'lib/statsample/vector.rb', line 60

def today_values
  @today_values
end

#type ⇒ `Object`

Level of measurement. Could be :object, :numeric



52
53
54

# File 'lib/statsample/vector.rb', line 52

def type
  @type
end

#valid_data ⇒ `Object` (readonly)

Valid data. Equal to data, minus values assigned as missing values



56
57
58

# File 'lib/statsample/vector.rb', line 56

def valid_data
  @valid_data
end

Class Method Details

.[](*args) ⇒ `Object`

Create a vector using (almost) any object

Array: flattened
Range: transformed using to_a
Statsample::Vector
Numeric and string values

# File 'lib/statsample/vector.rb', line 123

def self.[](*args)
  values=[]
  args.each do |a|
    case a
    when Array
      values.concat a.flatten
    when Statsample::Vector
      values.concat a.to_a
    when Range
      values.concat  a.to_a
    else
      values << a
    end
  end
  vector=new(values)
  vector.type=:numeric if vector.can_be_numeric?
  vector
end

._load(data) ⇒ `Object`

:nodoc:

# File 'lib/statsample/vector.rb', line 256

def self._load(data) # :nodoc:
h=Marshal.load(data)
Vector.new(h['data'], h['type'], :missing_values=> h['missing_values'], :labels=>h['labels'], :name=>h['name'])
end

.new_numeric(n, val = nil, &block) ⇒ `Object`

Create a new numeric type vector Parameters

n: Size
val: Value of each value
&block: If block provided, is used to set the values of vector

# File 'lib/statsample/vector.rb', line 146

def self.new_numeric(n,val=nil, &block)
  if block
    vector=n.times.map {|i| block.call(i)}.to_numeric
  else
    vector=n.times.map { val}.to_numeric
  end
  vector.type=:numeric
  vector
end

.new_scale(n, val = nil, &block) ⇒ `Object`

Deprecated. Use new_numeric instead.

# File 'lib/statsample/vector.rb', line 157

def self.new_scale(n, val=nil,&block)
  $stderr.puts "WARNING: .new_scale has been deprecated. Use .new_numeric instead."
  new_numeric n, val, &block
end

Instance Method Details

#*(v) ⇒ `Object`



451
452
453

# File 'lib/statsample/vector.rb', line 451

def *(v)
  _vector_ari("*",v)
end

#+(v) ⇒ `Object`

Vector sum.

If v is a scalar, add this value to all elements
If v is a Array or a Vector, should be of the same size of this vector every item of this vector will be added to the value of the item at the same position on the other vector



437
438
439

# File 'lib/statsample/vector.rb', line 437

def +(v)
_vector_ari("+",v)
end

#-(v) ⇒ `Object`

Vector rest.

If v is a scalar, rest this value to all elements
If v is a Array or a Vector, should be of the same size of this vector every item of this vector will be rested to the value of the item at the same position on the other vector



447
448
449

# File 'lib/statsample/vector.rb', line 447

def -(v)
_vector_ari("-",v)
end

#==(v2) ⇒ `Object`

Vector equality. Two vector will be the same if their data, missing values, type, labels are equals

# File 'lib/statsample/vector.rb', line 247

def ==(v2)
  return false unless v2.instance_of? Statsample::Vector
  @data==v2.data and @missing_values==v2.missing_values and @type==v2.type and @labels==v2.labels
end

#[](i) ⇒ `Object`

Retrieves i element of data



394
395
396

# File 'lib/statsample/vector.rb', line 394

def [](i)
  @data[i]
end

#[]=(i, v) ⇒ `Object`

Set i element of data. Note: Use set_valid_data if you include missing values



399
400
401

# File 'lib/statsample/vector.rb', line 399

def []=(i,v)
  @data[i]=v
end

#_check_type(t) ⇒ `Object`

:nodoc:

Raises:

(NoMethodError)

# File 'lib/statsample/vector.rb', line 185

def _check_type(t) #:nodoc:
  raise NoMethodError if (t == :numeric and @type == :object) or 
                         (t == :date)   or (:date == @type)
end

#_dump(i) ⇒ `Object`

:nodoc:



252
253
254

# File 'lib/statsample/vector.rb', line 252

def _dump(i) # :nodoc:
  Marshal.dump({'data'=>@data,'missing_values'=>@missing_values, 'labels'=>@labels, 'type'=>@type,'name'=>@name})
end

#_frequencies ⇒ `Object`

:nodoc:

# File 'lib/statsample/vector.rb', line 775

def _frequencies #:nodoc:
  @valid_data.inject(Hash.new) {|a,x|
    a[x]||=0
    a[x]=a[x]+1
    a
  }
end

#_set_valid_data_intern ⇒ `Object`

:nodoc:

# File 'lib/statsample/vector.rb', line 351

def _set_valid_data_intern #:nodoc:
  @data.each do |n|
    if is_valid? n
      @valid_data.push(n)
      @data_with_nils.push(n)
    else
      @data_with_nils.push(nil)
      @missing_data.push(n)
    end
  end
  @has_missing_data=@missing_data.size>0
end

#_vector_ari(method, v) ⇒ `Object`

:nodoc:

# File 'lib/statsample/vector.rb', line 465

def _vector_ari(method,v) # :nodoc:
if(v.is_a? Vector or v.is_a? Array)
  raise ArgumentError, "The array/vector parameter (#{v.size}) should be of the same size of the original vector (#{@data.size})" unless v.size==@data.size
  sum=[]
  v.size.times {|i|
      if((v.is_a? Vector and v.is_valid?(v[i]) and is_valid?(@data[i])) or (v.is_a? Array and !v[i].nil? and !data[i].nil?))
          sum.push(@data[i].send(method,v[i]))
      else
          sum.push(nil)
      end
  }
  Statsample::Vector.new(sum, :numeric)
elsif(v.respond_to? method )
  Statsample::Vector.new(
    @data.collect  {|x|
      if(!x.nil?)
        x.send(method,v)
      else
        nil
      end
    } , :numeric)
else
    raise TypeError,"You should pass a scalar or a array/vector"
end

end

#add(v, update_valid = true) ⇒ `Object`

Add a value at the end of the vector. If second argument set to false, you should update the Vector usign Vector.set_valid_data at the end of your insertion cycle

# File 'lib/statsample/vector.rb', line 314

def add(v,update_valid=true)
  @data.push(v)
  set_valid_data if update_valid
end

#average_deviation_population(m = nil) ⇒ `Object` Also known as: adp

Population average deviation (denominator N) author: Al Chou

# File 'lib/statsample/vector.rb', line 1003

def average_deviation_population( m = nil )
  check_type :numeric
  m ||= mean
  ( @numeric_data.inject( 0 ) { |a, x| ( x - m ).abs + a } ).quo( n_valid )
end

#bootstrap(estimators, nr, s = nil) ⇒ `Object`

Bootstrap

Generate nr resamples (with replacement) of size s from vector, computing each estimate from estimators over each resample. estimators could be a) Hash with variable names as keys and lambdas as values

a.bootstrap(:log_s2=>lambda {|v| Math.log(v.variance)},1000)

b) Array with names of method to bootstrap

a.bootstrap([:mean, :sd],1000)

c) A single method to bootstrap

a.jacknife(:mean, 1000)

If s is nil, is set to vector size by default.

Returns a dataset where each vector is an vector of length nr containing the computed resample estimates.

# File 'lib/statsample/vector.rb', line 565

def bootstrap(estimators, nr, s=nil)
  s||=n

  h_est, es, bss= prepare_bootstrap(estimators)


  nr.times do |i|
    bs=sample_with_replacement(s)
    es.each do |estimator|
      # Add bootstrap
      bss[estimator].push(h_est[estimator].call(bs))
    end
  end

  es.each do |est|
    bss[est]=bss[est].to_numeric
    bss[est].type=:numeric
  end
  bss.to_dataset

end

#box_cox_transformation(lambda) ⇒ `Object`

:nodoc:

# File 'lib/statsample/vector.rb', line 230

def box_cox_transformation(lambda) # :nodoc:
  raise "Should be a numeric" unless @type==:numeric
  @data_with_nils.collect{|x|
  if !x.nil?
    if(lambda==0)
      Math.log(x)
    else
      (x**lambda-1).quo(lambda)
    end
  else
    nil
  end
  }.to_vector(:numeric)
end

#can_be_date? ⇒ `Boolean`

Return true if all data is Date, “today” values or nil

Returns:

(Boolean)

# File 'lib/statsample/vector.rb', line 719

def can_be_date?
if @data.find {|v|
!v.nil? and !v.is_a? Date and !v.is_a? Time and (v.is_a? String and !@today_values.include? v) and (v.is_a? String and !(v=~/\d{4,4}[-\/]\d{1,2}[-\/]\d{1,2}/))}
  false
else
  true
end
end

#can_be_numeric? ⇒ `Boolean`

Return true if all data is Numeric or nil

Returns:

(Boolean)

# File 'lib/statsample/vector.rb', line 728

def can_be_numeric?
  if @data.find {|v| !v.nil? and !v.is_a? Numeric and !@missing_values.include? v}
    false
  else
    true
  end
end

#check_type(t) ⇒ `Object`

:nodoc:



175
176
177

# File 'lib/statsample/vector.rb', line 175

def check_type(t)
  Statsample::STATSAMPLE__.check_type(self,t)
end

#coefficient_of_variation ⇒ `Object` Also known as: cov

Coefficient of variation Calculed with the sample standard deviation

# File 'lib/statsample/vector.rb', line 1075

def coefficient_of_variation
    check_type :numeric
    standard_deviation_sample.quo(mean)
end

#count(x = false) ⇒ `Object`

Retrieves number of cases which comply condition. If block given, retrieves number of instances where block returns true. If other values given, retrieves the frequency for this value.

# File 'lib/statsample/vector.rb', line 692

def count(x=false)
if block_given?
  r=@data.inject(0) {|s, i|
    r=yield i
    s+(r ? 1 : 0)
  }
  r.nil? ? 0 : r
else
  frequencies[x].nil? ? 0 : frequencies[x]
end
end

#db_type(dbs = 'mysql') ⇒ `Object`

Returns the database type for the vector, according to its content

# File 'lib/statsample/vector.rb', line 706

def db_type(dbs='mysql')
# first, detect any character not number
if @data.find {|v|  v.to_s=~/\d{2,2}-\d{2,2}-\d{4,4}/} or @data.find {|v|  v.to_s=~/\d{4,4}-\d{2,2}-\d{2,2}/}
  return "DATE"
elsif @data.find {|v|  v.to_s=~/[^0-9e.-]/ }
  return "VARCHAR (255)"
elsif @data.find {|v| v.to_s=~/\./}
  return "DOUBLE"
else
  return "INTEGER"
end
end

#dichotomize(low = nil) ⇒ `Object`

Dicotomize the vector with 0 and 1, based on lowest value If parameter if defined, this value and lower will be 0 and higher, 1

# File 'lib/statsample/vector.rb', line 284

def dichotomize(low = nil)
  low ||= factors.min

  @data_with_nils.collect do |x|
    if x.nil?
      nil
    elsif x > low
      1
    else
      0
    end
  end.to_numeric
end

#dup ⇒ `Object`

Creates a duplicate of the Vector. Note: data, missing_values and labels are duplicated, so changes on original vector doesn’t propages to copies.



164
165
166

# File 'lib/statsample/vector.rb', line 164

def dup
  Vector.new(@data.dup,@type, :missing_values => @missing_values.dup, :labels => @labels.dup, :name=>@name)
end

#dup_empty ⇒ `Object`

Returns an empty duplicate of the vector. Maintains the type, missing values and labels.



169
170
171

# File 'lib/statsample/vector.rb', line 169

def dup_empty
  Vector.new([],@type, :missing_values => @missing_values.dup, :labels => @labels.dup, :name=> @name)
end

#each ⇒ `Object`

Iterate on each item. Equivalent to

@data.each{|x| yield x}



300
301
302

# File 'lib/statsample/vector.rb', line 300

def each
  @data.each{|x| yield(x) }
end

#each_index ⇒ `Object`

Iterate on each item, retrieving index

# File 'lib/statsample/vector.rb', line 305

def each_index
(0...@data.size).each {|i|
  yield(i)
}
end

#factors ⇒ `Object`

Retrieves uniques values for data.

# File 'lib/statsample/vector.rb', line 753

def factors
  if @type==:numeric
    @numeric_data.uniq.sort
  elsif @type==:date
    @date_data_with_nils.uniq.sort
  else
    @valid_data.uniq.sort
  end
end

#frequencies ⇒ `Object`

:nodoc:



765
766
767

# File 'lib/statsample/vector.rb', line 765

def frequencies
  Statsample::STATSAMPLE__.frequencies(@valid_data)
end

#has_missing_data? ⇒ `Boolean` Also known as: flawed?

Retrieves true if data has one o more missing values

Returns:

(Boolean)



365
366
367

# File 'lib/statsample/vector.rb', line 365

def has_missing_data?
  @has_missing_data
end

#histogram(bins = 10) ⇒ `Object`

With a fixnum, creates X bins within the range of data With an Array, each value will be a cut point

# File 'lib/statsample/vector.rb', line 1050

def histogram(bins=10)
  check_type :numeric

  if bins.is_a? Array
    #h=Statsample::Histogram.new(self, bins)
    h=Statsample::Histogram.alloc(bins)
  else
    # ugly patch. The upper limit for a bin has the form
    # x < range
    #h=Statsample::Histogram.new(self, bins)
    min,max=Statsample::Util.nice(@valid_data.min,@valid_data.max)
    # fix last data
    if max==@valid_data.max
      max+=1e-10
    end
    h=Statsample::Histogram.alloc(bins,[min,max])
    # Fix last bin

  end
  h.increment(@valid_data)
  h
end

#inspect ⇒ `Object`



749
750
751

# File 'lib/statsample/vector.rb', line 749

def inspect
  self.to_s
end

#is_valid?(x) ⇒ `Boolean`

Return true if a value is valid (not nil and not included on missing values)

Returns:

(Boolean)



403
404
405

# File 'lib/statsample/vector.rb', line 403

def is_valid?(x)
  !(x.nil? or @missing_values.include? x)
end

#jacknife(estimators, k = 1) ⇒ `Object`

Jacknife

Returns a dataset with jacknife delete-k estimators estimators could be: a) Hash with variable names as keys and lambdas as values

a.jacknife(:log_s2=>lambda {|v| Math.log(v.variance)})

b) Array with method names to jacknife

a.jacknife([:mean, :sd])

c) A single method to jacknife

a.jacknife(:mean)

k represent the block size for block jacknife. By default is set to 1, for classic delete-one jacknife.

Returns a dataset where each vector is an vector of length cases/k containing the computed jacknife estimates.

Reference:

Sawyer, S. (2005). Resampling Data: Using a Statistical Jacknife.

# File 'lib/statsample/vector.rb', line 604

def jacknife(estimators, k=1)
  raise "n should be divisible by k:#{k}" unless n%k==0

  nb=(n / k).to_i


  h_est, es, ps= prepare_bootstrap(estimators)

  est_n=es.inject({}) {|h,v|
    h[v]=h_est[v].call(self)
    h
  }


  nb.times do |i|
    other=@data_with_nils.dup
    other.slice!(i*k,k)
    other=other.to_numeric
    es.each do |estimator|
      # Add pseudovalue
      ps[estimator].push( nb * est_n[estimator] - (nb-1) * h_est[estimator].call(other))
    end
  end


  es.each do |est|
    ps[est]=ps[est].to_numeric
    ps[est].type=:numeric
  end
  ps.to_dataset
end

#kurtosis(m = nil) ⇒ `Object`

Kurtosis of the sample

# File 'lib/statsample/vector.rb', line 1034

def kurtosis(m=nil)
    check_type :numeric
    m||=mean
    fo=@numeric_data.inject(0){|a,x| a+((x-m)**4)}
    fo.quo((@numeric_data.size)*sd(m)**4)-3

end

#labeling(x) ⇒ `Object` Also known as: label

Retrieves label for value x. Retrieves x if no label defined.



372
373
374

# File 'lib/statsample/vector.rb', line 372

def labeling(x)
  @labels.has_key?(x) ? @labels[x].to_s : x.to_s
end

#max ⇒ `Object`

Maximum value

# File 'lib/statsample/vector.rb', line 920

def max
  check_type :numeric
  @valid_data.max
end

#mean ⇒ `Object`

The arithmetical mean of data

# File 'lib/statsample/vector.rb', line 966

def mean
  check_type :numeric
  sum.to_f.quo(n_valid)
end

#median ⇒ `Object`

Return the median (percentil 50)

# File 'lib/statsample/vector.rb', line 910

def median
  check_type :numeric
  percentil(50)
end

#median_absolute_deviation ⇒ `Object` Also known as: mad

# File 'lib/statsample/vector.rb', line 1008

def median_absolute_deviation
  med=median
  recode {|x| (x-med).abs}.median
end

#min ⇒ `Object`

Minimun value

# File 'lib/statsample/vector.rb', line 915

def min
  check_type :numeric
  @valid_data.min
end

#mode ⇒ `Object`

Returns the most frequent item.



784
785
786

# File 'lib/statsample/vector.rb', line 784

def mode
  frequencies.max{|a,b| a[1]<=>b[1]}.first
end

#n_valid ⇒ `Object`

The numbers of item with valid data.



788
789
790

# File 'lib/statsample/vector.rb', line 788

def n_valid
  @valid_data.size
end

#percentil(q, strategy = :midpoint) ⇒ `Object`

Percentil

Returns the value of the percentile q

Accepts an optional second argument specifying the strategy to interpolate when the requested percentile lies between two data points a and b Valid strategies are:

:midpoint (Default): (a + b) / 2
:linear : a + (b - a) * d where d is the decimal part of the index between a and b.

This is the NIST recommended method (en.wikipedia.org/wiki/Percentile#NIST_method)

# File 'lib/statsample/vector.rb', line 868

def percentil(q, strategy = :midpoint)
  check_type :numeric
  sorted=@valid_data.sort

  case strategy
  when :midpoint
    v = (n_valid * q).quo(100)
    if(v.to_i!=v)
      sorted[v.to_i]
    else
      (sorted[(v-0.5).to_i].to_f + sorted[(v+0.5).to_i]).quo(2)
    end
  when :linear
    index = (q / 100.0) * (n_valid + 1)

    k = index.truncate
    d = index % 1

    if k == 0
      sorted[0]
    elsif k >= sorted.size
      sorted[-1]
    else
      sorted[k - 1] + d * (sorted[k] - sorted[k - 1])
    end
  else
    raise NotImplementedError.new "Unknown strategy #{strategy.to_s}"
  end
end

#product ⇒ `Object`

Product of all values on the sample

# File 'lib/statsample/vector.rb', line 1043

def product
    check_type :numeric
    @numeric_data.inject(1){|a,x| a*x }
end

#proportion(v = 1) ⇒ `Object`

Proportion of a given value.



800
801
802

# File 'lib/statsample/vector.rb', line 800

def proportion(v=1)
    frequencies[v].quo(@valid_data.size)
end

#proportion_confidence_interval_t(n_poblation, margin = 0.95, v = 1) ⇒ `Object`



840
841
842

# File 'lib/statsample/vector.rb', line 840

def proportion_confidence_interval_t(n_poblation,margin=0.95,v=1)
  Statsample::proportion_confidence_interval_t(proportion(v), @valid_data.size, n_poblation, margin)
end

#proportion_confidence_interval_z(n_poblation, margin = 0.95, v = 1) ⇒ `Object`



843
844
845

# File 'lib/statsample/vector.rb', line 843

def proportion_confidence_interval_z(n_poblation,margin=0.95,v=1)
  Statsample::proportion_confidence_interval_z(proportion(v), @valid_data.size, n_poblation, margin)
end

#proportions ⇒ `Object`

Returns a hash with the distribution of proportions of the sample.

# File 'lib/statsample/vector.rb', line 793

def proportions
    frequencies.inject({}){|a,v|
        a[v[0]] = v[1].quo(n_valid)
        a
    }
end

#push(v) ⇒ `Object`

# File 'lib/statsample/vector.rb', line 276

def push(v)
  @data.push(v)
  set_valid_data
end

#range ⇒ `Object`

The range of the data (max - min)

# File 'lib/statsample/vector.rb', line 956

def range;
  check_type :numeric
  @numeric_data.max - @numeric_data.min
end

#ranked(type = :numeric) ⇒ `Object`

Returns a ranked vector.

# File 'lib/statsample/vector.rb', line 899

def ranked(type=:numeric)
  check_type :numeric
  i=0
  r=frequencies.sort.inject({}){|a,v|
    a[v[0]]=(i+1 + i+v[1]).quo(2)
    i+=v[1]
    a
  }
  @data.collect {|c| r[c] }.to_vector(type)
end

#recode(type = nil) ⇒ `Object`

Returns a new vector, with data modified by block. Equivalent to create a Vector after #collect on data

# File 'lib/statsample/vector.rb', line 262

def recode(type=nil)
  type||=@type
  @data.collect{|x|
    yield x
  }.to_vector(type)
end

#recode! ⇒ `Object`

Modifies current vector, with data modified by block. Equivalent to #collect! on @data

# File 'lib/statsample/vector.rb', line 270

def recode!
@data.collect!{|x|
  yield x
}
set_valid_data
end

#report_building(b) ⇒ `Object`

# File 'lib/statsample/vector.rb', line 803

def report_building(b)
  b.section(:name=>name) do |s|
    s.text _("n :%d") % n
    s.text _("n valid:%d") % n_valid
    if @type==:object
      s.text  _("factors:%s") % factors.join(",")
      s.text   _("mode: %s") % mode

      s.table(:name=>_("Distribution")) do |t|
        frequencies.sort.each do |k,v|
          key=labels.has_key?(k) ? labels[k]:k
          t.row [key, v , ("%0.2f%%" % (v.quo(n_valid)*100))]
        end
      end
    end

    s.text _("median: %s") % median.to_s if(@type==:numeric or @type==:numeric)
    if(@type==:numeric)
      s.text _("mean: %0.4f") % mean
      if sd
        s.text _("std.dev.: %0.4f") % sd
        s.text _("std.err.: %0.4f") % se
        s.text _("skew: %0.4f") % skew
        s.text _("kurtosis: %0.4f") % kurtosis
      end
    end
  end
end

#sample_with_replacement(sample = 1) ⇒ `Object`

Returns an random sample of size n, with replacement, only with valid data.

In all the trails, every item have the same probability of been selected.

# File 'lib/statsample/vector.rb', line 666

def sample_with_replacement(sample=1)
  vds=@valid_data.size
  (0...sample).collect{ @valid_data[rand(vds)] }
end

#sample_without_replacement(sample = 1) ⇒ `Object`

Returns an random sample of size n, without replacement, only with valid data.

Every element could only be selected once.

A sample of the same size of the vector is the vector itself.

Raises:

(ArgumentError)

# File 'lib/statsample/vector.rb', line 677

def sample_without_replacement(sample=1)
  raise ArgumentError, "Sample size couldn't be greater than n" if sample>@valid_data.size
  out=[]
  size=@valid_data.size
  while out.size<sample
    value=rand(size)
    out.push(value) if !out.include?value
  end
  out.collect{|i| @data[i]}
end

#set_valid_data ⇒ `Object`

Update valid_data, missing_data, data_with_nils and gsl at the end of an insertion.

Use after Vector.add(v,false) Usage:

v=Statsample::Vector.new
v.add(2,false)
v.add(4,false)
v.data
=> [2,3]
v.valid_data
=> []
v.set_valid_data
v.valid_data
=> [2,3]

# File 'lib/statsample/vector.rb', line 333

def set_valid_data
  @valid_data.clear
  @missing_data.clear
  @data_with_nils.clear
  @date_data_with_nils.clear
  set_valid_data_intern
  set_numeric_data if(@type==:numeric)
  set_date_data if(@type==:date)
end

#set_valid_data_intern ⇒ `Object`

:nodoc:



343
344
345

# File 'lib/statsample/vector.rb', line 343

def set_valid_data_intern #:nodoc:
  Statsample::STATSAMPLE__.set_valid_data_intern(self)
end

#size ⇒ `Object` Also known as: n

Size of total data



388
389
390

# File 'lib/statsample/vector.rb', line 388

def size
  @data.size
end

#skew(m = nil) ⇒ `Object`

Skewness of the sample

# File 'lib/statsample/vector.rb', line 1027

def skew(m=nil)
    check_type :numeric
    m||=mean
    th=@numeric_data.inject(0){|a,x| a+((x-m)**3)}
    th.quo((@numeric_data.size)*sd(m)**3)
end

#split_by_separator(sep = Statsample::SPLIT_TOKEN) ⇒ `Object`

Returns a hash of Vectors, defined by the different values defined on the fields Example:

a=Vector.new(["a,b","c,d","a,b"])
a.split_by_separator
=>  {"a"=>#<Statsample::Type::object:0x7f2dbcc09d88
      @data=[1, 0, 1]>,
     "b"=>#<Statsample::Type::object:0x7f2dbcc09c48
      @data=[1, 1, 0]>,
    "c"=>#<Statsample::Type::object:0x7f2dbcc09b08
      @data=[0, 1, 1]>}

# File 'lib/statsample/vector.rb', line 520

def split_by_separator(sep=Statsample::SPLIT_TOKEN)
split_data=splitted(sep)
factors=split_data.flatten.uniq.compact
out=factors.inject({}) {|a,x|
  a[x]=[]
  a
}
split_data.each do |r|
  if r.nil?
    factors.each do |f|
      out[f].push(nil)
    end
  else
    factors.each do |f|
      out[f].push(r.include?(f) ? 1:0)
    end
  end
end
out.inject({}){|s,v|
  s[v[0]]=Vector.new(v[1],:object)
  s
}
end

#split_by_separator_freq(sep = Statsample::SPLIT_TOKEN) ⇒ `Object`

# File 'lib/statsample/vector.rb', line 543

def split_by_separator_freq(sep=Statsample::SPLIT_TOKEN)
  split_by_separator(sep).inject({}) {|a,v|
    a[v[0]]=v[1].inject {|s,x| s+x.to_i}
    a
  }
end

#splitted(sep = Statsample::SPLIT_TOKEN) ⇒ `Object`

Return an array with the data splitted by a separator.

a=Vector.new(["a,b","c,d","a,b","d"])
a.splitted
  =>
[["a","b"],["c","d"],["a","b"],["d"]]

# File 'lib/statsample/vector.rb', line 496

def splitted(sep=Statsample::SPLIT_TOKEN)
@data.collect{|x|
  if x.nil?
    nil
  elsif (x.respond_to? :split)
    x.split(sep)
  else
    [x]
  end
}
end

#standard_deviation_population(m = nil) ⇒ `Object` Also known as: sdp

Population Standard deviation (denominator N)

# File 'lib/statsample/vector.rb', line 995

def standard_deviation_population(m=nil)
  check_type :numeric
  Math::sqrt( variance_population(m) )
end

#standard_deviation_sample(m = nil) ⇒ `Object` Also known as: sds, sd

Sample Standard deviation (denominator n-1)

# File 'lib/statsample/vector.rb', line 1021

def standard_deviation_sample(m=nil)
    check_type :numeric
    m||=mean
    Math::sqrt(variance_sample(m))
end

#standard_error ⇒ `Object` Also known as: se

Standard error of the distribution mean Calculated using sd/sqrt(n)



1081
1082
1083

# File 'lib/statsample/vector.rb', line 1081

def standard_error
  standard_deviation_sample.quo(Math.sqrt(valid_data.size))
end

#sum ⇒ `Object`

The sum of values for the data

# File 'lib/statsample/vector.rb', line 961

def sum
  check_type :numeric
  @numeric_data.inject(0){|a,x|x+a} ;
end

#sum_of_squared_deviation ⇒ `Object`

Sum of squared deviation

# File 'lib/statsample/vector.rb', line 980

def sum_of_squared_deviation
  check_type :numeric
  @numeric_data.inject(0) {|a,x| x.square+a} - (sum.square.quo(n_valid))
end

#sum_of_squares(m = nil) ⇒ `Object` Also known as: ss

Sum of squares for the data around a value. By default, this value is the mean

ss= sum{(xi-m)^2}

# File 'lib/statsample/vector.rb', line 974

def sum_of_squares(m=nil)
  check_type :numeric
  m||=mean
  @numeric_data.inject(0){|a,x| a+(x-m).square}
end

#to_a ⇒ `Object` Also known as: to_ary

# File 'lib/statsample/vector.rb', line 423

def to_a
  if @data.is_a? Array
    @data.dup
  else
    @data.to_a
  end
end

#to_matrix(dir = :horizontal) ⇒ `Object`

Ugly name. Really, create a Vector for standard ‘matrix’ package. dir could be :horizontal or :vertical

# File 'lib/statsample/vector.rb', line 741

def to_matrix(dir=:horizontal)
  case dir
  when :horizontal
    Matrix[@data]
  when :vertical
    Matrix.columns([@data])
  end
end

#to_REXP ⇒ `Object`



6
7
8

# File 'lib/statsample/rserve_extension.rb', line 6

def to_REXP
  Rserve::REXP::Wrapper.wrap(data_with_nils)
end

#to_s ⇒ `Object`



736
737
738

# File 'lib/statsample/vector.rb', line 736

def to_s
  sprintf("Vector(type:%s, n:%d)[%s]",@type.to_s,@data.size, @data.collect{|d| d.nil? ? "nil":d}.join(","))
end

#variance_population(m = nil) ⇒ `Object`

Population variance (denominator N)

# File 'lib/statsample/vector.rb', line 986

def variance_population(m=nil)
  check_type :numeric
  m||=mean
  squares=@numeric_data.inject(0){|a,x| x.square+a}
  squares.quo(n_valid) - m.square
end

#variance_proportion(n_poblation, v = 1) ⇒ `Object`

Variance of p, according to poblation size



833
834
835

# File 'lib/statsample/vector.rb', line 833

def variance_proportion(n_poblation, v=1)
  Statsample::proportion_variance_sample(self.proportion(v), @valid_data.size, n_poblation)
end

#variance_sample(m = nil) ⇒ `Object` Also known as: variance

Sample Variance (denominator n-1)

# File 'lib/statsample/vector.rb', line 1014

def variance_sample(m=nil)
  check_type :numeric
  m||=mean
  sum_of_squares(m).quo(n_valid - 1)
end

#variance_total(n_poblation, v = 1) ⇒ `Object`

Variance of p, according to poblation size



837
838
839

# File 'lib/statsample/vector.rb', line 837

def variance_total(n_poblation, v=1)
  Statsample::total_variance_sample(self.proportion(v), @valid_data.size, n_poblation)
end

#vector_centered ⇒ `Object` Also known as: centered

Return a centered vector

# File 'lib/statsample/vector.rb', line 210

def vector_centered
  check_type :numeric
  m=mean
  return ([nil]*size).to_numeric if mean.nil?
  vector=vector_centered_compute(m)
  vector.name=_("%s(centered)") % @name
  vector
end

#vector_centered_compute(m) ⇒ `Object`

:nodoc:



206
207
208

# File 'lib/statsample/vector.rb', line 206

def vector_centered_compute(m) #:nodoc:
  @data_with_nils.collect {|x| x.nil? ? nil : x.to_f-m }.to_numeric
end

#vector_labeled ⇒ `Object`

Returns a Vector with data with labels replaced by the label.

# File 'lib/statsample/vector.rb', line 377

def vector_labeled
  d=@data.collect{|x|
    if @labels.has_key? x
      @labels[x]
    else
      x
    end
  }
  Vector.new(d,@type)
end

#vector_percentil ⇒ `Object`

Return a vector with values replaced with the percentiles of each values

# File 'lib/statsample/vector.rb', line 223

def vector_percentil
  check_type :numeric
  c=@valid_data.size
  vector=ranked.map {|i| i.nil? ? nil : (i.quo(c)*100).to_f }.to_vector(@type)
  vector.name=_("%s(percentil)")  % @name
  vector
end

#vector_standarized(use_population = false) ⇒ `Object` Also known as: standarized

Return a vector usign the standarized values for data with sd with denominator n-1. With variance=0 or mean nil, returns a vector of equal size full of nils

# File 'lib/statsample/vector.rb', line 197

def vector_standarized(use_population=false)
  check_type :numeric
  m=mean
  sd=use_population ? sdp : sds
  return ([nil]*size).to_numeric if mean.nil? or sd==0.0
  vector=vector_standarized_compute(m,sd)
  vector.name=_("%s(standarized)")  % @name
  vector
end

#vector_standarized_compute(m, sd) ⇒ `Object`

:nodoc:



190
191
192

# File 'lib/statsample/vector.rb', line 190

def vector_standarized_compute(m,sd) # :nodoc:
  @data_with_nils.collect{|x| x.nil? ? nil : (x.to_f - m).quo(sd) }.to_vector(:numeric)
end

#verify ⇒ `Object`

Reports all values that doesn’t comply with a condition. Returns a hash with the index of data and the invalid data.

# File 'lib/statsample/vector.rb', line 456

def verify
h={}
(0...@data.size).to_a.each{|i|
  if !(yield @data[i])
    h[i]=@data[i]
  end
}
h
end

Class: Statsample::Vector

Overview

Usage

Defined Under Namespace

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Bootstrap Generate nr resamples (with replacement) of size s from vector, computing each estimate from estimators over each resample.

Percentil Returns the value of the percentile q.

Methods included from VectorShorthands

Methods included from Summarizable

Methods included from Writable

Constructor Details

#initialize(data = [], type = :object, opts = Hash.new) ⇒ Vector

Instance Attribute Details

#data ⇒ Object (readonly)

#data_with_nils ⇒ Object (readonly)

#date_data_with_nils ⇒ Object (readonly)

#labels ⇒ Object

#missing_data ⇒ Object (readonly)

#missing_values ⇒ Object

#name ⇒ Object

#today_values ⇒ Object

#type ⇒ Object

#valid_data ⇒ Object (readonly)

Class Method Details

.[](*args) ⇒ Object

._load(data) ⇒ Object

.new_numeric(n, val = nil, &block) ⇒ Object

.new_scale(n, val = nil, &block) ⇒ Object

Instance Method Details

#*(v) ⇒ Object

#+(v) ⇒ Object

#-(v) ⇒ Object

#==(v2) ⇒ Object

#[](i) ⇒ Object

#[]=(i, v) ⇒ Object

#_check_type(t) ⇒ Object

#_dump(i) ⇒ Object

#_frequencies ⇒ Object

#_set_valid_data_intern ⇒ Object

#_vector_ari(method, v) ⇒ Object

#add(v, update_valid = true) ⇒ Object

#average_deviation_population(m = nil) ⇒ Object Also known as: adp

#bootstrap(estimators, nr, s = nil) ⇒ Object

Bootstrap

#box_cox_transformation(lambda) ⇒ Object

#can_be_date? ⇒ Boolean

#can_be_numeric? ⇒ Boolean

#check_type(t) ⇒ Object

#coefficient_of_variation ⇒ Object Also known as: cov

#count(x = false) ⇒ Object

#db_type(dbs = 'mysql') ⇒ Object

#dichotomize(low = nil) ⇒ Object

#dup ⇒ Object

#dup_empty ⇒ Object

#each ⇒ Object

#each_index ⇒ Object

#factors ⇒ Object

#frequencies ⇒ Object

#has_missing_data? ⇒ Boolean Also known as: flawed?

#histogram(bins = 10) ⇒ Object

#inspect ⇒ Object

#is_valid?(x) ⇒ Boolean

#jacknife(estimators, k = 1) ⇒ Object

Jacknife

Reference:

#kurtosis(m = nil) ⇒ Object

#labeling(x) ⇒ Object Also known as: label

#max ⇒ Object

#mean ⇒ Object

#median ⇒ Object

#median_absolute_deviation ⇒ Object Also known as: mad

#min ⇒ Object

#mode ⇒ Object

#n_valid ⇒ Object

#percentil(q, strategy = :midpoint) ⇒ Object

Percentil

#product ⇒ Object

#proportion(v = 1) ⇒ Object

Bootstrap Generate `nr` resamples (with replacement) of size `s` from vector, computing each estimate from `estimators` over each resample.

#initialize(data = [], type = :object, opts = Hash.new) ⇒ `Vector`

#data ⇒ `Object` (readonly)

#data_with_nils ⇒ `Object` (readonly)

#date_data_with_nils ⇒ `Object` (readonly)

#labels ⇒ `Object`

#missing_data ⇒ `Object` (readonly)

#missing_values ⇒ `Object`

#name ⇒ `Object`

#today_values ⇒ `Object`

#type ⇒ `Object`

#valid_data ⇒ `Object` (readonly)

.[](*args) ⇒ `Object`

._load(data) ⇒ `Object`

.new_numeric(n, val = nil, &block) ⇒ `Object`

.new_scale(n, val = nil, &block) ⇒ `Object`

#*(v) ⇒ `Object`

#+(v) ⇒ `Object`

#-(v) ⇒ `Object`

#==(v2) ⇒ `Object`

#[](i) ⇒ `Object`

#[]=(i, v) ⇒ `Object`

#_check_type(t) ⇒ `Object`

#_dump(i) ⇒ `Object`

#_frequencies ⇒ `Object`

#_set_valid_data_intern ⇒ `Object`

#_vector_ari(method, v) ⇒ `Object`

#add(v, update_valid = true) ⇒ `Object`

#average_deviation_population(m = nil) ⇒ `Object` Also known as: adp

#bootstrap(estimators, nr, s = nil) ⇒ `Object`

#box_cox_transformation(lambda) ⇒ `Object`

#can_be_date? ⇒ `Boolean`

#can_be_numeric? ⇒ `Boolean`

#check_type(t) ⇒ `Object`

#coefficient_of_variation ⇒ `Object` Also known as: cov

#count(x = false) ⇒ `Object`

#db_type(dbs = 'mysql') ⇒ `Object`

#dichotomize(low = nil) ⇒ `Object`

#dup ⇒ `Object`

#dup_empty ⇒ `Object`

#each ⇒ `Object`

#each_index ⇒ `Object`

#factors ⇒ `Object`

#frequencies ⇒ `Object`

#has_missing_data? ⇒ `Boolean` Also known as: flawed?

#histogram(bins = 10) ⇒ `Object`

#inspect ⇒ `Object`

#is_valid?(x) ⇒ `Boolean`

#jacknife(estimators, k = 1) ⇒ `Object`

#kurtosis(m = nil) ⇒ `Object`

#labeling(x) ⇒ `Object` Also known as: label

#max ⇒ `Object`

#mean ⇒ `Object`

#median ⇒ `Object`

#median_absolute_deviation ⇒ `Object` Also known as: mad

#min ⇒ `Object`

#mode ⇒ `Object`

#n_valid ⇒ `Object`

#percentil(q, strategy = :midpoint) ⇒ `Object`

#product ⇒ `Object`

#proportion(v = 1) ⇒ `Object`

#proportion_confidence_interval_t(n_poblation, margin = 0.95, v = 1) ⇒ `Object`

#proportion_confidence_interval_z(n_poblation, margin = 0.95, v = 1) ⇒ `Object`

#proportions ⇒ `Object`

#push(v) ⇒ `Object`

#range ⇒ `Object`

#ranked(type = :numeric) ⇒ `Object`

#recode(type = nil) ⇒ `Object`

#recode! ⇒ `Object`

#report_building(b) ⇒ `Object`

#sample_with_replacement(sample = 1) ⇒ `Object`

#sample_without_replacement(sample = 1) ⇒ `Object`

#set_valid_data ⇒ `Object`

#set_valid_data_intern ⇒ `Object`

#size ⇒ `Object` Also known as: n

#skew(m = nil) ⇒ `Object`

#split_by_separator(sep = Statsample::SPLIT_TOKEN) ⇒ `Object`

#split_by_separator_freq(sep = Statsample::SPLIT_TOKEN) ⇒ `Object`

#splitted(sep = Statsample::SPLIT_TOKEN) ⇒ `Object`

#standard_deviation_population(m = nil) ⇒ `Object` Also known as: sdp