Class: FatTable::Column

Inherits:
Object
  • Object
show all
Includes:
Enumerable
Defined in:
lib/fat_table/column.rb

Overview

Column objects are a thin wrapper around an Array to allow columns to be summed and have other aggregate operations performed on them, but compacting out nils before proceeding. They are characterized by a header, which gives the Column a name, a type, which limits the kinds of items that can be stored in the Column, and the items themselves, which all must either be nil or objects compatible with the Column's type. The valid types are Boolean, DateTime, Numeric, String, and NilClass, the last of which is used as the initial type until items added to the Column fix its type as one of the others.

Constant Summary collapse

TYPES =

Valid Column types as strings.

%w[NilClass Boolean DateTime Numeric String].freeze
VALID_AGGREGATES =

The names of the known aggregate operations that can be performed on a Column.

s(first last range
sum count min max
avg var pvar dev pdev
any? all? none? one?)

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(header:, items: [], type: 'NilClass', tolerant: false) ⇒ Column

Create a new Column with the given +header+ and initialized with the given +items+, as an array of either strings or ruby objects that are one of the permissible types or strings parsable as one of the permissible types. If no +items+ are passed, returns an empty Column to which items may be added with the Column#<< method. The item types must be one of the following types or strings parseable as one of them:

Boolean:: an object of type TrueClass or FalseClass or a string that is either 't', 'true', 'y', 'yes', 'f', 'false', 'n', or 'no', in each case, regardless of case.

DateTime:: an object of class Date, DateTime, or a string that matches +/\d\d\d\d[-\/]\d\d?[-\/]\d\d?/+ and is parseable by DateTime.parse.

Numeric:: on object that is of class Numeric, or a string that looks like a number after removing '+$+', '+,+', and '+_+' as well as Rationals in the form /:/ or /, where is an integer.

String:: if the object is a non-blank string that does not parse as any of the foregoing, it its treated as a Sting type, and once a column is typed as such, blank strings represent blank strings rather than nil values.

NilClass:: until a Column sees an item that qualifies as one of the foregoing, it is typed as NilClass, meaning that the type is undetermined. Until a column obtains a type, blank strings are treated as nils and do not affect the type of the column. After a column acquires a type, blank strings are treated as nil values except in the case of String columns, which retain them a blank strings.

Examples:


require 'fat_table'
col = FatTable::Column.new(header: 'date')
col << Date.today - 30
col << '2017-05-04'
col.type #=> 'DateTime'
col.header #=> :date
nums = [35.25, 18, '35:14', '$18_321']
col = FatTable::Column.new(header: :prices, items: nums)
col.type #=> 'Numeric'
col.header #=> :prices
col.sum #=> 18376.75

Parameters:

Raises:



90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
# File 'lib/fat_table/column.rb', line 90

def initialize(header:, items: [], type: 'NilClass', tolerant: false)
  @raw_header = header
  @header =
    if @raw_header.is_a?(Symbol)
      @raw_header
    else
      @raw_header.to_s.as_sym
    end
  @type = type
  @tolerant = tolerant
  msg = "unknown column type '#{type}"
  raise UserError, msg unless TYPES.include?(@type.to_s)

  @items = []
  items.each { |i| self << i }
end

Instance Attribute Details

#headerObject (readonly)

The symbol representing this Column.



15
16
17
# File 'lib/fat_table/column.rb', line 15

def header
  @header
end

#itemsObject (readonly)

An Array of the items of this Column, all of which must be values of the Column's type or a nil. This Array contains the value of the item after conversion to a native Ruby type, such as TrueClass, Date, DateTime, Integer, String, etc. Thus, you can perform operations on the items, perhaps after removing nils with +.items.compact+.



30
31
32
# File 'lib/fat_table/column.rb', line 30

def items
  @items
end

#raw_headerObject (readonly)

The header as provided by the caller before its conversion to a symbol. You can use this to recover the original string form of the header.



19
20
21
# File 'lib/fat_table/column.rb', line 19

def raw_header
  @raw_header
end

#tolerantObject

Returns the value of attribute tolerant.



32
33
34
# File 'lib/fat_table/column.rb', line 32

def tolerant
  @tolerant
end

#typeObject (readonly)

A string representing the deduced type of this Column. One of Column::TYPES.



23
24
25
# File 'lib/fat_table/column.rb', line 23

def type
  @type
end

Instance Method Details

#+(other) ⇒ Object

Return a new Column appending the items of other to this Column's items, checking for type compatibility. Use the header of this Column as the header of the new Column.

Raises:



471
472
473
474
475
# File 'lib/fat_table/column.rb', line 471

def +(other)
  msg = 'cannot combine columns with different types'
  raise UserError, msg unless type == other.type
  Column.new(header: header, items: items + other.items)
end

#<<(itm) ⇒ Object

Append +itm+ to end of the Column after converting it to the Column's type. If the Column's type is still open, i.e. NilClass, attempt to fix the Column's type based on the type of +itm+ as with Column.new. If its a tolerant column, respond to type errors by converting the column to a String type.



456
457
458
459
460
461
462
463
464
# File 'lib/fat_table/column.rb', line 456

def <<(itm)
  items << convert_and_set_type(itm)
rescue IncompatibleTypeError => ex
  if tolerant?
    items << Convert.convert_to_string(itm)
  else
    raise ex
  end
end

#[](idx) ⇒ Object

Return the item of the Column at the given index.



114
115
116
# File 'lib/fat_table/column.rb', line 114

def [](idx)
  items[idx]
end

#all?Boolean

Return true if all of the items in the Column are true; otherwise return false, or false if all items are nil. Works only with boolean Columns.

Returns:

  • (Boolean)


404
405
406
407
408
409
# File 'lib/fat_table/column.rb', line 404

def all?
  return false if type == 'NilClass' || items.all?(&:nil?)

  only_with('all?', 'Boolean')
  items.filter_to_type(type).all?
end

#any?Boolean

Return true if any of the items in the Column are true; otherwise return false, or false if all items are nil. Works only with boolean Columns.

Returns:

  • (Boolean)


393
394
395
396
397
398
# File 'lib/fat_table/column.rb', line 393

def any?
  return false if type == 'NilClass' || items.all?(&:nil?)

  only_with('any?', 'Boolean')
  items.filter_to_type(type).any?
end

#avgObject

Return the average value of the non-nil items in the Column, or 0 if all items are nil. Works with numeric and datetime Columns. For datetime Columns, it converts each date to its Julian day number, computes the average, and then converts the average back to a DateTime.



300
301
302
303
304
305
306
307
308
309
310
311
312
# File 'lib/fat_table/column.rb', line 300

def avg
  return 0 if type == 'NilClass' || items.all?(&:nil?)

  only_with('avg', 'DateTime', 'Numeric')
  itms = items.filter_to_type(type)
  size = itms.size.to_d
  if type == 'DateTime'
    avg_jd = itms.map(&:jd).sum / size
    DateTime.jd(avg_jd)
  else
    itms.sum / size
  end
end

#countObject

Return a count of the non-nil items in the Column, or the size of the column if all items are nil. Works with any Column type.



230
231
232
233
234
235
236
237
238
# File 'lib/fat_table/column.rb', line 230

def count
  return items.size if items.all?(&:nil?)

  if type == 'String'
    items.reject(&:blank?).count.to_d
  else
    items.filter_to_type(type).count.to_d
  end
end

#devObject

Return the sample standard deviation (the unbiased estimator of the population standard deviation using a divisor of N-1) as the square root of the sample variance, of the non-nil items in the Column, or 0 if all items are nil. Works with numeric and datetime Columns. For datetime Columns, it converts each date to its Julian day number and computes the standard deviation of those numbers.



367
368
369
370
371
372
# File 'lib/fat_table/column.rb', line 367

def dev
  return 0 if type == 'NilClass' || items.all?(&:nil?)

  only_with('dev', 'DateTime', 'Numeric')
  var.sqrt(20)
end

#eachObject

Yield each item in the Column in the order in which they appear in the Column. This makes Columns Enumerable, so all the Enumerable methods are available on a Column.



177
178
179
180
181
182
183
184
# File 'lib/fat_table/column.rb', line 177

def each
  if block_given?
    items.each { |itm| yield itm }
    self
  else
    to_enum(:each)
  end
end

#empty?Boolean

Return true if there are no items in the Column.

Returns:

  • (Boolean)


136
137
138
# File 'lib/fat_table/column.rb', line 136

def empty?
  items.empty?
end

#firstObject

Return the first non-nil item in the Column, or nil if all items are nil. Works with any Column type.



203
204
205
206
207
208
209
210
211
# File 'lib/fat_table/column.rb', line 203

def first
  return nil if items.all?(&:nil?)

  if type == 'String'
    items.reject(&:blank?).first
  else
    items.filter_to_type(type).first
  end
end

#force_string!Object

Force the column to have String type and then convert all items to strings.



159
160
161
162
163
164
# File 'lib/fat_table/column.rb', line 159

def force_string!
  @type = 'String'
  unless empty?
    @items = items.map(&:to_s)
  end
end

#lastObject

Return the last non-nil item in the Column. Works with any Column type.



216
217
218
219
220
221
222
223
224
# File 'lib/fat_table/column.rb', line 216

def last
  return nil if items.all?(&:nil?)

  if type == 'String'
    items.reject(&:blank?).last
  else
    items.filter_to_type(type).last
  end
end

#last_iObject

Return the index of the last item in the Column.



143
144
145
# File 'lib/fat_table/column.rb', line 143

def last_i
  size - 1
end

#maxObject

Return the largest non-nil, non-blank item in the Column, or nil if all items are nil. Works with numeric, string, and datetime Columns.



257
258
259
260
261
262
263
264
# File 'lib/fat_table/column.rb', line 257

def max
  only_with('max', 'NilClass', 'Numeric', 'String', 'DateTime')
  if type == 'String'
    items.reject(&:blank?).max
  else
    items.filter_to_type(type).max
  end
end

#minObject

Return the smallest non-nil, non-blank item in the Column, or nil if all items are nil. Works with numeric, string, and datetime Columns.



244
245
246
247
248
249
250
251
# File 'lib/fat_table/column.rb', line 244

def min
  only_with('min', 'NilClass', 'Numeric', 'String', 'DateTime')
  if type == 'String'
    items.reject(&:blank?).min
  else
    items.filter_to_type(type).min
  end
end

#none?Boolean

Return true if none of the items in the Column are true; otherwise return false, or true if all items are nil. Works only with boolean Columns.

Returns:

  • (Boolean)


416
417
418
419
420
421
# File 'lib/fat_table/column.rb', line 416

def none?
  return true if type == 'NilClass' || items.all?(&:nil?)

  only_with('none?', 'Boolean')
  items.filter_to_type(type).none?
end

#one?Boolean

Return true if precisely one of the items in the Column is true; otherwise return false. Works only with boolean Columns.

Returns:

  • (Boolean)


427
428
429
430
431
432
# File 'lib/fat_table/column.rb', line 427

def one?
  return false if type == 'NilClass' || items.all?(&:nil?)

  only_with('one?', 'Boolean')
  items.filter_to_type(type).one?
end

#pdevObject

Return the population standard deviation (the biased estimator of the population standard deviation using a divisor of N) as the square root of the population variance, of the non-nil items in the Column, or 0 if all items are nil. Works with numeric and datetime Columns. For datetime Columns, it converts each date to its Julian day number and computes the standard deviation of those numbers.



382
383
384
385
386
387
# File 'lib/fat_table/column.rb', line 382

def pdev
  return 0 if type == 'NilClass' || items.all?(&:nil?)

  only_with('dev', 'DateTime', 'Numeric')
  Math.sqrt(pvar)
end

#pvarObject

Return the population variance (the biased estimator of the population variance using a divisor of N) as the average squared deviation from the mean, of the non-nil items in the Column, or 0 if all items are nil. Works with numeric and datetime Columns. For datetime Columns, it converts each date to its Julian day number and computes the variance of those numbers.



350
351
352
353
354
355
356
357
# File 'lib/fat_table/column.rb', line 350

def pvar
  return 0 if type == 'NilClass' || items.all?(&:nil?)

  only_with('var', 'DateTime', 'Numeric')
  n = items.filter_to_type(type).size.to_d
  return BigDecimal('0.0') if n <= 1
  var * ((n - 1) / n)
end

#rangeObject

Return a Range object for the smallest to largest value in the column, or nil if all items are nil. Works with numeric, string, and datetime Columns.



271
272
273
274
275
276
# File 'lib/fat_table/column.rb', line 271

def range
  only_with('range', 'NilClass', 'Numeric', 'String', 'DateTime')
  return nil if items.all?(&:nil?)

  Range.new(min, max)
end

#sizeObject

Return the size of the Column, including any nils.



129
130
131
# File 'lib/fat_table/column.rb', line 129

def size
  items.size
end

#sumObject

Return the sum of the non-nil items in the Column, or 0 if all items are nil. Works with numeric and string Columns. For a string Column, it will return the concatenation of the non-nil items.



283
284
285
286
287
288
289
290
291
292
# File 'lib/fat_table/column.rb', line 283

def sum
  return 0 if type == 'NilClass' || items.all?(&:nil?)

  only_with('sum', 'Numeric', 'String')
  if type == 'String'
    items.reject(&:blank?).join(' ')
  else
    items.filter_to_type(type).sum
  end
end

#to_aObject

Return a dupped Array of this Column's items. To get the non-dupped items, just use the .items accessor.



122
123
124
# File 'lib/fat_table/column.rb', line 122

def to_a
  items.deep_dup
end

#tolerant?Boolean

Is this column tolerant of type incompatibilities? If so, the Column type will be forced to String if an incompatible type is found.

Returns:

  • (Boolean)


151
152
153
# File 'lib/fat_table/column.rb', line 151

def tolerant?
  @tolerant
end

#varObject

Return the sample variance (the unbiased estimator of the population variance using a divisor of N-1) as the average squared deviation from the mean, of the non-nil items in the Column, or 0 if all items are nil. Works with numeric and datetime Columns. For datetime Columns, it converts each date to its Julian day number and computes the variance of those numbers.



322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
# File 'lib/fat_table/column.rb', line 322

def var
  return 0 if type == 'NilClass' || items.all?(&:nil?)

  only_with('var', 'DateTime', 'Numeric')
  all_items =
    if type == 'DateTime'
      items.filter_to_type(type).map(&:jd)
    else
      items.filter_to_type(type)
    end
  n = count
  return BigDecimal('0.0') if n <= 1
  mu = Column.new(header: :mu, items: all_items).avg
  sq_dev = BigDecimal('0.0')
  all_items.each do |itm|
    sq_dev += (itm - mu) * (itm - mu)
  end
  sq_dev / (n - 1)
end