Class: FatTable::Column

Inherits:
Object
  • Object
show all
Includes:
Enumerable
Defined in:
lib/fat_table/column.rb

Overview

Column objects are a thin wrapper around an Array to allow columns to be summed and have other aggregate operations performed on them, but compacting out nils before proceeding. They are characterized by a header, which gives the Column a name, a type, which limits the kinds of items that can be stored in the Column, and the items themselves, which all must either be nil or objects compatible with the Column's type. The valid types are Boolean, DateTime, Numeric, String, and NilClass, the last of which is used as the initial type until items added to the Column fix its type as one of the others.

Constant Summary collapse

TYPES =

Valid Column types as strings.

%w[NilClass Boolean DateTime Numeric String].freeze
VALID_AGGREGATES =

The names of the known aggregate operations that can be performed on a Column.

s(first last range
sum count min max
avg var pvar dev pdev
any? all? none? one?)

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(header:, items: [], type: 'NilClass', tolerant: false) ⇒ Column

Create a new Column with the given +header+ and initialized with the given +items+, as an array of either strings or ruby objects that are one of the permissible types or strings parsable as one of the permissible types. If no +items+ are passed, returns an empty Column to which items may be added with the Column#<< method. The item types must be one of the following types or strings parseable as one of them:

Boolean:: an object of type TrueClass or FalseClass or a string that is either 't', 'true', 'y', 'yes', 'f', 'false', 'n', or 'no', in each case, regardless of case.

DateTime:: an object of class Date, DateTime, or a string that matches +/\d\d\d\d[-\/]\d\d?[-\/]\d\d?/+ and is parseable by DateTime.parse.

Numeric:: on object that is of class Numeric, or a string that looks like a number after removing '+$+', '+,+', and '+_+' as well as Rationals in the form /:/ or /, where is an integer.

String:: if the object is a non-blank string that does not parse as any of the foregoing, it its treated as a Sting type, and once a column is typed as such, blank strings represent blank strings rather than nil values.

NilClass:: until a Column sees an item that qualifies as one of the foregoing, it is typed as NilClass, meaning that the type is undetermined. Until a column obtains a type, blank strings are treated as nils and do not affect the type of the column. After a column acquires a type, blank strings are treated as nil values except in the case of String columns, which retain them a blank strings.

Examples:


require 'fat_table'
col = FatTable::Column.new(header: 'date')
col << Date.today - 30
col << '2017-05-04'
col.type #=> 'DateTime'
col.header #=> :date
nums = [35.25, 18, '35:14', '$18_321']
col = FatTable::Column.new(header: :prices, items: nums)
col.type #=> 'Numeric'
col.header #=> :prices
col.sum #=> 18376.75

Parameters:

  • header (String, Symbol)

    the name of the column header

  • items (Array<String>, Array<DateTime>, Array<Numeric>, Array<Boolean>) (defaults to: [])

    the initial data items in column

  • type (String) (defaults to: 'NilClass')

    the column type: 'String', 'Numeric', 'DateTime', 'Boolean', or 'NilClass'

  • tolerant (Boolean) (defaults to: false)

    whether the column accepts unconvertable items not of its type as Strings

Raises:



94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
# File 'lib/fat_table/column.rb', line 94

def initialize(header:, items: [], type: 'NilClass', tolerant: false)
  @raw_header = header
  @header =
    if @raw_header.is_a?(Symbol)
      @raw_header
    else
      @raw_header.to_s.as_sym
    end
  @type = type
  @tolerant = tolerant
  msg = "unknown column type '#{type}"
  raise UserError, msg unless TYPES.include?(@type.to_s)

  @items = []
  items.each { |i| self << i }
end

Instance Attribute Details

#headerObject (readonly)

The symbol representing this Column.



15
16
17
# File 'lib/fat_table/column.rb', line 15

def header
  @header
end

#itemsObject (readonly)

An Array of the items of this Column, all of which must be values of the Column's type or a nil. This Array contains the value of the item after conversion to a native Ruby type, such as TrueClass, Date, DateTime, Integer, String, etc. Thus, you can perform operations on the items, perhaps after removing nils with +.items.compact+.



30
31
32
# File 'lib/fat_table/column.rb', line 30

def items
  @items
end

#raw_headerObject (readonly)

The header as provided by the caller before its conversion to a symbol. You can use this to recover the original string form of the header.



19
20
21
# File 'lib/fat_table/column.rb', line 19

def raw_header
  @raw_header
end

#tolerantObject

Returns the value of attribute tolerant.



32
33
34
# File 'lib/fat_table/column.rb', line 32

def tolerant
  @tolerant
end

#typeObject (readonly)

A string representing the deduced type of this Column. One of Column::TYPES.



23
24
25
# File 'lib/fat_table/column.rb', line 23

def type
  @type
end

Instance Method Details

#+(other) ⇒ Object

Return a new Column appending the items of other to this Column's items, checking for type compatibility. Use the header of this Column as the header of the new Column.

Raises:



477
478
479
480
481
482
# File 'lib/fat_table/column.rb', line 477

def +(other)
  msg = 'cannot combine columns with different types'
  raise UserError, msg unless type == other.type

  Column.new(header: header, items: items + other.items)
end

#<<(itm) ⇒ Object

Append +itm+ to end of the Column after converting it to the Column's type. If the Column's type is still open, i.e. NilClass, attempt to fix the Column's type based on the type of +itm+ as with Column.new. If its a tolerant column, respond to type errors by converting the column to a String type.



462
463
464
465
466
467
468
469
470
# File 'lib/fat_table/column.rb', line 462

def <<(itm)
  items << convert_and_set_type(itm)
rescue IncompatibleTypeError => ex
  if tolerant?
    items << Convert.convert_to_string(itm)
  else
    raise ex
  end
end

#[](idx) ⇒ Object

Return the item of the Column at the given index.



118
119
120
# File 'lib/fat_table/column.rb', line 118

def [](idx)
  items[idx]
end

#all?Boolean

Return true if all of the items in the Column are true; otherwise return false, or false if all items are nil. Works only with boolean Columns.

Returns:

  • (Boolean)


410
411
412
413
414
415
# File 'lib/fat_table/column.rb', line 410

def all?
  return false if type == 'NilClass' || items.all?(&:nil?)

  only_with('all?', 'Boolean')
  items.filter_to_type(type).all?
end

#any?Boolean

Return true if any of the items in the Column are true; otherwise return false, or false if all items are nil. Works only with boolean Columns.

Returns:

  • (Boolean)


399
400
401
402
403
404
# File 'lib/fat_table/column.rb', line 399

def any?
  return false if type == 'NilClass' || items.all?(&:nil?)

  only_with('any?', 'Boolean')
  items.filter_to_type(type).any?
end

#avgObject

Return the average value of the non-nil items in the Column, or 0 if all items are nil. Works with numeric and datetime Columns. For datetime Columns, it converts each date to its Julian day number, computes the average, and then converts the average back to a DateTime.



304
305
306
307
308
309
310
311
312
313
314
315
316
# File 'lib/fat_table/column.rb', line 304

def avg
  return 0 if type == 'NilClass' || items.all?(&:nil?)

  only_with('avg', 'DateTime', 'Numeric')
  itms = items.filter_to_type(type)
  size = itms.size.to_d
  if type == 'DateTime'
    avg_jd = itms.sum(&:jd) / size
    DateTime.jd(avg_jd)
  else
    itms.sum / size
  end
end

#countObject

Return a count of the non-nil items in the Column, or the size of the column if all items are nil. Works with any Column type.



234
235
236
237
238
239
240
241
242
# File 'lib/fat_table/column.rb', line 234

def count
  return items.size if items.all?(&:nil?)

  if type == 'String'
    items.count { |i| !i.blank? }
  else
    items.filter_to_type(type).count.to_d
  end
end

#devObject

Return the sample standard deviation (the unbiased estimator of the population standard deviation using a divisor of N-1) as the square root of the sample variance, of the non-nil items in the Column, or 0 if all items are nil. Works with numeric and datetime Columns. For datetime Columns, it converts each date to its Julian day number and computes the standard deviation of those numbers.



373
374
375
376
377
378
# File 'lib/fat_table/column.rb', line 373

def dev
  return 0 if type == 'NilClass' || items.all?(&:nil?)

  only_with('dev', 'DateTime', 'Numeric')
  var.sqrt(20)
end

#each(&block) ⇒ Object

Yield each item in the Column in the order in which they appear in the Column. This makes Columns Enumerable, so all the Enumerable methods are available on a Column.



181
182
183
184
185
186
187
188
# File 'lib/fat_table/column.rb', line 181

def each(&block)
  if block
    items.each(&block)
    self
  else
    to_enum(:each)
  end
end

#empty?Boolean

Return true if there are no items in the Column.

Returns:

  • (Boolean)


140
141
142
# File 'lib/fat_table/column.rb', line 140

def empty?
  items.empty?
end

#firstObject

Return the first non-nil item in the Column, or nil if all items are nil. Works with any Column type.



207
208
209
210
211
212
213
214
215
# File 'lib/fat_table/column.rb', line 207

def first
  return if items.all?(&:nil?)

  if type == 'String'
    items.reject(&:blank?).first
  else
    items.filter_to_type(type).first
  end
end

#force_string!Object

Force the column to have String type and then convert all items to strings.



163
164
165
166
167
168
# File 'lib/fat_table/column.rb', line 163

def force_string!
  @type = 'String'
  unless empty?
    @items = items.map(&:to_s)
  end
end

#lastObject

Return the last non-nil item in the Column. Works with any Column type.



220
221
222
223
224
225
226
227
228
# File 'lib/fat_table/column.rb', line 220

def last
  return if items.all?(&:nil?)

  if type == 'String'
    items.reject(&:blank?).last
  else
    items.filter_to_type(type).last
  end
end

#last_iObject

Return the index of the last item in the Column.



147
148
149
# File 'lib/fat_table/column.rb', line 147

def last_i
  size - 1
end

#maxObject

Return the largest non-nil, non-blank item in the Column, or nil if all items are nil. Works with numeric, string, and datetime Columns.



261
262
263
264
265
266
267
268
# File 'lib/fat_table/column.rb', line 261

def max
  only_with('max', 'NilClass', 'Numeric', 'String', 'DateTime')
  if type == 'String'
    items.reject(&:blank?).max
  else
    items.filter_to_type(type).max
  end
end

#minObject

Return the smallest non-nil, non-blank item in the Column, or nil if all items are nil. Works with numeric, string, and datetime Columns.



248
249
250
251
252
253
254
255
# File 'lib/fat_table/column.rb', line 248

def min
  only_with('min', 'NilClass', 'Numeric', 'String', 'DateTime')
  if type == 'String'
    items.reject(&:blank?).min
  else
    items.filter_to_type(type).min
  end
end

#none?Boolean

Return true if none of the items in the Column are true; otherwise return false, or true if all items are nil. Works only with boolean Columns.

Returns:

  • (Boolean)


422
423
424
425
426
427
# File 'lib/fat_table/column.rb', line 422

def none?
  return true if type == 'NilClass' || items.all?(&:nil?)

  only_with('none?', 'Boolean')
  items.filter_to_type(type).none?
end

#one?Boolean

Return true if precisely one of the items in the Column is true; otherwise return false. Works only with boolean Columns.

Returns:

  • (Boolean)


433
434
435
436
437
438
# File 'lib/fat_table/column.rb', line 433

def one?
  return false if type == 'NilClass' || items.all?(&:nil?)

  only_with('one?', 'Boolean')
  items.filter_to_type(type).one?
end

#pdevObject

Return the population standard deviation (the biased estimator of the population standard deviation using a divisor of N) as the square root of the population variance, of the non-nil items in the Column, or 0 if all items are nil. Works with numeric and datetime Columns. For datetime Columns, it converts each date to its Julian day number and computes the standard deviation of those numbers.



388
389
390
391
392
393
# File 'lib/fat_table/column.rb', line 388

def pdev
  return 0 if type == 'NilClass' || items.all?(&:nil?)

  only_with('dev', 'DateTime', 'Numeric')
  Math.sqrt(pvar)
end

#pvarObject

Return the population variance (the biased estimator of the population variance using a divisor of N) as the average squared deviation from the mean, of the non-nil items in the Column, or 0 if all items are nil. Works with numeric and datetime Columns. For datetime Columns, it converts each date to its Julian day number and computes the variance of those numbers.



355
356
357
358
359
360
361
362
363
# File 'lib/fat_table/column.rb', line 355

def pvar
  return 0 if type == 'NilClass' || items.all?(&:nil?)

  only_with('var', 'DateTime', 'Numeric')
  n = items.filter_to_type(type).size.to_d
  return BigDecimal('0.0') if n <= 1

  var * ((n - 1) / n)
end

#rangeObject

Return a Range object for the smallest to largest value in the column, or nil if all items are nil. Works with numeric, string, and datetime Columns.



275
276
277
278
279
280
# File 'lib/fat_table/column.rb', line 275

def range
  only_with('range', 'NilClass', 'Numeric', 'String', 'DateTime')
  return if items.all?(&:nil?)

  Range.new(min, max)
end

#sizeObject

Return the size of the Column, including any nils.



133
134
135
# File 'lib/fat_table/column.rb', line 133

def size
  items.size
end

#sumObject

Return the sum of the non-nil items in the Column, or 0 if all items are nil. Works with numeric and string Columns. For a string Column, it will return the concatenation of the non-nil items.



287
288
289
290
291
292
293
294
295
296
# File 'lib/fat_table/column.rb', line 287

def sum
  return 0 if type == 'NilClass' || items.all?(&:nil?)

  only_with('sum', 'Numeric', 'String')
  if type == 'String'
    items.reject(&:blank?).join(' ')
  else
    items.filter_to_type(type).sum
  end
end

#to_aObject

Return a dupped Array of this Column's items. To get the non-dupped items, just use the .items accessor.



126
127
128
# File 'lib/fat_table/column.rb', line 126

def to_a
  items.deep_dup
end

#tolerant?Boolean

Is this column tolerant of type incompatibilities? If so, the Column type will be forced to String if an incompatible type is found.

Returns:

  • (Boolean)


155
156
157
# File 'lib/fat_table/column.rb', line 155

def tolerant?
  @tolerant
end

#varObject

Return the sample variance (the unbiased estimator of the population variance using a divisor of N-1) as the average squared deviation from the mean, of the non-nil items in the Column, or 0 if all items are nil. Works with numeric and datetime Columns. For datetime Columns, it converts each date to its Julian day number and computes the variance of those numbers.



326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
# File 'lib/fat_table/column.rb', line 326

def var
  return 0 if type == 'NilClass' || items.all?(&:nil?)

  only_with('var', 'DateTime', 'Numeric')
  all_items =
    if type == 'DateTime'
      items.filter_to_type(type).map(&:jd)
    else
      items.filter_to_type(type)
    end
  n = count
  return BigDecimal('0.0') if n <= 1

  mu = Column.new(header: :mu, items: all_items).avg
  sq_dev = BigDecimal('0.0')
  all_items.each do |itm|
    sq_dev += (itm - mu) * (itm - mu)
  end
  sq_dev / (n - 1)
end