Class: Table

Inherits:
Object
  • Object
show all
Defined in:
lib/tablestakes.rb

Overview

This class is a Ruby representation of a table. All data is captured as type String by default. Columns are referred to by their String headers which are assumed to be identified in the first row of the input file. Output is written by default to tab-delimited files with the first row serving as the header names.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(input = nil) ⇒ Table

Instantiate a Table object using a tab-delimited file

input

OPTIONAL Array of rows or String to identify the name of the tab-delimited file to read



32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# File 'lib/tablestakes.rb', line 32

def initialize(input=nil)
  @headers = []
  @table = {}
  @indices = {}
  
  if input.respond_to?(:fetch)
    if input[0].respond_to?(:fetch)
      #create +Table+ from rows

      add_rows(input)
    end
  elsif input.respond_to?(:upcase)
    # a string, then read_file

    read_file(input)
  elsif input.respond_to?(:headers)
    init(input)
  end
  # else create empty +Table+

end

Instance Attribute Details

#headersObject (readonly)

The headers attribute contains the table headers used to reference columns in the Table. All headers are represented as String types.



21
22
23
# File 'lib/tablestakes.rb', line 21

def headers
  @headers
end

Instance Method Details

#add_column(*args) ⇒ Object

Add a column to the Table. Raises ArgumentError if the column name is already taken or there are not the correct number of values.

colname

String to identify the name of the column

column_vals

Array to hold the column values

Examples:

add_column("Header", [e1, e2, e3])
add_column(array_including_header)
add_column("Header", "e1", "e2", ...)

Raises:



85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
# File 'lib/tablestakes.rb', line 85

def add_column(*args)
  if args.kind_of? Array
    args = args.flatten
    colname = args.shift
    column_vals = args
  else
    raise ArgumentError, "Invalid Arguments to add_column"
  end
  # check arguments

  raise ArgumentError, "Duplicate Column Name!" if @table.has_key?(colname)
  unless self.empty?
    if column_vals.length != @table[@headers.first].length
      raise ArgumentError, "Number of elements in column does not match existing table"
    end
  end
  append_col(colname, column_vals)    
end

#add_row(*row) ⇒ Object

Add a row to the Table, appending it to the end. Raises ArgumentError if there are not the correct number of values.

row

Array to hold the row values

Examples:

add_row([e1, e2, e3])
add_row("e1", "e2", "e3", ...)


124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
# File 'lib/tablestakes.rb', line 124

def add_row(*row)
  if row.kind_of? Array
    row = row.flatten
  else
    raise ArgumentError, "Invalid Arguments to add_row"
  end
  if @headers.empty?
      @headers = row
  else
    unless row.length == @headers.length
      raise ArgumentError, "Wrong number of fields in Table input"
    end
    append_row(row)
  end
  return self
end

#add_rows(array_of_rows) ⇒ Object

Add one or more rows to the Table, appending it to the end. Raises ArgumentError if there are not the correct number of values. The first row becomes the table headers if currently undefined.

array_of_rows

Array of Arrays to hold the rows values

Examples:

add_rows([ [e1, e2, e3], [e1, e2, e3] ])


110
111
112
113
114
115
# File 'lib/tablestakes.rb', line 110

def add_rows(array_of_rows)
  array_of_rows.each do |r|
    add_row(r.clone)
  end
  return self
end

#bottom(colname, num = 1) ⇒ Object

Counts the number of instances of a particular string, given a column name, and returns an integer >= 0. Returns nil if the column is not found. If no parameters are given, returns the number of rows in the table.

colname

String to identify the column to count

num

OPTIONAL String number of values to return



250
251
252
253
# File 'lib/tablestakes.rb', line 250

def bottom(colname, num=1)
  freq = tally(colname).to_a[1..-1].sort_by {|k,v| v }
  return Table.new(freq[0..num-1].unshift([colname,"Count"]))
end

#column(colname) ⇒ Object

Return a copy of a column from the table, identified by column name. Returns nil if column name not found.

colname

String to identify the name of the column



55
56
57
58
59
60
# File 'lib/tablestakes.rb', line 55

def column(colname)
  # check arguments

  return nil unless @table.has_key?(colname)

  Array(@table[colname])
end

#count(colname = nil, value = nil) ⇒ Object Also known as: size, length

Counts the number of instances of a particular string, given a column name, and returns an integer >= 0. Returns nil if the column is not found. If no parameters are given, returns the number of rows in the table.

colname

OPTIONAL String to identify the column to count

value

OPTIONAL String value to count



209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
# File 'lib/tablestakes.rb', line 209

def count(colname=nil, value=nil)
  if colname.nil? || value.nil?
    if @table.size > 0
      @table.each_key {|e| return @table.fetch(e).length }
    else
      return nil
    end
  end
  
  if @table[colname]
    result = 0
    @table[colname].each do |val|
      val == value.to_s ? result += 1 : nil 
    end
    result
  else
    nil 
  end
end

#del_column(colname) ⇒ Object

Delete a column from the Table. Raises ArgumentError if the column name does not exist.

colname

String to identify the name of the column

Raises:



144
145
146
147
148
149
150
151
# File 'lib/tablestakes.rb', line 144

def del_column(colname)
  # check arguments

  raise ArgumentError, "Column name does not exist!" unless @table.has_key?(colname)
  
  @headers.delete(colname)
  @table.delete(colname)
  return self
end

#del_row(rownum) ⇒ Object

Delete a row from the Table. Raises ArgumentError if the row number is not found.

rownum

FixNum to hold the row number

Raises:



157
158
159
160
161
162
163
164
165
# File 'lib/tablestakes.rb', line 157

def del_row(rownum)
  # check arguments

  raise ArgumentError, "Row number does not exist!" unless rownum <= @table[@headers.first].length

  @headers.each do |col|
    @table[col].delete_at(rownum)
  end
  return self
end

#empty?Boolean

Return true if the Table is empty, false otherwise.

Returns:



72
73
74
# File 'lib/tablestakes.rb', line 72

def empty?
  @headers.length == 0 && @table.length == 0
end

#intersect(table2, colname, col2name = nil) ⇒ Object

Return the intersection of columns from different tables, eliminating duplicates. Return nil if a column is not found.

table2

Table to identify the secondary table in the intersection

colname

String to identify the column to intersection

col2name

OPTIONAL String to identify the column in the second table to intersection

Raises:



414
415
416
417
418
419
420
421
422
423
424
# File 'lib/tablestakes.rb', line 414

def intersect(table2, colname, col2name=nil)
  # check arguments

  raise ArgumentError, "Invalid table!" unless table2.is_a?(Table)
  return nil unless @table.has_key?(colname)
  if col2name.nil?   # Assume colname applies for both tables

    col2name = colname
  end
  return nil unless table2.headers.include?(col2name)

  return self.column(colname) & table2.column(col2name)
end

#join(table2, colname, col2name = nil) ⇒ Object

Given a second table to join against, and a field/column, return a Table which contains a join of the two tables. Join only lists the common column once, under the column name of the first table (if different from the name of thee second). All columns from both tables are returned. Returns nil if the column is not found.

table2

Table to identify the secondary table in the join

colname

String to identify the column to join on

col2name

OPTIONAL String to identify the column in the second table to join on

Raises:



335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
# File 'lib/tablestakes.rb', line 335

def join(table2, colname, col2name=nil)
  # check arguments

  raise ArgumentError, "Invalid table!" unless table2.is_a?(Table)
  return nil unless @table.has_key?(colname)
  if col2name.nil?   # Assume colname applies for both tables

    col2name = colname
  end
  t2_col_index = table2.headers.index(col2name)
  return nil unless t2_col_index # is not nil


  
  # ensure no duplication of header values

  table2.headers.each do |h|
    if @headers.include?(h)
      update_header(h, '_' << h )
      if h == colname
        colname = '_' << colname
      end
    end
  end

  result = [ Array(@headers) + Array(table2.headers) ]
  @table[colname].each_index do |index|
    t2_index = table2.column(col2name).find_index(@table[colname][index])
    unless t2_index.nil?
      result << self.row(index) + table2.row(t2_index)
    end
  end
  if result.length == 1 #no rows selected

    return nil
  else
    return Table.new(result) 
  end
end

#row(index) ⇒ Object

Return a copy of a row from the table as an Array, given an index (i.e. row number). Returns empty Array if the index is out of bounds.

index

FixNum indicating index of the row.



66
67
68
# File 'lib/tablestakes.rb', line 66

def row(index)    
  Array(get_row(index))
end

#select(*columns) ⇒ Object Also known as: get_columns

Select columns from the table, given one or more column names. Returns an instance of Table with the results. Returns nil if any column is not valid.

columns

Variable String arguments to identify the columns to select



276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
# File 'lib/tablestakes.rb', line 276

def select(*columns)
  # check arguments

  columns.each do |c|
    return nil unless @table.has_key?(c)
  end

  result = []
  result_headers = []
  columns.each { |col| @headers.include?(col) ? result_headers << col : nil }
  result << result_headers
  @table[@headers.first].length.times do |row|
    this_row = []
    result_headers.each do |col|
      this_row << @table[col][row]
    end
    result << this_row
  end
  unless result_headers.empty?
    return Table.new(result)
  else
    return nil
  end
end

#sub(colname, re, replace) ⇒ Object Also known as: sub!

Given a field/column, and a regular expression to match against, and a replacement string, update the table such that it substitutes the column data with the replacement string. Returns nil if the column is not found.

colname

String to identify the column to join on

re

Regexp to match the value in the selected column

replace

String to specify the replacement text for the given Regexp

Raises:



378
379
380
381
382
383
384
385
386
387
388
# File 'lib/tablestakes.rb', line 378

def sub(colname, re, replace)
  # check arguments

  raise ArgumentError, "No regular expression to match against" unless re
  raise ArgumentError, "No replacement string specified" unless replace
  return nil unless @table.has_key?(colname)
  
  @table[colname].each do |item|
    item.sub!(re, replace)
  end
  return self
end

#tally(colname) ⇒ Object

Count instances in a particular field/column and return a Table of the results. Returns nil if the column is not found.

colname

String to identify the column to tally



261
262
263
264
265
266
267
268
269
270
# File 'lib/tablestakes.rb', line 261

def tally(colname)
  # check arguments

  return nil unless @table.has_key?(colname)

  result = {}
  @table[colname].each do |val|
    result.has_key?(val) ? result[val] += 1 : result[val] = 1
  end
  return Table.new([[colname,"Count"]] + result.to_a)
end

#to_aObject

Converts a Table object to an array of arrays (each row)

none



190
191
192
193
194
195
196
197
198
199
200
201
# File 'lib/tablestakes.rb', line 190

def to_a
  result = [ Array(@headers) ]
  
  @table[@headers.first].length.times do |row|
    items = []
    @headers.each do |col|
      items << @table[col][row]
    end
    result << items
  end
  result
end

#to_sObject

Converts a Table object to a tab-delimited string.

none



171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
# File 'lib/tablestakes.rb', line 171

def to_s
  result = @headers.join("\t") << "\n"
  
  @table[@headers.first].length.times do |row|
    @headers.each do |col|
      result << @table[col][row].to_s
      unless col == @headers.last
        result << "\t"
      else
        result << "\n"
      end
    end
  end
  result
end

#top(colname, num = 1) ⇒ Object

Counts the number of instances of a particular string, given a column name, and returns an integer >= 0. Returns nil if the column is not found. If no parameters are given, returns the number of rows in the table.

colname

String to identify the column to count

num

OPTIONAL String number of values to return



238
239
240
241
# File 'lib/tablestakes.rb', line 238

def top(colname, num=1)
  freq = tally(colname).to_a[1..-1].sort_by {|k,v| v }.reverse
  return Table.new(freq[0..num-1].unshift([colname,"Count"]))
end

#union(table2, colname, col2name = nil) ⇒ Object

Return the union of columns from different tables, eliminating duplicates. Return nil if a column is not found.

table2

Table to identify the secondary table in the union

colname

String to identify the column to union

col2name

OPTIONAL String to identify the column in the second table to union

Raises:



396
397
398
399
400
401
402
403
404
405
406
# File 'lib/tablestakes.rb', line 396

def union(table2, colname, col2name=nil)
  # check arguments

  raise ArgumentError, "Invalid table!" unless table2.is_a?(Table)
  return nil unless @table.has_key?(colname)
  if col2name.nil?   # Assume colname applies for both tables

    col2name = colname
  end
  return nil unless table2.headers.include?(col2name)

  return self.column(colname) | table2.column(col2name)
end

#where(colname, condition = nil) ⇒ Object Also known as: get_rows

Given a particular condition for a given column field/column, return a subtable that matches the condition. If no condition is given, a new Table is returned with all records. Returns nil if the condition is not met or the column is not found.

colname

String to identify the column to tally

condition

OPTIONAL String containing a ruby condition to evaluate



309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
# File 'lib/tablestakes.rb', line 309

def where(colname, condition=nil)
  # check arguments

  return nil unless @table.has_key?(colname)

  result = []
  result << @headers
  @table[colname].each_index do |index|
    if condition
      eval(%q["#{@table[colname][index]}"] << "#{condition}") ? result << get_row(index) : nil
    else
      result << get_row(index)
    end
  end
  result.length > 1 ? Table.new(result) : nil
end

#write_file(filename) ⇒ Object

Write a representation of the Table object to a file (tab delimited).

filename

String to identify the name of the file to write



431
432
433
434
# File 'lib/tablestakes.rb', line 431

def write_file(filename)
  file = File.open(filename, "w")
  file.print self.to_s
end