Class: Table

Inherits:
Object
  • Object
show all
Defined in:
lib/tablestakes.rb

Overview

This class is a Ruby representation of a table. All data is captured as type String by default. Columns are referred to by their String headers which are assumed to be identified in the first row of the input file. Output is written by default to tab-delimited files with the first row serving as the header names.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(input = nil) ⇒ Table

Instantiate a Table object using a tab-delimited file

input

OPTIONAL Array of rows or String to identify the name of the tab-delimited file to read



32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# File 'lib/tablestakes.rb', line 32

def initialize(input=nil)
  @headers = []
  @table = {}
  @indices = {}
  
  if input.respond_to?(:fetch)
    if input[0].respond_to?(:fetch)
      #create +Table+ from rows

      add_rows(input)
    end
  elsif input.respond_to?(:upcase)
    # a string, then read_file

    read_file(input)
  elsif input.respond_to?(:headers)
    init(input)
  end
  # else create empty +Table+

end

Instance Attribute Details

#headersObject (readonly)

The headers attribute contains the table headers used to reference columns in the Table. All headers are represented as String types.



21
22
23
# File 'lib/tablestakes.rb', line 21

def headers
  @headers
end

Instance Method Details

#bottom(colname, num = 1) ⇒ Object

Counts the number of instances of a particular string, given a column name, and returns an integer >= 0. Returns nil if the column is not found. If no parameters are given, returns the number of rows in the table.

colname

String to identify the column to count

num

OPTIONAL String number of values to return



152
153
154
155
# File 'lib/tablestakes.rb', line 152

def bottom(colname, num=1)
  freq = tally(colname).to_a[1..-1].sort_by {|k,v| v }
  return Table.new(freq[0..num-1].unshift(["State","Count"]))
end

#column(colname) ⇒ Object

Return a copy of a column from the table, identified by column name. Returns nil if column name not found.

colname

String to identify the name of the column



55
56
57
58
59
60
# File 'lib/tablestakes.rb', line 55

def column(colname)
  # check arguments

  return nil unless @table.has_key?(colname)

  Array(@table[colname])
end

#count(colname = nil, value = nil) ⇒ Object Also known as: size, length

Counts the number of instances of a particular string, given a column name, and returns an integer >= 0. Returns nil if the column is not found. If no parameters are given, returns the number of rows in the table.

colname

OPTIONAL String to identify the column to count

value

OPTIONAL String value to count



111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
# File 'lib/tablestakes.rb', line 111

def count(colname=nil, value=nil)
  if colname.nil? || value.nil?
    if @table.size > 0
      @table.each_key {|e| return @table.fetch(e).length }
    else
      return nil
    end
  end
  
  if @table[colname]
    result = 0
    @table[colname].each do |val|
      val == value.to_s ? result += 1 : nil 
    end
    result
  else
    nil 
  end
end

#intersect(table2, colname, col2name = nil) ⇒ Object

Return the intersection of columns from different tables, eliminating duplicates. Return nil if a column is not found.

table2

Table to identify the secondary table in the intersection

colname

String to identify the column to intersection

col2name

OPTIONAL String to identify the column in the second table to intersection

Raises:

  • (ArgumentError)


316
317
318
319
320
321
322
323
324
325
326
# File 'lib/tablestakes.rb', line 316

def intersect(table2, colname, col2name=nil)
  # check arguments

  raise ArgumentError, "Invalid table!" unless table2.is_a?(Table)
  return nil unless @table.has_key?(colname)
  if col2name.nil?   # Assume colname applies for both tables

    col2name = colname
  end
  return nil unless table2.headers.include?(col2name)

  return self.column(colname) & table2.column(col2name)
end

#join(table2, colname, col2name = nil) ⇒ Object

Given a second table to join against, and a field/column, return a Table which contains a join of the two tables. Join only lists the common column once, under the column name of the first table (if different from the name of thee second). All columns from both tables are returned. Returns nil if the column is not found.

table2

Table to identify the secondary table in the join

colname

String to identify the column to join on

col2name

OPTIONAL String to identify the column in the second table to join on

Raises:

  • (ArgumentError)


237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
# File 'lib/tablestakes.rb', line 237

def join(table2, colname, col2name=nil)
  # check arguments

  raise ArgumentError, "Invalid table!" unless table2.is_a?(Table)
  return nil unless @table.has_key?(colname)
  if col2name.nil?   # Assume colname applies for both tables

    col2name = colname
  end
  t2_col_index = table2.headers.index(col2name)
  return nil unless t2_col_index # is not nil


  
  # ensure no duplication of header values

  table2.headers.each do |h|
    if @headers.include?(h)
      update_header(h, '_' << h )
      if h == colname
        colname = '_' << colname
      end
    end
  end

  result = [ Array(@headers) + Array(table2.headers) ]
  @table[colname].each_index do |index|
    t2_index = table2.column(col2name).find_index(@table[colname][index])
    unless t2_index.nil?
      result << self.row(index) + table2.row(t2_index)
    end
  end
  if result.length == 1 #no rows selected

    return nil
  else
    return Table.new(result) 
  end
end

#row(index) ⇒ Object

Return a copy of a row from the table as an Array, given an index (i.e. row number). Returns empty Array if the index is out of bounds.

index

FixNum indicating index of the row.



66
67
68
# File 'lib/tablestakes.rb', line 66

def row(index)    
  Array(get_row(index))
end

#select(*columns) ⇒ Object Also known as: get_columns

Select columns from the table, given one or more column names. Returns an instance of Table with the results. Returns nil if any column is not valid.

columns

Variable String arguments to identify the columns to select



178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
# File 'lib/tablestakes.rb', line 178

def select(*columns)
  # check arguments

  columns.each do |c|
    return nil unless @table.has_key?(c)
  end

  result = []
  result_headers = []
  columns.each { |col| @headers.include?(col) ? result_headers << col : nil }
  result << result_headers
  @table[@headers.first].length.times do |row|
    this_row = []
    result_headers.each do |col|
      this_row << @table[col][row]
    end
    result << this_row
  end
  unless result_headers.empty?
    return Table.new(result)
  else
    return nil
  end
end

#sub(colname, re, replace) ⇒ Object Also known as: sub!

Given a field/column, and a regular expression to match against, and a replacement string, update the table such that it substitutes the column data with the replacement string. Returns nil if the column is not found.

colname

String to identify the column to join on

re

Regexp to match the value in the selected column

replace

String to specify the replacement text for the given Regexp

Raises:

  • (ArgumentError)


280
281
282
283
284
285
286
287
288
289
290
# File 'lib/tablestakes.rb', line 280

def sub(colname, re, replace)
  # check arguments

  raise ArgumentError, "No regular expression to match against" unless re
  raise ArgumentError, "No replacement string specified" unless replace
  return nil unless @table.has_key?(colname)
  
  @table[colname].each do |item|
    item.sub!(re, replace)
  end
  return self
end

#tally(colname) ⇒ Object

Count instances in a particular field/column and return a Table of the results. Returns nil if the column is not found.

colname

String to identify the column to tally



163
164
165
166
167
168
169
170
171
172
# File 'lib/tablestakes.rb', line 163

def tally(colname)
  # check arguments

  return nil unless @table.has_key?(colname)

  result = {}
  @table[colname].each do |val|
    result.has_key?(val) ? result[val] += 1 : result[val] = 1
  end
  return Table.new([[colname,"Count"]] + result.to_a)
end

#to_aObject

Converts a Table object to an array of arrays (each row)

none



92
93
94
95
96
97
98
99
100
101
102
103
# File 'lib/tablestakes.rb', line 92

def to_a
  result = [ Array(@headers) ]
  
  @table[@headers.first].length.times do |row|
    items = []
    @headers.each do |col|
      items << @table[col][row]
    end
    result << items
  end
  result
end

#to_sObject

Converts a Table object to a tab-delimited string.

none



73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
# File 'lib/tablestakes.rb', line 73

def to_s
  result = @headers.join("\t") << "\n"
  
  @table[@headers.first].length.times do |row|
    @headers.each do |col|
      result << @table[col][row].to_s
      unless col == @headers.last
        result << "\t"
      else
        result << "\n"
      end
    end
  end
  result
end

#top(colname, num = 1) ⇒ Object

Counts the number of instances of a particular string, given a column name, and returns an integer >= 0. Returns nil if the column is not found. If no parameters are given, returns the number of rows in the table.

colname

String to identify the column to count

num

OPTIONAL String number of values to return



140
141
142
143
# File 'lib/tablestakes.rb', line 140

def top(colname, num=1)
  freq = tally(colname).to_a[1..-1].sort_by {|k,v| v }.reverse
  return Table.new(freq[0..num-1].unshift(["State","Count"]))
end

#union(table2, colname, col2name = nil) ⇒ Object

Return the union of columns from different tables, eliminating duplicates. Return nil if a column is not found.

table2

Table to identify the secondary table in the union

colname

String to identify the column to union

col2name

OPTIONAL String to identify the column in the second table to union

Raises:

  • (ArgumentError)


298
299
300
301
302
303
304
305
306
307
308
# File 'lib/tablestakes.rb', line 298

def union(table2, colname, col2name=nil)
  # check arguments

  raise ArgumentError, "Invalid table!" unless table2.is_a?(Table)
  return nil unless @table.has_key?(colname)
  if col2name.nil?   # Assume colname applies for both tables

    col2name = colname
  end
  return nil unless table2.headers.include?(col2name)

  return self.column(colname) | table2.column(col2name)
end

#where(colname, condition = nil) ⇒ Object Also known as: get_rows

Given a particular condition for a given column field/column, return a subtable that matches the condition. If no condition is given, a new Table is returned with all records. Returns nil if the condition is not met or the column is not found.

colname

String to identify the column to tally

condition

OPTIONAL String containing a ruby condition to evaluate



211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
# File 'lib/tablestakes.rb', line 211

def where(colname, condition=nil)
  # check arguments

  return nil unless @table.has_key?(colname)

  result = []
  result << @headers
  @table[colname].each_index do |index|
    if condition
      eval("'#{@table[colname][index]}' #{condition}") ? result << get_row(index) : nil
    else
      result << get_row(index)
    end
  end
  result.length > 1 ? Table.new(result) : nil
end

#write_file(filename) ⇒ Object

Write a representation of the Table object to a file (tab delimited).

filename

String to identify the name of the file to write



333
334
335
336
# File 'lib/tablestakes.rb', line 333

def write_file(filename)
  file = File.open(filename, "w")
  file.print self.to_s
end