Class: Table
- Inherits:
-
Object
- Object
- Table
- Defined in:
- lib/tablestakes.rb
Overview
This class is a Ruby representation of a table. All data is captured as type String by default. Columns are referred to by their String headers which are assumed to be identified in the first row of the input file. Output is written by default to tab-delimited files with the first row serving as the header names.
Instance Attribute Summary collapse
-
#headers ⇒ Object
readonly
The headers attribute contains the table headers used to reference columns in the
Table.
Instance Method Summary collapse
-
#add_column(*args) ⇒ Object
Add a column to the Table.
-
#add_row(*row) ⇒ Object
(also: #<<)
Add a row to the Table, appending it to the end.
-
#add_rows(array_of_rows) ⇒ Object
Add one or more rows to the Table, appending it to the end.
-
#append(a_table) ⇒ Object
Append one Table object to another.
-
#bottom(colname, num = 1) ⇒ Object
Returns counts of the least frequent values found in a given column in the form of a Table.
-
#column(colname) ⇒ Object
Return a copy of a column from the table, identified by column name.
-
#count(colname = nil, value = nil) ⇒ Object
(also: #size, #length)
Counts the number of instances of a particular string, given a column name, and returns an integer >= 0.
-
#del_column(colname) ⇒ Object
Delete a column from the Table.
-
#del_row(rownum) ⇒ Object
Delete a row from the Table.
-
#each ⇒ Object
Defines an iterator for
Tablewhich produces rows of data (headers omitted) for its calling block. -
#empty? ⇒ Boolean
Return true if the Table is empty, false otherwise.
-
#initialize(input = nil) ⇒ Table
constructor
Instantiate a
Tableobject using a tab-delimited file. -
#intersect(table2, colname, col2name = colname) ⇒ Object
Return an Array with the intersection of columns from different tables, eliminating duplicates.
-
#join(table2, colname, col2name = colname) ⇒ Object
Given a second table to join against, and a field/column, return a
Tablewhich contains a join of the two tables. -
#rename_header(orig_name, new_name) ⇒ Object
Rename a header value for this
Tableobject. -
#row(index) ⇒ Object
Return a copy of a row from the table as an
Array, given an index (i.e. row number). -
#select(*columns) ⇒ Object
(also: #get_columns)
Select columns from the table, given one or more column names.
-
#sort(column = nil, &block) ⇒ Object
(also: #sort!)
Sort the table based on given column.
-
#sub(colname, match = nil, replace = nil, &block) ⇒ Object
Given a field/column, and a regular expression to match against, and a replacement string, create a new table which performs a substitute operation on column data.
-
#tally(colname) ⇒ Object
Count instances in a particular field/column and return a
Tableof the results. -
#to_a ⇒ Object
Converts a
Tableobject to an array of arrays (each row). -
#to_s ⇒ Object
Converts a
Tableobject to a tab-delimited string. -
#top(colname, num = 1) ⇒ Object
Returns counts of the most frequent values found in a given column in the form of a Table.
-
#union(table2, colname, col2name = colname) ⇒ Object
Return Array with the union of elements columns in the given tables, eliminating duplicates.
-
#where(colname, condition = nil) ⇒ Object
(also: #get_rows)
Given a particular condition for a given column field/column, return a subtable that matches the condition.
-
#write_file(filename) ⇒ Object
Write a representation of the
Tableobject to a file (tab delimited).
Constructor Details
#initialize(input = nil) ⇒ Table
Instantiate a Table object using a tab-delimited file
Attributes
input-
OPTIONAL
Arrayof rows orStringto identify the name of the tab-delimited file to read
Examples
cities = Table.new() # empty table
cities = Table.new([ ["City", "State], ["New York", "NY"], ["Dallas", "TX"] ]) # create from Array of rows
cities = Table.new("cities.txt") # read from file
cities = Table.new(capitals) # create from table
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
# File 'lib/tablestakes.rb', line 42 def initialize(input=nil) @headers = [] @table = {} @indices = {} if input.respond_to?(:fetch) if input[0].respond_to?(:fetch) #create Table from rows add_rows(input) end elsif input.respond_to?(:upcase) # a string, then read_file read_file(input) elsif input.respond_to?(:headers) @headers = input.headers.dup input.each {|row| add_row(row) } end # else create empty +Table+ end |
Instance Attribute Details
#headers ⇒ Object (readonly)
The headers attribute contains the table headers used to reference columns in the Table. All headers are represented as String types.
23 24 25 |
# File 'lib/tablestakes.rb', line 23 def headers @headers end |
Instance Method Details
#add_column(*args) ⇒ Object
Add a column to the Table. Raises ArgumentError if the column name is already taken or there are not the correct number of values.
Attributes
args-
Array of
Stringto identify the name of the column (see examples)
Examples
cities.add_column("City", ["New York", "Dallas", "San Franscisco"])
cities.add_column(["City","New York", "Dallas", "San Franscisco"])
cities.add_column("City", "New York", "Dallas", "San Franscisco")
119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 |
# File 'lib/tablestakes.rb', line 119 def add_column(*args) if args.kind_of? Array args.flatten! colname = args.shift column_vals = args end # check arguments raise ArgumentError, "Duplicate Column Name!" if @table.has_key?(colname) unless self.empty? if column_vals.length != @table[@headers.first].length raise ArgumentError, "Number of elements in column does not match existing table" end end append_col(colname, column_vals) end |
#add_row(*row) ⇒ Object Also known as: <<
Add a row to the Table, appending it to the end. Raises ArgumentError if there are not the correct number of values.
Attributes
row-
Arrayto hold the row values
Examples
cities = Table.new.add_row( ["City", "State"] ) # create new Table with headers
cities.add_row("New York", "NY") # add data row to Table
189 190 191 192 193 194 195 196 197 198 199 200 201 202 |
# File 'lib/tablestakes.rb', line 189 def add_row(*row) if row.kind_of? Array row = row.flatten end if @headers.empty? @headers = row else unless row.length == @headers.length raise ArgumentError, "Wrong number of fields in Table input" end append_row(row) end return self end |
#add_rows(array_of_rows) ⇒ Object
Add one or more rows to the Table, appending it to the end. Raises ArgumentError if there are not the correct number of values. The first row becomes the table headers if currently undefined.
Attributes
array_of_rows-
ArrayofArraysto hold the rows values
Examples
cities.add_rows([ ["New York", "NY"], ["Austin", "TX"] ])
144 145 146 147 148 149 |
# File 'lib/tablestakes.rb', line 144 def add_rows(array_of_rows) array_of_rows.each do |r| add_row(r.clone) end return self end |
#append(a_table) ⇒ Object
Append one Table object to another. Raises ArgumentError if the header values and order do not align with the destination Table. Return self if appending an empty table. Return given table if appending to an empty table.
Attributes
a_table-
Tableto be added
Examples
cities.append(more_cities)
160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 |
# File 'lib/tablestakes.rb', line 160 def append(a_table) if !a_table.kind_of? Table raise ArgumentError, "Argument to append is not a Table" end if self.empty? return a_table elsif a_table.empty? return self end if a_table.headers != @headers raise ArgumentError, "Argument to append does not have matching headers" end a_table.each do |r| add_row(r.clone) end return self end |
#bottom(colname, num = 1) ⇒ Object
Returns counts of the least frequent values found in a given column in the form of a Table. Raises ArgumentError if the column is not found. If no limit is given to the number of values, only the least frequent value will be returned.
Attributes
colname-
Stringto identify the column to count num-
OPTIONAL
Stringnumber of values to return
Examples
cities.bottom("State") # returns a Table with the least frequent state in the cities Table
cities.bottom("State", 10) # returns a Table with the 10 least frequent states in the cities Table
363 364 365 366 |
# File 'lib/tablestakes.rb', line 363 def bottom(colname, num=1) freq = tally(colname).to_a[1..-1].sort_by {|k,v| v } return Table.new(freq[0..num-1].unshift([colname,"Count"])) end |
#column(colname) ⇒ Object
Return a copy of a column from the table, identified by column name. Returns empty Array if column name not found.
Attributes
colname-
Stringto identify the name of the column
90 91 92 |
# File 'lib/tablestakes.rb', line 90 def column(colname) Array(get_col(colname)) end |
#count(colname = nil, value = nil) ⇒ Object Also known as: size, length
Counts the number of instances of a particular string, given a column name, and returns an integer >= 0. Returns nil if the column is not found. If no parameters are given, returns the number of rows in the table.
Attributes
colname-
OPTIONAL
Stringto identify the column to count value-
OPTIONAL
Stringvalue to count
Examples
cities.count # returns number of rows in cities Table
cities.size # same as cities.count
cities.length # same as cities.count
cities.count("State", "NY") # returns the number of rows with State == "NY"
309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 |
# File 'lib/tablestakes.rb', line 309 def count(colname=nil, value=nil) if colname.nil? || value.nil? if @table.size > 0 @table.each_key {|e| return @table.fetch(e).length } else return 0 end end raise ArgumentError, "Invalid column name" unless @headers.include?(colname) if @table[colname] result = 0 @table[colname].each do |val| val == value.to_s ? result += 1 : nil end result else nil end end |
#del_column(colname) ⇒ Object
Delete a column from the Table. Raises ArgumentError if the column name does not exist.
Attributes
colname-
Stringto identify the name of the column
Examples
cities.del_column("State") # returns table without "State" column
213 214 215 216 217 218 219 220 |
# File 'lib/tablestakes.rb', line 213 def del_column(colname) # check arguments raise ArgumentError, "Column name does not exist!" unless @table.has_key?(colname) @headers.delete(colname) @table.delete(colname) return self end |
#del_row(rownum) ⇒ Object
Delete a row from the Table. Raises ArgumentError if the row number is not found
Attributes
rownum-
FixNumto hold the row number
Examples
cities.del_row(3) # deletes row with index 3 (4th row)
cities.del_row(-1) # deletes last row (per Ruby convention)
231 232 233 234 235 236 237 238 239 240 |
# File 'lib/tablestakes.rb', line 231 def del_row(rownum) # check arguments if self.empty? || rownum >= @table[@headers.first].length raise ArgumentError, "Row number does not exist!" end @headers.each do |col| @table[col].delete_at(rownum) end return self end |
#each ⇒ Object
Defines an iterator for Table which produces rows of data (headers omitted) for its calling block.
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 |
# File 'lib/tablestakes.rb', line 65 def each if block_given? @table[@headers.first].each_index do |index| nextrow = [] @headers.each do |col| begin nextrow << @table[col][index].clone rescue nextrow << @table[col][index] end end yield nextrow end else self.to_enum(:each) end end |
#empty? ⇒ Boolean
Return true if the Table is empty, false otherwise.
105 106 107 |
# File 'lib/tablestakes.rb', line 105 def empty? @headers.length == 0 && @table.length == 0 end |
#intersect(table2, colname, col2name = colname) ⇒ Object
Return an Array with the intersection of columns from different tables, eliminating duplicates. Return nil if a column is not found.
Attributes
table2-
Tableto identify the secondary table in the intersection colname-
Stringto identify the column to intersection col2name-
OPTIONAL
Stringto identify the column in the second table to intersection
Examples
cities.intersect(capitals, "City", "Capital") # returns Array with all capitals that are also in the cities table
573 574 575 576 577 578 579 580 |
# File 'lib/tablestakes.rb', line 573 def intersect(table2, colname, col2name=colname) # check arguments raise ArgumentError, "Invalid table!" unless table2.is_a?(Table) raise ArgumentError, "Invalid column name" unless @table.has_key?(colname) raise ArgumentError, "Invalid column name" unless table2.headers.include?(col2name) return self.column(colname) & table2.column(col2name) end |
#join(table2, colname, col2name = colname) ⇒ Object
Given a second table to join against, and a field/column, return a Table which contains a join of the two tables. Join only lists the common column once, under the column name of the first table (if different from the name of thee second). All columns from both tables are returned. Returns nil if the column is not found.
Attributes
table2-
Tableto identify the secondary table in the join colname-
Stringto identify the column to join on col2name-
OPTIONAL
Stringto identify the column in the second table to join on
Examples
cities.join(capitals, "City", "Capital") # returns a Table of cities that are also state capitals
capitals.join(cities, "State") # returns a Table of capital cities with populations info from the cities table
470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 |
# File 'lib/tablestakes.rb', line 470 def join(table2, colname, col2name=colname) # check arguments raise ArgumentError, "Invalid table!" unless table2.is_a?(Table) raise ArgumentError, "Invalid column name" unless @table.has_key?(colname) raise ArgumentError, "Invalid column name" unless table2.headers.include?(col2name) dedupe_headers(table2, colname) result = [ Array(@headers) + Array(table2.headers) ] @table[colname].each_index do |index| t2_index = table2.column(col2name).find_index(@table[colname][index]) unless t2_index.nil? result << self.row(index) + table2.row(t2_index) end end if result.length == 1 #no rows selected return nil else return Table.new(result) end end |
#rename_header(orig_name, new_name) ⇒ Object
Rename a header value for this Table object.
Attributes
orig_name-
Stringcurrent header name new_name-
Stringindicating new header name
248 249 250 251 252 253 254 255 |
# File 'lib/tablestakes.rb', line 248 def rename_header(orig_name, new_name) raise ArgumentError, "Original Column name type invalid" unless orig_name.kind_of? String raise ArgumentError, "New Column name type invalid" unless new_name.kind_of? String raise ArgumentError, "Column Name does not exist!" unless @headers.include? orig_name update_header(orig_name, new_name) return self end |
#row(index) ⇒ Object
Return a copy of a row from the table as an Array, given an index (i.e. row number). Returns empty Array if the index is out of bounds.
Attributes
index-
FixNumindicating index of the row.
99 100 101 |
# File 'lib/tablestakes.rb', line 99 def row(index) Array(get_row(index)) end |
#select(*columns) ⇒ Object Also known as: get_columns
Select columns from the table, given one or more column names. Returns an instance of Table with the results. Raises ArgumentError if any column is not valid.
Attributes
columns-
Variable
Stringarguments to identify the columns to select
Examples
cities.select("City", "State") # returns a Table of "City" and "State" columns
cities.select(cities.headers) # returns a new Table that is a duplicate of cities
400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 |
# File 'lib/tablestakes.rb', line 400 def select(*columns) # check arguments raise ArgumentError, "Invalid column name(s)" unless columns columns.kind_of?(Array) ? columns.flatten! : nil columns.each do |c| raise ArgumentError, "Invalid column name" unless @table.has_key?(c) end result = [] result_headers = [] columns.each { |col| @headers.include?(col) ? result_headers << col : nil } result << result_headers @table[@headers.first].each_index do |index| this_row = [] result_headers.each do |col| this_row << @table[col][index] end result << this_row end result_headers.empty? ? Table.new() : Table.new(result) end |
#sort(column = nil, &block) ⇒ Object Also known as: sort!
Sort the table based on given column. Uses precedence as defined in the column. By default will sort by the value in the first column.
Attributes
args-
OPTIONAL
Stringto identify the column on which to sort
Options
datatype => :Fixnum
datatype => :Float
datatype => :Date
Examples
cities.sort("State") # Re-orders the cities table based on State name
cities.sort { |a,b| b<=>a } # Reverse the order of the cities table
cities.sort("State") { |a,b| b<=>a } # Sort by State in reverse alpha order
598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 |
# File 'lib/tablestakes.rb', line 598 def sort(column=nil, &block) col_index = 0 if column.kind_of? String col_index = @headers.index(column) elsif column.kind_of? Fixnum col_index = column end # return empty Table if empty if self.empty? return Table.new() end neworder = [] self.each { |row| neworder << OrderedRow.new(row,col_index) } result = [neworder.shift.data] # take off headers block_given? ? neworder.sort!(&block) : neworder.sort! neworder.each { |row| result << row.data } return Table.new(result) end |
#sub(colname, match = nil, replace = nil, &block) ⇒ Object
Given a field/column, and a regular expression to match against, and a replacement string, create a new table which performs a substitute operation on column data. In the case that the given replacement is a String, a direct substitute is performed. In the case that it is a Hash and the matched text is one of its keys, the corresponding Hash value will be substituted.
Optionally takes a block containing an operation to perform on all matching data elements in the given column. Raises ArgumentError if the column is not found.
Attributes
colname-
Stringto identify the column to substitute on match-
OPTIONAL
StringorRegexpto match the value in the selected column replace-
OPTIONAL
StringorHashto specify the replacement text for the given match value - &block
-
OPTIONAL block to execute against matching values
Examples
cities.sub("Population", /(.*?),(.*?)/, '\1\2') # eliminate commas
capitals.sub("State", /NY/, "New York") # replace acronym with full name
capitals.sub("State", /North|South/, {"North" => "South", "South" => "North"}) # Northern states for Southern and vice-versa
capitals.sub("State") { |state| state.downcase } # Lowercase for all values
513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 |
# File 'lib/tablestakes.rb', line 513 def sub(colname, match=nil, replace=nil, &block) # check arguments raise ArgumentError, "No regular expression to match against" unless match || block_given? raise ArgumentError, "Invalid column name" unless @table.has_key?(colname) if ! block_given? if ! (String.try_convert(match) || Regexp.try_convert(match)) raise ArgumentError, "Match expression must be String or Regexp" elsif ! (replace.respond_to?(:fetch) || replace.respond_to?(:to_str)) raise ArgumentError, "Replacement must be String or Hash" end end result = Table.new([@headers]) col_index = @headers.index(colname) self.each do |row| if block_given? row[col_index] = block.call row[col_index] else row[col_index] = row[col_index].sub(match, replace) end result.add_row(row) end return result end |
#tally(colname) ⇒ Object
Count instances in a particular field/column and return a Table of the results. Raises ArgumentError if the column is not found.
Attributes
colname-
Stringto identify the column to tally
Examples
cities.tally("State") # returns each State in the cities Table with number of occurences
379 380 381 382 383 384 385 386 387 388 |
# File 'lib/tablestakes.rb', line 379 def tally(colname) # check arguments raise ArgumentError, "Invalid column name" unless @table.has_key?(colname) result = {} @table[colname].each do |val| result.has_key?(val) ? result[val] += 1 : result[val] = 1 end return Table.new([[colname,"Count"]] + result.to_a) end |
#to_a ⇒ Object
Converts a Table object to an array of arrays (each row). The first entry are the table headers.
Attributes
none
282 283 284 285 286 287 288 289 290 291 292 293 |
# File 'lib/tablestakes.rb', line 282 def to_a result = [ Array(@headers) ] @table[@headers.first].each_index do |index| items = [] @headers.each do |col| items << @table[col][index] end result << items end result end |
#to_s ⇒ Object
Converts a Table object to a tab-delimited string.
Attributes
none
261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 |
# File 'lib/tablestakes.rb', line 261 def to_s result = @headers.join("\t") << "\n" @table[@headers.first].each_index do |index| @headers.each do |col| result << @table[col][index].to_s unless col == @headers.last result << "\t" else result << "\n" end end end result end |
#top(colname, num = 1) ⇒ Object
Returns counts of the most frequent values found in a given column in the form of a Table. Raises ArgumentError if the column is not found. If no limit is given to the number of values, only the top value will be returned.
Attributes
colname-
Stringto identify the column to count num-
OPTIONAL
Stringnumber of values to return
Examples
cities.top("State") # returns a Table with the most frequent state in the cities Table
cities.top("State", 10) # returns a Table with the 10 most frequent states in the cities Table
345 346 347 348 |
# File 'lib/tablestakes.rb', line 345 def top(colname, num=1) freq = tally(colname).to_a[1..-1].sort_by {|k,v| v }.reverse return Table.new(freq[0..num-1].unshift([colname,"Count"])) end |
#union(table2, colname, col2name = colname) ⇒ Object
Return Array with the union of elements columns in the given tables, eliminating duplicates. Raises an ArgumentError if a column is not found.
Attributes
table2-
Tableto identify the secondary table in the union colname-
Stringto identify the column to union col2name-
OPTIONAL
Stringto identify the column in the second table to union
Examples
cities.union(capitals, "City", "Capital") # returns Array with all cities in both tables
553 554 555 556 557 558 559 560 |
# File 'lib/tablestakes.rb', line 553 def union(table2, colname, col2name=colname) # check arguments raise ArgumentError, "Invalid table!" unless table2.is_a?(Table) raise ArgumentError, "Invalid column name" unless @table.has_key?(colname) raise ArgumentError, "Invalid column name" unless table2.headers.include?(col2name) return self.column(colname) | table2.column(col2name) end |
#where(colname, condition = nil) ⇒ Object Also known as: get_rows
Given a particular condition for a given column field/column, return a subtable that matches the condition. If no condition is given, a new Table is returned with all records. Returns an empty table if the condition is not met or the column is not found.
Attributes
colname-
Stringto identify the column to tally condition-
OPTIONAL
Stringcontaining a ruby condition to evaluate
Examples
cities.where("State", "=='NY'") # returns a Table of cities in New York state
cities.where("State", "=~ /New.*/") # returns a Table of cities in states that start with "New"
cities.where("Population", ".to_i > 1000000") # returns a Table of cities with population over 1 million
438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 |
# File 'lib/tablestakes.rb', line 438 def where(colname, condition=nil) # check arguments raise ArgumentError, "Invalid Column Name" unless @headers.include?(colname) result = [] result << @headers self.each do |row| if condition eval(%q["#{row[headers.index(colname)]}"] << "#{condition}") ? result << row : nil else result << row end end result.length > 1 ? Table.new(result) : Table.new() end |
#write_file(filename) ⇒ Object
Write a representation of the Table object to a file (tab delimited).
Attributes
filename-
Stringto identify the name of the file to write
626 627 628 629 |
# File 'lib/tablestakes.rb', line 626 def write_file(filename) file = File.open(filename, "w") file.print self.to_s end |