Class: MineShaft::HTMLTable
- Inherits:
-
Object
- Object
- MineShaft::HTMLTable
- Defined in:
- lib/mine_shaft/html_table.rb
Overview
Provides several convenience methods for translating a (machinist-) parsed HTML table into standard Ruby data structures. All tables are assumed to have a “heading” row as the first row, and that header uses <td> elements (instead of <th>).
Instance Method Summary collapse
-
#content_rows ⇒ Object
Public: Retrieve the content of all the <td> elements from the table, except for the first row.
-
#deserialize ⇒ Object
Public: Converts HTML table to an Array of Hash objects, using the column headings as keys for each Hash element.
-
#headings ⇒ Object
(also: #headers)
Public: Retrieves the content from the <td> elements of the first row of the table.
-
#initialize(parsed_table) ⇒ HTMLTable
constructor
Public: Initialize a new HTMLTable with the specified table-data as parse by machinist (or Nokogiri).
-
#symbolized_headings ⇒ Object
Public: Converts the return value of #headings to an Array of lower-cased Symbol elements.
-
#td_elements ⇒ Object
Public: Retrieves the content from all <td> elements in the table.
Constructor Details
#initialize(parsed_table) ⇒ HTMLTable
Public: Initialize a new HTMLTable with the specified table-data as parse
by machinist (or Nokogiri).
parsed_table - A Nokogiri::HTML::Document or Nokogiri::XML::Element scoped
to only the HTML table you are interested in. Technically
speaking, you could pass in more content than just the
<table> element and it would likely work fine, but that is
the anticipated content structure.
Returns an instance of HTMLTable
17 18 19 |
# File 'lib/mine_shaft/html_table.rb', line 17 def initialize(parsed_table) @table = parsed_table end |
Instance Method Details
#content_rows ⇒ Object
Public: Retrieve the content of all the <td> elements from the table,
except for the first row.
Returns an Array of Array elements, each one being the content from one
row of the table. The returned content does NOT include the first row,
as it is assumed to be the heading of the table.
27 28 29 30 |
# File 'lib/mine_shaft/html_table.rb', line 27 def content_rows table_content = td_elements[column_count, td_elements.size] table_content.enum_slice(column_count).to_a end |
#deserialize ⇒ Object
Public: Converts HTML table to an Array of Hash objects, using the column
headings as keys for each Hash element.
Examples
Given 'names' was initialized with the following table:
---------------------
|Name |Number |
---------------------
|John |123-456-7890|
---------------------
names.deserialize
# => [{:name => "John", :number => "123-456-7890"}]
Returns an Array of Hash objects. Each Hash element is a
key-value mapping of "table header"-"row content". (Note that the
the key is a downcased-symbol of the heading value).
51 52 53 54 55 56 57 58 59 |
# File 'lib/mine_shaft/html_table.rb', line 51 def deserialize content_rows.map do |row_cells| symbolized_headings.inject({}) do |all_attributes, current_attribute| index_of_header = symbolized_headings.index(current_attribute) value = row_cells[index_of_header] all_attributes.merge({current_attribute.to_sym => value}) end end end |
#headings ⇒ Object Also known as: headers
Public: Retrieves the content from the <td> elements of the first row of
the table.
Returns an Array of the content contained in each <td> element of the
first row.
73 74 75 |
# File 'lib/mine_shaft/html_table.rb', line 73 def headings td_elements.slice(0,column_count) end |
#symbolized_headings ⇒ Object
Public: Converts the return value of #headings to an Array of
lower-cased Symbol elements.
Returns an Array of Symbol elements.
82 83 84 |
# File 'lib/mine_shaft/html_table.rb', line 82 def symbolized_headings headings.map {|header| header.downcase.to_sym} end |
#td_elements ⇒ Object
Public: Retrieves the content from all <td> elements in the table.
Returns an Array of the content contained in each <td> element.
64 65 66 |
# File 'lib/mine_shaft/html_table.rb', line 64 def td_elements @table.search("td").map(&:content) end |