Class: IOStreams::Tabular

Inherits:
Object
  • Object
show all
Defined in:
lib/io_streams/tabular.rb,
lib/io_streams/tabular/header.rb,
lib/io_streams/tabular/parser/csv.rb,
lib/io_streams/tabular/parser/psv.rb,
lib/io_streams/tabular/parser/base.rb,
lib/io_streams/tabular/parser/hash.rb,
lib/io_streams/tabular/parser/json.rb,
lib/io_streams/tabular/parser/array.rb,
lib/io_streams/tabular/parser/fixed.rb,
lib/io_streams/tabular/utility/csv_row.rb

Overview

Common handling for efficiently processing tabular data such as CSV, spreadsheet or other tabular files on a line by line basis.

Tabular consists of a table of data where the first row is usually the header, and subsequent rows are the data elements.

Tabular applies the header information to every row of data when #as_hash is called.

Example using the default CSV parser:

tabular = Tabular.new
tabular.parse_header("first field,Second,thirD")
# => ["first field", "Second", "thirD"]

tabular.cleanse_header!
# => ["first_field", "second", "third"]

tabular.record_parse("1,2,3")
# => {"first_field"=>"1", "second"=>"2", "third"=>"3"}

tabular.record_parse([1,2,3])
# => {"first_field"=>1, "second"=>2, "third"=>3}

tabular.render([5,6,9])
# => "5,6,9"

tabular.render({"third"=>"3", "first_field"=>"1" })
# => "1,,3"

Defined Under Namespace

Modules: Parser, Utility Classes: Header

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(format: nil, file_name: nil, format_options: nil, **args) ⇒ Tabular

Parse a delimited data source.

Parameters

format: [Symbol]
  :csv, :hash, :array, :json, :psv, :fixed

For all other parameters, see Tabular::Header.new


56
57
58
59
60
61
62
63
64
65
# File 'lib/io_streams/tabular.rb', line 56

def initialize(format: nil, file_name: nil, format_options: nil, **args)
  @header = Header.new(**args)
  klass   =
    if file_name && format.nil?
      self.class.parser_class_for_file_name(file_name)
    else
      self.class.parser_class(format)
    end
  @parser = format_options ? klass.new(format_options) : klass.new
end

Instance Attribute Details

#formatObject (readonly)

Returns the value of attribute format.



47
48
49
# File 'lib/io_streams/tabular.rb', line 47

def format
  @format
end

#headerObject (readonly)

Returns the value of attribute header.



47
48
49
# File 'lib/io_streams/tabular.rb', line 47

def header
  @header
end

#parserObject (readonly)

Returns the value of attribute parser.



47
48
49
# File 'lib/io_streams/tabular.rb', line 47

def parser
  @parser
end

Class Method Details

.deregister_format(format) ⇒ Object

De-Register a file format

Returns [Symbol] the format removed, or nil if the format was not registered

Example:

register_extension(:xls)

Raises:

  • (ArgumentError)


144
145
146
147
# File 'lib/io_streams/tabular.rb', line 144

def self.deregister_format(format)
  raise(ArgumentError, "Invalid format #{format.inspect}") unless format.to_s =~ /\A\w+\Z/
  @formats.delete(format.to_sym)
end

.register_format(format, parser) ⇒ Object

Register a format and the parser class for it.

Example:

register_format(:csv, IOStreams::Tabular::Parser::Csv)

Raises:

  • (ArgumentError)


133
134
135
136
# File 'lib/io_streams/tabular.rb', line 133

def self.register_format(format, parser)
  raise(ArgumentError, "Invalid format #{format.inspect}") unless format.nil? || format.to_s =~ /\A\w+\Z/
  @formats[format.nil? ? nil : format.to_sym] = parser
end

.registered_formatsObject

Returns [Array<Symbol>] the list of registered formats



150
151
152
# File 'lib/io_streams/tabular.rb', line 150

def self.registered_formats
  @formats.keys
end

Instance Method Details

#cleanse_header!Object

Returns [Array<String>] the cleansed columns



124
125
126
127
# File 'lib/io_streams/tabular.rb', line 124

def cleanse_header!
  header.cleanse!
  header.columns
end

#header?Boolean

Returns [true|false] whether a header is still required in order to parse or render the current format.

Returns:

  • (Boolean)


68
69
70
# File 'lib/io_streams/tabular.rb', line 68

def header?
  parser.requires_header? && IOStreams::Utils.blank?(header.columns)
end

#parse_header(line) ⇒ Object

Returns [Array] the header row/line after parsing and cleansing. Returns ‘nil` if the row/line is blank, or a header is not required for the supplied format (:json, :hash).

Notes:

  • Call ‘header?` first to determine if the header should be parsed first.

  • The header columns are set after parsing the row, but the header is not cleansed.



83
84
85
86
87
# File 'lib/io_streams/tabular.rb', line 83

def parse_header(line)
  return if IOStreams::Utils.blank?(line) || !parser.requires_header?

  header.columns = parser.parse(line)
end

#record_parse(line) ⇒ Object

Returns [Hash<String,Object>] the line as a hash. Returns nil if the line is blank.



91
92
93
94
# File 'lib/io_streams/tabular.rb', line 91

def record_parse(line)
  line = row_parse(line)
  header.to_hash(line) if line
end

#render(row) ⇒ Object

Renders the output row



105
106
107
108
109
# File 'lib/io_streams/tabular.rb', line 105

def render(row)
  return if IOStreams::Utils.blank?(row)

  parser.render(row, header)
end

#render_headerObject

Returns [String] the header rendered for the output format Return nil if no header is required.



113
114
115
116
117
118
119
120
121
# File 'lib/io_streams/tabular.rb', line 113

def render_header
  return unless requires_header?

  if IOStreams::Utils.blank?(header.columns)
    raise(Errors::MissingHeader, "Header columns must be set before attempting to render a header for format: #{format.inspect}")
  end

  parser.render(header.columns, header)
end

#requires_header?Boolean

Returns [true|false] whether a header row show be rendered on output.

Returns:

  • (Boolean)


73
74
75
# File 'lib/io_streams/tabular.rb', line 73

def requires_header?
  parser.requires_header?
end

#row_parse(line) ⇒ Object

Returns [Array] the row/line as a parsed Array of values. Returns nil if the row/line is blank.



98
99
100
101
102
# File 'lib/io_streams/tabular.rb', line 98

def row_parse(line)
  return if IOStreams::Utils.blank?(line)

  parser.parse(line)
end