Class: CSVH::Reader

Inherits:
Object
  • Object
show all
Extended by:
Forwardable
Defined in:
lib/csvh/reader.rb

Overview

Sequantially and lazily reads from CSV-formatted data that has a header row. Allows accessing headers before reading any subsequent data rows and/or when no additional data rows are present in the data.

Constant Summary collapse

DEFAULT_CSV_OPTS =
{
  headers: :first_row,
  return_headers: true
}.freeze

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(csv) ⇒ Reader

Returns a new reader based on the given CSV object. The CSV object must be configured to return a header row (a ‘CSV::ROW` that returns true from its `#header?` method as its first item. The header item must also not have been read yet.

Parameters:

  • csv (CSV)

    A Ruby ‘::CSV` object.



116
117
118
119
120
121
122
123
124
125
# File 'lib/csvh/reader.rb', line 116

def initialize(csv)
  unless csv.return_headers?
    raise \
      InappropreateCsvInstanceError,
       "%{self.class} requires a CSV instance that returns headers." \
      " It needs to have been initialized with non-false/nil values" \
      " for :headers and :return_headers options."
  end
  @csv = csv
end

Class Method Details

.from_file(file_path, **opts) {|the| ... } ⇒ Reader, object Also known as: foreach

When called without a block argument, returns an open reader for data from the file at the given file_path.

When called with a block argument, passes an open reader for data from the file to the given block, closes the reader (and its underlying file IO channel) before returning, and then returns the value that was returned by the block.

By default, the underlying CSV object is initialized with default options for data with a header row and to return the header row. Any oadditional options you supply will be added to those defaults or override them.

A [Reader] created using this method will delegate all of the same IO methods that a ‘CSV` created using `CSV#open` does except `close_write`, `flush`, `fsync`, `sync`, `sync=`, and `truncate`. You may call:

  • binmode()

  • binmode?()

  • close()

  • close_read()

  • closed?()

  • eof()

  • eof?()

  • external_encoding()

  • fcntl()

  • fileno()

  • flock()

  • flush()

  • internal_encoding()

  • ioctl()

  • isatty()

  • path()

  • pid()

  • pos()

  • pos=()

  • reopen()

  • seek()

  • fstat()

  • tell()

  • to_i()

  • to_io()

  • tty?()

Parameters:

  • file_path (String)

    the path of the file to read.

  • opts

    options for ‘CSV.new`.

Yield Parameters:

  • the (Reader)

    new reader.

Returns:

  • (Reader, object)

    the new reader or the value returned from the given block.



71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
# File 'lib/csvh/reader.rb', line 71

def from_file(file_path, **opts)
  opts = default_csv_opts.merge(opts)
  io = File.open(file_path, 'r')
  csv = CSV.new(io, **opts)
  instance = new(csv)

  if block_given?
    begin
      yield instance
    ensure
      instance.close unless instance.closed?
    end
  else
    instance
  end
end

.from_string_or_io(data, **opts) ⇒ Reader Also known as: parse

Returns an open reader for data from given string or readable IO stream.

Parameters:

  • data (String, IO)

    the source of the data to read.

  • opts

    options for ‘CSV.new`.

Returns:

  • (Reader)

    the new reader.



94
95
96
97
98
# File 'lib/csvh/reader.rb', line 94

def from_string_or_io(data, **opts)
  opts = default_csv_opts.merge(opts)
  csv = CSV.new(data, **opts)
  new(csv)
end

Instance Method Details

#each {|| ... } ⇒ Object

When given a block, yields each remaining data row of the data source in turn as a ‘CSV::Row` instance. When called without a block, returns an Enumerator over those rows.

Will never yield the header row, however, the headers are available via the #headers method of either the reader or the row object.

Yield Parameters:

  • (CSV::Row)


210
211
212
213
214
215
216
217
# File 'lib/csvh/reader.rb', line 210

def each
  headers
  if block_given?
    @csv.each { |row| yield row }
  else
    @csv.each
  end
end

#headersArray<String>

Returns the list of column header values from the CSV data.

If any rows have already been read, then the result is immediately returned, having been recorded when the header row was initially encountered.

If no rows have been read yet, then the first row is read from the data in order to return the result.

Returns:

  • (Array<String>)

    the column header names.



142
143
144
145
146
147
148
149
150
151
152
# File 'lib/csvh/reader.rb', line 142

def headers
  @headers ||= begin
    row = @csv.readline
    unless row.header_row?
      raise \
        CsvPrematurelyShiftedError,
        "the header row was prematurely read from the underlying CSV object."
    end
    row.headers
  end
end

#readCSV::Table Also known as: readlines

Slurps the remaining data rows and returns a ‘CSV::Table`.

This is essentially the same behavior as ‘CSV#read`, but ensures that the header info has been fetched first, and the resulting table will never include the header row.

Note that the Ruby documentation (at least as of 2.2.2) is for ‘CSV#read` is incomplete and simply says that it returns “an Array of Arrays”, but it actually returns a table if a truthy `:headers` option was used when creating the `CSV` object.

Returns:

  • (CSV::Table)

    a table of remaining unread rows



254
255
256
257
# File 'lib/csvh/reader.rb', line 254

def read
  headers
  @csv.read
end

#shiftCSV::Row Also known as: gets, readline

A single data row is pulled from the data source, parsed and returned as a CSV::Row.

This is essentially the same behavior as ‘CSV#shift`, but ensures that the header info has been fetched first, and #shift will never return the header row.

Returns:

  • (CSV::Row)

    the next previously unread row



269
270
271
272
# File 'lib/csvh/reader.rb', line 269

def shift
  headers
  @csv.shift
end

#to_csvh_readerReader

Returns the target of the method call.

Returns:

  • (Reader)

    the target of the method call.



128
129
130
# File 'lib/csvh/reader.rb', line 128

def to_csvh_reader
  self
end