Class: Libis::Tools::Spreadsheet

Inherits:
Object
  • Object
show all
Defined in:
lib/libis/tools/spreadsheet.rb

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(file_name, opts = {}) ⇒ Spreadsheet

Spreadsheet reader.

This class supports CSV, Excel 2007-2016, Excel (pre-2007) and LibreOffice/OpenOffice Calc thanks to the Roo (github.com/roo-rb/roo) project.

The first argument is the file name to read. For spreadsheets, append ‘|’ and the sheet name to specify the sheet to read.

The second argument is a Hash with options. The options can be:

  • required: a list of headers that need to be present. The list can be an Array containing the litteral header values expected. Alternatively, a Hash is also allowed with alternative header names as keys and litteral names as values. If a :headers keys is present in the Hash with a value of true or :first, whatever is on the first row, will be used as header values, ignoring the rest of the Hash. A key of :header_search with an array of strings as value will search for a row that contains each of the strings in the given array. Each string is searched by regular expression, so strings may contain wildcards. Default is empty array, meaning to use whatever is on the first row as header.

  • optional: a list of headers that may be present, but are not required. Similar format as above. Default is empty array.

  • noheader: a list of headers to force upon the sheet if no headers are present.

  • extension: :csv, :xlsx, :xlsm, :ods, :xls, :google to help the library in deciding what format the file is in.

The following options are only applicable to CSV input files and are ignored otherwise.

  • encoding: the encoding of the CSV file. e.g. ‘windows-1252:UTF-8’ to convert the input from windows code page 1252 to UTF-8 during file reading

  • col_sep: column separator. Default is ‘,’, but can be set to “t” for TSV files.

  • quote_char: character for quoting.

Parameters:

  • file_name (String)
  • opts (Hash) (defaults to: {})


39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
# File 'lib/libis/tools/spreadsheet.rb', line 39

def initialize(file_name, opts = {})
  options = {
      csv_options: {
          encoding: 'UTF-8',
          col_sep: ',',
          quote_char: '"',
      }.merge([:encoding, :col_sep, :quote_char].inject({}) do |h, k|
        h[k] = opts.delete(k) if opts[k]
        h
      end)
  }.merge(opts)

  required_headers = options.delete(:required) || []
  optional_headers = options.delete(:optional) || []
  noheader_headers = options.delete(:noheader) || []

  file, sheet = file_name.split('|')
  @ss = ::Roo::Spreadsheet.open(file, options)
  @ss.default_sheet = sheet if sheet

  @header_options = {}

  check_headers(required: required_headers, optional: optional_headers, noheader: noheader_headers)

end

Class Method Details

.foreach(file_name, opts = {}, &block) ⇒ Object

Open and iterate over sheet content.

Parameters:

  • @see

    #initialize



115
116
117
# File 'lib/libis/tools/spreadsheet.rb', line 115

def self.foreach(file_name, opts = {}, &block)
  Libis::Tools::Spreadsheet.new(file_name, opts).each(&block)
end

Instance Method Details

#each(options = {}, &block) ⇒ Object

Iterate over sheet content.

The options Hash can contain the following keys:

  • :sheet - overwrites default sheet name

  • :required - Array or Hash of required headers

  • :optional - Array or Hash of optional headers

  • :noheader - Array of noheader headers

Each iteration, a Hash will be passed with the key names as specified in the header options and the corresponding cell values.

Parameters:

  • options (Hash) (defaults to: {})


77
78
79
80
# File 'lib/libis/tools/spreadsheet.rb', line 77

def each(options = {}, &block)
  @ss.default_sheet = options[:sheet] if options[:sheet]
  @ss.each(check_headers(options), &block)
end

#headersObject



119
120
121
# File 'lib/libis/tools/spreadsheet.rb', line 119

def headers
  (@ss.headers || {}).keys
end

#parse(options = {}) ⇒ Array<Hash>

Parse sheet content.

The options Hash can contain the following keys:

  • :sheet - overwrites default sheet name

  • :required - Array or Hash of required headers

  • :optional - Array or Hash of optional headers

  • :noheader - Array of noheader headers

An Array will be returned with for each row a Hash with the key names as specified in the header options and the corresponding cell values.

Parameters:

  • options (Hash) (defaults to: {})

Returns:



95
96
97
98
# File 'lib/libis/tools/spreadsheet.rb', line 95

def parse(options = {})
  @ss.default_sheet = options.delete(:sheet) if options.has_key?(:sheet)
  @ss.parse(check_headers(options))
end

#restartObject

Set the current_row pointer back to the start



108
109
110
# File 'lib/libis/tools/spreadsheet.rb', line 108

def restart
  @current_row = @ss.header_line
end

#shiftObject

Return the current row and increment the current_row pointer.



101
102
103
104
105
# File 'lib/libis/tools/spreadsheet.rb', line 101

def shift
  return nil unless @current_row < @ss.last_row
  @current_row += 1
  Hash[@ss.row(@current_row).map.with_index { |v, i| [headers[i], v] }]
end