Module: IMW::Formats::Excel

Includes:
Enumerable
Defined in:
lib/imw/formats/excel.rb

Overview

Defines methods for reading and writing Microsoft Excel data.

Instance Method Summary collapse

Instance Method Details

#each {|Spreadsheet::Excel::Row| ... } ⇒ Object

Yield each row of this Excel document.

Will loop from one worksheet to the next.

Yields:

  • (Spreadsheet::Excel::Row)


38
39
40
41
42
43
# File 'lib/imw/formats/excel.rb', line 38

def each &block
  require 'spreadsheet'
  Spreadsheet.open(path).worksheets.each do |worksheet|
    worksheet.each(&block)
  end
end

#loadArray<Array>

Return the data in this Excel document as an array of arrays.

Data from consecutive worksheets will be concatenated into a single outer array.

Returns:

  • (Array<Array>)


19
20
21
22
23
24
25
26
27
28
# File 'lib/imw/formats/excel.rb', line 19

def load
  require 'spreadsheet'
  data = []
  Spreadsheet.open(path).worksheets.each do |worksheet|
    data += worksheet.map do |row|
      row.to_a
    end
  end
  data
end

#num_linesInteger

Return the number of lines in this Excel document.

Measured across worksheets.

Returns:

  • (Integer)


50
51
52
53
54
55
# File 'lib/imw/formats/excel.rb', line 50

def num_lines
  require 'spreadsheet'
  Spreadsheet.open(path).worksheets.inject(0) do |sum, worksheet|
    sum += worksheet.row_count
  end
end

#snippetObject



76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
# File 'lib/imw/formats/excel.rb', line 76

def snippet
  require 'spreadsheet'
  [].tap do |snip|
    rows_sampled = 0
    Spreadsheet.open(path).worksheets.each do |worksheet|
      worksheet.each do |row|
        begin
          break if rows_sampled > 100
          row_size = row.size.to_f
          if (row.reject(&:blank?).size.to_f / row_size) > 0.5
            snip << row.to_a
            rows_sampled += 1
          end
        rescue => e
          next
        end
      end
      break if rows_sampled > 10
    end
  end
end

#validate_schema!Object

Ensure that this Excel resource is described by a an ordered collection of flat fields.

Raises:



9
10
11
# File 'lib/imw/formats/excel.rb', line 9

def validate_schema!
  raise IMW::SchemaError.new("#{self.class} resources must be described by an ordered set of flat fields") if schema.any?(&:nested?)
end