Module: WorkerTools::XlsxInput

Defined in:
lib/worker_tools/xlsx_input.rb

Defined Under Namespace

Classes: XlsxInputForeach

Instance Method Summary collapse

Instance Method Details

#xlsx_input_columnsObject

If an array is provided, the names will be used as the row keys, the row values will be assign according to the columns order.

Ex: %w(tenant segment area) row =>

tenant: _value_at_first_column_,
segment: _value_at_second_column_,
area: _value_at_third_column_

If a hash if provided, the keys will turn into the row keys, the values will be used to find the corresponding columns (the order in the spreadsheet won’t affect the import)

Ex: { tenant: ‘Mandant’, segment: ‘Segment’, area: ‘Bereich’) row =>

tenant: _value_at_column_Mandant,
segment: _value_at_column_Segment,
area: _value_at_column_Bereich

The name of the column is filtered using the xlsx_input_header_normalized method, which takes care of extra spaces and looks for a case insentive match (so ‘Bereich’ matches ‘ Bereich’, ‘bereich’, etc.). You can override that method as well.

Besides matching the columns using strings, it is possible to use a regular expression or a proc: {

tenant: 'Mandant',
segment: /Segment/i,
area: ->(name) { name.downcase == 'area' }

}



38
39
40
# File 'lib/worker_tools/xlsx_input.rb', line 38

def xlsx_input_columns
  raise "xlsx_input_columns has to be defined in #{self}"
end

#xlsx_input_columns_array_check(xlsx_rows_enum) ⇒ Object



65
66
67
68
69
70
# File 'lib/worker_tools/xlsx_input.rb', line 65

def xlsx_input_columns_array_check(xlsx_rows_enum)
  expected_columns_length = xlsx_input_columns.length
  actual_columns_length = xlsx_rows_enum.first.length
  return if expected_columns_length == actual_columns_length
  raise "The number of columns (#{actual_columns_length}) is not the expected (#{expected_columns_length})"
end

#xlsx_input_columns_check(xlsx_rows_enum) ⇒ Object



59
60
61
62
63
# File 'lib/worker_tools/xlsx_input.rb', line 59

def xlsx_input_columns_check(xlsx_rows_enum)
  # override and return true if you do not want this check to be performed
  return xlsx_input_columns_array_check(xlsx_rows_enum) if xlsx_input_columns.is_a?(Array)
  xlsx_input_columns_hash_check(xlsx_rows_enum)
end

#xlsx_input_columns_hash_check(xlsx_rows_enum) ⇒ Object



72
73
74
75
76
77
# File 'lib/worker_tools/xlsx_input.rb', line 72

def xlsx_input_columns_hash_check(xlsx_rows_enum)
  expected_names = xlsx_input_columns.values
  filtered_actual_names = xlsx_rows_enum.first.map { |n| xlsx_input_header_normalized(n) }
  xlsx_input_columns_hash_check_duplicates(filtered_actual_names)
  xlsx_input_columns_hash_check_missing(filtered_actual_names, expected_names)
end

#xlsx_input_columns_hash_check_duplicates(names) ⇒ Object



79
80
81
82
# File 'lib/worker_tools/xlsx_input.rb', line 79

def xlsx_input_columns_hash_check_duplicates(names)
  dups = names.group_by(&:itself).select { |_, v| v.count > 1 }.keys
  raise "The file contains duplicated columns: #{dups}" if dups.present?
end

#xlsx_input_columns_hash_check_missing(actual_names, expected_names) ⇒ Object



84
85
86
87
88
89
90
# File 'lib/worker_tools/xlsx_input.rb', line 84

def xlsx_input_columns_hash_check_missing(actual_names, expected_names)
  missing = expected_names.reject do |name|
    matchable = name.is_a?(String) ? xlsx_input_header_normalized(name) : name
    actual_names.any? { |n| case n when matchable then true end } # rubocop does not like ===
  end
  raise "Some columns are missing: #{missing}" unless missing.empty?
end

#xlsx_input_file_pathObject



121
122
123
# File 'lib/worker_tools/xlsx_input.rb', line 121

def xlsx_input_file_path
  model.attachment.path.to_s
end

#xlsx_input_foreachObject



132
133
134
135
136
137
138
139
140
141
142
# File 'lib/worker_tools/xlsx_input.rb', line 132

def xlsx_input_foreach
  @xlsx_input_foreach ||= begin
    xlsx_input_columns_check(xlsx_rows_enum)

    XlsxInputForeach.new(
      rows_enum: xlsx_rows_enum,
      mapping_order: xlsx_input_mapping_order(xlsx_rows_enum.first),
      cleanup_method: method(:xlsx_input_value_cleanup)
    )
  end
end

#xlsx_input_header_normalized(name) ⇒ Object



49
50
51
# File 'lib/worker_tools/xlsx_input.rb', line 49

def xlsx_input_header_normalized(name)
  name.to_s.strip.downcase
end

#xlsx_input_include_other_columnsObject

If true, the rows will append those columns that don’t belong to the xlsx_input_columns list. Useful when the spreadsheet contains some fixed columns and a number of variable ones.



45
46
47
# File 'lib/worker_tools/xlsx_input.rb', line 45

def xlsx_input_include_other_columns
  false
end

#xlsx_input_mapping_order(header_names) ⇒ Object

Compares the first row (header names) with the xlsx_input_columns hash to find the corresponding positions.

Ex: xlsx_input_columns: ‘Mandant’, area: ‘Bereich’

headers: ['Bereich', 'Mandant']
=>  { tenant: 1, area: 0}


98
99
100
101
# File 'lib/worker_tools/xlsx_input.rb', line 98

def xlsx_input_mapping_order(header_names)
  return xlsx_input_columns.map.with_index { |n, i| [n, i] }.to_h if xlsx_input_columns.is_a?(Array)
  xlsx_input_mapping_order_for_hash(header_names)
end

#xlsx_input_mapping_order_for_hash(header_names) ⇒ Object



103
104
105
106
107
108
109
110
111
# File 'lib/worker_tools/xlsx_input.rb', line 103

def xlsx_input_mapping_order_for_hash(header_names)
  filtered_column_names = header_names.map { |n| xlsx_input_header_normalized(n) }
  mapping = xlsx_input_columns.each_with_object({}) do |(k, v), h|
    matchable = v.is_a?(String) ? xlsx_input_header_normalized(v) : v
    h[k] = filtered_column_names.index { |n| case n when matchable then true end }
  end
  return mapping unless xlsx_input_include_other_columns
  xlsx_input_mapping_order_with_other_columns(mapping, filtered_column_names)
end

#xlsx_input_mapping_order_with_other_columns(mapping, filtered_column_names) ⇒ Object



113
114
115
116
117
118
119
# File 'lib/worker_tools/xlsx_input.rb', line 113

def xlsx_input_mapping_order_with_other_columns(mapping, filtered_column_names)
  positions_taken = mapping.values
  filtered_column_names.each_with_index do |header, index|
    mapping[header] = index unless positions_taken.include?(index)
  end
  mapping
end

#xlsx_input_value_cleanup(value) ⇒ Object

Allows for some basic cleanup of the values, such as applying strip to the strings.



55
56
57
# File 'lib/worker_tools/xlsx_input.rb', line 55

def xlsx_input_value_cleanup(value)
  value.is_a?(String) ? value.strip : value
end

#xlsx_rows_enumObject



125
126
127
128
129
130
# File 'lib/worker_tools/xlsx_input.rb', line 125

def xlsx_rows_enum
  @xlsx_rows_enum ||= begin
    spreadsheet = Roo::Excelx.new(xlsx_input_file_path)
    spreadsheet.each_row_streaming(sheet: spreadsheet.sheets.first, pad_cells: true)
  end
end