Class: ETL::Processor::CheckUniqueProcessor

Inherits:
RowProcessor show all
Defined in:
lib/etl/processor/check_unique_processor.rb

Overview

Row processor that checks whether or not the row has already passed through the ETL processor, using the key fields provided as the keys to check.

Instance Attribute Summary collapse

Instance Method Summary collapse

Methods inherited from RowProcessor

#ensure_columns_available_in_row!

Constructor Details

#initialize(control, configuration) ⇒ CheckUniqueProcessor

Initialize the processor Configuration options:

  • :keys: An array of keys to check against



14
15
16
17
# File 'lib/etl/processor/check_unique_processor.rb', line 14

def initialize(control, configuration)
  super
  @keys = configuration[:keys]
end

Instance Attribute Details

#keysObject

The keys to check



9
10
11
# File 'lib/etl/processor/check_unique_processor.rb', line 9

def keys
  @keys
end

Instance Method Details

#compound_key_constraintsObject

A Hash of keys that have already been processed.



20
21
22
# File 'lib/etl/processor/check_unique_processor.rb', line 20

def compound_key_constraints
  @compound_key_constraints ||= {}
end

#process(row) ⇒ Object

Process the row. This implementation will only return a row if it it’s key combination has not already been seen.

An error will be raised if the row doesn’t include the keys.



28
29
30
31
32
33
34
35
36
# File 'lib/etl/processor/check_unique_processor.rb', line 28

def process(row)
  ensure_columns_available_in_row!(row, keys, 'for unicity check')
  
  key = (keys.collect { |k| row[k] }).join('|')
  unless compound_key_constraints[key]
    compound_key_constraints[key] = 1
    return row
  end
end