Class: CSVDiff::CSVSource
- Inherits:
-
Object
- Object
- CSVDiff::CSVSource
- Defined in:
- lib/csv-diff/csv_source.rb
Overview
Represents a CSV input (i.e. the left/from or right/to input) to the diff process.
Instance Attribute Summary collapse
-
#case_sensitive ⇒ Boolean
(also: #case_sensitive?)
readonly
True if the source has been indexed with case- sensitive keys, or false if it has been indexed using upper-case key values.
-
#child_field_indexes ⇒ Array<Fixnum>
readonly
The indexes of the child fields in the source file.
-
#child_fields ⇒ Array<String>
readonly
The names of the field(s) that distinguish a child of a parent record.
-
#field_names ⇒ Array<String>
readonly
The names of the fields in the source file.
-
#index ⇒ Hash<String,Array<String>>
readonly
A hash containing each parent key, and an Array of the child keys it is a parent of.
-
#key_field_indexes ⇒ Array<Fixnum>
readonly
The indexes of the key fields in the source file.
-
#key_fields ⇒ Array<String>
readonly
The names of the field(s) that uniquely identify each row.
-
#line_count ⇒ Fixnum
readonly
A count of the lines processed from this source.
-
#lines ⇒ Hash<String,Hash>
readonly
A hash containing each line of the source, keyed on the values of the
key_fields. -
#parent_field_indexes ⇒ Array<Fixnum>
readonly
The indexes of the parent fields in the source file.
-
#parent_fields ⇒ Array<String>
readonly
The names of the field(s) that identify a common parent of child records.
-
#path ⇒ String
The path to the source file.
-
#skip_count ⇒ Fixnum
readonly
A count of the lines from this source that were skipped, due either to duplicate keys or filter conditions.
-
#trim_whitespace ⇒ Boolean
readonly
True if leading/trailing whitespace should be stripped from fields.
-
#warnings ⇒ Array<String>
readonly
An array of any warnings encountered while processing the source.
Instance Method Summary collapse
-
#[](key) ⇒ Hash
Returns the row in the CSV source corresponding to the supplied key.
-
#initialize(source, options = {}) ⇒ CSVSource
constructor
Creates a new diff source.
Constructor Details
#initialize(source, options = {}) ⇒ CSVSource
Creates a new diff source.
A diff source must contain at least one field that will be used as the key to identify the same record in a different version of this file. If not specified via one of the options, the first field is assumed to be the unique key.
If multiple fields combine to form a unique key, the parent is assumed to be identified by all but the last field of the unique key. If finer control is required, use a combination of the :parent_fields and :child_fields options.
All key options can be specified either by field name, or by field index (0 based).
101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 |
# File 'lib/csv-diff/csv_source.rb', line 101 def initialize(source, = {}) if source.is_a?(String) require 'csv' mode_string = [:encoding] ? "r:#{[:encoding]}" : 'r' = .fetch(:csv_options, {}) @path = source source = CSV.open(@path, mode_string, ).readlines elsif !source.is_a?(Enumerable) || (source.is_a?(Enumerable) && source.size > 0 && !source.first.is_a?(Enumerable)) raise ArgumentError, "source must be a path to a file or an Enumerable<Enumerable>" end if (.keys & [:parent_field, :parent_fields, :child_field, :child_fields]).empty? && (kf = .fetch(:key_field, [:key_fields])) @key_fields = [kf].flatten @parent_fields = @key_fields[0...-1] @child_fields = @key_fields[-1..-1] else @parent_fields = [.fetch(:parent_field, [:parent_fields]) || []].flatten @child_fields = [.fetch(:child_field, [:child_fields]) || [0]].flatten @key_fields = @parent_fields + @child_fields end @field_names = [:field_names] @warnings = [] index_source(source, ) end |
Instance Attribute Details
#case_sensitive ⇒ Boolean (readonly) Also known as: case_sensitive?
Returns True if the source has been indexed with case- sensitive keys, or false if it has been indexed using upper-case key values.
35 36 37 |
# File 'lib/csv-diff/csv_source.rb', line 35 def case_sensitive @case_sensitive end |
#child_field_indexes ⇒ Array<Fixnum> (readonly)
Returns The indexes of the child fields in the source file.
30 31 32 |
# File 'lib/csv-diff/csv_source.rb', line 30 def child_field_indexes @child_field_indexes end |
#child_fields ⇒ Array<String> (readonly)
Returns The names of the field(s) that distinguish a child of a parent record.
20 21 22 |
# File 'lib/csv-diff/csv_source.rb', line 20 def child_fields @child_fields end |
#field_names ⇒ Array<String> (readonly)
Returns The names of the fields in the source file.
11 12 13 |
# File 'lib/csv-diff/csv_source.rb', line 11 def field_names @field_names end |
#index ⇒ Hash<String,Array<String>> (readonly)
Returns A hash containing each parent key, and an Array of the child keys it is a parent of.
45 46 47 |
# File 'lib/csv-diff/csv_source.rb', line 45 def index @index end |
#key_field_indexes ⇒ Array<Fixnum> (readonly)
Returns The indexes of the key fields in the source file.
24 25 26 |
# File 'lib/csv-diff/csv_source.rb', line 24 def key_field_indexes @key_field_indexes end |
#key_fields ⇒ Array<String> (readonly)
Returns The names of the field(s) that uniquely identify each row.
14 15 16 |
# File 'lib/csv-diff/csv_source.rb', line 14 def key_fields @key_fields end |
#line_count ⇒ Fixnum (readonly)
Returns A count of the lines processed from this source. Excludes any header and duplicate records identified during indexing.
51 52 53 |
# File 'lib/csv-diff/csv_source.rb', line 51 def line_count @line_count end |
#lines ⇒ Hash<String,Hash> (readonly)
Returns A hash containing each line of the source, keyed on the values of the key_fields.
42 43 44 |
# File 'lib/csv-diff/csv_source.rb', line 42 def lines @lines end |
#parent_field_indexes ⇒ Array<Fixnum> (readonly)
Returns The indexes of the parent fields in the source file.
27 28 29 |
# File 'lib/csv-diff/csv_source.rb', line 27 def parent_field_indexes @parent_field_indexes end |
#parent_fields ⇒ Array<String> (readonly)
Returns The names of the field(s) that identify a common parent of child records.
17 18 19 |
# File 'lib/csv-diff/csv_source.rb', line 17 def parent_fields @parent_fields end |
#path ⇒ String
Returns the path to the source file.
8 9 10 |
# File 'lib/csv-diff/csv_source.rb', line 8 def path @path end |
#skip_count ⇒ Fixnum (readonly)
Returns A count of the lines from this source that were skipped, due either to duplicate keys or filter conditions.
54 55 56 |
# File 'lib/csv-diff/csv_source.rb', line 54 def skip_count @skip_count end |
#trim_whitespace ⇒ Boolean (readonly)
Returns True if leading/trailing whitespace should be stripped from fields.
39 40 41 |
# File 'lib/csv-diff/csv_source.rb', line 39 def trim_whitespace @trim_whitespace end |
#warnings ⇒ Array<String> (readonly)
Returns An array of any warnings encountered while processing the source.
48 49 50 |
# File 'lib/csv-diff/csv_source.rb', line 48 def warnings @warnings end |
Instance Method Details
#[](key) ⇒ Hash
Returns the row in the CSV source corresponding to the supplied key.
133 134 135 |
# File 'lib/csv-diff/csv_source.rb', line 133 def [](key) @lines[key] end |