Class: CSVDiff
- Inherits:
-
Object
- Object
- CSVDiff
- Includes:
- Algorithm
- Defined in:
- lib/csv-diff/csv_diff.rb,
lib/csv-diff/source.rb,
lib/csv-diff/algorithm.rb,
lib/csv-diff/csv_source.rb,
lib/csv-diff/xml_source.rb
Overview
This library performs diffs of flat file content that contains structured data in fields, with rows provided in a parent-child format.
Parent-child data does not lend itself well to standard text diffs, as small changes in the organisation of the tree at an upper level (e.g. re-ordering of two ancestor nodes) can lead to big movements in the position of descendant records - particularly when the parent-child data is generated by a hierarchy traversal.
Additionally, simple line-based diffs can identify that a line has changed, but not which field(s) in the line have changed.
Data may be supplied in the form of CSV files, or as an array of arrays. The diff process process provides a fine level of control over what to diff, and can optionally ignore certain types of changes (e.g. changes in order).
Defined Under Namespace
Modules: Algorithm Classes: CSVSource, Source, XMLSource
Instance Attribute Summary collapse
-
#child_fields ⇒ Array<String>
readonly
An array of field names for the child field(s).
-
#diff_fields ⇒ Array<String>
readonly
An array of field names that are compared in the diff process.
-
#diffs ⇒ Array<Hash>
readonly
An array of differences.
-
#key_fields ⇒ Array<String>
readonly
An array of field namees of the key fields that uniquely identify each row.
-
#left ⇒ CSVSource
(also: #from)
readonly
CSVSource object containing details of the left/from input.
-
#options ⇒ Hash
readonly
The options hash used for the diff.
-
#parent_fields ⇒ Array<String>
readonly
An array of field names for the parent field(s).
-
#right ⇒ CSVSource
(also: #to)
readonly
CSVSource object containing details of the right/to input.
Instance Method Summary collapse
-
#diff(options = {}) ⇒ Object
Performs a diff with the specified
options
. -
#diff_warnings ⇒ Array<String>
An array of warning messages from the diff process.
-
#initialize(left, right, options = {}) ⇒ CSVDiff
constructor
Generates a diff between two hierarchical tree structures, provided as
left
andright
, each of which consists of an array of lines in CSV format. -
#summary ⇒ Object
Returns a summary of the number of adds, deletes, moves, and updates.
-
#warnings ⇒ Array<String>
An array of warning messages generated from the sources and the diff process.
Methods included from Algorithm
Constructor Details
#initialize(left, right, options = {}) ⇒ CSVDiff
Generates a diff between two hierarchical tree structures, provided as left
and right
, each of which consists of an array of lines in CSV format. An array of field indexes can also be specified as key_fields
; a minimum of one field index must be specified; the last index is the child id, and the remaining fields (if any) are the parent field(s) that uniquely qualify the child instance.
83 84 85 86 87 88 89 90 91 92 93 94 |
# File 'lib/csv-diff/csv_diff.rb', line 83 def initialize(left, right, = {}) @left = left.is_a?(Source) ? left : CSVSource.new(left, ) @left.index_source if @left.lines.nil? raise "No field names found in left (from) source" unless @left.field_names && @left.field_names.size > 0 @right = right.is_a?(Source) ? right : CSVSource.new(right, ) @right.index_source if @right.lines.nil? raise "No field names found in right (to) source" unless @right.field_names && @right.field_names.size > 0 @warnings = [] @diff_fields = get_diff_fields(@left.field_names, @right.field_names, ) @key_fields = @left.key_fields diff() end |
Instance Attribute Details
#child_fields ⇒ Array<String> (readonly)
Returns An array of field names for the child field(s).
37 38 39 |
# File 'lib/csv-diff/csv_diff.rb', line 37 def child_fields @child_fields end |
#diff_fields ⇒ Array<String> (readonly)
Returns An array of field names that are compared in the diff process.
30 31 32 |
# File 'lib/csv-diff/csv_diff.rb', line 30 def diff_fields @diff_fields end |
#diffs ⇒ Array<Hash> (readonly)
Returns An array of differences.
27 28 29 |
# File 'lib/csv-diff/csv_diff.rb', line 27 def diffs @diffs end |
#key_fields ⇒ Array<String> (readonly)
Returns An array of field namees of the key fields that uniquely identify each row.
33 34 35 |
# File 'lib/csv-diff/csv_diff.rb', line 33 def key_fields @key_fields end |
#left ⇒ CSVSource (readonly) Also known as: from
Returns CSVSource object containing details of the left/from input.
20 21 22 |
# File 'lib/csv-diff/csv_diff.rb', line 20 def left @left end |
#options ⇒ Hash (readonly)
Returns The options hash used for the diff.
39 40 41 |
# File 'lib/csv-diff/csv_diff.rb', line 39 def @options end |
#parent_fields ⇒ Array<String> (readonly)
Returns An array of field names for the parent field(s).
35 36 37 |
# File 'lib/csv-diff/csv_diff.rb', line 35 def parent_fields @parent_fields end |
#right ⇒ CSVSource (readonly) Also known as: to
Returns CSVSource object containing details of the right/to input.
24 25 26 |
# File 'lib/csv-diff/csv_diff.rb', line 24 def right @right end |
Instance Method Details
#diff(options = {}) ⇒ Object
Performs a diff with the specified options
.
98 99 100 101 102 |
# File 'lib/csv-diff/csv_diff.rb', line 98 def diff( = {}) @summary = nil @options = @diffs = diff_sources(@left, @right, @key_fields, @diff_fields, ) end |
#diff_warnings ⇒ Array<String>
Returns an array of warning messages from the diff process.
132 133 134 |
# File 'lib/csv-diff/csv_diff.rb', line 132 def diff_warnings @warnings end |
#summary ⇒ Object
Returns a summary of the number of adds, deletes, moves, and updates.
106 107 108 109 110 111 112 113 |
# File 'lib/csv-diff/csv_diff.rb', line 106 def summary unless @summary @summary = Hash.new{ |h, k| h[k] = 0 } @diffs.each{ |k, v| @summary[v[:action]] += 1 } @summary['Warning'] = warnings.size if warnings.size > 0 end @summary end |
#warnings ⇒ Array<String>
Returns an array of warning messages generated from the sources and the diff process.
126 127 128 |
# File 'lib/csv-diff/csv_diff.rb', line 126 def warnings @left.warnings + @right.warnings + @warnings end |