Module: Hashdiff

Defined in:
lib/hashdiff/patch.rb,
lib/hashdiff/lcs.rb,
lib/hashdiff/diff.rb,
lib/hashdiff/util.rb,
lib/hashdiff/version.rb,
lib/hashdiff/compare_hashes.rb,
lib/hashdiff/lcs_compare_arrays.rb,
lib/hashdiff/linear_compare_array.rb

Overview

This module provides methods to diff two hash, patch and unpatch hash

Constant Summary collapse

VERSION =
'1.1.0'.freeze

Class Method Summary collapse

Class Method Details

.best_diff(obj1, obj2, options = {}) {|path, value1, value2| ... } ⇒ Array

Best diff two objects, which tries to generate the smallest change set using different similarity values.

Hashdiff.best_diff is useful in case of comparing two objects which include similar hashes in arrays.

Examples:

a = {'x' => [{'a' => 1, 'c' => 3, 'e' => 5}, {'y' => 3}]}
b = {'x' => [{'a' => 1, 'b' => 2, 'e' => 5}] }
diff = Hashdiff.best_diff(a, b)
diff.should == [['-', 'x[0].c', 3], ['+', 'x[0].b', 2], ['-', 'x[1].y', 3], ['-', 'x[1]', {}]]

Parameters:

  • obj1 (Array, Hash)
  • obj2 (Array, Hash)
  • options (Hash) (defaults to: {})

    the options to use when comparing

    • :strict (Boolean) [true] whether numeric values will be compared on type as well as value. Set to false to allow comparing Integer, Float, BigDecimal to each other

    • :ignore_keys (Symbol, String or Array) [[]] a list of keys to ignore. No comparison is made for the specified key(s)

    • :indifferent (Boolean) [false] whether to treat hash keys indifferently. Set to true to ignore differences between symbol keys (ie. 1 ~= => 1)

    • :delimiter (String) [‘.’] the delimiter used when returning nested key references

    • :numeric_tolerance (Numeric) [0] should be a positive numeric value. Value by which numeric differences must be greater than. By default, numeric values are compared exactly; with the :tolerance option, the difference between numeric values must be greater than the given value.

    • :strip (Boolean) [false] whether or not to call #strip on strings before comparing

    • :array_path (Boolean) [false] whether to return the path references for nested values in an array, can be used for patch compatibility with non string keys.

    • :use_lcs (Boolean) [true] whether or not to use an implementation of the Longest common subsequence algorithm for comparing arrays, produces better diffs but is slower.

Yields:

  • (path, value1, value2)

    Optional block is used to compare each value, instead of default #==. If the block returns value other than true of false, then other specified comparison options will be used to do the comparison.

Returns:

  • (Array)

    an array of changes. e.g. [[ ‘+’, ‘a.b’, ‘45’ ], [ ‘-’, ‘a.c’, ‘5’ ], [ ‘~’, ‘a.x’, ‘45’, ‘63’]]

Since:

  • 0.0.1



32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# File 'lib/hashdiff/diff.rb', line 32

def self.best_diff(obj1, obj2, options = {}, &block)
  options[:comparison] = block if block_given?

  opts = { similarity: 0.3 }.merge!(options)
  diffs1 = diff(obj1, obj2, opts)
  count1 = count_diff diffs1

  opts = { similarity: 0.5 }.merge!(options)
  diffs2 = diff(obj1, obj2, opts)
  count2 = count_diff diffs2

  opts = { similarity: 0.8 }.merge!(options)
  diffs3 = diff(obj1, obj2, opts)
  count3 = count_diff diffs3

  count, diffs = count1 < count2 ? [count1, diffs1] : [count2, diffs2]
  count < count3 ? diffs : diffs3
end

.diff(obj1, obj2, options = {}) {|path, value1, value2| ... } ⇒ Array

Compute the diff of two hashes or arrays

Examples:

a = {"a" => 1, "b" => {"b1" => 1, "b2" =>2}}
b = {"a" => 1, "b" => {}}

diff = Hashdiff.diff(a, b)
diff.should == [['-', 'b.b1', 1], ['-', 'b.b2', 2]]

Parameters:

  • obj1 (Array, Hash)
  • obj2 (Array, Hash)
  • options (Hash) (defaults to: {})

    the options to use when comparing

    • :strict (Boolean) [true] whether numeric values will be compared on type as well as value. Set to false to allow comparing Integer, Float, BigDecimal to each other

    • :ignore_keys (Symbol, String or Array) [[]] a list of keys to ignore. No comparison is made for the specified key(s)

    • :indifferent (Boolean) [false] whether to treat hash keys indifferently. Set to true to ignore differences between symbol keys (ie. 1 ~= => 1)

    • :similarity (Numeric) [0.8] should be between (0, 1]. Meaningful if there are similar hashes in arrays. See best_diff.

    • :delimiter (String) [‘.’] the delimiter used when returning nested key references

    • :numeric_tolerance (Numeric) [0] should be a positive numeric value. Value by which numeric differences must be greater than. By default, numeric values are compared exactly; with the :tolerance option, the difference between numeric values must be greater than the given value.

    • :strip (Boolean) [false] whether or not to call #strip on strings before comparing

    • :array_path (Boolean) [false] whether to return the path references for nested values in an array, can be used for patch compatibility with non string keys.

    • :use_lcs (Boolean) [true] whether or not to use an implementation of the Longest common subsequence algorithm for comparing arrays, produces better diffs but is slower.

Yields:

  • (path, value1, value2)

    Optional block is used to compare each value, instead of default #==. If the block returns value other than true of false, then other specified comparison options will be used to do the comparison.

Returns:

  • (Array)

    an array of changes. e.g. [[ ‘+’, ‘a.b’, ‘45’ ], [ ‘-’, ‘a.c’, ‘5’ ], [ ‘~’, ‘a.x’, ‘45’, ‘63’]]

Since:

  • 0.0.1



80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
# File 'lib/hashdiff/diff.rb', line 80

def self.diff(obj1, obj2, options = {}, &block)
  opts = {
    prefix: '',
    similarity: 0.8,
    delimiter: '.',
    strict: true,
    ignore_keys: [],
    indifferent: false,
    strip: false,
    numeric_tolerance: 0,
    array_path: false,
    use_lcs: true
  }.merge!(options)

  opts[:prefix] = [] if opts[:array_path] && opts[:prefix] == ''

  opts[:ignore_keys] = [*opts[:ignore_keys]] # splat covers single sym/string case

  opts[:comparison] = block if block_given?

  # prefer to compare with provided block
  result = custom_compare(opts[:comparison], opts[:prefix], obj1, obj2)
  return result if result

  return [] if obj1.nil? && obj2.nil?

  return [['~', opts[:prefix], obj1, obj2]] if obj1.nil? || obj2.nil?

  return [['~', opts[:prefix], obj1, obj2]] unless comparable?(obj1, obj2, opts[:strict])

  return LcsCompareArrays.call(obj1, obj2, opts) if obj1.is_a?(Array) && opts[:use_lcs]

  return LinearCompareArray.call(obj1, obj2, opts) if obj1.is_a?(Array) && !opts[:use_lcs]

  return CompareHashes.call(obj1, obj2, opts) if obj1.is_a?(Hash)

  return [] if compare_values(obj1, obj2, opts)

  [['~', opts[:prefix], obj1, obj2]]
end

.patch!(obj, changes, options = {}) ⇒ Object

Apply patch to object

Parameters:

  • obj (Hash, Array)

    the object to be patched, can be an Array or a Hash

  • changes (Array)

    e.g. [[ ‘+’, ‘a.b’, ‘45’ ], [ ‘-’, ‘a.c’, ‘5’ ], [ ‘~’, ‘a.x’, ‘45’, ‘63’]]

  • options (Hash) (defaults to: {})

    supports following keys:

    • :delimiter (String) [‘.’] delimiter string for representing nested keys in changes array

Returns:

  • the object after patch

Since:

  • 0.0.1



17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
# File 'lib/hashdiff/patch.rb', line 17

def self.patch!(obj, changes, options = {})
  delimiter = options[:delimiter] || '.'

  changes.each do |change|
    parts = change[1]
    parts = decode_property_path(parts, delimiter) unless parts.is_a?(Array)

    last_part = parts.last

    parent_node = node(obj, parts[0, parts.size - 1])

    if change[0] == '+'
      if parent_node.is_a?(Array)
        parent_node.insert(last_part, change[2])
      else
        parent_node[last_part] = change[2]
      end
    elsif change[0] == '-'
      if parent_node.is_a?(Array)
        parent_node.delete_at(last_part)
      else
        parent_node.delete(last_part)
      end
    elsif change[0] == '~'
      parent_node[last_part] = change[3]
    end
  end

  obj
end

.prefix_append_array_index(prefix, array_index, opts) ⇒ Object



137
138
139
140
141
142
143
# File 'lib/hashdiff/util.rb', line 137

def self.prefix_append_array_index(prefix, array_index, opts)
  if opts[:array_path]
    prefix + [array_index]
  else
    "#{prefix}[#{array_index}]"
  end
end

.prefix_append_key(prefix, key, opts) ⇒ Object



129
130
131
132
133
134
135
# File 'lib/hashdiff/util.rb', line 129

def self.prefix_append_key(prefix, key, opts)
  if opts[:array_path]
    prefix + [key]
  else
    prefix.empty? ? key.to_s : "#{prefix}#{opts[:delimiter]}#{key}"
  end
end

.unpatch!(obj, changes, options = {}) ⇒ Object

Unpatch an object

Parameters:

  • obj (Hash, Array)

    the object to be unpatched, can be an Array or a Hash

  • changes (Array)

    e.g. [[ ‘+’, ‘a.b’, ‘45’ ], [ ‘-’, ‘a.c’, ‘5’ ], [ ‘~’, ‘a.x’, ‘45’, ‘63’]]

  • options (Hash) (defaults to: {})

    supports following keys:

    • :delimiter (String) [‘.’] delimiter string for representing nested keys in changes array

Returns:

  • the object after unpatch

Since:

  • 0.0.1



58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
# File 'lib/hashdiff/patch.rb', line 58

def self.unpatch!(obj, changes, options = {})
  delimiter = options[:delimiter] || '.'

  changes.reverse_each do |change|
    parts = change[1]
    parts = decode_property_path(parts, delimiter) unless parts.is_a?(Array)

    last_part = parts.last

    parent_node = node(obj, parts[0, parts.size - 1])

    if change[0] == '+'
      if parent_node.is_a?(Array)
        parent_node.delete_at(last_part)
      else
        parent_node.delete(last_part)
      end
    elsif change[0] == '-'
      if parent_node.is_a?(Array)
        parent_node.insert(last_part, change[2])
      else
        parent_node[last_part] = change[2]
      end
    elsif change[0] == '~'
      parent_node[last_part] = change[2]
    end
  end

  obj
end