Mingle

Mingle is an ActiveRecord extension for merging database records. It provides completely generic logic for removing duplicate entries from tables, including merging associations.

Examples

Objects are merged using the syntax target.merge victim. The victim's attributes are used to fill in any blanks in the target. If the target is valid after this update, it is saved and the victim is destroyed.

lennon = Factory :artist, :first_name => 'John'
starr  = Factory :artist, :last_name  => 'Starr'

Artist.count.should == 2

lennon.merge starr

lennon.first_name.should == 'John'
lennon.last_name.should == 'Starr'

Artist.count.should == 1

You can use the :keep and :overwrite options to get better control over which fields are changed by the merge. :keep stops fields being replaced:

lennon = Factory :artist, :first_name => 'John'
starr  = Factory :artist, :first_name => 'Ringo', :last_name  => 'Starr'

lennon.merge starr, :keep => :last_name

lennon.first_name.should == 'John'
lennon.last_name.should be_nil

:overwrite forces fields to be replaced:

lennon = Factory :artist, :first_name => 'John',  :last_name  => 'Lennon'
starr  = Factory :artist, :first_name => 'Ringo', :last_name  => 'Starr'

lennon.merge starr, :overwrite => :first_name

lennon.first_name.should == 'Ringo'
lennon.last_name.should == 'Lennon'

You can also pass a block to decide which attributes to keep by inspecting the value of each attribute on both objects. For each key, the value returned by the block is the one kept:

lennon = Factory :artist, :first_name => 'John',  :last_name  => 'Lennon'
starr  = Factory :artist, :first_name => 'Ringo', :last_name  => 'Starr'

lennon.merge starr do |key, my_value, their_value|
  key == :first_name ? my_value : their_value
end

lennon.first_name.should == 'John'
lennon.last_name.should == 'Starr'

:keep, :overwrite and decision blocks are supported for basic attributes and belongs_to associations. Decision blocks can be set per-class to apply to all merges for that class:

class Artist
  merge_strategy do |key, my_value, their_value|
    key == :first_name ? my_value : their_value
  end
end

One-to-many and many-to-many associations are (by default) handled by simply concatenating the two associated collections. This is done efficiently using a single query rather than loading and updating all the associated objects. Duplicate links in HABTM join tables are deleted, as are duplicate links in join models used in has_many :through associations.

You can specify how to look for duplicates in merged collections using the merge_if macro. e.g. say I want to make sure that merged cities do not contain venues with the same name:

class City
  has_many :venues
end

class Venue
  belongs_to :city

  merge_if { |me, them| me.name == them.name }
end

london = Factory :city, :venues => %w[Koko Scala Astoria].map { |n| Factory :venue, :name => n }
new_york = Factory :city, :venues => %w[Scala Factory].map { |n| Factory :venue, :name => n }

Venue.count.should == 5

london.merge new_york

london.venues.map(&:name).should == %w[Koko Scala Astoria Factory]
Venue.count.should == 4

The duplicate objects from the victim's collection are merged into the object in the target's collection using the merge method, so attributes from the victim objects are not lost.

TODO

  • Handle recursive merges and join models more intelligently

  • Allow pruning of collections during the merge process

  • Mark victims as duplicates rather than deleting

  • Forward changes to marked dupes onto their target object

  • Automatically redirect when accessing pages for victim objects

License

(The MIT License)

Copyright © 2009 Songkick.com

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the 'Software'), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.