ZD

ZD is a zero-downtime data migration framework that sits on top of the data store (or stores) of your choice. It implements zero-downtime by putting your data through a series of states:

  1. Unrun: The migration is implemented, but has not been run yet.
  2. Prepared: Any necessary creation of a "space" for the migrated data has been done. Typically not needed for schema-less stores. Your application code now writes to both the old and new data structures.
  3. Migrated: All pre-existing data has been copied and/or mutated in to the new locations while continuing to exist as-is in the old locations. The code base is still operating off of the old locations (and new data continues to be written to both locations).
  4. Switched: The code base now uses the new locations and ignores the old locations. Data is still written to both old and new locations. This is the point at which you would want to test the system to make sure the migration worked as expected. If something isn't right, the migration can still be rolled all the way back to the unrun state.
  5. Completed: The migration has been verified to be working, and new data is no longer written to the old locations. Migration-specific code can be stripped out of the code base now. Once a migration is completed, it cannot be rolled back. Any references to the migration in the codebase will generate a warning.
  6. Destroyed: The old locations for data have been removed from the data store. Any references to the migration in the codebase will raise an error.

Installation

Just add zd to your Gemfile:

gem 'zd'

And bundle install.

Usage

To show how ZD works, lets walk through a simple example. Lets say you have a Person class, which used to store separate first names and last names. You've since expanded internationally and realized what a bad idea this is in general, and so now you need to fix your mistake down to the data level without taking your application down (though rolling restarts are OK). Your Person class looks like this initially:

class Person
  include AwesomeDB

  def first_name
    read(:first_name)
  end

  def last_name
    read(:last_name)
  end

  def first_name=(value)
    write(:first_name, value)
  end

  def last_name=(value)
    write(:last_name, value)
  end

  def name
    [first_name, last_name].compact.join(" ")
  end

  def name=(value)
    first_name, *rest = value.split(/\s+/)
    write(:first_name, first_name)
    write(:last_name, rest.join(" "))
  end
end

(The read and write methods are made-up access methods for the made-up AwesomeDB data store.)

Currently you still have code using both the name and first_name/last_name, but you're slowly cleaning it up. The key thing is that all the methods on the class continue to obey their contract throughout the data migration.

To get started you'll want to generate a new migration with zd new <name>. Migrations go in the db/migrate folder in your project, and use a timestamped filename (similar to ActiveRecord migrations). A fresh migration looks something like this:

class Migrations::MergeFirstAndLastName < ZD::Migration
  register! depends_on: :nothing

  def prepare
  end

  def migrate
  end

  def destroy
  end
end

And here is what it might look like after the migration is filled out:

class Migrations::MergeFirstAndLastName < ZD::Migration
  register! depends_on: :nothing

  def prepare
    Person.add_field :name
  end

  def migrate
    Person.each do |person|
      person.name = [person.first_name, person.last_name].compact.join(" ")
    end
  end

  def destroy
    Person.remove_field :first_name
    Person.remove_field :last_name
  end
end

The Person.add_field and Person.remove_field methods are made up; you would just use whatever your data store provides (if necessary; many schemaless datastores won't even need the prepare step).

This is all well and good, but how does the model handle the fact that the data format is shifting around underneath it? ZD provides state-based methods that can be used to mark when which code should be run:

class Person
  include AwesomeDB

  def first_name
    ZD[:merge_first_and_last_name].HANDLE do |m|
      m.UNTIL_SWITCHED{read(:first_name)}
      m.ONCE_SWITCHED{@first_name ||= name.split(/\s+/).first}
    end
  end

  def last_name
    ZD[:merge_first_and_last_name].HANDLE do |m|
      m.UNTIL_SWITCHED{read(:last_name)}
      m.ONCE_SWITCHED{@last_name ||= name.split(/\s+/)[1..-1].join(" ")}
    end
  end

  def first_name=(value)
    ZD[:merge_first_and_last_name].HANDLE do |m|
      m.ONCE_PREPARED{write(:name, [value, last_name].compact.join(" "))}
      m.UNTIL_COMPLETED{write(:first_name, value)}
    end
  end

  def last_name=(value)
    ZD[:merge_first_and_last_name].HANDLE do |m|
      m.ONCE_PREPARED{write(:name, [first_name, value].compact.join(" "))}
      m.UNTIL_COMPLETED{write(:last_name, value)}
    end
  end

  def name
    ZD[:merge_first_and_last_name].HANDLE do |m|
      m.UNTIL_SWITCHED{return [first_name, last_name].compact.join(" ")}
      m.ONCE_SWITCHED{read(:name)}
    end
  end

  def name=(value)
    ZD[:merge_first_and_last_name].HANDLE do |m|
      m.ONCE_PREPARED{write(:name, value)}
      m.UNTIL_COMPLETED do
        first_name, *rest = value.split(/\s+/)
        write(:first_name, first_name)
        write(:last_name, rest.join(" "))
      end
    end
  end
end

The first thing you're probably thinking after seeing that is, "Who hit my code with the ugly stick!?!" But that's actually a feature of ZD: migration-specific code sticks out like a sore thumb so that there will be lots of motivation to strip it out once the migration is complete. Migration code should be robust but temporary.

Once your migration and migration-specific code is in place, you can start walking your data through the migration states using zd:

$ zd prepare

All migrations in the unrun state will be transitioned to the prepared state via the prepare action.

$ zd migrate

All migrations in the prepared state will be transitioned to the migrated state via the migrate action.

$ zd switch

All migrations in the migrated state will flip over to switched. This triggers all code to start using the new code paths.

This is the point at which you should verify that your migrations have been successful and all the new code is working as expected in production. Getting back to the old state is as easy as zd switchoff [name].

$ zd complete

All migrations in the switched state will flip over to completed. This triggers all code to stop writing to old locations, and puts you past the point of no return for an easy rollback. Once you get here, it's time to go through your codebase and rip out the migration-specific code blocks, just leaving the code that deals with the new data structure.

$ zd destroy

All migrations in the completed state will be transitioned to the destroyed state via the destroy action. Typically this is the point at which old data gets cleaned up. Note that once your migration gets to this state, continued references to it in your code will raise an error.

And that's all there is to it! You can either leave old migration files from db/migrate, or delete them once you're done with them - the overhead for each one is very small. Oh, and here's what the Person class looks like once you're done:

class Person
  include AwesomeDB

  def first_name
    @first_name ||= name.split(/\s+/).first
  end

  def last_name
    @last_name ||= name.split(/\s+/)[1..-1].join(" ")
  end

  def first_name=(value)
    write(:name, [value, last_name].compact.join(" "))
  end

  def last_name=(value)
    write(:name, [first_name, value].compact.join(" "))
  end

  def name
    read(:name)
  end

  def name=(value)
    write(:name, value)
  end
end

No more ugly!

State Tracking

Dependencies

Philosophy