data-import is a data-migration framework. The goal of the project is to provide a simple api to migrate data from a legacy schema into a new one. It's based on jeremyevans/sequel.


gem 'data-import'

you can put your migration configuration in any file you like. We suggest something like mapping.rb

source :sequel, 'sqlite:/'
target :sequel, 'sqlite:/'

import 'Animals' do
  from 'tblAnimal', :primary_key => 'sAnimalID'
  to 'animals'

  mapping 'sAnimalID' => 'id'
  mapping 'strAnimalTitleText' => 'name'
  mapping 'sAnimalAge' => 'age'
  mapping 'strThreat' do |context, threat|
    rating = ['none', 'medium', 'big'].index(threat) + 1
    {:danger_rating => rating}

to run the import just execute:

  mapping_path = Rails.root + 'mapping.rb'
  DataImport.run_config! mapping_path

if you execute the import frequently you can create a Rake-Task:

desc "Imports the date from the source database"
task :import do
  mapping_path = Rails.root + 'mapping.rb'
  options = {}
  options[:only] = ENV['RUN_ONLY'].split(',') if ENV['RUN_ONLY'].present?

  DataImport.run_config! mapping_path, options


data-import provides a clean dsl to define your mappings from the legacy schema to the new one.

Before Filter

data-import allows you to definie a global filter. This filter can be used to make global transformations like encoding fixes. You can define a filter, which downcases every string like so:

before_filter do |row|
  row.each do |k, v|
    row[k] = v.downcase if v.respond_to?(:downcase)

Simple Mappings

You've already seen a very basic example of the dsl in the Installation-Section. This part shows off the features of the mapping-DSL.


every mapping starts with a call to import followed by the name of the mapping. You can name mappings however you like. The block passed to import contains the mapping itself. You can supply the source-table with from and the target-table with to. Make sure that you set the primary-key on the source-table otherwhise pagination is not working properly and the migration will fill up your RAM.

import 'Users' do
  from 'tblUser', :primary_key => 'sUserID'
  to 'users'


You can create simple name-mappings with a call to mapping:

mapping 'sUserID' => 'id'
mapping 'strEmail' => 'email'
mapping 'strUsername' => 'username'

If you need to process a column you can add a block. This will pass in the values of the columns you specified after mapping. The return value of the block should be a hash or nil. Nil means no mapping at all and in case of a hash you have to use the column-names of the target-table as keys.

mapping 'strThreat' do |context, threat|
  rating = ['none', 'medium', 'big'].index(threat) + 1
  {:danger_rating => rating}


You can specify dependencies between definitions. Dependencies are always run before a given definition will be executed. Adding all necessary dependencies also allows you to run a set of definitions instead of everything.

import 'Roles' do
  from 'tblRole', :primary_key => 'sRoleID'
  to 'roles'

import 'SubscriptionPlans' do
  from 'tblSubcriptionCat', :primary_key => 'sSubscriptionCatID'
  to 'subscription_plans'

import 'Users' do
  from 'tblUser', :primary_key => 'sUserID'
  to 'users'
  dependencies 'SubscriptionPlans'

import 'Permissions' do
  from 'tblUserRoles'
  to 'permissions'
  dependencies 'Users', 'Roles'

you can now run parts of your mappings using the :only option:

DataImport.run_config! 'mappings.rb', :only => ['Users'] # => imports SubscriptionPlans then Users
DataImport.run_config! 'mappings.rb', :only => ['Roles'] # => imports Roles only
DataImport.run_config! 'mappings.rb', :only => ['Permissions'] # => imports Roles, SubscriptionPlans, Users and then Permissions


you can learn a lot from the integration specs.


Got a question?

Just send me a message and I'll try to get to you as soon as possible.

Found a bug?

Please submit a new issue.

Fixed something?

  1. Fork data-import
  2. Create a topic branch - git checkout -b my_branch
  3. Make your changes and update the History.txt file
  4. Push to your branch - git push origin my_branch
  5. Send me a pull-request for your topic branch
  6. That's it!