Mongify

<img src=“https://badges.gitter.im/Join%20Chat.svg” alt=“Join the chat at https://gitter.im/anlek/mongify”>

mongify.com

A data translator from sql database to mongoDB.

Supports MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2 (Basically anything ActiveRecord has built-in). However, I’ve only tested it with MySql and SQLite

Supports any version of MongoDB

Learn more about MongoDB at: www.mongodb.org

Install

gem install mongify

NOTICE

You might have to install active record database gems (you'll know if you get an error message while running mongify command)

Usage

Creating a configuration file

In order for Mongify to do its job, it needs to know where your databases are located.

Building a config file is as simple as this:

sql_connection do
  adapter     "mysql"
  host        "localhost"
  username    "root"
  password    "passw0rd"
  database    "my_database"
  batch_size  10000           # This is defaulted to 10000 but in case you want to make that smaller (on lower RAM machines)
end

mongodb_connection do
  host        "localhost"
  database    "my_database"
end

You can check your configuration by running

mongify check database.config

Options

Currently the only supported option is for the mongodb_connection.

mongodb_connection :force => true do                # Forcing a mongodb_connection will drop the database before processing
  # ...
end

You can omit the mongodb connection until you’re ready to process your translation

Generating or creating a translation

Generating a translation

If your database is large and complex, it might be a bit too much work to write the translation file by hand. Mongify’s translate command can help with this:

mongify translation database.config

Or pipe it right into a file by running

mongify translation database.config > translation_file.rb

Creating a translation

Creating a translation is pretty straightforward. It looks something like this:

table "users" do
  column "id", :key
  column "first_name", :string
  column "last_name", :string
  column "created_at", :datetime
  column "updated_at", :datetime
end

table "posts" do
  column "id", :key
  column "title", :string
  column "owner_id", :integer, :references => :users
  column "body", :text
  column "published_at", :datetime
  column "created_at", :datetime
  column "updated_at", :datetime
end

table "comments", :embed_in => :posts, :on => :post_id do
  column "id", :key
  column "body", :text
  column "post_id", :integer, :references => :posts
  column "user_id", :integer, :references => :users
  column "created_at", :datetime
  column "updated_at", :datetime
end

table "preferences", :embed_in => :users, :as => :object do
  column "id", :key, :as => :string
  column "user_id", :integer, :references => "users"
  column "notify_by_email", :boolean
end

table "notes", :embed_in => true, :polymorphic => 'notable' do
  column "id", :key
  column "user_id", :integer, :references => "users"
  column "notable_id", :integer
  column "notable_type", :string
  column "body", :text
  column "created_at", :datetime
  column "updated_at", :datetime
end

Save the file as "translation_file.rb" and run the command:

mongify process database.config translation_file.rb

Commands

Usage: mongify command database_config [database_translation.rb]

Commands:
  "check" or "ck"           >> Checks connection for sql and no_sql databases [configuration_file]
  "process" or "pr"         >> Takes a translation and process it to mongodb [configuration_file, translation_file]
  "sync" or "sy"            >> Takes a translation and process it to mongodb, only syncs (insert/update) new or updated records based on the updated_at column [configuration_file, translation_file]
  "translation" or "tr"     >> Outputs a translation file from a sql connection [configuration_file]

Examples:

  mongify translation datbase.config
  mongify tr database.config
  mongify check database.config
  mongify process database.config database_translation.rb
  mongify sync database.config database_translation.rb

Common options:
    -h, --help                       Show this message
    -v, --version                    Show version

Translation Layout and Options

When dealing with a translation, there are a few options you can set

Table

Structure

Structure for defining a table is as follows:

table "table_name", {options} do
  # columns go here...
end

Options

Table Options are as follow:

table "table_name"                                        # Does a straight copy of the table
table "table_name", :embed_in => 'users'                  # Embeds table_name into users, assuming a user_id is present in table_name.
                                                          # This will also assume you want the table embedded as an array.

table "table_name",                                       # Embeds table_name into users, linking it via a owner_id
      :embed_in => 'users',                               # This will also assume you want the table embedded as an array.
      :on => 'owner_id'

table "table_name",                                       # Embeds table_name into users as a one to one relationship
      :embed_in => 'users',                               # This also assumes you have a user_id present in table_name
      :on => 'owner_id',                                  # You can also specify both :on and :as options when embedding
      :as => 'object'                                     # NOTE: If you rename the owner_id column, make sure you
                                                          # update the :on to the new column name

table "table_name", :rename_to => 'my_table'              # This will allow you to rename the table as it's getting process
                                                          # Just remember that columns that use :reference need to
                                                          # reference the new name.

table "table_name", :ignore => true                       # This will ignore the whole table (like it doesn't exist)
                                                          # This option is good for tables like: schema_migrations

table "table_name",                                       # This allows you to specify the table as being polymorphic
      :polymorphic => 'notable',                          # and provide the name of the polymorphic relationship.
      :embed_in => true                                   # Setting embed_in => true allows the relationship to be
                                                          # embedded directly into the parent class.
                                                          # If you do not embed it, the polymorphic table will be copied in to
                                                          # MongoDB and the notable_id will be updated to the new BSON::ObjectID

table "table_name" do                                     # A table can take a before_save block that will be called just
  before_save do |row|                                    # before the row is saved to the no sql database.
    row.admin = row.delete('permission').to_i > 50        # This gives you the ability to do very powerful things like:
  end                                                     # Moving records around, renaming records, changing values in row based on
end                                                       # some values! Checkout Mongify::Database::DataRow to learn more

table "users" do                                          # Here is how to set new ID using the old id
  before_save do |row|
    row._id = row.delete('pre_mongified_id')
  end
end

table "preferences", :embed_in => "users" do              # As of version 0.2, embedded tables with a before_save will take an
  before_save do |pref_row, user_row, unset_user_row|                     # extra argument which is the parent row of the embedded table.
    user_row.email_me = pref_row.delete('email_me')       # This gives you the ability to move things from an embedded table row
                                                          # to the parent row.
    if pref_row['one_name']
      unset_user_row['last_name'] = true                  # This will delete/unset the last name for the parent row
    end
  end
end

More documentation can be found at Mongify::Database::Table

Columns

Structure

Structure for defining a column is as follows:

column "name", :type, {options}

Columns with no type given will be set to :string

Notes

as of version 0.2: Leaving a column out when defining a table will result in the column being ignored

Types

Before we cover the options, you need to know what types of columns are supported:

:key                  # Columns that are primary keys need to be marked as :key type. You can provide an :as if your :key is not an integer column
:integer              # Will be converted to a integer
:float                # Will be converted to a float
:decimal              # Will be converted to a string (due to MongoDB Ruby Drivers not supporting BigDecimal, read details in Mongify::Database::Column under Decimal Storage)
:string               # Will be converted to a string
:text                 # Will be converted to a string
:datetime             # Will be converted to a Time format (DateTime is not currently supported in the Mongo ruby driver)
:date                 # Will be converted to a Time format (Date is not currently supported in the Mongo ruby driver)
:timestamps           # Will be converted to a Time format
:time                 # Will be converted to a Time format (the date portion of the Time object will be 2000-01-01)
:binary               # Will be converted to a string
:boolean              # Will be converted to a true or false values

Options

column "post_id", :integer, :references => :posts   # Referenced columns need to be marked as such, this will mean that they will be updated
                                                    # with the new BSON::ObjectID.
                                                    # NOTE: if you rename the table 'posts', you should set the :references to the new name

column "name", :string, :ignore => true             # Ignoring a column will make the column NOT copy over to the new database

column "surname",
       :string,
       :rename_to => 'last_name'                    # Rename_to allows you to rename the column

<em>For decimal columns you can specify a few options:</em>
column "total",                                     # This is a default conversion setting.
       :decimal,
       :as => 'string'

column "total",                                     # You can specify to convert your decimal to integer
        :decimal,                                   # specifying scale will define how many decimal places to keep
        :as => 'integer',                           # Example: :scale => 2 will convert 123.4567 to 12346 before saving
        :scale => 2

More documentation can be found at Mongify::Database::Column

Notes

Mongify has been tested on Ruby 1.8.7-p330, 1.9.2-p136, 1.9.3-p327, 2.0.0-p353, 2.1.1

If you have any issues, please feel free to report them here: issue tracker

Sync Note

If you want to continually sync your sql database with MongoDB using Mongify, you MUST run the sync feature from the start. Using the process command will not permit future syncing. The sync feature leaves the ‘pre_mongify_id` in your documents as well as it creates it’s own collection called ‘mongify_sync_helper` to keep dates and times of your last sync.

Additional Tools

As of May 2013, we’ve released an add-on tool to Mongify called Mongify-Mongoid which lets you use the translation.rb file to generate Mongoid models. It generates the model fields, relations (most of the way) and timestamps. This should save you some time from having to re-type all the models by hand. Check out the gem at: rubygems.org/gems/mongify-mongoid

TODO

  • Allow deeper embedding

  • Test in different databases

  • Give an ability to mark source DB rows as imported (allowing Mongify to run as a on going converter)

Known Issues

  • Can’t do anything to an embedded table

Development

Requirements

You just need bundler >= 1.0.8

gem install bundler
bundle install
copy over spec/support/database.example to spec/support/database.yml and fill in required info
rake test:setup:mysql
rake test:setup:postgresql
rake test

Special Thanks

Just want to thank my wife (Alessia) who not only puts up with me working on my free time but sometimes helps out listening to my craziness or helping me name classes or functions.

I would also like to thank Mon_Ouie on the Ruby IRC channel for helping me figure out how to setup the internal configuration file reading.

Another thanks goes out to eimermusic for his feedback on the gem, and a few bugs that he helps flush out. And to Pranas Kiziela who committed some bug and typo fixes that make Mongify even better!

About

This gem was made by Andrew Kalek from Anlek Consulting

Reach me at:

The sync functionality was made by Hossam Hammady from Qatar Computing Research Institute

Reach me at:

License

Copyright © 2011 - 2013 Andrew Kalek, Anlek Consulting

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.