ConnectionManager

Multi-Database, Replication and Sharding for ActiveRecord.

Background

ActiveRecord, for quite some time now, has supported multiple database connections through the use of #establish_connection and connection classes more info Multiple databases, replication and shards can be implemented directly in rails without patching, but a gem helps to reduce redundant code and ensure consistency. ConnectionManager replaces all the connection classes and subclasses required for multiple database support in Rails with a few class methods and simple database.yml configuration. Since ConnectionManager does not alter ActiveRecord's connection pool, thread safety is not a concern.

Upgrading to 0.3

0.3 is a complete overhaul and will cause compatibility issues for folks who upgrade using the previous replication setup. Fortunately, for most folks the only change they have to do is specify the their slaves and masters in the database.yml and set build_connection_class to true to have ActiveRecord build their connection classes. See the example database.yml below.

Installation

ConnectionManager is available through Rubygems and can be installed via:

$ gem install connection_manager

Rails 3/4 setup (No Rails 2 at this time)

Add connection_manager to you gemfile:

gem 'connection_manager'

Run bundle install:

bundle install

Example database.yml

common: &common
adapter: mysql2
username: root
password: *****
pool: 20
connect_timeout: 20
timeout: 900
socket: /tmp/mysql.sock
build_connection_class: true

development:
  <<: *common
  database: test_app
  slaves: [slave_1_test_app_development, slave_2_test_app_development]

slave_1_test_app_development:
  <<: *common
  database: test_app
  readonly: true

slave_2_test_app_development:
  <<: *common
  database: test_app
  readonly: true

user_data_development
  <<: *common
  database: user_data
  slaves: [slave_1_user_data_development, slave_2_user_data_development]

slave_1_user_data_development
  <<: *common
  database: user_data
  readonly: true

slave_2_user_data_development
  <<: *common
  database: user_data
  readonly: true

In the above database.yml the Master databases are listed as "development" and "user_data_development". Replication databases are defined as normally connections and are added to the 'replications:' option for their master. The readonly option ensures all ActiveRecord objects returned from this connection are ALWAYS readonly.

Building Connection Classes

Manually

ConnectionManager provides establish_managed_connection for build connection classes and connection to multiple databases.

class MyConnection < ActiveRecord::Base
  establish_managed_connection("my_database_#{Rails.env}", :readonly => true)
end

class User < MyConnection
end

MyConnection    => MyConnection(abstract)
@user = User.first
@user.readonly? => true

The establish_managed_connection method, runs establish_connection with the supplied database.yml key, sets abstract_class to true, and (since :readonly is set to true) ensures all ActiveRecord objects build using this connection class are readonly. If readonly is set to true in the database.yml, passing the readonly option is not necessary.

Automatically

ActiveRecord can build all your connection classes for you. The connection class names will be based on the database.yml keys.ActiveRecord will build connection classes for all the entries in the database.yml where "build_connection_class" is true, and match the current environment settings

Using

The using method allows you specify the connection class to use for query. The return objects will have the correct model name, but the instance's class's superclass will be the connection class and all database actions performed on the instance will use the connection class's connection.

User.using("Slave1Connection").first

search = User.where(disabled => true)
@legacy_users = search.using("Shard1Connection").all #=> [<User::Shard1ConnectionDup...>,<User::Shard1ConnectionDup..]
@legacy_users.first.save #=> uses the Shard1Connection connection

@new_users = search.page(params[:page]).all => [<User...>,<User...>]

Replication

Simply add 'replicated' to your model.

class User < UserDataConnection
    has_one :job
    has_many :teams
    replicated # implement replication        
    # model code ...
end

The replicated method builds models who inherit from the main model. User::Slave1UserDataConnectionDup.superclass => Slave1UserDataConnection(abstract) User::Slave1UserDataDup.first => returns results from slave_1_user_data_development User::Slave2UserDataDup.where(['created_at BETWEEN ? and ?',Time.now - 3.hours, Time.now]).all => returns results from slave_2_user_data_development

Finally, ConnectionManager creates an additional class method that shifts through your available slave connections each time it is called using a different connection on each action.

User.slaves.first  => returns results from slave_1_use_data_development 
User.slaves.last =>  => returns results from slave_2_use_data_development 
User.slaves.where(['created_at BETWEEN ? and ?',Time.now - 3.hours, Time.now]).all  => returns results from slave_1_user_data_development 
User.slaves.where(['created_at BETWEEN ? and ?',Time.now - 5.days, Time.now]).all  => returns results from slave_2_user_data_development 

Replicated defaults to the slaves replication type,so if you have only masters and a combination of masters and slaves for replication, you have set the replication type to masters

class User < UserDataConnection
    replicated #slaves replication
    replicated :type => :masters, :name => 'masters' # masters replication
end

Sharding

After tinkering with some solutions for shards, I've come to a similar conclusion as DataFabric: "Sharding should be implemented at the application level". The #shards method is very basic and while it may be useful to most folks, it should really serve as an example of a possible solutions to your shard requirements.

class LegacyUser < UserShardConnection
end

class User < ActiveRecord::Base
    self.shard_class_names = ["LegacyUser"]
end

# Calls the supplied block on all the shards available to User, including the User model itself.
User.shards{ |shard| shard.where(:user_name => "some_user").all} => [<User ...>,<LegacyUser ...>]

Caching

ActiveRecord only caches queries for the ActiveRecord::Base connection. Inorder to cache queries that originate from classes that used establish_connection you must surround your code with a cache block:

MyOtherConnectionClass.cache {
  Some queries...
}

In Rails for less complicated schemas you could simply create an around filter for your controllers

class ApplicationController < ActionController::Base
  around_filter :cache_slaves
  private
  def cache_slaves
    MyOnlySlaveConnection.cache { yield }
  end

Migrations

Nothing implement now to help but there are lots of potential solutions here

TODOs

  • Maybe add migration support for Rails AR implementations.

Other ActiveRecord Connection gems

Contributing to ConnectionManager

  • Check out the latest master to make sure the feature has not been implemented or the bug hasn't been fixed yet
  • Check out the issue tracker to make sure someone already has not requested it and/or contributed it
  • Fork the project
  • Start a feature/bugfix branch
  • Commit and push until you are happy with your contribution
  • Make sure to add tests for it. This is important so I don't break it in a future version unintentionally.
  • Please try not to mess with the Rakefile, version, or history. If you want to have your own version, or is otherwise necessary, that is fine, but please isolate to its own commit so I can cherry-pick around it.