SlavePools

Easy Single Master/ Multiple Slave Setup for use in Ruby/Rails projects

SlavePools builds a base layer of master/slave query splitting, by overwriting ActiveRecord’s connection (with connection_proxy). With this in place, you can easily add a second layer of traffic splitting, by wrapping requests in the provided helper methods (examples below), and have a manageable master/slave solution for a standard rails application

Overview

Sends only whitelisted SELECT-type queries to the Slaves
Sends all other queries to the Master
Works with query caching and transactions
Easy to separate types of read traffic into different collections of slaves (e.g. separating admin and user traffic)
Minimalist approach
- doesn’t include sharding
- doesn’t create a new ActiveRecord adapter
- doesn’t weight slave db’s
- Builds onto a standard database.yml file (gem doesn’t initialize if no slaves are specified)
- doesn’t switch slaves on its own (the user specifies when to switch in their code)

The SlavePools GEM started as a fork of Maximilian Sch303266fmann’s github.com/schoefmax/multi_db The MultiDB gem was inspired by Rick Olson’s “masochism”-Plugin

Usage

Toggle to next slave:

SlavePools.next_slave!

Specify a different slave pool than the default:

SlavePools.with_pool('other_pool') { #do stuff }

Specifically use the master for a call:

SlavePools.with_master { #do stuff }

Determine if there are slaves:

SlavePools.active?

The gem, by default, sends writes and reads to the master and slave databases, respectfully. But in your app, if you write to the master during a request, you will probably want to read from the master in that request as well, in case there is replication. You will also probably want to read from the master on the next request (after a write to the master) to cover redirects.

Using a standard rails application setup, you can achieve this by adding these example methods to your application controller (some of these may be folded in the gem, but leaving out for now):

class ApplicationController < ActionController::Base

  around_filter   :stick_to_master_for_updates
  around_filter   :use_master_for_redirect #goes with above
  after_filter    :switch_to_next_slave

  def switch_to_next_slave
    SlavePools.next_slave! if slaves?
  end

  def use_admin_slave_pool
    SlavePools.with_pool('admin') { yield } if slaves?
  end

  def stick_to_master_for_updates
    if slaves? && (request.post? || request.put? || request.delete?)
      SlavePools.with_master { yield }
      session[:stick_to_master] = 1
    else
      yield
    end
  end

  def use_master_for_redirect
    if slaves? && session[:stick_to_master]
      session[:stick_to_master] = nil
      SlavePools.with_master { yield }
    else
      yield
    end
  end

  def use_master
    if slaves?
      SlavePools.with_master { yield }
      session[:stick_to_master] = 1
    else
      yield
    end
  end

  def slaves?
    SlavePools.active?
  end
end

For other cases where you use the master for writes, you should wrap the request in a ‘use_master’ block

class PostsController < ApplicationController
  around_filter :use_master, :only=>:index

  def index
    Activity.create()
    # index is a GET call, but we've decided to record something, so we want to wrap it in a use_master block
  end
end

works with activerecord 3.2.12 (not tested with Rails 2)

Install

Add to your Gemfile

gem 'slave_pools'

Setup

slave_pools identifies slave databases by looking for entries of the form “<environment>pool<pool_name>name<db_name>”.

In your database.yml, add sections for the slaves, e.g.:

development: # that would be the master
  adapter: mysql
  database: myapp_production
  username: root
  password:
  host: localhost

development_pool_default_name_slave1: # that would be a slave named 'slave1' in the 'default' pool
  adapter: mysql
  database: slave_db1
  username: root
  password:
  host: 10.0.0.2

development_pool_default_name_slave2: # that would be a slave named 'slave2' in the 'default' pool
  ...
development_pool_admin_name_slave1: # that would be a slave named 'slave1' in the 'admin' pool (db names can be reused across pools)
  ...
development_pool_admin_name_another_slave: # that would be a slave named 'another_slave' in the 'admin' pool

This also creates an abstract classes named SlavePools::DefaultDb1 for each db of the form SlavePools::<PoolName><DbName>etc. If no slaves are specified, the SlavePools setup does not run, and the development DB would be used as normal.

For development testing, I recommend creating a read-only mysql user and just point all of your slave DB’s to the your development DB using the read-only user.

The Default SlavePool will be used for all requests, so you should name on of the pools ‘default’ (if there isn’t a ‘default’ slave_pool, the first slave_pool specified becomes the default)

To enable the proxy globally, add this to a config/initializers:

SlavePools.setup!

If you only want to enable it for specific environments, add this to the corresponding file in config/environments:

config.after_initialize do
  SlavePools.setup!
end

Using with Phusion Passenger

(this is a note from MultiDB gem and has not been verified)

With Passengers smart spawning method, child processes forked by the ApplicationSpawner won’t have the connection proxy set up properly (this is a note from ).

To make it work, add this to your environment.rb or an initializer script (e.g. config/initializers/connection_proxy.rb):

if defined?(PhusionPassenger)
  PhusionPassenger.on_event(:starting_worker_process) do |forked|
    if forked
      # ... set configuration options, if any ...
      SlavePools::ConnectionProxy.setup!
    end
  end
else # not using passenger (e.g. development/testing)
  # ... set configuration options, if any ...
  SlavePools::ConnectionProxy.setup!
end

Using with ThinkingSphinx

ThinkingSphinx looks for an adapter type and

SlavePools::ConnectionProxy.setup!

if ActiveRecord::Base.respond_to?('connection_proxy')
  ThinkingSphinx::AbstractAdapter.class_eval do
    def self.standard_adapter_for_model(model)
      :mysql
    end
  end
end

Forcing the master for certain actions

Just add this to your controller:

around_filter(:only => :foo_action) { |c,a| ActiveRecord::Base.connection_proxy.with_master { a.call } }

Forcing the master for certain models

In your environment.rb or an initializer, add this before the call to setup!:

SlavePoolsModule::ConnectionProxy.master_models = ['CGI::Session::ActiveRecordStore::Session', 'PaymentTransaction', ...]
SlavePoolsModule::ConnectionProxy.setup!

NOTE: You cannot safely add more master_models after calling setup!.

Features

Minimalist implementation - does include sharding, doesn’t creation a new adapter (so if you don’t specify slaves for an environment, the connection is not overwritten, and the DB works as normal), doesn’t blacklist/remove slaves,
It sends everything except “select …” queries to the master, instead of sending only specific things to the master and anything “else” to the slave. This avoids accidental writes to the master when there are API changes in ActiveRecord which haven’t been picked up by multi_db yet. Note that this behavior will also always send helper methods like “quote” or “add_limit!” to the master connection object, which doesn’t add any more load on the master, as these methods don’t communicate with the db server itself.

Differences to “multi_db”:

Supports multiple separate pools of slave databases
query caching is fixed
tries a slave once and immediately reverts to the master afterwards (does not cycle through slaves)
stays with the same slave DB until explicitly told to change. In practical usage, it didn’t make sense to us to have it cycle through slaves in the same web request, so I made the ‘sticky slave’ feature permanent
removed weighted slave rotation for now (didn’t need it)
Currently not using Threaded variables (left this commented out in the code for now, may revisit)
Added with_pool method
does not blacklist slaves for timing out (we want other more robust monitoring software to take care of this)
better default case handling - if no slave DB’s are specified, the regular Environment database is used, and the gem is not initialized
added a wrapper class for shorter calls

Running specs

If you haven’t already, install the rspec gem, then set up your database with a test database and a read_only user.

To match spec/config/database.yml, you can:

mysql>
  create database test_db;
  create user 'read_only'@'localhost' identified by 'readme';
  grant select on db_test.* to 'read_only'@'localhost';

From the plugin directory, run:

rspec spec