Spawn

This plugin provides a 'spawn' method to easily fork OR thread long-running sections of code so that your application can return results to your users more quickly. This plugin works by creating new database connections in ActiveRecord::Base for the spawned block.

The plugin also patches ActiveRecord::Base to handle some known bugs when using threads (see lib/patches.rb).

Installation

Use

gem "spawn", :git => 'git://github.com/rfc2822/spawn'

in your Gemfile and use bundler to manage it (bundle install, bundle update).

Make sure that ActiveRecord reconnects to your database automatically when needed, for instance put

production/development:
  ...
  reconnect: true

into your config/database.yml.

Usage

Here's a simple example of how to demonstrate the spawn plugin. In one of your controllers, insert this code (after installing the plugin of course):

spawn_block do
   logger.info("I feel sleepy...")
   sleep 11
   logger.info("Time to wake up!")
 end

If everything is working correctly, your controller should finish quickly then you'll see the last log message several seconds later.

If you need to wait for the spawned processes/threads, then pass the objects returned by spawn to Spawn::wait(), like this:

N.times do |i|
  # spawn N blocks of code
  spawn_ids[i] = spawn_block do
    something(i)
  end
end
# wait for all N blocks of code to finish running
wait(spawn_ids)

Options

The options you can pass to spawn_block are:

OptionValues
:method:fork, :thread, :yield
:niceinteger value 0-19, 19 = really nice
:killboolean value indicating whether the parent should kill the spawned process when it exits (only valid when :method => :fork)
:argvstring to override the process name

Any option to spawn_block can be set as a default so that you don't have to pass them in to every call of spawn_block. To configure the spawn default options, add a line to your configuration file(s) like this:

Spawn::default_options {:method => :thread}

If you don't set any default options, the :method will default to :fork. To specify different values for different environments, add the default_options call to he appropriate environment file (development.rb, test.rb). For testing you can set the default :method to :yield so that the code is run inline.

# in environment.rb
Spawn::method :method => :fork, :nice => 7
# in test.rb, will override the environment.rb setting
Spawn::method :method => :yield

This allows you to set your production and development environments to use different methods according to your needs.

be nice

If you want your forked child to run at a lower priority than the parent process, pass in the :nice option like this:

spawn_block(:nice => 7) do
  do_something_nicely
end

fork me

By default, spawn will use the fork to spawn child processes. You can configure it to do threading either by telling the spawn method when you call it or by configuring your environment. For example, this is how you can tell spawn to use threading on the call,

spawn_block(:method => :thread) do
  something
end

When you use threaded spawning, make sure that your application is thread-safe. Rails can be switched to thread-safe mode with

# Enable threaded mode
config.threadsafe!

in environments/your_environment.rb

kill or be killed

Depending on your application, you may want the children processes to go away when the parent process exits. By default spawn lets the children live after the parent dies. But you can tell it to kill the children by setting the :kill option to true.

a process by any other name

If you'd like to be able to identify which processes are spawned by looking at the output of ps then set the :argv option with a string of your choice. You should then be able to see this string as the process name when listing the running processes (ps).

For example, if you do something like this,

3.times do |i|
  spawn_block(:argv => "spawn -#{i}-") do
    something(i)
  end
end

then in the shell,

$ ps -ef | grep spawn
502  2645  2642   0   0:00.01 ttys002    0:00.02 spawn -0-
502  2646  2642   0   0:00.02 ttys002    0:00.02 spawn -1-
502  2647  2642   0   0:00.02 ttys002    0:00.03 spawn -2-

The length of the process name may be limited by your OS so you might want to experiment to see how long it can be (it may be limited by the length of the original process name).

Forking vs. Threading

There are several tradeoffs for using threading vs. forking. Forking was chosen as the default primarily because it requires no configuration to get it working out of the box.

Forking advantages:

  • more reliable? - the ActiveRecord code is generally not deemed to be thread-safe. Even though spawn attempts to patch known problems with the threaded implementation, there are no guarantees. Forking is heavier but should be fairly reliable.
  • keep running - this could also be a disadvantage, but you may find you want to fork off a process that could have a life longer than its parent. For example, maybe you want to restart your server without killing the spawned processes. We don't necessarily condone this (i.e. haven't tried it) but it's technically possible.
  • easier - forking works out of the box with spawn, threading requires you set allow_concurrency=true (for older versions of Rails). Also, beware of automatic reloading of classes in development mode (config.cache_classes = false).

Threading advantages:

  • less filling - threads take less resources... how much less? it depends. Some flavors of Unix are pretty efficient at forking so the threading advantage may not be as big as you think... but then again, maybe it's more than you think. ;-)
  • debugging - you can set breakpoints in your threads

Acknowledgements

This plugin was initially inspired by Scott Persinger's blog post on how to use fork in rails for background processing. http://geekblog.vodpod.com/?p=26

Further inspiration for the threading implementation came from Jonathon Rochkind's blog post on threading in rails. http://bibwild.wordpress.com/2007/08/28/threading-in-rails/

Also thanks to all who have helped debug problems and suggest improvements including:

  • Ahmed Adam, Tristan Schneiter, Scott Haug, Andrew Garfield, Eugene Otto, Dan Sharp, Olivier Ruffin, Adrian Duyzer, Cyrille Labesse

  • Garry Tan, Matt Jankowski (Rails 2.2.x fixes), Mina Naguib (Rails 2.3.6 fix)

  • Tim Kadom, Mauricio Marcon Zaffari, Danial Pearce, Hongli Lai, Scott Wadden (passenger fixes)

  • <your name here>

Copyright (c) 2007-present Tom Anderson ([email protected]), see LICENSE