Qu::Scheduler

Description

qu-scheduler is an extension to Qu that adds support for queueing jobs in the future.

Currently qu-scheduler only works with qu-redis and requires Redis 2.0 or newer.

Job scheduling is supported in two different ways: Recurring (scheduled) and Delayed.

Scheduled jobs are like cron jobs, recurring on a regular basis. Delayed jobs are Qu jobs that you want to run at some point in the future. The syntax is pretty explanatory:

Qu.enqueue_in(5.days, SendFollowupEmail) # run a job in 5 days
# or
Qu.enqueue_at(5.days.from_now, SomeJob) # run SomeJob at a specific time

Installation

# Rails 3.x: add it to your Gemfile
gem 'qu-scheduler'

There are just a single thing qu-scheduler needs to know about in order to do it's jobs: the schedule. The easiest way to configure these things is via the rake task. By default, qu-scheduler depends on the "qu:setup" rake task. Since you probably already have this task, lets just put our configuration there. qu-scheduler pretty much inherits everything else from Qu.

# Qu tasks
require 'qu/tasks'
require 'qu-scheduler/tasks'

namespace :qu do
  task :setup do
    require 'qu'
    require 'qu-scheduler'

    # If you want to be able to dynamically change the schedule,
    # uncomment this line.  A dynamic schedule can be updated via the
    # Qu::Scheduler.set_schedule (and remove_schedule) methods.
    # When dynamic is set to true, the scheduler process looks for
    # schedule changes and applies them on the fly.
    # Note: This feature is still under development
    # Qu::Scheduler.dynamic = true

    # The schedule doesn't need to be stored in a YAML, it just needs to
    # be a hash.  YAML is usually the easiest.
    Qu.schedule = YAML.load_file(Rails.root.join('config', 'your_resque_schedule.yml'))

    # If your don't depend on your application environment you need to
    # require your jobs here. Qu determines the queue exclusively from the
    # class, so we need to have access to them.
    require 'jobs'
  end
end

The scheduler process is just a rake task which is responsible for both queueing jobs from the schedule and polling the delayed queue for jobs ready to be pushed on to the work queues. For obvious reasons, this process never exits.

$ bundle exec rake qu:scheduler

NOTE: You DO NOT want to run more than one instance of the scheduler. Doing so will result in the same job being queued multiple times. You only need one instance of the scheduler running per application, regardless of number of servers.

If the scheduler process goes down for whatever reason, the delayed items that should have fired during the outage will fire once the scheduler process is started back up again (even if it is on a new machine). Missed scheduled jobs, however, will not fire upon recovery of the scheduler process.

Delayed jobs

Delayed jobs are one-off jobs that you want to be put into a queue at some point in the future. The classic example is sending email:

Qu.enqueue_in(5.days, SendFollowUpEmail, current_user.id)

This will store the job for 5 days in the Qu delayed queue at which time the scheduler process will pull it from the delayed queue and put it in the appropriate work queue for the given job. It will then be processed as soon as a worker is available (just like any other Qu job).

NOTE: The job does not fire exactly at the time supplied. Rather, once that time is in the past, the job moves from the delayed queue to the actual work queue and will be completed as workers as free to process it.

Also supported is Qu.enqueue_at which takes a timestamp to queue the job.

The delayed queue is stored in redis and is persisted in the same way the standard Qu jobs are persisted (redis writing to disk). Delayed jobs differ from scheduled jobs in that if your scheduler process is down or workers are down when a particular job is supposed to be processed, they will simply "catch up" once they are started again. Jobs are guaranteed to run (provided they make it into the delayed queue) after their given queue_at time has passed.

Your jobs can specify one or more before_schedule and after_schedule hooks, to be run before or after scheduling. If any of your before_schedule hooks returns false, the job will not be scheduled and your after_schedule hooks will not be run.

One other thing to note is that insertion into the delayed queue is O(log(n)) since the jobs are stored in a redis sorted set (zset). I can't imagine this being an issue for someone since redis is stupidly fast even at log(n), but full disclosure is always best.

Removing Delayed jobs

If you have the need to cancel a delayed job, you can do it like this:

# after you've enqueued a job like:
Qu.enqueue_at(5.days.from_now, SendFollowUpEmail, current_user.id)
# remove the job with exactly the same parameters:
Qu.remove_delayed(SendFollowUpEmail, current_user.id)

Scheduled Jobs (Recurring Jobs)

Scheduled (or recurring) jobs are logically no different than a standard cron job. They are jobs that run based on a fixed schedule which is set at startup.

The schedule is a list of job classes with arguments and a schedule frequency (in crontab syntax). The schedule is just a hash, but is most likely stored in a YAML like this:

queue_documents_for_indexing:
  cron: "0 0 * * *"
  # you can use rufus-scheduler "every" syntax in place of cron if you prefer
  # every: 1hr
  klass: QueueDocuments
  args:
  description: "This job queues all content for indexing in solr"

clear_leaderboards_contributors:
  cron: "30 6 * * 1"
  klass: ClearLeaderboards
  args: contributors
  description: "This job resets the weekly leaderboard for contributions"

NOTE: Six parameter cron's are also supported (as they supported by rufus-scheduler which powers the resque-scheduler process). This allows you to schedule jobs per second (ie: "30 * * * * *" would fire a job every 30 seconds past the minute).

A big shout out to rufus-scheduler for handling the heavy lifting of the actual scheduling engine.

Running in the background

(Only supported with ruby >= 1.9). There are scenarios where it's helpful for the resque worker to run itself in the background (usually in combination with PIDFILE). Use the BACKGROUND option so that rake will return as soon as the worker is started.

$ PIDFILE=./qu-scheduler.pid BACKGROUND=yes bundle exec rake qu:scheduler

Note on Patches / Pull Requests

Fork the project.
Make your feature addition or bug fix.
Add tests for it. This is important so I don't break it in a future version unintentionally.
Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)
Send me a pull request. Bonus points for topic branches.

Credits

This work is a port of resque-scheduler by Ben VandenBos.
Modified to work with the Qu queueing library by Morton Jonuschat.