periodically

Redis-backed Ruby library for tasks that need to run once in a while. "Once in a while" is intentionally vague and should be defined more accurately by a custom lambda block by the library user.

Since task execution is done in a single thread using non-accurate and non-time-based conditions, periodically is best for infrequent and noncritical jobs, such as weekly syncs.

Example usecases:

  • Sync a Rails model's data once per week from an external source. External source can be unstable, so you'd like to repeat failing jobs but some amount of marginally outdated data won't be a problem
    • Great fit for periodically! Can be achieved for example by a last_synced value in the database and a condition based on it
    • Additionally, periodically supports deferring jobs (meaning one line rate limiting!)
  • Apply asynchronous corrections to data added to the database. For example automatically fetch title for user-submitted links.
    • Achievable by adding a condition against the correctable column (where(title: nil))
    • However, a background job processor like Sidekiq would likely be more efficient

Getting started with Rails

Add gem to Gemfile and install

gem 'periodically' && bundle install

Add an initializer (e.g. config/initializers/periodically.rb)

require "periodically"

# Launches a background thread. For production usage you may want to do this in another process
Periodically.start

In Rails, Periodically jobs are only registered when the class is loaded. In production mode Rails (by default) eagerly loads all classes, meaning that everything is fine. However, in development mode you might want to disable eager mode with config.eager_load = false

Utilize Periodically in e.g. a Model

# app/models/item.rb

class Item < ApplicationRecord
  include Periodically::Model

  periodically :refresh_price,
    on: -> { Item.where("last_synced < ?", 7.days.ago) }

  private

  def refresh_price
    self.price = PriceFetcher.fetch(item_id)
    self.last_synced = Time.now! # Remember to update the condition by yourself!
    save!
  end
end

Execution model

Periodically launches a single background thread, which executes registered queries every x seconds. If a pending query is found, the registered callback method is called in the same thread. Hence, a blocking callback method will also block execution of other pending queries.

By default everything happens in the same process as the main Rails web server. To parallelize processing a bit, you can do bundle exec periodically to start a new process for jobs. (Remember to remove Periodically.start from initializer!)

API

Terminology

  • Job: something enqueued to be called using the periodically method
  • Instance job: a single Job execution concerning a specific instance of the class

Inside a Model

Definitions

# Add Periodically context to this class
include Periodically::Model

# Enqueue a Periodically job
periodically :update_method, # call instance method "update_method" for found instances
  on: -> { Item.where("last_synced < ?", 7.days.ago) }, # (Optional) Iterator/Array of instances to update. Empty array to skip update
  min_class_interval: 5.minutes, # (Optional) The minimum interval between calls to this specific class (TODO not implemented)
  max_retries: 25, # (Optional) Maximum number of retries. Periodically uses exponential backoff (TODO not implemented)
  instance_id: -> { cache_key_with_version }, # (Optional) Returns this instance's unique identifying key. Used for e.g. deferring jobs and marking them as erroring (TODO not implemented)


Update method return values

Job method's return value or raised exception determines further executions of that specific instance job.

# As referred to by a previous `periodically` call
def update_method
  # Let's retrieve a normal value from the model instance
  status = my_column_status

  # No-op
  #   Since we don't update `last_synced`, this method will get called again without much delay!
  return if status == "pending"

  # Log error and defer execution
  #   This unique instance will be deferred for later execution (using exponential backoff) and the error is logged
  raise "something went wrong" if status == "error"

  # Defer any further calls to :update_method (on any instance)
  #   This is perfect for short-time rate limiting, but highly discouraged as a method of timing updates!
  #   You should use database columns (e.g. last_synced like this method) instead, which causes less stress on Redis
  return Periodically::Defer.job_by(60.minutes) if status == "rate_limited"

  # Update checked delay
  #   Updates the property we check against, thus making this instance not pass the Periodically condition
  #   Note that this line is normal Rails code: Periodically conditions are database/anything-agnostic
  update(last_synced: Time.now)
end

The job method's return value can be used to defer further execution of either the model instance, the specific job within the model or any job within the model:

  • Periodically::Defer.instance_by(60.minutes) # calling :update_method on this instance will be delayed
  • Periodically::Defer.job_by(60.minutes) # calling :update_method on any instance of this class will be delayed
  • Periodically::Defer.class_by(60.minutes) # calling anything on any instance of this class will be delayed

Testing

What do you want to test?

Behavior of my update callbacks

Just call them by yourself in your tests.

The periodically :on condition

You can call Periodically.would_execute?(MyModel, :object) to statically check the condition. (TODO not implemented)

Debugging

As part of your application you might want to be able to get a quick snapshot of how periodically is doing. You can do that by calling Periodically::Debug.total_debug_dump, which returns a hash containing bunch of debug information.

Dashboard

Dashboard contains recently succeeded executions, failed executions (with stacktrace) and deferred executions. (TODO not implemented)

Why not Sidekiq?

With Sidekiq you can achieve something almost similar by combining a scheduled job that enqueues further unique jobs based on the condition. (see https://github.com/mperham/sidekiq/wiki/Ent-Periodic-Jobs#dynamic-jobs)

However, there are few advantages periodically has for the specific usecase of per-instance non-critical jobs:

  • Improved backpressure handling. Since we know the conditions for executing jobs, we are in better control of the job producer and able to balance between different jobs. This enables for instance early warnings for the developers in case of job buildup. (TODO not implemented)
  • Closer to the source. Periodic callbacks are defined inside the models, so it is always easy to find which jobs are affecting which models.
  • Cleaner per-instance retrying. If we start executing a job, but suddenly want to defer execution by some time in Sidekiq, it is definitely doable with scheduled jobs. However, this may entrap you in a "scheduled unique job" hell: if some job keeps getting mistakenly deferred, it might be hard to find out about the repeated erroneous behavior without some complex job tracking logic. In contrast, periodically delivers this functionality for free due to more explicit control over job scheduling and rescheduling. (TODO not implemented)
  • More flexible delaying. In periodically, you can delay job executions per-instance, per-job or per-class.
  • More clever polling. Since we know the exact condition for new periodic jobs, we can deduce the next execution time and sleep accordingly. (TODO not implemented)
  • Easier priority escalation. Periodically selects jobs in order of the given condition and maintains no queue of its own; therefore prioritizing certain jobs by adding a new query condition comes by design.

Importantly, Sidekiq and periodically aim to solve different problems. Nothing prevents one from using both at the same time.