Progressrus Build Status Coverage Status

Progressrus provides progress status of long-running jobs. The progress is stored in persistence layer. Progressrus currently ships with a Redis adapter, but is written in an agnostic way, where multiple layers can be used at the same time. For example, one a message queue adapter too for real-time updates.

Think of Progressrus as a progress bar where instead of flushing the progress to stdout, it's pushed to one or more data stores. It can be used with a background job engine or just about anything where you need to show the progress in a different location than the long-running operation.

It works by instructing Progressrus about the finishing point (total). When the job makes progress towards the total, the job calls tick. With ticks second(s) apart (configurable) the progress is updated in the data store(s). This prevents a job processing e.g. 100 records to hit the data store 100 times. Combination of adapters is planned, so you can publish to some kind of real-time data source, Redis and even stdout at the same time.

Progressrus keeps track of the jobs in some scope. This could be a user_id. This makes it easy to find the jobs and their progress for a specific user, without worrying about keeping e.g. the Resque job ids around.

Progressrus will update the data store with the progress of the job. The key for a user with user_id 3421 would be: progressrus:user:3421. For the Redis data store, the key is a Redis hash where the Redis job_id is the key and the value is a json object with information about the progress, i.e.:

redis> HGETALL progressrus:user:3421
1) "4bacc11a-dda3-405e-b0aa-be8678d16037"
2) "{"count\":94,\"total\":100,\"started_at\":\"2013-12-08 10:53:41 -0500\"}"

Usage

Instrument by creating a Progressrus object with the scope and total amount of records to be processed:

class MaintenanceProcessRecords
  def self.perform(record_ids, user_id)
    Record.where(id: record_ids)
      .enum_for(:find_each)
      .with_progress(scope: [:user, user_id], total: # get this somehow, unless you're on rails 4.1) do |record|
      record.do_expensive_things
    end
  end
end

You can also use the slightly more flexible lower-level API, which is useful in some cases:

class MaintenanceProcessRecords
  def self.perform(record_ids, user_id)
    # Construct the pace object.
    progress = Progressrus.new(scope: [:user, user_id], total: record_ids.count)

    # Start processing the records!
    Record.where(id: record_ids).find_each do |record|
      begin
        record.do_expensive_things

        # Does a single tick, updates the data store every x seconds this is called.
        progress.tick
      rescue
        # Increments the error count if the above failed
        progress.error
      end
    end

    # Force an update to the data store and set :completed_at to Time.now
    progress.complete
  end
end

Querying Ticks by scope

To query for the progress of jobs for a specific scope:

> Progressrus.all(["walrus", '1234'])
#=> [
  #<Progressrus::Progress:0x007f0fdc8ab888 @scope=["walrus", "1234"], @total=50, @id="narwhal", @interval=2, @params={:count=>0, :started_at=>"2013-12-12 18:09:44 +0000", :completed_at=>nil, :name=>"oemg-test-2"}, @count=0, @error_count=0, @started_at=2013-12-12 18:09:44 +0000, @persisted_at=2013-12-12 18:09:41 +0000, @store=#<Progressrus::Store::Redis:0x007f0fdc894c28 @redis=#<Redis client v3.0.6 for redis://127.0.0.1:6379/0>, @options={:expire=>1800, :prefix=>"progressrus"}>, @completed_at=nil>,
  #<Progressrus::Progress:0x007f0fdc8ab4a0 @scope=["walrus", "1234"], @total=100, @id="oemg", @interval=2, @params={:count=>0, :started_at=>"2013-12-12 18:09:44 +0000", :completed_at=>nil, :name=>"oemg-test"}, @count=0, @error_count=0, @started_at=2013-12-12 18:09:44 +0000, @persisted_at=2013-12-12 18:09:41 +0000, @store=#<Progressrus::Store::Redis:0x007f0fdc894c28 @redis=#<Redis client v3.0.6 for redis://127.0.0.1:6379/0>, @options={:expire=>1800, :prefix=>"progressrus"}>, @completed_at=nil>
]

The Progressrus objects contain useful methods such as #percentage to return how many percent done the job is and #eta to return a Time object estimation of when the job will be complete. The scope is completely independent from the job itself, which means you can have jobs from multiple sources in the same scope.

Querying Progress by scope and id

To query for the progress of a specific job:

> Progressrus.find(["walrus", '1234'], 'narwhal')
#=> #<Progressrus::Progress:0x007f0fdc8ab888 @scope=["walrus", "1234"], @total=50, @id="narwhal", @interval=2, @params={:count=>0, :started_at=>"2013-12-12 18:09:44 +0000", :completed_at=>nil, :name=>"oemg-test-2"}, @count=0, @error_count=0, @started_at=2013-12-12 18:09:44 +0000, @persisted_at=2013-12-12 18:09:41 +0000, @store=#<Progressrus::Store::Redis:0x007f0fdc894c28 @redis=#<Redis client v3.0.6 for redis://127.0.0.1:6379/0>, @options={:expire=>1800, :prefix=>"progressrus"}>, @completed_at=nil>

Todo

  • Tighter Resque/Sidekiq/DJ integration
  • Rack interface
  • SQL adapter
  • Document adapter-specific options
  • Enumerable integration for higher-level API
  • Documentation on how to do sharding