CanvasSync

This gem is intended to facilitate fast and easy syncing of Canvas data.

Installation

Add this line to your application's Gemfile:

gem 'canvas_sync'

Models and migrations can be installed using the following generator:

bin/rails generate canvas_sync:install --models users,terms,courses

Use the --models option to specify what models you would like installed. This will add both the model files and their corresponding migrations. If you'd like to install all the models that CanvasSync supports then specify --models all.

Then run the migrations:

bundle exec rake db:migrate

For a list of currently supported models, see CanvasSync::SUPPORTED_MODELS.

Additionally, your Canvas instance must have the "Proserv Provisioning Report" enabled.

The following custom reports are required for the specified models:

  • assignments = "Assignments Report" (proserv_assignment_export_csv)
  • submissions = "Student Submissions" (proserv_student_submissions_csv)
  • assignment_groups = "Assignment Group Export" (proserv_assignment_group_export_csv)
  • context_modules = "Professional Services Context Modules Report" (proserv_context_modules_csv)
  • context_module_items = "Professional Services Context Module Items Report" (proserv_context_module_items_csv)

Prerequisites

Postgres

The bulk inserting is made possible by using a Postgres upsert. Beause of this, you need to be using Postgres 9.5 or above.

Sidekiq

Make sure you've setup sidekiq to work properly with ActiveJob as outlined here.

Apartment

If using apartment and sidekiq make sure you include the apartment-sidekiq gem so that the jobs are run in the correct tenant.

Basic Usage

Your tool must have an ActiveJob compatible job queue adapter configured, such as DelayedJob or Sidekiq. Additionally, you must have a method called canvas_sync_client defined in an initializer that returns a Bearcat client for the Canvas instance you are syncing against. Example:

# config/initializers/canvas_sync.rb
def canvas_sync_client
  Bearcat::Client.new(token: current_organization.settings[:api_token], prefix: current_organization.settings[:base_url])
end

(Having the client defined here means the sensitive API token doesn't have to be passed in plain text between jobs.)

Once that's done and you've used the generator to create your models and migrations you can run the standard provisioning sync:

CanvasSync.provisioning_sync(<array of models to sync>, term_scope: <optional term scope>)

Note: pass in 'xlist' to your array of models if you would like sections to include cross listing information

Example:

CanvasSync.provisioning_sync(['users', 'courses'], term_scope: :active)

This will kick off a string of jobs to sync your specified models.

If you pass in the optional term_scope the provisioning reports will be run for only the terms returned by that scope. The scope must be defined on your Term model. (A sample one is provided in the generated Term.)

Imports are inserted in bulk with activerecord-import so they should be very fast.

Advanced Usage

This gem also helps with syncing and processing other reports if needed. In order to do so, you must:

  • Define a Processor class that implements a process method for handling the results of the report
  • Integrate your reports with the ReportStarter
  • Tell the gem what jobs to run

Processor

Your processor class must implement a process class method that receives a report_file_path and a hash of options. (See the CanvasSync::Processors::ProvisioningReportProcessor for an example.) The gem handles the work of enqueueing and downloading the report and then passes the file path to your class to process as needed. A simple example might be:

class MyCoolProcessor
  def self.process(report_file_path, options)
    puts "I downloaded a report to #{report_file_path}! Isn't that neat!"
  end
end

Report starter

You must implement a job that will enqueue a report starter for your report. (TODO: would be nice to make some sort of builder for this, so you just define the report and its params and then the gem runs it in a pre-defined job.)

Let's say we have a custom Canvas report called "my_really_cool_report_csv". First, we would need to create a job class that will enqueue a report starter. To work with the CanvasSync interface, your class must accept 2 parameters: job_chain, and options.

class MyReallyCoolReportJob < CanvasSync::Jobs::ReportStarter
  def perform(job_chain, options)
    super(
      job_chain,
      'my_really_cool_report_csv', # Report name
      { "parameters[param1]" => true }, # Report parameters
      MyCoolProcessor.to_s, # Your processor class as a string
      options
    )
  end
end

You can also see examples in lib/canvas_sync/jobs/sync_users_job.rb and lib/canvas_sync/jobs/sync_provisioning_report.rb.

Start the jobs

The CanvasSync.process_jobs method allows you to pass in a chain of jobs to run. The job chain must be formatted like:

{
  jobs: [
    { job: JobClass, options: {} },
    { job: JobClass2, options: {} }
  ],
  global_options: {}
}

Here is an example that runs our new report job first followed by the builtin provisioning job:

job_chain = {
  jobs: [
    { job: MyReallyCoolReportJob, options: {} },
    { job: CanvasSync::Jobs::SyncProvisioningReportJob, options: { models: ['users', 'courses'] } }
  ],
  global_options: {}
}

CanvasSync.process_jobs(job_chain)

What if you've got some other job that you want run that doesn't deal with a report? No problem! Just make sure you call CanvasSync.invoke_next at the end of your job. Example:

class SomeRandomJob < CanvasSync::Job
  def perform(job_chain, options)
    i_dunno_do_something!

    CanvasSync.invoke_next(job_chain)
  end
end

job_chain = {
  jobs: [
    { job: SomeRandomJob, options: {} },
    { job: CanvasSync::Jobs::SyncProvisioningReportJob, options: { models: ['users', 'courses'] } }
  ],
  global_options: {}
}

CanvasSync.process_jobs(job_chain)

Batching

The provisioning report uses the CanvasSync::Importers::BulkImporter class to bulk import rows with the activerecord-import gem. It inserts rows in batches of 10,000 by default. This can be customized by setting the BULK_IMPORTER_BATCH_SIZE environment variable.

Mapping Overrides

Overrides are useful for two scenarios:

  • You have an existing application where the column names do not match up with what CanvasSync expects
  • You want to sync some other column in the report that CanvasSync is not configured to sync

In order to create an override, place a file called canvas_sync_provisioning_mapping.yml in your Rails config directory. Define the tables and columns you want to override using the following format:

users:
  conflict_target: canvas_user_id # This must be a unique field that is present in the report and the database
  report_columns: # The keys specified here are the column names in the report CSV
    canvas_user_id_column_name_in_report:
        database_column_name: canvas_user_id_name_in_your_db # Sometimes the database column name might not match the report column name
        type: integer

Legacy Support

If you have an old style tool that needs to sync data on a row by row basis, you can pass in the legacy_support: true option. In order for this to work, your models must have a create_or_update_from_csv class method defined that accepts a row argument. This method will get passed each row from the CSV, and it's up to you to persist it.

Example:

CanvasSync.provisioning_sync(['users', 'courses'], term_scope: :active, legacy_support: true)

CanvasSync::JobLog

Running the migrations will create a canvas_sync_job_logs table. All the jobs written in this gem will create a CanvasSync::JobLog and store data about their arguments, job class, any exceptions, and start/completion time. This will work regardless of your queue adapter.

If you want your own jobs to also log to the table all you have to do is have your job class inherit from CanvasSync::Job. You can also persist extra data you might need later by saving to the metadata column:

@job_log. = "This job ran really well!"
@job_log.save!

If you want to be able to utilize the CanvasSync::JobLog without ActiveJob (so you can get access to Sidekiq features that ActiveJob doesn't support), then add the following to an initializer in your Rails app:

Sidekiq.configure_server do |config|
  config.server_middleware do |chain|
    chain.add CanvasSync::Sidekiq::Middleware
  end
end

Configuration

You can configure CanvasSync settings by doing the following:

CanvasSync.configure do |config|
  config.classes_to_only_log_errors_on << "ClassToOnlyLogErrorsOn"
end

Available config options (if you add more, please update this!):

  • config.classes_to_only_log_errors_on - use this if you are utilizing the CanvasSync::JobLog table, but want certain classes to only persist in the job_logs table if an error is encountered. This is useful if you've got a very frequently used job that's filling up your database, and only really care about tracking failures.

Upgrading

Re-running the generator when there's been a gem change will give you several choices if it detects conflicts between your local files and the updated generators. You can either view a diff or allow the generator to overwrite your local file. In most cases you may just want to add the code from the diff yourself so as not to break any of your customizations.

Additionally, if there have been schema changes to an existing model you may have to run your own migration to bring it up to speed.

If you make updates to the gem please add any upgrade instructions here.

Integrating with existing applications

In order for this to work properly your database tables will need to have at least the columns defined in this gem. (Adding additional columns is fine.) As such, you may need to run some migrations to rename existing columns or add missing ones. The generator only works well in a situation where that table does not already exist. Take a look at the migration templates in lib/canvas_sync/generators/templates to see what you need.

Development

When adding to or updating this gem, make sure you do the following:

  • Update the yardoc comments where necessary, and confirm the changes by running yardoc --server
  • Write specs
  • If you modify the model or migration templates, run bundle exec rake update_test_schema to update them in the Rails Dummy application (and commit those changes)

Docs

Docs can be generated using yard. To view the docs:

  • Clone this gem's repository
  • bundle install
  • yard server --reload

The yard server will give you a URL you can visit to view the docs.