Module: OnlineMigrations::BackgroundDataMigrations::MigrationHelpers

Included in:
SchemaStatements
Defined in:
lib/online_migrations/background_data_migrations/migration_helpers.rb

Instance Method Summary collapse

Instance Method Details

#backfill_column_for_type_change_in_background(table_name, column_name, model_name: nil, type_cast_function: nil, **options) ⇒ OnlineMigrations::BackgroundDataMigrations::Migration

Note:

This method is better suited for large tables (10/100s of millions of records). For smaller tables it is probably better and easier to use more flexible ‘backfill_column_for_type_change`.

Backfills data from the old column to the new column using background migrations.

Examples:

backfill_column_for_type_change_in_background(:files, :size)

With type casting

backfill_column_for_type_change_in_background(:users, :settings, type_cast_function: "jsonb")

Additional background migration options

backfill_column_for_type_change_in_background(:files, :size, batch_size: 10_000)

Parameters:

  • table_name (String, Symbol)
  • column_name (String, Symbol)
  • model_name (String) (defaults to: nil)

    If Active Record multiple databases feature is used, the class name of the model to get connection from.

  • type_cast_function (String, Symbol) (defaults to: nil)

    Some type changes require casting data to a new type. For example when changing from ‘text` to `jsonb`. In this case, use the `type_cast_function` option. You need to make sure there is no bad data and the cast will always succeed

  • options (Hash)

    used to control the behavior of background migration. See ‘#enqueue_background_data_migration`

Returns:



86
87
88
89
90
91
92
93
94
95
# File 'lib/online_migrations/background_data_migrations/migration_helpers.rb', line 86

def backfill_column_for_type_change_in_background(table_name, column_name, model_name: nil,
                                                  type_cast_function: nil, **options)
  backfill_columns_for_type_change_in_background(
    table_name,
    column_name,
    model_name: model_name,
    type_cast_functions: { column_name => type_cast_function },
    **options
  )
end

#backfill_column_in_background(table_name, column_name, value, model_name: nil, **options) ⇒ OnlineMigrations::BackgroundDataMigrations::Migration

Note:

This method is better suited for large tables (10/100s of millions of records). For smaller tables it is probably better and easier to use more flexible ‘update_column_in_batches`.

Note:

Consider ‘backfill_columns_in_background` when backfilling multiple columns to avoid rewriting the table multiple times.

Backfills column data using background migrations.

Examples:

backfill_column_in_background(:users, :admin, false)

Additional background migration options

backfill_column_in_background(:users, :admin, false, batch_size: 10_000)

Parameters:

  • table_name (String, Symbol)
  • column_name (String, Symbol)
  • value
  • model_name (String) (defaults to: nil)

    If Active Record multiple databases feature is used, the class name of the model to get connection from.

  • options (Hash)

    used to control the behavior of background migration. See ‘#enqueue_background_data_migration`

Returns:



30
31
32
33
# File 'lib/online_migrations/background_data_migrations/migration_helpers.rb', line 30

def backfill_column_in_background(table_name, column_name, value, model_name: nil, **options)
  backfill_columns_in_background(table_name, { column_name => value },
                                 model_name: model_name, **options)
end

#backfill_columns_for_type_change_in_background(table_name, *column_names, model_name: nil, type_cast_functions: {}, **options) ⇒ Object

Same as ‘backfill_column_for_type_change_in_background` but for multiple columns.

Parameters:

  • type_cast_functions (Hash) (defaults to: {})

    if not empty, keys - column names, values - corresponding type cast functions

See Also:



104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
# File 'lib/online_migrations/background_data_migrations/migration_helpers.rb', line 104

def backfill_columns_for_type_change_in_background(table_name, *column_names, model_name: nil,
                                                   type_cast_functions: {}, **options)
  if model_name.nil? && Utils.multiple_databases?
    raise ArgumentError, "You must pass a :model_name when using multiple databases."
  end

  tmp_columns = column_names.map { |column_name| "#{column_name}_for_type_change" }

  if model_name
    model_name = model_name.name if model_name.is_a?(Class)
    connection_class_name = Utils.find_connection_class(model_name.constantize).name
  end

  # model_name = model_name.name if model_name.is_a?(Class)
  # connection_class = Utils.find_connection_class(model_name.constantize) if model_name

  enqueue_background_data_migration(
    "CopyColumn",
    table_name,
    column_names,
    tmp_columns,
    model_name,
    type_cast_functions,
    connection_class_name: connection_class_name,
    **options
  )
end

#backfill_columns_in_background(table_name, updates, model_name: nil, **options) ⇒ Object

Same as ‘backfill_column_in_background` but for multiple columns.

Examples:

backfill_columns_in_background(:users, { admin: false, status: "active" })

Parameters:

  • updates (Hash)

    keys - column names, values - corresponding values

See Also:



44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
# File 'lib/online_migrations/background_data_migrations/migration_helpers.rb', line 44

def backfill_columns_in_background(table_name, updates, model_name: nil, **options)
  if model_name.nil? && Utils.multiple_databases?
    raise ArgumentError, "You must pass a :model_name when using multiple databases."
  end

  model_name = model_name.name if model_name.is_a?(Class)

  enqueue_background_data_migration(
    "BackfillColumn",
    table_name,
    updates,
    model_name,
    **options
  )
end

#copy_column_in_background(table_name, copy_from, copy_to, model_name: nil, type_cast_function: nil, **options) ⇒ OnlineMigrations::BackgroundDataMigrations::Migration

Note:

This method is better suited for large tables (10/100s of millions of records). For smaller tables it is probably better and easier to use more flexible ‘update_column_in_batches`.

Copies data from the old column to the new column using background migrations.

Examples:

copy_column_in_background(:users, :id, :id_for_type_change)

Parameters:

  • table_name (String, Symbol)
  • copy_from (String, Symbol)

    source column name

  • copy_to (String, Symbol)

    destination column name

  • model_name (String) (defaults to: nil)

    If Active Record multiple databases feature is used, the class name of the model to get connection from.

  • type_cast_function (String, Symbol) (defaults to: nil)

    Some type changes require casting data to a new type. For example when changing from ‘text` to `jsonb`. In this case, use the `type_cast_function` option. You need to make sure there is no bad data and the cast will always succeed

  • options (Hash)

    used to control the behavior of background migration. See ‘#enqueue_background_data_migration`

Returns:



153
154
155
156
157
158
159
160
161
162
# File 'lib/online_migrations/background_data_migrations/migration_helpers.rb', line 153

def copy_column_in_background(table_name, copy_from, copy_to, model_name: nil, type_cast_function: nil, **options)
  copy_columns_in_background(
    table_name,
    [copy_from],
    [copy_to],
    model_name: model_name,
    type_cast_functions: { copy_from => type_cast_function },
    **options
  )
end

#copy_columns_in_background(table_name, copy_from, copy_to, model_name: nil, type_cast_functions: {}, **options) ⇒ Object

Same as ‘copy_column_in_background` but for multiple columns.

Parameters:

  • type_cast_functions (Hash) (defaults to: {})

    if not empty, keys - column names, values - corresponding type cast functions

See Also:



171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
# File 'lib/online_migrations/background_data_migrations/migration_helpers.rb', line 171

def copy_columns_in_background(table_name, copy_from, copy_to, model_name: nil, type_cast_functions: {}, **options)
  if model_name.nil? && Utils.multiple_databases?
    raise ArgumentError, "You must pass a :model_name when using multiple databases."
  end

  model_name = model_name.name if model_name.is_a?(Class)

  enqueue_background_data_migration(
    "CopyColumn",
    table_name,
    copy_from,
    copy_to,
    model_name,
    type_cast_functions,
    **options
  )
end

#delete_associated_records_in_background(model_name, record_id, association, **options) ⇒ OnlineMigrations::BackgroundDataMigrations::Migration

Note:

This method is better suited for large tables (10/100s of millions of records). For smaller tables it is probably better and easier to directly delete associated records.

Deletes associated records for a specific parent record using background migrations. This is useful when you are planning to remove a parent object (user, account etc) and needs to remove lots of its associated objects.

Examples:

delete_associated_records_in_background("Link", 1, :clicks)

Parameters:

  • model_name (String)
  • record_id (Integer, String)

    parent record primary key’s value

  • association (String, Symbol)

    association name for which records will be removed

  • options (Hash)

    used to control the behavior of background migration. See ‘#enqueue_background_data_migration`

Returns:



274
275
276
277
278
279
280
281
282
283
284
# File 'lib/online_migrations/background_data_migrations/migration_helpers.rb', line 274

def delete_associated_records_in_background(model_name, record_id, association, **options)
  model_name = model_name.name if model_name.is_a?(Class)

  enqueue_background_data_migration(
    "DeleteAssociatedRecords",
    model_name,
    record_id,
    association,
    **options
  )
end

#delete_orphaned_records_in_background(model_name, *associations, **options) ⇒ OnlineMigrations::BackgroundDataMigrations::Migration

Note:

This method is better suited for large tables (10/100s of millions of records). For smaller tables it is probably better and easier to directly find and delete orpahed records.

Deletes records with one or more missing relations using background migrations. This is useful when some referential integrity in the database is broken and you want to delete orphaned records.

Examples:

delete_orphaned_records_in_background("Post", :author)

Parameters:

  • model_name (String)
  • associations (Array)
  • options (Hash)

    used to control the behavior of background migration. See ‘#enqueue_background_data_migration`

Returns:



245
246
247
248
249
250
251
252
253
254
# File 'lib/online_migrations/background_data_migrations/migration_helpers.rb', line 245

def delete_orphaned_records_in_background(model_name, *associations, **options)
  model_name = model_name.name if model_name.is_a?(Class)

  enqueue_background_data_migration(
    "DeleteOrphanedRecords",
    model_name,
    associations,
    **options
  )
end

#enqueue_background_data_migration(migration_name, *arguments, **options) ⇒ OnlineMigrations::BackgroundDataMigrations::Migration Also known as: enqueue_background_migration

Note:

For convenience, the enqueued background data migration is run inline in development and test environments

Creates a background migration for the given job class name.

A background migration runs one job at a time, computing the bounds of the next batch based on the current migration settings and the previous batch bounds. Each job’s execution status is tracked in the database as the migration runs.

Examples:

enqueue_background_data_migration("BackfillProjectIssuesCount")

# Given the background migration exists:

class BackfillProjectIssuesCount < OnlineMigrations::DataMigration
  def collection
    Project.in_batches(of: 100)
  end

  def process(projects)
    projects.update_all(
      "issues_count = (SELECT COUNT(*) FROM issues WHERE issues.project_id = projects.id)"
    )
  end

  # To be able to track progress, you need to define this method.
  def count
    Project.maximum(:id)
  end
end

Parameters:

  • migration_name (String, Class)

    Background migration class name

  • arguments (Array)

    Extra arguments to pass to the migration instance when the migration runs

  • options (Hash)

    a customizable set of options

Options Hash (**options):

  • :max_attempts (Integer) — default: 5

    Maximum number of batch run attempts

  • :connection_class_name (String, nil)

    Class name to use to get connections

Returns:



371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
# File 'lib/online_migrations/background_data_migrations/migration_helpers.rb', line 371

def enqueue_background_data_migration(migration_name, *arguments, **options)
  options.assert_valid_keys(:max_attempts, :iteration_pause, :connection_class_name)

  migration_name = migration_name.name if migration_name.is_a?(Class)
  options[:connection_class_name] ||= compute_connection_class_name(migration_name, arguments)

  if Utils.multiple_databases? && !options[:connection_class_name]
    raise ArgumentError, "You must pass a :connection_class_name when using multiple databases."
  end

  connection_class = options[:connection_class_name].constantize
  shards = Utils.shard_names(connection_class)
  shards = [nil] if shards.size == 1

  shards.each do |shard|
    # Can't use `find_or_create_by` or hash syntax here, because it does not correctly work with json `arguments`.
    migration = Migration.where(migration_name: migration_name, shard: shard).where("arguments = ?", arguments.to_json).first
    migration ||= Migration.create!(**options, migration_name: migration_name, arguments: arguments, shard: shard)

    if Utils.run_background_migrations_inline? && !migration.succeeded?
      job = OnlineMigrations.config.background_data_migrations.job
      job.constantize.perform_inline(migration.id)
    end
  end

  true
end

#ensure_background_data_migration_succeeded(migration_name, arguments: nil) ⇒ Object Also known as: ensure_background_migration_succeeded

Ensures that the background data migration with the provided configuration succeeded.

If the enqueued migration was not found in development (probably when resetting a dev environment followed by ‘db:migrate`), then a log warning is printed. If enqueued migration was not found in production, then the error is raised. If enqueued migration was found but is not succeeded, then the error is raised.

Examples:

Without arguments

ensure_background_data_migration_succeeded("BackfillProjectIssuesCount")

With arguments

ensure_background_data_migration_succeeded("CopyColumn", arguments: ["users", "id", "id_for_type_change"])

Parameters:

  • migration_name (String, Class)

    Background migration job class name

  • arguments (Array, nil) (defaults to: nil)

    Arguments with which background migration was enqueued



430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
# File 'lib/online_migrations/background_data_migrations/migration_helpers.rb', line 430

def ensure_background_data_migration_succeeded(migration_name, arguments: nil)
  migration_name = migration_name.name if migration_name.is_a?(Class)

  configuration = { migration_name: migration_name }

  if arguments
    arguments = Array(arguments)
    migrations = Migration.for_configuration(migration_name, arguments).to_a
    configuration[:arguments] = arguments.to_json
  else
    migrations = Migration.for_migration_name(migration_name).to_a
  end

  if migrations.empty?
    Utils.raise_in_prod_or_say_in_dev("Could not find background data migration(s) for the given configuration: #{configuration}.")
  elsif !migrations.all?(&:succeeded?)
    raise "Expected background data migration(s) for the given configuration to be marked as 'succeeded': #{configuration}."
  end
end

#perform_action_on_relation_in_background(model_name, conditions, action, updates: nil, **options) ⇒ OnlineMigrations::BackgroundDataMigrations::Migration

Note:

This method is better suited for large tables (10/100s of millions of records). For smaller tables it is probably better and easier to directly perform the action on associated records.

Performs specific action on a relation or individual records. This is useful when you want to delete/destroy/update/etc records based on some conditions.

Examples:

Delete records

perform_action_on_relation_in_background("User", { banned: true }, :delete_all)

Destroy records

perform_action_on_relation_in_background("User", { banned: true }, :destroy_all)

Update records

perform_action_on_relation_in_background("User", { banned: nil }, :update_all, updates: { banned: false })

Perform custom method on individual records

class User < ApplicationRecord
  def generate_invite_token
    self.invite_token = # some complex logic
  end
end

perform_action_on_relation_in_background("User", { invite_token: nil }, :generate_invite_token)

Parameters:

  • model_name (String)
  • conditions (Array, Hash, String)

    conditions to filter the relation

  • action (String, Symbol)

    action to perform on the relation or individual records. Relation-wide available actions: ‘:delete_all`, `:destroy_all`, and `:update_all`.

  • updates (Hash) (defaults to: nil)

    updates to perform when ‘action` is set to `:update_all`

  • options (Hash)

    used to control the behavior of background migration. See ‘#enqueue_background_data_migration`

Returns:



320
321
322
323
324
325
326
327
328
329
330
331
# File 'lib/online_migrations/background_data_migrations/migration_helpers.rb', line 320

def perform_action_on_relation_in_background(model_name, conditions, action, updates: nil, **options)
  model_name = model_name.name if model_name.is_a?(Class)

  enqueue_background_data_migration(
    "PerformActionOnRelation",
    model_name,
    conditions,
    action,
    { updates: updates },
    **options
  )
end

#remove_background_data_migration(migration_name, *arguments) ⇒ Object Also known as: remove_background_migration

Removes the background migration for the given class name and arguments, if exists.

Examples:

remove_background_data_migration("BackfillProjectIssuesCount")

Parameters:

  • migration_name (String, Class)

    Background migration job class name

  • arguments (Array)

    Extra arguments the migration was originally created with



408
409
410
411
# File 'lib/online_migrations/background_data_migrations/migration_helpers.rb', line 408

def remove_background_data_migration(migration_name, *arguments)
  migration_name = migration_name.name if migration_name.is_a?(Class)
  Migration.for_configuration(migration_name, arguments).delete_all
end

#reset_counters_in_background(model_name, *counters, touch: nil, **options) ⇒ OnlineMigrations::BackgroundDataMigrations::Migration

Note:

This method is better suited for large tables (10/100s of millions of records). For smaller tables it is probably better and easier to use ‘reset_counters` from the Active Record.

Resets one or more counter caches to their correct value using background migrations. This is useful when adding new counter caches, or if the counter has been corrupted or modified directly by SQL.

Examples:

reset_counters_in_background("User", :projects, :friends, touch: true)

Touch specific column

reset_counters_in_background("User", :projects, touch: :touched_at)

Touch with specific time value

reset_counters_in_background("User", :projects, touch: [time: 2.days.ago])

Parameters:

  • model_name (String)
  • counters (Array)
  • touch (Boolean, Symbol, Array) (defaults to: nil)

    touch timestamp columns when updating.

    • when ‘true` - will touch `updated_at` and/or `updated_on`

    • when ‘Symbol` or `Array` - will touch specific column(s)

  • options (Hash)

    used to control the behavior of background migration. See ‘#enqueue_background_data_migration`

Returns:

See Also:



216
217
218
219
220
221
222
223
224
225
226
# File 'lib/online_migrations/background_data_migrations/migration_helpers.rb', line 216

def reset_counters_in_background(model_name, *counters, touch: nil, **options)
  model_name = model_name.name if model_name.is_a?(Class)

  enqueue_background_data_migration(
    "ResetCounters",
    model_name,
    counters,
    { touch: touch },
    **options
  )
end