Mclone

Mclone is a utility for offline file synchronization utilizing the Rclone as a backend for doing actual file transfer.

Purpose

Suppose you have a (large amount of) data which needs to be either distributed across many storages or simply backed up. For example, consider a terabyte private media archive one can not afford to lose.

As the data gets periodically updated, there is a need for regular synchronization. When the use of online cloud storage is not an option due storage space or security reasons, the good ol' offline backing up comes back into play.

A sane backup strategy mandates the data copies to be physically separated - be it a next room (building, city or planet) computer or just an external drive. Or, even better, the two computers' storages - a primary, where all activity takes place, a mirror storage which holds the backup, and a portable storage (USB flash disc, exteral HDD or SSD - whatever) which serves as both an intermediate storage and a means of propagating the changes between the primary and the mirror.

In a more complex scenario there may be multiple one-way or two-way point-to-point data transfer routes between the storages, employing portable storage as a "shuttle" or a "ferry".

All in all the synchronization task boils down to copying or synchronizing the contents of two local directories. However, since portable storage is involved, the actual file paths may change between synchronizations as a storage device can be mounted under different mount points on *NIX system or change the disk drive on Windows system.

While the Rclone itself is a great tool for local file synchronization, typing the command line for execution in this case becomes tedious and error prone where the possible cost of error is a backup corruption due to wrong paths or misspelled flags.

This is where the Mclone comes in. It is designed to automatize the Rclone synchronization process by memorizing the command line options and detecting the proper source and destination locations wherever they are.

Installation

Mclone is written in Ruby language and is distributed in the form of the Ruby GEM.

Once the Ruby runtime is properly set, the Mclone itself is installed with

$ gem install mclone

Obviously, the Rclone installation is also required. The Mclone will use either the contents of the RCLONE environment variable if exists or look though the PATH environment variable to locate the rclone executable.

Once properly installed, the Mclone provides the mclone command line utility.

$ mclone -h

Basic use case

Let's start with the simplest case.

Suppose you have a data directory /data and you'd want to set up the backup of the /data/files subdirectory into a backup directory /mnt/backup. The latter may be an ordinary directory or a mounted portable storage, or whatever.

1. Create volumes

Mclone has a notion of a volume - a file system directory containing the .mclone file, which is used as a root directory for all Mclone operations.

By default, in order to detect currently available volumes the Mclone scans all mount points on *NIX systems and all available disk drives on Windows system. Additionally, a static volume directories list to consider can be specified in the MCLONE_PATH environment variable which is a PATH-like list of directories separated by the double colon : on *NIX systems or the semicolon ; on Windows system.

If the /data is a regular directory, it won't be picked up by the Mclone automatically, so it needs to be put into the environment for later reuse

export MCLONE_PATH=/data

On the other hand, if the /mnt/backup is a mount point for a portable storage, it will be autodetected, therefore there is no need to put it there.

Both source and destination endpoints have to "formatted" in order to be recognized as the Mclone volumes

$ mclone volume create /data
$ mclone volume create /mnt/backup

After that, mclone info can be used to review the recognized volumes

$ mclone info

# Mclone version 0.1.0

## Volumes

* [6bfa4a2d] :: (/data)
* [7443e311] :: (/mnt/backup)

Each volume is identified by the randomly generated tag shown within the square brackets [...]. Obviously, the tags will be different in your case.

2. Create a task

A Mclone task corresponds to a single Rclone command. It contains the source and destination volume identifiers, the source and destination subdirectories relative to the respective volumes, as well as additional Rclone command line arguments to be used.

There can be multiple tasks linking different source and destination volumes as well as their respective subdirectores.

A task with all defaults is created with

$ mclone task create /data/files /mnt/backup/files

Note that at with point there is no need to use the above volume tags as they will be auto-determined during task creation.

Again, use the mclone info to review the changes

# Mclone version 0.1.0

## Volumes

* [6bfa4a2d] :: (/data)
* [7443e311] :: (/mnt/backup)

## Intact tasks

* [cef63f5e] :: update [6bfa4a2d](files) -> [7443e311](files) :: include **

The output literally means: ready to process (intact) update cef63f5e task from the files source subdirectory of the 6bfa4a2d volume to the files destination subdirectory of the 7443e311 volume including ** all files and subdirectories.

Again, the task's tag is randomly generated and will be different in your case.

There are two kinds of tasks to encounter - intact and stale.

An intact task is a task which is fully ready for processing with the Rclone. As with the volumes, its tag is shown in the square brackets [...]

Conversely, a stale task is not ready for processing due to currently missing source or destination volume. A stale task's tag is shown in the angle brackets <...>. Also, a missing stale task's volume tag will also be shown in the angle brackets.

Thank to the indirection in the source and destination directories, this task will be handled properly regardless of the portable storage directory it will be mounted in next time provided that it will be detectable by the Mclone.

The same applies to the Windows system where the portable storage can be appear as different disk drives and yet be detectable by the Mclone.

3. Modify the task

Once a task is created, its source and destination volumes and directories get fixed and can not be changed. Therefore the only way to modify it is to start from scratch preceded by the task deletion with the mclone task delete command.

A task's optional parameters however can be modified afterwards with the mclone task modify command.

Suppose you'd want to change the operation mode from default updating to synchronization and exclude .bak files.

$ mclone task modify -m sync -x '*.bak' cef

This time the task is identified by its tag instead of a directory.

Note the mode and task's tag abbreviations: synchronize is reduced to sync (or it can be cut down further to sy) and the tag is reduced from full cef63f5e to cef for convenience and type saving. Any part of the full word can be used as an abbreviation provided it is unique among all other full words of the same kind otherwise the Mclone will bail out with error.

The abbreviations are supported for operation mode, volume and task tags.

Behold the changes

$ mclone info

# Mclone version 0.1.0

## Volumes

* [6bfa4a2d] :: (/data)
* [7443e311] :: (/mnt/backup)

## Intact tasks

* [cef63f5e] :: synchronize [6bfa4a2d](files) -> [7443e311](files) :: include ** :: exclude *.bak

4. Process the tasks

Once created all intact tasks can be (sequentially) processed with the mclone task process command.

$ mclone task process

If specific tasks need to be processed, their (possibly abbreviated) tags are specified as command line arguments

$ mclone task process cef

Technically, for a task to be processed the Mclone renders the full source and destination path names from the respective volume locations and relative paths and passes them along with other options to the Rclone to do the actual processing.

Thats it. No more need to determine (and type in) current locations of the backup directory and retype all those Rclone arguments for every occasion.

Advanced use case

Now back to the triple storage scenario outlined above.

Let S be a source storage from where the data needs to be backed up, D be a destination storage where the data is to be mirrored and P be a portable storage which serves as both an intermediate storage and a means of the S->D data propagation.

In this case the full data propagation graph is S->P->D.

1. Set up the S->P route

1.1. Plug in the P portable storage to the S's computer and mount it.

1.2. As shown in the basic use case, create S's and P's volumes, then create a S->P task.

1.3. Unplug P.

At this point S and P are now separated and each carry its own copy of the S->P task.

2. Set up the P->D route

2.1. Plug in the P portable storage to the D's computer and mount it.

Note that at this point the S->P is a stale task as D's computer knows nothing about S storage.

2.2. Create the D's volume, then create a P->D task. Note that P at this point already contains a volume and therefore must not be formatted.

2.3. Unplug P.

Now S and D are formatted and carry the respective tasks. P contains its own copies of both S->P and P->D tasks.

3. Process the S->P->D route

3.1. Plug in P to the S's computer and mount it.

3.2. Process the intact tasks. In this case it is the S->P task (P->D is stale at this point).

3.3. Unplug P.

P now carries its own copy of the S's data.

3.4. Plug in P to the D's computer and mount it.

3.5. Process the intact tasks. In this case it is the P->D task (S->P is stale at this point).

3.6. Unplug P.

Voilà! Both P and D now carry a copy of the S's data.

There may be more complex data propagation scenarios with multiple source and destination storages utilizing the portable storage in the above way.

Consider a two-way synchronization between two storages with a portable ferry which carries and propagates data in both directions.

Whats next

On-screen help

Every mclone (sub)command has its own help page which can be shown with --help option

$ mclone task create --help

Usage:
    mclone task create [OPTIONS] SOURCE DESTINATION

Parameters:
    SOURCE                   Source path
    DESTINATION              Destination path

Options:
    -m, --mode MODE          Operation mode (update | synchronize | copy | move) (default: "update")
    -i, --include PATTERN    Include paths pattern (default: "**")
    -x, --exclude PATTERN    Exclude paths pattern
    -f, --force              Insist on potentially dangerous activities (default: false)
    -n, --dry-run            Simulation mode with no on-disk modifications (default: false)
    -v, --verbose            Verbose operation (default: false)
    -h, --help               print help

File filtering

The Mclone passes its include and exclude options to the Rclone. The pattern format is an extended glob (*.dat) format described in detail in the corresponding Rclone documentation section.

Dry run

The Mclone respects the Rclone's dry run mode activated with --dry-run command line option in which case no volume (.mclone) files are ever touched (created, overwritten) during any operation. The Rclone is run during task processing but in turn is supplied with this option.

Force mode

The Mclone will refuse to automatically perform certain actions which are considered dangerous, such as deleting a volume or overwriting existing task. In this case a --force command line option should be used to pass through.

Task operation modes

Update

  • Copy source files which are newer than the destination's or have different size or checksum.

  • Do not delete destination files which are nonexistent in the source.

  • Do not copy source files which are older than the destination's.

A default refreshing mode which is considered to be least harmful with respect to the unintentional data override.

Rclone command: copy --update.

Synchronize

  • Copy source files which are newer than the destination's or have different size or checksum.

  • Delete destination files which are nonexistent in the source.

  • Copy source files which are older than the destination's.

This is the mirroring mode which makes destination completely identical to the source.

Rclone command: sync.

Copy

  • Copy source files which are newer than the destination's or have different size or checksum.

  • Do not delete destination files which are nonexistent in the source.

  • Do not copy source files which are older than the destination's.

This mode is much like synchronize with only difference that it does not delete files.

Rclone command: copy.

Move

  • Copy source files which are newer than the destination's or have different size or checksum.

  • Do not delete destination files which are nonexistent in the source.

  • Do not copy source files which are older than the destination's.

  • Delete source files after successful copy to the destination.

Rclone command: move.

The end

Cheers,

Oleg A. Khlybov [email protected]