Pwrake

Parallel Workflow extension for Rake, runs on multicores, clusters, clouds.

  • Author: Masahiro Tanaka

README in Japanese, GitHub Repository, RubyGems

Features

  • Pwrake executes a workflow written in Rakefile in parallel.
    • The specification of Rakefile is same as Rake.
    • The tasks which do not have mutual dependencies are automatically executed in parallel.
    • The multitask which is a parallel task definition of Rake is no more necessary.
  • Parallel and distributed execution is possible using a computer cluster which consists of multiple compute nodes.
    • Cluster settings: SSH login (or MPI), and the directory sharing using a shared filesystem, e.g., NFS, Gfarm.
    • Pwrake automatically connects to remote hosts using SSH. You do not need to start a daemon.
    • Remote host names and the number of cores to use are provided in a hostfile.
  • Gfarm file system utilizes storage of compute nodes. It provides the high-performance parallel I/O.
    • Parallel I/O access to local storage of compute nodes enables scalable increase in the I/O performance.
    • Gfarm schedules a compute node to store an output file, to local storage.
    • Pwrake schedules a compute node to execute a task, to a node where input files are stored.
    • Other supports for Gfarm: Automatic mount of the Gfarm file system, etc.

Installation

Install with RubyGems:

$ gem install pwrake

Or download source tgz/zip and expand, cd to subdirectory and install:

$ ruby setup.rb

In the latter case, you need install Parallel manually. It is required by Pwrake for processor count.

If you use rbenv, your system may fail to find pwrake command after installation:

-bash: pwrake: command not found

In this case, you need the rehash of command paths:

$ rbenv rehash

Usage

Parallel execution using 4 cores at localhost:

$ pwrake -j 4

Parallel execution using all cores at localhost:

$ pwrake -j

Parallel execution using total 2*2 cores at remote 2 hosts:

  1. Share your directory among remote hosts via distributed file system such as NFS, Gfarm.
  2. Allow passphrase-less access via SSH in either way:
    • Add passphrase-less key generated by ssh-keygen. (Be careful)
    • Add passphrase using ssh-add.
  3. Make hosts file in which remote host names and the number of cores are listed:

    $ cat hosts
    host1 2
    host2 2
    
  4. Run pwrake with an option --hostfile or -F:

    $ pwrake -F hosts
    

Use MPI to start remote worker

  1. Setup MPI on your cluster.
  2. Install MPipe gem. (requires mpicc)
  3. Run pwrake-mpi command.

    $ pwrake-mpi -F hosts
    

Options

Pwrake command line options (in addition to Rake option)

-F, --hostfile FILE              [Pw] Read hostnames from FILE
-j, --jobs [N]                   [Pw] Number of threads at localhost (default: # of processors)
-L, --log, --log-dir [DIRECTORY] [Pw] Write log to DIRECTORY
    --ssh-opt, --ssh-option OPTION
                                 [Pw] Option passed to SSH
    --filesystem FILESYSTEM      [Pw] Specify FILESYSTEM (nfs|gfarm)
    --gfarm                      [Pw] FILESYSTEM=gfarm
-A, --disable-affinity           [Pw] Turn OFF affinity (AFFINITY=off)
-S, --disable-steal              [Pw] Turn OFF task steal
-d, --debug                      [Pw] Output Debug messages
    --pwrake-conf [FILE]         [Pw] Pwrake configuration file in YAML
    --show-conf, --show-config   [Pw] Show Pwrake configuration options
    --report LOGDIR              [Pw] Generate `report.html' (Report of workflow statistics) in LOGDIR and exit.
    --report-image IMAGE_TYPE    [Pw] Gnuplot output format (png,jpg,svg etc.) in report.html.
    --clear-gfarm2fs             [Pw] Clear gfarm2fs mountpoints left after failure.

pwrake_conf.yaml

  • If pwrake_conf.yaml exists at current directory, Pwrake reads options from it.
  • Example (in YAML form):

    HOSTFILE: hosts
    LOG_DIR: true
    DISABLE_AFFINITY: true
    DISABLE_STEAL: true
    FAILED_TARGET: delete
    PASS_ENV :
     - ENV1
     - ENV2
    
  • Option list:

    HOSTFILE, HOSTS   nil(default, localhost)|filename
    LOG_DIR, LOG      nil(default, No log output)|true(dirname="Pwrake%Y%m%d-%H%M%S")|dirname
    LOG_FILE          default="pwrake.log"
    TASK_CSV_FILE     default="task.csv"
    COMMAND_CSV_FILE  default="command.csv"
    GC_LOG_FILE       default="gc.log"
    WORK_DIR          default=$PWD
    FILESYSTEM        default(autodetect)|gfarm
    SSH_OPTION        SSH option
    PASS_ENV          (Array) Environment variables passed to SSH
    HEARTBEAT         default=240 - Hearbeat interval in seconds
    RETRY             default=1 - The number of retry
    FAILED_TARGET     rename(default)|delete|leave - Treatment of failed target files
    FAILURE_TERMINATION wait(default)|kill|continue - Behavior of other tasks when a task is failed
    QUEUE_PRIORITY          LIHR(default)|FIFO|LIFO|RANK
    NOACTION_QUEUE_PRIORITY FIFO(default)|LIFO|RAND
    SHELL_START_INTERVAL    default=0.012 (sec)
    GRAPH_PARTITION         false(default)|true
    REPORT_IMAGE            default=png
    
  • Options for Gfarm system:

    DISABLE_AFFINITY    default=false
    DISABLE_STEAL       default=false
    GFARM_BASEDIR       default="/tmp"
    GFARM_PREFIX        default="pwrake_$USER"
    GFARM_SUBDIR        default='/'
    MAX_GFWHERE_WORKER  default=8
    GFARM2FS_OPTION     default=""
    GFARM2FS_DEBUG      default=false
    GFARM2FS_DEBUG_WAIT default=1
    

Task Properties

  • Task properties are specified in desc strings above task definition in Rakefile.

Example of Rakefile:

desc "ncore=4 allow=ourhost*" # desc has no effect on rule in original Rake, but it is used for task property in Pwrake.
rule ".o" => ".c" do
  sh "..."
end

(1..n).each do |i|
  desc "ncore=2 steal=no" # desc should be inside of loop because it is effective only for the next task.
  file "task#{i}" do
    sh "..."
  end
end

Properties (The leftmost item is default):

ncore=integer     - The number of cores used by this task.
exclusive=no|yes  - Exclusively execute this task in a single node.
allow=hostname    - Allow this host to execute this task. (accepts wild card)
deny=hostname     - Deny this host to execute this task. (accepts wild card)
order=deny,allow|allow,deny - The order of evaluation.
steal=yes|no      - Allow task stealing for this task.
retry=integer     - The number of retry for this task.

Note for Gfarm

  • gfwhere-pipe script (included in Pwrake) is used for file-affinity scheduling. This script requires Ruby/FFI (https://github.com/ffi/ffi). Install FFI by

    gem install ffi
    

Scheduling with Graph Partitioning

Current version

  • Pwrake version 2.2.0

Tested Platform

  • Ruby 2.4.0
  • Rake 12.0.0
  • CentOS 7.3

Acknowledgment

This work is supported by: