Pmux: pipeline multiplexer

Pmux is a lightweight file-based MapReduce system, written in Ruby. Applying the philosophy of Unix pipeline processing to distributed computing on a GlusterFS cluster, pmux provides a tool capable of handling large amounts of data stored in files.

Requirements

  • ruby 1.8.7, 1.9.2 or higher
  • msgpack-rpc
  • net-ssh, net-scp
  • gflocator
  • GlusterFS 3.3.0 or higher, native client (FUSE)

Install

on all GlusterFS server nodes:

gem install pmux

on the GlusterFS client node

gem install pmux
gem install gflocator
sudo gflocator

Each of GlusterFS server nodes, SSH with authentication key is required.

Usage

show status

$ pmux --status
host0.example.com: pmux 0.1.0, num_cpu=8, ruby 1.9.3

show status of remote machine

$ pmux --status -h host1.example.com
host1.example.com: pmux 0.1.0, num_cpu=2, ruby 1.9.3

distributed grep

$ pmux --mapper="grep PATTERN" /glusterfs/xxx/*.log