OSwitch

One-line access to other operating systems.

Background

Genomic analyses require using many bioinformatics tools starting from assessing the quality of sequenced data and assembly, to annotation, comparison and analysis. The data types are young, thus the tools are too and many genomicists lack computational training. Thus tools are frequently updated yet often challenging to install. Furthermore, software updates often involve changes in algorithms or input/output formats, making analyses difficult to reproduce. To make matters worse, genomicists often lack the skills necessary to setup complex bioinformatics software, and systems administrators can be overwhelmed by large numbers of software installation requests & the challenges of managing multiple versions.

Aim & Features

We are developing oswitch to enable seamless switching from one operating system to another - providing access to diverse ranges of tools. This project grew from our own need to rapidly access software distributed as part of BioLinux on our MacBooks and our university HPC system.

For this we take advantage of the docker technology. Docker works by creating "to the specification" image from a Dockerfile, which is then run in an isolated "container". Dockerfiles or the resulting images can persist forever, are easily shared or published, and make it possible for anybody to recreate the exact same setup at any point of time in the future. This is similar to using virtual machine images - but much more flexible and light-weight.

oswitch is thus a wrapper facilitating access to docker images (without the need for ssh-ing). Importantly, when switching operating systems inside a shell, most things remain unchanged:

  • Current working directory is maintained
  • User name, uid and gid are maintained
  • Login shell (bash/zsh/fish) is maintained
  • Home directory is maintained (thus all .dotfiles and config files are maintained).
  • read/write permissions are maintained
  • Paths are maintained whenever possible. Thus volumes (external drives, NAS) mounted on the host are available in the container at the same path.

Example Usage

There are two broad usage scenarios: interactive use & non-interactive use.

Use a package interactively in a normal command-line

Minimalist example:

Yannick@n56-169 ~/g/oswitch> uname -a
Darwin n56-169.sbcs.qmul.ac.uk 14.0.0 Darwin Kernel Version 14.0.0: Fri Sep 19 00:26:44 PDT 2014; root:xnu-2782.1.97~2/RELEASE_X86_64 x86_64
Yannick@n56-169 ~/g/oswitch> oswitch yeban/biolinux:8
### You are now running: biolinux_8, in container: biolinux_8-27182. ###
Yannick@biolinux_8-27182 ~/g/oswitch> uname -a
Linux biolinux_8-27182 3.16.4-tinycore64 #1 SMP Thu Oct 23 16:14:24 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Biologically relevant example:

# Trying to run blast.
pixel:~/test/ $ ls 
mygene.fasta
pixel:~/test/ $ cat mygene.fa
>myfavoritegene isthisone
MNTLWLSLWDYPGKLPLNFMVFDTKDDLQAAYWRDPYSIPLAVIFEDPQPISQRLIYEIR
TNPSYTLPPPPTKLYSAPISCRKNKTGHWMDDILSIKTGESCPVNNYLHSGFLALQMITD
ITKIKLENSDVTIPDIKLIMFPKEPYTADWMLAFRVVIPLYMVLALSQFITYLLILIVGE
KENKIKEGMKMMGLNDSVF
pixel:~/test/ $ blastp -query mygene.fa -remote -db nr -outfmt 7 > mygene_blastp_nr.tab
zsh: command not found: blastp
# Indeed... blastp is not installed on my MacBook. 

# Switch to BioLinux and run blastp.
pixel:~/test/ $ oswitch biolinux
###### You are now running: biolinux in container biolinux-7187. ######
biolinux-7187:~/test/ $ blastp -query mygene.fa -remote -db nr -outfmt 7 >  mygene_blastp_nr.tab
# BioLinux includes blastp, thus the command ran smoothly.

# View the result.
biolinux-7187:~/test/ $ head mygene_blastp_nr.tab
# BLASTP 2.2.28+
# Query: myfavoritegene isthisone
# RID: BJAHAHU9015
# Database: nr
# Fields: query id, subject id, % identity, alignment length, mismatches, gap opens, q. start, q. end, s. start, s. end, evalue, bit score
# 501 hits found
myfavoritegene  gi|322796550|gb|EFZ19024.1| 100.00  199 0   0   1   199 1   199 2e-142   407
myfavoritegene  gi|307183032|gb|EFN69988.1| 86.07   201 25  2   1   199 80  279 6e-115   361
myfavoritegene  gi|572260155|ref|XP_006608402.1|    80.60   201 36  2   1   199 95  294 4e-108   350
myfavoritegene  gi|328778864|ref|XP_397465.4|   80.60   201 36  2   1   199 95  294 5e-108   350


# [... potentially run other analyses that require biolinux things...]

# Return to normal operating system
biolinux-7187:~/test/ $ exit
pixel:~/test/ $ ls 
mygene.fasta mygene_blastp_nr.txt
Use a package non-interactively

Alternatively, single commands can be run directly in a container (e.g. BioLinux) without entering it interactively. This can be useful to test new tools, or to run a single piece of not-locally-installed software as part of a single command. The container terminates automatically once the command has been executed, output is printed to the terminal and can be redirected, and the exit status of the command run within container is returned.

# Run command directly in BioLinux and view results if success.
pixel:~/test/ $ oswitch biolinux blastp -remote -query mygene.fa -db nr > mygene_blastp_nr.txt
Listing available operating system containers

OSwitch can pull any image from docker hub. You can see the images you pulled from docker hub using oswitch as:

pixel:~ $ oswitch -l
yeban/biolinux:8
ubuntu:14.04
Availability

OSwitch has been tested on:

  • Mac OS X Yosemite
  • Ubuntu 14.04.1
  • CentOS 7
Caveats
  • Works only for Debian, Ubuntu, CentOS based docker images.
  • Host directories/volumes with paths conflicting with container paths are skipped.
  • SELinux must be disabled on CentOS for mounting volumes to work.

Installation

OSwitch first requires a working docker install.

Install and setup docker

Mac OS X

Installing docker - https://docs.docker.com/installation/mac/

Ubuntu

Installing docker - https://docs.docker.com/installation/ubuntulinux/

Add yourself to docker group so you can run docker client without sudo:

    $ sudo usermod -aG docker `whoami`

    # then logout and login again for the above command to take effect
CentOS

Installing docker - https://docs.docker.com/installation/centos/

Add yourself to docker group so you can run docker client without sudo:

    $ sudo usermod -aG docker `whoami`

    # then logout and login again for the above command to take effect

Disable SELinux as it gets in the way of mounting volumes within the container:

    $ sed -i .bak 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config

    # then reboot your system

The above command backs up the original file to /etc/selinux/config.bak. If you are concerned about disabling SELinux, do note that we are trying to work out a better solution.

Test that docker is correctly installed

The following should give an encouraging message:

$ docker run hello-world

Install oswitch

Requirements: Ruby 2.0 or higher.

$ gem install oswitch

Testing oswitch

$ oswitch ubuntu:14.04

FAQ

Q. Directories mounted within container on Mac host are empty.

The problem is, on Mac boot2docker is the real host, not OS X. oswitch can mount only what's available to it from boot2docker. For example, /Applications.

Run boot2docker ssh ls /Applications and you will find it empty as well.

The workaround is to correctly mount the directories you want in boot2docker first.

boot2docker down
VBoxManage sharedfolder remove boot2docker-vm --name Applications
VBoxManage sharedfolder add boot2docker-vm --name Applications --hostpath /Applications
boot2docker up
boot2docker ssh "sudo mkdir -p /Applications && sudo mount -t vboxsf -o uid=1000,gid=50 Applications /Applications"
Q. cwd is empty in the container

This means the said directory was not mounted by oswitch, or was incorrectly mounted. On Linux host, directories that can conflict with paths within container are not mounted. On Mac, boot2docker can get in the way.

Please report this on our issue tracker. To help us debug, please include:

  1. the directory in question
  2. the operating system you are running
Q. oswitch does not work with my docker image

Please report this on our issue tracker with oswitch's output. If the image you are using is not available via docker hub or another public repository, please include the Dockerfile as well.

Roadmap

  1. ~~make it possible to use docker containers without inheriting our current baseimage~~
  2. ~~gem distribution for easier installation~~
  3. brew recipe for Mac
  4. test on QMUL's compute cluster
  5. create an SELinux policy to run oswitch on CentOS without having to disable SELinux entirely
  6. rpm and deb packages
  7. make available images for common bioinformatics software
  8. deploy at RAL/JASMIN

Contribute

$ git clone https://github.com/yeban/oswitch
$ cd oswitch
$ gem install bundler && bundle
$ bundle exec bin/oswitch biolinux

Contributors & Funding


Development funded as part of NERC EOS Cloud at Wurm Lab, Queen Mary University of London.