Pileup format files are a representation of an alignment/mapping of reads to a reference. This biogem builds on the bio-samtools biogem to create enable developers to iterate through columns of a pileup format file, and interrogate possible polymorphisms for e.g. SNP detection. Say we have the pile lines like so

contig00001 199 A   4   .$...$  >a^>
contig00001 200 T   2   ..+1A   aR


line = "contig00001\t199\tA\t4\t.$...$\t>a^>\ncontig00001\t200\tT\t2\t..+1A\taR"


piles =
piles[0].reads #=> An array of 4 pileup reads (Bio::DB::PileupIterator::PileupRead objects)

The first reads ends at the first position

piles[0].reads[0].sequence #=> 'A'

The second read covers both positions:

piles[0].reads[1].sequence #=> 'AT'

Note that when you don't use to_a, instead using Bio::DB::PileupIterator#each, there is no "lookahead" (yet), so it doesn't find the T before it has iterated over it:{|pile| puts pile.reads[1].sequence if pile.pos==199} #=> "A"


piles[0].reads[1].direction #=> '+'


piles[1].reads[1].insertions #=> {200=>"A"}

Apologies in advance for any missing features (e.g. currently it does handle deletions) and slowness (it wasn't really written with speed in mind).


    gem install bio-pileup_iterator


To use the library

    require 'bio-pileup_iterator'

The API doc is online. For more code examples see also the test files in the source tree.

Project home page

Information on the source tree, documentation, issues and how to contribute, see


If you use this software, please cite

This Biogem is published at

Copyright (c) 2012 Ben J. Woodcroft. See LICENSE.txt for further details.