bio-gag is a biogem for detecting and correcting a particular type of error that occurs/occurred in particular versions of the IonTorrent sequencing kit:
Ion Xpress Template 100 Kit
Ion Xpress Template 200 Kit
Ion Sequencing 100 Kit
Ion Sequencing 200 Kit
Newer versions of these kits do not appear to be affected by this error, starting with the “Ion PGM 200 Sequencing Kit”. *gag error* is the term I've coined to describe an error that various people have observed on certain sequencing kits with IonTorrent, though it has not previously been characterised very well that I know of (we noticed that the errors seemed to occur at GAG positions in the reads that were supposed to be GAAG). This biogem tries to find and fix these errors.
Errors that appear to be of this type were recently refered to in a benchtop sequencing platform comparison (Supplementary figure 4):
There are also some more in-depth discussions about this on the (closed access) Ion Torrent forum:
To search for these errors, a pileup format file of aligned sequences is required. These can be generated either from an assembly or by aligning to a reference, although it has only been tested on de-novo assemblies assembled with newbler. Note that it has not been entirely optimised due to regular time constraints combined with the fact they appear to have been fixed in newer kits.
gem install bio-gag
To use the script, the important options are these:
gag [options] <pileup_output>
At first, you probably want to just run it without any options. The output is a list of predicted sites at which the error occurs.
--lookahead Work out if gag predictions are supported by orf predictions being extended [default is just to print out found gag errors]. There's modifed usage too - probably best for you to look at the code if you are using this operation --fix CONSENSUS_FASTA_FILE Find gag errors in the pileup file, correct them in CONSENSUS_FASTA_FILE, and print to STDOUT the fixed up consensus -g, --gags GAG_FILE Specify a list of GAG errors to be fixed in tab-separated form (use with --fix, the tab-separated output is from regular output or --lookahead)
And some options for logging:
--logger filename Log to file (default STDERR) --trace options Set log level (default INFO, see bio-logger documentation at https://github.com/pjotrp/bioruby-logger-plugin -q, --quiet Run quietly -v, --verbose Run verbosely
To use the library
The API doc is online. For more code examples see also the test files in the source tree.
Project home page
Information on the source tree, documentation, issues and how to contribute, see
Currently, this bio-gem is unpublished, but a relevant manuscript is in the works.
This Biogem is published at biogems.info/index.html#bio-gag
Copyright © 2012 Ben J Woodcroft. See LICENSE.txt for further details.