Module: Bio
- Defined in:
- lib/bio/db/vcf.rb,
 lib/bio/db/sam.rb,
 lib/bio/db/pileup.rb,
 lib/bio/db/alignment.rb,
 lib/bio/db/sam/library.rb
Overview
:title:Pileup
Bio::DB::Pileup
A class representing information in SAMTools pileup format
- Author
- 
Dan MacLean ([email protected]) 
Pileup is described at sourceforge.net/apps/mediawiki/samtools/index.php?title=SAM_FAQ#I_do_not_understand_the_columns_in_the_pileup_output. Briefly (when you invoke pileup with the -c option):
- 
1 reference sequence name 
- 
2 reference coordinate 
- 
(3) reference base, or ‘*’ for an indel line 
- 
(4) genotype where heterozygotes are encoded in the IUB code: M=A/C, R=A/G, W=A/T, S=C/G, Y=C/T and K=G/T; indels are indicated by, for example, */+A, -A/* or CC/-C. There is no difference between */A or +A/*. 
- 
(5) Phred-scaled likelihood that the genotype is wrong, which is also called ‘consensus quality’. 
- 
(6) Phred-scaled likelihood that the genotype is identical to the reference, which is also called ‘SNP quality’. Suppose the reference base is A and in alignment we see 17 G and 3 A. We will get a low consensus quality because it is difficult to distinguish an A/G heterozygote from a G/G homozygote. We will get a high SNP quality, though, because the evidence of a SNP is very strong. 
- 
(7) root mean square (RMS) mapping quality 
- 
8 # reads covering the position 
- 
9 read bases at a SNP line (check the manual page for more information); the 1st indel allele otherwise 
- 
10 base quality at a SNP line; the 2nd indel allele otherwise 
- 
(11) indel line only: # reads directly supporting the 1st indel allele 
- 
(12) indel line only: # reads directly supporting the 2nd indel allele 
- 
(13) indel line only: # reads supporting a third indel allele 
If pileup is invoked without ‘-c’, indel lines and columns between 3 and 7 inclusive will not be outputted.
NB mpileup uses the 6 column output format eg “seq2t151tGtGt36t0t99t12t.….……At:9<;;7=<<<<<” Pileup provides accessors for all columns (6 or 10 column format) and a few other useful methods
Defined Under Namespace
Classes: DB, NucleicAcid, Sequence