Class: Bio::DB::Pileup

Inherits:
Object
  • Object
show all
Defined in:
lib/bio/util/bio-gngm.rb

Overview

Extends the methods of the Bio::DB::Pileup class in bio-samtools. A pileup object represents the SAMtools pileup format at samtools.sourceforge.net/pileup.shtml. These extension methods are used by the Bio::Util::Gngm object internally and are not exposed to the user of the Bio::Util::Gngm object through that.

Instance Attribute Summary collapse

Instance Method Summary collapse

Instance Attribute Details

#second_non_ref_countObject

attributes set by call to Bio::DB::Pileup#discordant_chastity



46
47
48
# File 'lib/bio/util/bio-gngm.rb', line 46

def second_non_ref_count
  @second_non_ref_count
end

#third_non_ref_countObject

attributes set by call to Bio::DB::Pileup#discordant_chastity



46
47
48
# File 'lib/bio/util/bio-gngm.rb', line 46

def third_non_ref_count
  @third_non_ref_count
end

#top_non_ref_countObject

attributes set by call to Bio::DB::Pileup#discordant_chastity



46
47
48
# File 'lib/bio/util/bio-gngm.rb', line 46

def top_non_ref_count
  @top_non_ref_count
end

Instance Method Details

#discordant_chastityObject

calculates the discordant chastity statistic as defined in Austin et al (2011) bar.utoronto.ca/ngm/description.html and onlinelibrary.wiley.com/doi/10.1111/j.1365-313X.2011.04619.x/abstract;jsessionid=F73E2DA628523B26205297CEE95526DA.d02t04 Austin et al (2011) Next-generation mapping of Arabidopsis genes Plant Journal 67(4):7125-725

Briefly, The statistic measures the degree of difference between the SNP and the expected reference base. Using the mapping information comprising a SNP, the most frequent base that is not the reference base is compared to the next most common base after it. (from bar.utoronto.ca/ngm/description.html )



56
57
58
59
60
61
62
63
64
65
# File 'lib/bio/util/bio-gngm.rb', line 56

def discordant_chastity
  arr = self.non_refs.to_a.sort {|a,b| b.last <=> a.last }
  @top_non_ref_count, @second_non_ref_count, @third_non_ref_count = arr.collect {|c| c.last}
  case
  when self.non_ref_count == 0 then 0.0
  when @top_non_ref_count == @coverage then 1.0
  when @second_non_ref_count > 0 then @top_non_ref_count.to_f / (@top_non_ref_count + @second_non_ref_count).to_f
  else @top_non_ref_count.to_f / @coverage.to_f
  end
end

#is_snp?(opts) ⇒ Boolean

returns true if self is a SNP with minimum coverage depth of :min_depth and minimum non-reference bases of :min_non_ref_count returns false for every position where the reference base is N or n if :ignore_reference_n is set to true

Options and Defaults:

  • :min_depth => 2

  • :min_non_ref_count => 2

  • :ignore_reference_n => false

Example pileup.is_snp?(:min_depth => 5, :min_non_ref_count => 2) pileup.is_snp?(:min_depth => 5, :min_non_ref_count => 1, :ignore_reference_n => true)

Returns:

  • (Boolean)


78
79
80
81
82
83
84
# File 'lib/bio/util/bio-gngm.rb', line 78

def is_snp?(opts)  
  return false if self.ref_base == '*'
  #return false unless is_ct
  return false if opts[:ignore_reference_n] and self.ref_base == "N" or self.ref_base == "n"
  return true if self.coverage >= opts[:min_depth] and self.non_ref_count >= opts[:min_non_ref_count]
  false
end