Class: Bio::Sequence::Kmer

Inherits:
Object
  • Object
show all
Defined in:
lib/bio-kmer_counter/kmer_counter.rb

Class Method Summary collapse

Class Method Details

.empty_full_kmer_hash(k = 4) ⇒ Object

Return a hash of Strings to 0, for each kmer of length k. For instance empty_full_kmer_hash(1) => ‘T’=>0, ‘C’=>0, ‘G’=>0



17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# File 'lib/bio-kmer_counter/kmer_counter.rb', line 17

def self.empty_full_kmer_hash(k=4)
  return @empty_full_hash.dup unless @empty_full_hash.nil?
  
  counts = {}
          
  ordered_possibilities = %w(A T C G)
  keys = ordered_possibilities
  (k-1).times do
    keys = keys.collect{|k| ordered_possibilities.collect{|n| "#{k}#{n}"}.flatten}.flatten
  end
  
  keys.each do |key|
    counts[key] = 0
  end
  counts
end

.merge_down_to_lowest_lexigraphical_form(hash) ⇒ Object

Take a kmer hash, and merge those keys to the lowest lexigraphical form (See Bio::Sequence::NA#lowest_lexigraphical_form for what this means) When 2 keys are reverse complements they get merged into one hash entry, where the key is the lowest_lexigraphical_form of the two and the value is the sum of the original 2 values

For instance ‘A’=>2,‘T’=>5 #=> ‘A’=>7



41
42
43
44
45
46
47
48
49
50
51
# File 'lib/bio-kmer_counter/kmer_counter.rb', line 41

def self.merge_down_to_lowest_lexigraphical_form(hash)
  keys = empty_full_kmer_hash.keys
  
  new_hash = {}
  hash.each do |kmer, count|
    key = Bio::Sequence::NA.new(kmer).lowest_lexigraphical_form.to_s.upcase
    new_hash[key] ||= 0
    new_hash[key] += count
  end
  return new_hash
end