Class: Bio::AssemblyGraphAlgorithms::KmerCoverageBasedPathFilter
- Inherits:
-
Object
- Object
- Bio::AssemblyGraphAlgorithms::KmerCoverageBasedPathFilter
- Includes:
- FinishM::Logging
- Defined in:
- lib/assembly/kmer_coverage_based_path_filter.rb
Instance Method Summary collapse
-
#filter(paths, kmer_hash, thresholds, options = {}) ⇒ Object
Remove all paths where the kmer coverage is below the threshold at any point along the path.
-
#write_depths(io, trail_name, trail_sequence, kmer_hash) ⇒ Object
Write coverages to the given IO object as a tab-separated file similar to the output of “samtools depth”.
Methods included from FinishM::Logging
Instance Method Details
#filter(paths, kmer_hash, thresholds, options = {}) ⇒ Object
Remove all paths where the kmer coverage is below the threshold at any point along the path.
:paths: an iterable collection of paths :kmer_hash: KmerMultipleAbundanceHash :thresholds: minimum coverages (min numbers of full kmers) required at each point along the path
Optional options
-
:exclude_ending_length => don’t filter out paths based on kmers close to the end e.g. :exclude_ending_length => 2 means the first 2 and last 2 kmers are not considered
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
# File 'lib/assembly/kmer_coverage_based_path_filter.rb', line 16 def filter(paths, kmer_hash, thresholds, ={}) # sanity check unless kmer_hash.number_of_abundances == thresholds.length raise "Unexpectedly found a different number of thresholds and kmer abundance columns" end passable_paths = [] paths.each do |path| passable = true seq = path.sequence # remove ends of sequence if kmers don't count at the ends if [:exclude_ending_length] ex = [:exclude_ending_length] seq = seq[ex...(seq.length-ex)] end Bio::Sequence::NA.new(seq).window_search(kmer_hash.kmer_length,1) do |kmer| kmer_hash[kmer].each_with_index do |abundance, i| if abundance < thresholds[i] passable = false log.debug "Failing trail #{path.sequence} due to insufficent abundance (#{abundance} from #{kmer_hash[kmer]}) for #{kmer}" if log.debug? break end end break if !passable end passable_paths.push path if passable end return passable_paths end |
#write_depths(io, trail_name, trail_sequence, kmer_hash) ⇒ Object
Write coverages to the given IO object as a tab-separated file similar to the output of “samtools depth”
50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
# File 'lib/assembly/kmer_coverage_based_path_filter.rb', line 50 def write_depths(io, trail_name, trail_sequence, kmer_hash) #write header io.print %w(trail position).join("\t") io.print "\t" kmer_hash.number_of_abundances.times{|i| io.print "\tcoverage#{i+1}"} io.puts #write data pos = 1 Bio::Sequence::NA.new(trail_sequence).window_search(kmer_hash.kmer_length,1) do |kmer| io.puts [trail_name, pos, kmer_hash[kmer]].flatten.join("\t") pos += 1 end end |