Class: BioDSL::FilterRrna

Inherits:
Object
  • Object
show all
Includes:
AuxHelper
Defined in:
lib/BioDSL/commands/filter_rrna.rb

Overview

Filter rRNA sequences from the stream.

Description

filter_rrna utilizes sortmerna to identify and filter ribosomal RNA sequences from the stream. The sortmerna and indexdb_rna executables must be installed for filter_rrna to work.

Indexed reference files are produced using indexdb_rna.

For more about the sortmerna look here:

bioinfo.lifl.fr/RNA/sortmerna/

Usage

filter_rrna(ref_fasta: <file(s)>, ref_index: <file(s)>)

Options

  • ref_fasta <file(s)> - One or more reference FASTA files.

  • ref_index <file(s)> - One or more index reference files.

Examples

To filter all reads matching the SILVA archaea 23S rRNA do:

BD.new.
read_fastq(input: "reads.fq").
filter_rrna(ref_fasta: ["silva-arc-23s-id98.fasta"],
            ref_index: ["silva-arc-23s-id98.fasta.idx*"]).
write_fastq(output: "clean.fq").
run

rubocop:disable ClassLength

Constant Summary collapse

STATS =
%i(records_in records_out sequences_in sequences_out residues_in
residues_out)

Instance Method Summary collapse

Methods included from AuxHelper

#aux_exist

Constructor Details

#initialize(options) ⇒ FilterRrnas

Constructor the FilterRrna class.

Parameters:

  • options (Hash)

    Options hash.

Options Hash (options):



79
80
81
82
83
84
85
# File 'lib/BioDSL/commands/filter_rrna.rb', line 79

def initialize(options)
  @options = options
  @filter  = Set.new

  aux_exist('sortmerna')
  check_options
end

Instance Method Details

#lmbProc

Return the command lambda for filter_rrnas.

Returns:

  • (Proc)

    Command lambda.



90
91
92
93
94
95
96
97
98
99
100
101
102
# File 'lib/BioDSL/commands/filter_rrna.rb', line 90

def lmb
  lambda do |input, output, status|
    status_init(status, STATS)

    TmpDir.create('tmp', 'seq', 'out') do |tmp_file, seq_file, out_file|
      ref_files = process_ref_files
      process_input(input, tmp_file, seq_file)
      execute_sortmerna(ref_files, seq_file, out_file)
      parse_sortme_output(out_file)
      process_output(output, tmp_file)
    end
  end
end