Class: BioDSL::ClassifySeqMothur

Inherits:
Object
  • Object
show all
Includes:
AuxHelper
Defined in:
lib/BioDSL/commands/classify_seq_mothur.rb

Overview

Run classify_seq_mothur on sequences in the stream.

This is a wrapper for the mothur command classify.seqs(). Basically, it classifies sequences in the stream given a database file and a taxonomy file which can be downloaded here:

www.mothur.org/w/images/5/59/Trainset9_032012.pds.zip

Please refer to the manual:

www.mothur.org/wiki/Classify.seqs

Mothur must be installed for classify_seq_mothurs to work. Read more here:

www.mothur.org/

Usage

classify_seq_mothur(<database: <file>>, <taxonomy: <file>>
                    [, confidence: <uint>[, cpus: <uint>]])

Options

  • database: <file> - Database to search.

  • taxonomy: <file> - Taxonomy file for mapping names.

  • confidence: <uint> - Confidence threshold (defualt=80).

  • cpus: <uint> - Number of CPU cores to use (default=1).

Examples

To classify a bunch of OTU sequences in the file otus.fna we do:

database = "trainset9_032012.pds.fasta"
taxonomy = "trainset9_032012.pds.tax"

BD.new.
read_fasta(input: "otus.fna").
classify_seq_mothur(database: database, taxonomy: taxonomy).
grab(exact: true, keys: :RECORD_TYPE, select: "taxonomy").
write_table(output: "classified.tab", header: true, force: true,
            skip: [:RECORD_TYPE]).
run

Constant Summary

STATS =
%i(records_in records_out sequences_in sequences_out
residues_in residues_out)

Instance Method Summary collapse

Methods included from AuxHelper

#aux_exist

Constructor Details

#initialize(options) ⇒ ClassifySeqMothur

Constructor for ClassifySeqMothur.

Parameters:

  • options (Hash)

    Options hash.

Options Hash (options):

  • :database (String)

    Path to database file.

  • :taxonomy (String)

    Path to taxonomy file.

  • :confidence (Integer)

    Confidence cutoff.

  • :cpus (Integer)

    Number of CPUs to use.



89
90
91
92
93
94
95
# File 'lib/BioDSL/commands/classify_seq_mothur.rb', line 89

def initialize(options)
  @options = options

  aux_exist('mothur')
  check_options
  defaults
end

Instance Method Details

#lmbProc

Command lambda for ClassifySeqMothur.

Returns:

  • (Proc)

    Lambda for the command.



100
101
102
103
104
105
106
107
108
109
110
111
# File 'lib/BioDSL/commands/classify_seq_mothur.rb', line 100

def lmb
  lambda do |input, output, status|
    status_init(status, STATS)

    TmpDir.create('input.fasta') do |tmp_in, tmp_dir|
      process_input(input, output, tmp_in)
      run_mothur(tmp_dir, tmp_in)
      tmp_out = Dir.glob("#{tmp_dir}/input.*.taxonomy").first
      process_output(output, tmp_out)
    end
  end
end