Class: BioDSL::Genecall

Inherits:
Object
  • Object
show all
Includes:
AuxHelper
Defined in:
lib/BioDSL/commands/genecall.rb

Overview

Genecall sequences in the stream.

Genecall predict genes in prokaryotic single genomes or metagenomes using Prodigal 2.6 which must be installed:

prodigal.ornl.gov/

The records produced are of the type:

{:RECORD_TYPE=>"gene",
 :S_BEG=>2, :S_END=>109,
 :S_LEN=>108,
 :STRAND=>"-",
 :SEQ_NAME=>"contig1",
 :SEQ=>"MGKVIGIDLGTTNSCVAVMDGKTAKVIENAEGMRTT",
 :SEQ_LEN=>36}

Usage

genecall([type: <string>[, procedure: <string>[, closed_ends: <bool>
         [, masked: <bool>]]]])

Options

  • type: <string> - Output dna or protein sequence (default: dna).

  • procedure: <string> - Single or meta (default: single).

  • closed_ends: <bool> - Don't allow truncated gene at ends.

  • masked: <bool> - Ignore stretch of Ns.

Examples

To genecall a genome do:

BD.new.
read_fasta(input: "contigs.fna").
genecall.
grab(select: "genecall", key: :type, exact: true).
write_fasta(output: "genes.fna").
run

To add genecall data to the sequence name use merge_values:

BD.new.
read_fasta(input: "contigs.fna").
genecall(type: "protein").
grab(select: "genecall", key: :type, exact: true).
merge_values(keys: [:SEQ_NAME, :S_BEG, :S_END, :S_LEN, :STRAND]).
write_fasta(output: "genes.faa").
run

Constant Summary

STATS =
i(records_in records_out sequences_in sequences_out residues_in
residues_out)

Instance Method Summary collapse

Methods included from AuxHelper

#aux_exist

Constructor Details

#initialize(options) ⇒ Genecall

Constructor for the Genecall class.

Parameters:

  • options (Hash)

    Options hash.

Options Hash (options):

  • :type (Symbol)

    of output.

  • :procedure (Symbol)

    used for genecalling.

  • :closed_ends (Boolean)

    disallow truncated genes at ends.

  • :masked (Boolean)

    ignore stretch of Ns.



96
97
98
99
100
101
102
103
104
105
# File 'lib/BioDSL/commands/genecall.rb', line 96

def initialize(options)
  @options = options
  @names   = {}

  aux_exist('prodigal')
  defaults
  check_options

  @type = @options[:type].to_sym
end

Instance Method Details

#lmbProc

Return a lambda for the genecall command.

Returns:

  • (Proc)

    Returns the command lambda.



110
111
112
113
114
115
116
117
118
119
120
# File 'lib/BioDSL/commands/genecall.rb', line 110

def lmb
  lambda do |input, output, status|
    status_init(status, STATS)

    TmpDir.create('i.fa', 'o.fna', 'o.faa') do |tmp_in, tmp_fna, tmp_faa|
      process_input(input, output, tmp_in)
      run_prodigal(tmp_in, tmp_fna, tmp_faa)
      process_output(output, tmp_fna, tmp_faa)
    end
  end
end