Class: BioDSL::AssembleSeqSpades

Inherits:
Object
  • Object
show all
Includes:
AuxHelper
Defined in:
lib/BioDSL/commands/assemble_seq_spades.rb

Overview

Assemble sequences the stream using SPAdes.

assemble_seq_spades is a wrapper around the single prokaryotic genome assembler SPAdes:

bioinf.spbau.ru/spades

Any records containing sequence information will be included in the assembly, but only the assembled contig sequences will be output to the stream.

The sequences records may contain qualty scores, and if the sequence names indicates that the sequence order is inter-leaved paired-end assembly will be performed.

Usage

assemble_seq_spades([careful: <bool>[, cpus: <uint>[, kmers: <list>]]])

Options

  • careful: <bool> - Run SPAdes with the careful flag set.

  • cpus: <uint> - Number of CPUs to use (default: 1).

  • kmers: <list> - List of kmers to use (default: auto).

Examples

If you have two pair-end sequence files with the Illumina data then you can assemble these using assemble_seq_spades like this:

BD.new.
read_fastq(input: "file1.fq", input2: "file2.fq).
assemble_seq_spades(kmers: [55,77,99,127]).
write_fasta(output: "contigs.fna").
run

rubocop:disable ClassLength

Constant Summary

STATS =
%i(records_in records_out sequences_in sequences_out residues_in
residues_out records_out assembled)

Instance Method Summary collapse

Methods included from AuxHelper

#aux_exist

Constructor Details

#initialize(options) ⇒ AssembleSeqSpades

Constructor for the AssembleSeqSpades class.

Parameters:

  • options (Hash)

    Options hash.

Options Hash (options):

  • :careful (Boolean)

    Flag indicating use of careful assembly.

  • :kmers (Array)

    List of kmers to use.

  • :cpus (Integer)

    CPUs to use.



88
89
90
91
92
93
94
95
96
# File 'lib/BioDSL/commands/assemble_seq_spades.rb', line 88

def initialize(options)
  @options = options
  @lengths = []
  @type    = nil

  aux_exist('spades.py')
  check_options
  defaults
end

Instance Method Details

#lmbProc

Return a lambda for the AssembleSeqSpades command.

Returns:

  • (Proc)

    Returns the command lambda.



101
102
103
104
105
106
107
108
109
110
111
112
113
114
# File 'lib/BioDSL/commands/assemble_seq_spades.rb', line 101

def lmb
  lambda do |input, output, status|
    status_init(status, STATS)

    TmpDir.create('reads.fq', 'reads.fa') do |in_fq, in_fa, tmp_dir|
      process_input(in_fq, in_fa, input, output)
      input_file = (@type == :fastq) ? in_fq : in_fa
      execute_spades(input_file, tmp_dir)
      process_output(output, File.join(tmp_dir, 'scaffolds.fasta'))
    end

    calc_n50(status)
  end
end