Class: BioDSL::AssembleSeqIdba

Inherits:
Object
  • Object
show all
Includes:
AuxHelper
Defined in:
lib/BioDSL/commands/assemble_seq_idba.rb

Overview

Assemble sequences the stream using IDBA_UD.

assemble_seq_idba is a wrapper around the prokaryotic metagenome assembler IDBA_UD:

i.cs.hku.hk/~alse/hkubrg/projects/idba_ud/

Any records containing sequence information will be included in the assembly, but only the assembled contig sequences will be output to the stream.

The sequences records may contain quality scores, and if the sequence names indicates that the sequence order is inter-leaved paired-end assembly will be performed.

Usage

assemble_seq_idba([kmer_min: <uint>[, kmer_max: <uint>[, cpus: <uint>]]])

Options

  • kmer_min: <uint> - Minimum k-mer value (default: 24).

  • kmer_max: <uint> - Maximum k-mer value (default: 128).

  • cpus: <uint> - Number of CPUs to use (default: 1).

Examples

If you have two pair-end sequence files with the Illumina data then you can assemble these using assemble_seq_idba like this:

BD.new.
read_fastq(input: "file1.fq", input2: "file2.fq).
assemble_seq_idba.
write_fasta(output: "contigs.fna").
run

Constant Summary collapse

STATS =
%i(records_in records_out sequences_in sequences_out residues_in
residues_out)

Instance Method Summary collapse

Methods included from AuxHelper

#aux_exist

Constructor Details

#initialize(options) ⇒ AssembleSeqIdba

Constructor for the AssembleSeqIdba class.

Parameters:

  • options (Hash)

    Options hash.

Options Hash (options):

  • :kmer_min (Integer)

    Minimum kmer value.

  • :kmer_max (Integer)

    Maximum kmer value.

  • :cpus (Integer)

    CPUs to use.



81
82
83
84
85
86
87
88
# File 'lib/BioDSL/commands/assemble_seq_idba.rb', line 81

def initialize(options)
  @options = options
  @lengths = []

  aux_exist('idba_ud')
  check_options
  defaults
end

Instance Method Details

#lmbProc

Return a lambda for the AssembleSeqIdba command.

Returns:

  • (Proc)

    Returns the command lambda.



93
94
95
96
97
98
99
100
101
102
103
104
105
# File 'lib/BioDSL/commands/assemble_seq_idba.rb', line 93

def lmb
  lambda do |input, output, status|
    status_init(status, STATS)

    TmpDir.create('reads.fna', 'contig.fa') do |fa_in, fa_out, tmp_dir|
      process_input(input, output, fa_in)
      execute_idba(fa_in, tmp_dir)
      process_output(output, fa_out)
    end

    calc_n50(status)
  end
end