Class: BioDSL::SliceSeq

Inherits:
Object
  • Object
show all
Defined in:
lib/BioDSL/commands/slice_seq.rb

Overview

Slice sequences in the stream and obtain subsequences.

Slice subsequences from sequences using index positions, that is single postion residues, or using ranges for stretches of residues.

All positions are 0-based.

If the records also contain quality SCORES these are also sliced.

Usage

slice_seq(<slice: <index>|<range>>)

Options

  • slice: <index> - Slice a one residue subsequence.

  • slice: <range> - Slice a range from the sequence.

Examples

Consider the following FASTQ entry in the file test.fq:

@HWI-EAS157_20FFGAAXX:2:1:888:434
TTGGTCGCTCGCTCCGCGACCTCAGATCAGACGTGGGCGAT
+
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHI

To slice the second residue from the beginning do:

BD.new.read_fastq(input: "test.fq").slice_seq(slice: 2).dump.run

{:SEQ_NAME=>"HWI-EAS157_20FFGAAXX:2:1:888:434",
 :SEQ=>"G",
 :SEQ_LEN=>1,
 :SCORES=>"#"}

To slice the last residue do:

BD.new.read_fastq(input: "test.fq").slice_seq(slice: -1).dump.run

{:SEQ_NAME=>"HWI-EAS157_20FFGAAXX:2:1:888:434",
 :SEQ=>"T",
 :SEQ_LEN=>1,
 :SCORES=>"I"}

To slice the first 5 residues do:

BD.new.read_fastq(input: "test.fq").slice_seq(slice: 0 ... 5).dump.run

{:SEQ_NAME=>"HWI-EAS157_20FFGAAXX:2:1:888:434",
 :SEQ=>"TTGGT",
 :SEQ_LEN=>5,
 :SCORES=>"!\"\#$%"}

To slice the last 5 residues do:

BD.new.read_fastq(input: "test.fq").slice_seq(slice: -5 .. -1).dump.run

{:SEQ_NAME=>"HWI-EAS157_20FFGAAXX:2:1:888:434",
 :SEQ=>"GCGAT",
 :SEQ_LEN=>5,
 :SCORES=>"EFGHI"}

Constant Summary collapse

STATS =
%i(records_in records_out sequences_in sequences_out residues_in
residues_out)

Instance Method Summary collapse

Constructor Details

#initialize(options) ⇒ SliceSeq

Constructor for SliceSeq.

Parameters:

  • options (Hash)

    Options hash.

Options Hash (options):

  • :slice (Range, Integer)


101
102
103
104
105
# File 'lib/BioDSL/commands/slice_seq.rb', line 101

def initialize(options)
  @options = options

  check_options
end

Instance Method Details

#lmbProc

Return lambda for command.

Returns:

  • (Proc)

    Command lambda.



110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
# File 'lib/BioDSL/commands/slice_seq.rb', line 110

def lmb
  lambda do |input, output, status|
    status_init(status, STATS)

    input.each do |record|
      @status[:records_in] += 1

      slice_seq(record) if record.key? :SEQ

      output << record

      @status[:records_out] += 1
    end
  end
end