Class: BioDSL::TrimPrimer

Inherits:
Object
  • Object
show all
Defined in:
lib/BioDSL/commands/trim_primer.rb

Overview

Trim sequence ends in the stream matching a specified primer.

trim_primer can trim full or partial primer sequence from sequence ends. This is done by matching the primer at the end specified by the direction option:

Forward clip:

sequence       ATCGACTGCATCACGACG
primer    CATGAATCGA
result              CTGCATCACGACG

Reverse clip:

sequence  ATCGACTGCATCACGACG
primer                  GACGATAGCA
result    ATCGACTGCATCAC

The primer sequence can be reverse complemented using the reverse_complement option. Also, a minimum overlap for trimming can be specified using the overlap_min option (default=1).

Non-perfect matching can be allowed by setting the allowed mismatch_percent, insertion_percent and deletion_percent.

The following keys are added to clipped records:

  • TRIM_PRIMER_DIR - Direction of clip.

  • TRIM_PRIMER_POS - Sequence position of clip (0 based).

  • TRIM_PRIMER_LEN - Length of clip match.

  • TRIM_PRIMER_PAT - Clip match pattern.

Usage

trim_primer(<primer: <string>>, <direction: <:forward|:reverse>
            [, reverse_complement: <bool>[, overlap_min: <uint>
            [, mismatch_percent: <uint>
            [, insertion_percent: <uint>
            [, deletion_percent: <uint>]]]]])

Options

  • primer: <string> - Primer sequence to search for.

  • direction: <:forward|:reverse> - Clip direction.

  • reverse_complement: <bool> - Reverse complement primer (default=false)

  • overlap_min: <uint> - Minimum primer length used (default=1)

  • mismatch_percent: <unit> - Allowed percent mismatches (default=0)

  • insertion_percent: <unit> - Allowed percent insertions (default=0)

  • deletion_percent: <unit> - Allowed percent mismatches (default=0)

Examples

Consider the following FASTA entry in the file test.fna:

>test
ACTGACTGATGACTACGACTACGACTACTACTACGT

The forward end can be trimmed like this:

BD.new.
read_fasta(input: "test.fna").
trim_primer(primer: "ATAGAACTGAC", direction: :forward).
dump.
run

{:SEQ_NAME=>"test",
 :SEQ=>"TGATGACTACGACTACGACTACTACTACGT",
 :SEQ_LEN=>30,
 :TRIM_PRIMER_DIR=>"FORWARD",
 :TRIM_PRIMER_POS=>0,
 :TRIM_PRIMER_LEN=>6,
 :TRIM_PRIMER_PAT=>"ACTGAC"}

And trimming a reverse primer:

BD.new.
read_fasta(input: "test.fna").
trim_primer(primer: "ACTACGTGCGGAT", direction: :reverse).
dump.
run

{:SEQ_NAME=>"test",
 :SEQ=>"ACTGACTGATGACTACGACTACGACTACT",
 :SEQ_LEN=>29,
 :TRIM_PRIMER_DIR=>"REVERSE",
 :TRIM_PRIMER_POS=>29,
 :TRIM_PRIMER_LEN=>7,
 :TRIM_PRIMER_PAT=>"ACTACGT"}

rubocop: disable ClassLength

Constant Summary collapse

STATS =
%i(records_in records_out sequences_in sequences_out pattern_hits
pattern_misses residues_in residues_out)

Instance Method Summary collapse

Constructor Details

#initialize(options) ⇒ TrimPrimer

Constructor for TrimPrimer.

Parameters:

  • options (Hash)

    Options hash.

Options Hash (options):

  • :primer (String)
  • :direction (Symbol)
  • :overlap_min (Boolean)
  • :reverse_complement (Boolean)
  • :mismatch_percent (Integer)
  • :insertion_percent (Ingetger)
  • :deletion_percent (Integer)


132
133
134
135
136
137
138
139
140
141
142
# File 'lib/BioDSL/commands/trim_primer.rb', line 132

def initialize(options)
  @options = options
  @options[:overlap_min] ||= 1
  @options[:mismatch_percent] ||= 0
  @options[:insertion_percent] ||= 0
  @options[:deletion_percent] ||= 0
  @pattern = pattern
  @hit     = false

  check_options
end

Instance Method Details

#lmbProc

Return command lambda for trim_primer.

Returns:

  • (Proc)

    Command lambda.



147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
# File 'lib/BioDSL/commands/trim_primer.rb', line 147

def lmb
  lambda do |input, output, status|
    status_init(status, STATS)

    input.each do |record|
      @status[:records_in] += 1

      if record[:SEQ] && record[:SEQ].length > 0
        @status[:sequences_in] += 1
        @status[:sequences_out] += 1

        case @options[:direction]
        when :forward then trim_forward(record)
        when :reverse then trim_reverse(record)
        end
      end

      output << record

      @status[:records_out] += 1
    end
  end
end