Class: Mascot::DAT::Peptides
- Inherits:
-
Object
- Object
- Mascot::DAT::Peptides
- Includes:
- Enumerable
- Defined in:
- lib/mascot/dat/peptides.rb
Overview
A iterator for the peptide spectrum match results of a Mascot DAT file. As opposed to the other sections of a DAT file, you don’t really want to access this section in memory at once. It is often quite large and needs to be accessed using the provided Enumerable or random access methods.
Instance Attribute Summary collapse
-
#psmidx ⇒ Hash{ Fixnum => Hash{ Fixnum => Fixnum }}
readonly
A nested Hash index of the byte offset positions for the peptide-spectrum-match entries.
Instance Method Summary collapse
-
#each {|Mascot::DAT::PSM| ... } ⇒ Object
Iterate through all of the PSM entries in the DAT file.
-
#initialize(dat, section_label, cache_psm_index = true) ⇒ Peptides
constructor
A new instance of Peptides.
-
#next_psm ⇒ Mascot::DAT::PSM, NilClass
Returns the next PSM from the DAT file.
-
#psm(query_number, rank) ⇒ Mascot::DAT::PSM
Return a specific PSM identified for query
qand peptide numberp. -
#rewind ⇒ Object
Rewind the cursor to the start of the peptides section (e.g. q1_p1=…).
Constructor Details
#initialize(dat, section_label, cache_psm_index = true) ⇒ Peptides
Returns a new instance of Peptides.
20 21 22 23 24 25 26 27 28 29 30 31 32 |
# File 'lib/mascot/dat/peptides.rb', line 20 def initialize(dat, section_label, cache_psm_index=true) # create our own filehandle, since other operations may interfere with the @dat = Mascot::DAT.open(dat.dat_file.path) @filehandle = @dat.dat_file @section_label = section_label self.rewind @curr_psm = [1,1] @psmidx = {} @endbytepos = Float::INFINITY if cache_psm_index index_psm_positions() end end |
Instance Attribute Details
#psmidx ⇒ Hash{ Fixnum => Hash{ Fixnum => Fixnum }} (readonly)
A nested Hash index of the byte offset positions for the peptide-spectrum-match entries. The keys of the index are the query and peptide rank (Fixnum), the structure of which is:
{ query_number => { peptide_rank => byte_position } }
To access a particular entry, it is better to use the #psm method.
15 16 17 |
# File 'lib/mascot/dat/peptides.rb', line 15 def psmidx @psmidx end |
Instance Method Details
#each {|Mascot::DAT::PSM| ... } ⇒ Object
Iterate through all of the Mascot::DAT::PSM entries in the DAT file.
87 88 89 90 91 92 |
# File 'lib/mascot/dat/peptides.rb', line 87 def each self.rewind while psm = self.next_psm yield psm end end |
#next_psm ⇒ Mascot::DAT::PSM, NilClass
Returns the next Mascot::DAT::PSM from the DAT file. If there is no other PSM, then it returns nil.
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 |
# File 'lib/mascot/dat/peptides.rb', line 57 def next_psm if @filehandle.pos >= @endbytepos return nil end # get the initial values for query & rank buffer = [@filehandle.readline.chomp] buffer[0] =~ /q(\d+)_p(\d+)/ q,p = $1, $2 @curr_psm = [q,p] prev_pos = @filehandle.pos @filehandle.each do |l| l.chomp! # break if we have reached the boundary if l =~ @boundary @endbytepos = @filehandle.pos - @dat.boundary_string.length break end # break if we are on another PSM break unless l =~ /^q#{q}_p#{p}/ buffer << l prev_pos = @filehandle.pos end # rewind the cursor to the last hit @filehandle.pos = prev_pos # return the new PSM Mascot::DAT::PSM.new(buffer) end |
#psm(query_number, rank) ⇒ Mascot::DAT::PSM
Return a specific Mascot::DAT::PSM identified for query q and peptide number p
46 47 48 49 50 51 52 53 |
# File 'lib/mascot/dat/peptides.rb', line 46 def psm query_number,rank if @psmidx[query_number] and @psmidx[query_number][rank] @filehandle.pos = @psmidx[query_number][rank] next_psm else raise Exception.new "Invalid PSM specification (#{q},#{p})" end end |
#rewind ⇒ Object
Rewind the cursor to the start of the peptides section (e.g. q1_p1=…)
35 36 37 38 |
# File 'lib/mascot/dat/peptides.rb', line 35 def rewind @dat.goto(@section_label) 1.upto(2) { @filehandle.readline } end |