A library for working with LC/MS runs.


The following example works on ALL versions of mzXML and mzML (including support for compressed peak data).

require "ms/msrun" 

file = "file.mzXML" # works identical for "file.mzML" do |ms|

  # Run level information:
  ms.start_time       # in seconds, gives date for mzML
  ms.end_time         # in seconds, returns nil in mzML

  ms.scan_count       # number of scans
  ms.scan_count(1)    # number of MS scans
  ms.scan_count(2)    # number of MS/MS scans, etc.

  ms.parent_basename_noext   # "file" (as recorded _in the xml_)
  ms.filename                # "file.mzXML"

  # Random scan access (blazing fast)
  ms.scan(22)         # a scan object

  # Complete scan access
  ms.each do |scan|
    scan.num          # scan number
    scan.ms_level     # ms_level
    scan.time         # retention time in seconds
    scan.start_mz     # the first m/z value, returns nil in mzML
    scan.end_mz       # the last m/z value, returns nil in mzML

    # Precursor information
    pr = scan.precursor  # an Ms::Precursor object
    pr.intensity      # does fast binary search if info not already given
    pr.parent         # the parent scan
    pr.charge_states  # Array of possible charge states

    # Spectral information
    spectrum = scan.spectrum
    spectrum.mzs          # Array of m/z values
    spectrum.intensities  # Array of m/z values
    spectrum.peaks do |mz, inten|
      puts "#{mz} #{inten}"   # print each peak on own line

  # supports pre-filtering for faster access

  ## get just precursor info:
  ms.each(:ms_level => 2, :spectrum => false) {|scan| scan.precursor }

  ## get just level one spectra:
  ms.each(:ms_level => 1, :precursor => false) {|scan| scan.spectrum }

# Quicker way to get at the scans:
Ms::Msrun.foreach("file.mzXML") {|scan|  scan <do something> }


Can convert mzXML or mzML to mgf or ms2 do |ms|
  mgfFile = mzmlFile.chomp(".mzML") + ".ms2"
  ms.to_ms2(:output => mgfFile)

Or it can be done through the command line program ms_to_search.rb

"usage:  <file>.mz[XML | ML] ... <type>"

Other output formats can be included in future versions.



Uses Nokogiri and a dash of regular expressions to achieve very fast random access of scans (also supports accessing all scans or subsets of scans).


One interface for all formats.

Lazy evaluation at scan and spectrum level

Scans are only read from IO when requested. Spectra are also decoded only when explicitly accessed.

Extensively tested

To release, the parser must pass an extensive specification for each file version (a total of ~1500 tests).

Long-term support

We will continue to support newer versions and fix any bugs or edge cases that are found. Please alert us of any mzXML or mzML file that is not parsed correctly.


gem install ms-msrun



See also

mzml and the TPP