ParseFasta

Gem Version Build Status Coverage Status

So you want to parse a fasta file...

Installation

Add this line to your application's Gemfile:

gem 'parse_fasta'

And then execute:

$ bundle

Or install it yourself as:

$ gem install parse_fasta

Overview

Provides nice, programmatic access to fasta and fastq files. It's faster and more lightweight than BioRuby. And more fun!

Documentation

Checkout parse_fasta docs for the full api documentation.

Usage

Here are some examples of using ParseFasta. Don't forget to require "parse_fasta" at the top of your program!

Print header and length of each record.

ParseFasta::SeqFile.open(ARGV[0]).each_record do |rec|
  puts [rec.header, rec.seq.length].join "\t"
end

You can parse fastQ files in exatcly the same way.

ParseFasta::SeqFile.open(ARGV[0]).each_record do |rec|
  printf "Header: %s, Sequence: %s, Description: %s, Quality: %s\n",
         rec.header,
         rec.seq,
         rec.desc,
         rec.qual
end

The Record#desc and Record#qual will be nil if the file you are parsing is a fastA file.

ParseFasta::SeqFile.open(ARGV[0]).each_record do |rec|
  if rec.qual
    # it's a fastQ record
  else
    # it's a fastA record
  end
end

You can also check this with Record#fastq?

ParseFasta::SeqFile.open(ARGV[0]).each_record do |rec|
  if rec.fastq?
    # it's a fastQ record
  else
    # it's a fastA record
  end
end

And there is a nice #to_s method, that does what it should whether the record is fastA or fastQ like. Check out the docs for info on the fancy #to_fasta and #to_fastq methods!

ParseFasta::SeqFile.open(ARGV[0]).each_record do |rec|
  puts rec.to_s
end

But of course, since it is a #to_s override...you don't even have to call it directly!

ParseFasta::SeqFile.open(ARGV[0]).each_record do |rec|
  puts rec
end