This library provides support for BGZF (Blocked GZip Format) in Ruby. BGZF, originally defined as part of the SAM/BAM specification, is used to compress record-oriented bioinformatics data in a way that facilitates random access, unlike plain gzip. A BGZF file consists of contatenated 64 KB blocks, each an independent gzip stream. It can be decompressed in its entirety with gzip, but this library enables random access using 'virtual offsets' as defined in SAM/BAM.
A virtual offset is a 64-bit quantity, with a 48-bit block offset giving the position in the file of the start of the block followed by a 16-bit data offset giving a position within the file.
gem install bio-bgzf
require 'bio-bgzf' File.open('example.gz') do |f| r = ::::.(f) while true do block_vo = r.tell block = r.read_block break unless block end block = f.read_block_at(block_vo) end
The API doc is online. For more code examples see the test files in the source tree.
Project home page
Information on the source tree, documentation, examples, issues and how to contribute, see
The BioRuby community is on IRC server: irc.freenode.org, channel: #bioruby.
If you use this software, please cite one of
- BioRuby: bioinformatics software for the Ruby programming language
- Biogem: an effective tool-based approach for scaling up open source software development in bioinformatics
This Biogem is published at #bio-bgzf
Copyright (c) 2012 Artem Tarasov and Clayton Wheeler. See LICENSE.txt for further details.