[[#]] bio-cd-hit-report

Build Status

Clustering sequences with CD-HIT produces a cluster file(.clstr) containing sequence names and their respective clusters. This plugin provides methods for parsing this file.

Note: this plugin is under active development!


    gem install bio-cd-hit-report


    require 'bio-cd-hit-report'

    cluster_file = "cluster95.clstr"
    report = Bio::CdHitReport.new(cluster_file)

      #print total number of clusters in the report
      puts report.total_clusters  

      #print the cluster members for cluster with id 1
      puts report.get_cluster(1)

      #information for each cluster
      report.each_cluster do |c|
        puts "#{c.name} - #{c.members}" #print cluster name/id with respective sequences in the cluster
        puts c.size #print the total number of entries in the cluster

      #print the representative sequence for each cluster
      report.each_cluster do |c|
         puts c.rep_seq 

Project home page

Information on the source tree, documentation, examples, issues and how to contribute, see


The BioRuby community is on IRC server: irc.freenode.org, channel: #bioruby.


If you use this software, please cite one of


This Biogem is published at #bio-cd-hit-report

Copyright (c) 2013 George Githinji. See LICENSE.txt for further details.