bio-dbla-classifier

DBL-alpha tags can be classified into six expression groups depending on the number of cysteines and presence of sequence certain motifs within the tag region (Bull et al 2007). DBLa adds methods for grouping DBL-alpha amino acid sequence tags. The DBLa class is a subclass of Bio::Sequence::AA. If you apply this method please quote this article Bull et al “An approach to classifying sequence tags sampled from Plasmodium falciparum var genes..” Molecular and Biochemical Parasitology 154 (1) (July): 98–102. doi:10.1016/j.molbiopara.2007.03.011.

Installation

gem install bio-dbla-classifier

Uninstall

gem uninstall bio-dbla-classifier

Usage

require 'bio-dbla-classifier'

#create an instace of a new Bio::Sequence::AA class. This class simply extends the Bio::Sequence::AA class with methods #to classify and describe Dbla tags.

#seq1 = ‘DIGDIIRGRDLYSGNNKEKEQRKKLEKNGKTIVGKIYNEATNGQALQARYKGDDNNNYSKLREDRWTANRATIWEAITCDDDNKLSNASYVRPTSTDGQSGAQGKDKCRSANKTTGNTGDVNIVPTYFDYVPQYLR’ #seq = Bio::Sequence::AA.new(seq1)

#get the positions of limited variability #puts seq.polv1 #puts seq.polv2 #puts seq.polv3 #puts seq.polv4

#get the number if cysteines in the tag #puts seq.cys_count

#get the distinct sequence identifier #puts seq.dsid

#get the cyspolv group for this tag #puts seq.cyspolv_group

#get the block sharing group for this tag #puts seq.bs_group #to be implemented

#get the length of the tag #puts seq.size

#if input file is a fasta file

#seq_file = "#{ENV['HOME']}/sequences/878_kilifi_sequences.fasta"

#read the file

#Bio::FlatFile.open(seq_file).each do |entry|
 #tag = Bio::Sequence::AA.new(entry.seq)
 #puts "#{entry.definition},#{tag.dsid},#{tag.cys_count},#{tag.cyspolv_group}"
#end

Copyright

See LICENSE.txt for further details.