Class: Bio::GenBank

Inherits:
NCBIDB show all
Includes:
NCBIDB::Common
Defined in:
lib/bio/db/genbank/genbank.rb

Overview

Description

Parses a GenBank formatted database entry

Example

# entry is a string containing only one entry contents
gb = Bio::GenBank.new(entry)

Direct Known Subclasses

DDBJ, RefSeq

Defined Under Namespace

Classes: Locus

Constant Summary

Constants included from NCBIDB::Common

NCBIDB::Common::DELIMITER, NCBIDB::Common::TAGSIZE

Instance Method Summary collapse

Methods included from NCBIDB::Common

#acc_version, #accession, #accessions, #comment, #common_name, #definition, #features, #gi, #initialize, #keywords, #nid, #organism, #origin, #references, #segment, #source, #taxonomy, #version, #versions

Methods inherited from NCBIDB

#initialize

Methods inherited from DB

#exists?, #fetch, #get, open, #tags

Instance Method Details

#basecount(base = nil) ⇒ Object

BASE COUNT (this field is obsoleted after GenBank release 138.0) – Returns the BASE COUNT as a Hash. When the base is specified, returns count of the base as a Fixnum. The base can be one of ‘a’, ‘t’, ‘g’, ‘c’, and ‘o’ (others).



99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
# File 'lib/bio/db/genbank/genbank.rb', line 99

def basecount(base = nil)
  unless @data['BASE COUNT']
    hash = Hash.new(0)
    get('BASE COUNT').scan(/(\d+) (\w)/).each do |c, b|
      hash[b] = c.to_i
    end
    @data['BASE COUNT'] = hash
  end

  if base
    base.downcase!
    @data['BASE COUNT'][base]
  else
    @data['BASE COUNT']
  end
end

#circularObject



68
# File 'lib/bio/db/genbank/genbank.rb', line 68

def circular;  locus.circular;  end

#classificationObject

Taxonomy classfication. Returns an array of strings.



142
143
144
# File 'lib/bio/db/genbank/genbank.rb', line 142

def classification
  self.taxonomy.to_s.sub(/\.\z/, '').split(/\s*\;\s*/)
end

#dateObject



70
# File 'lib/bio/db/genbank/genbank.rb', line 70

def date;      locus.date;      end

#date_modifiedObject

modified date. Returns Date object, String or nil.



133
134
135
136
137
138
139
# File 'lib/bio/db/genbank/genbank.rb', line 133

def date_modified
  begin
    Date.parse(self.date)
  rescue ArgumentError, TypeError, NoMethodError, NameError
    self.date
  end
end

#divisionObject



69
# File 'lib/bio/db/genbank/genbank.rb', line 69

def division;  locus.division;  end

#each_cdsObject

FEATURES – Iterate only for the ‘CDS’ portion of the Bio::Features.



77
78
79
80
81
82
83
# File 'lib/bio/db/genbank/genbank.rb', line 77

def each_cds
  features.each do |feature|
    if feature.feature == 'CDS'
      yield(feature)
    end
  end
end

#each_geneObject

FEATURES – Iterate only for the ‘gene’ portion of the Bio::Features.



86
87
88
89
90
91
92
# File 'lib/bio/db/genbank/genbank.rb', line 86

def each_gene
  features.each do |feature|
    if feature.feature == 'gene'
      yield(feature)
    end
  end
end

#entry_idObject



66
# File 'lib/bio/db/genbank/genbank.rb', line 66

def entry_id;  locus.entry_id;  end

#lengthObject Also known as: nalen



67
# File 'lib/bio/db/genbank/genbank.rb', line 67

def length;    locus.length;    end

#locusObject

Accessor methods for the contents of the LOCUS record.



62
63
64
# File 'lib/bio/db/genbank/genbank.rb', line 62

def locus
  @data['LOCUS'] ||= Locus.new(get('LOCUS'))
end

#natypeObject



73
# File 'lib/bio/db/genbank/genbank.rb', line 73

def natype;    locus.natype;    end

#seqObject Also known as: naseq

ORIGIN – Returns DNA sequence in the ORIGIN record as a Bio::Sequence::NA object.



118
119
120
121
122
123
# File 'lib/bio/db/genbank/genbank.rb', line 118

def seq
  unless @data['SEQUENCE']
    origin
  end
  Bio::Sequence::NA.new(@data['SEQUENCE'])
end

#seq_lenObject

(obsolete???) length of the sequence



128
129
130
# File 'lib/bio/db/genbank/genbank.rb', line 128

def seq_len
  seq.length
end

#strandObject



72
# File 'lib/bio/db/genbank/genbank.rb', line 72

def strand;    locus.strand;    end

#strandednessObject

Strandedness. Returns one of ‘single’, ‘double’, ‘mixed’, or nil.



147
148
149
150
151
152
153
# File 'lib/bio/db/genbank/genbank.rb', line 147

def strandedness
  case self.strand.to_s.downcase
  when 'ss-'; 'single'
  when 'ds-'; 'double'
  when 'ms-'; 'mixed'
  else nil; end
end

#to_biosequenceObject

converts Bio::GenBank to Bio::Sequence


Arguments:

Returns

Bio::Sequence object



159
160
161
# File 'lib/bio/db/genbank/genbank.rb', line 159

def to_biosequence
  Bio::Sequence.adapter(self, Bio::Sequence::Adapter::GenBank)
end