Class: Bio::Fetch

Inherits:
Object show all
Defined in:
lib/bio/io/fetch.rb

Overview

DESCRIPTION

The Bio::Fetch class provides an interface to dbfetch servers. Given a database name and an accession number, these servers return the associated record. For example, for the embl database on the EBI, that would be a nucleic or amino acid sequence.

Possible dbfetch servers include:

Note that old URL www.ebi.ac.uk/cgi-bin/dbfetch still alives probably because of compatibility, but using the new URL is recommended.

Historically, there were other dbfetch servers including:

But they are unavailable now.

If you’re behind a proxy server, be sure to set your HTTP_PROXY environment variable accordingly.

USAGE

require 'bio'

# Retrieve the sequence of accession number M33388 from the EMBL
# database.
server = Bio::Fetch::EBI.new  #uses EBI server
puts server.fetch('ena_sequence','M33388')

# database name "embl" can also be used though it is not officially listed
puts server.fetch('embl','M33388')

# Do the same thing with explicitly giving the URL.
server = Bio::Fetch.new(Bio::Fetch::EBI::URL)  #uses EBI server
puts server.fetch('ena_sequence','M33388')

# Do the same thing without creating a Bio::Fetch::EBI object.
puts Bio::Fetch::EBI.query('ena_sequence','M33388')

# To know what databases are available on the dbfetch server:
server = Bio::Fetch::EBI.new
puts server.databases

# Some databases provide their data in different formats (e.g. 'fasta',
# 'genbank' or 'embl'). To check which formats are supported by a given
# database:
puts server.formats('embl')

Direct Known Subclasses

EBI

Defined Under Namespace

Classes: EBI

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(url = nil) ⇒ Fetch

Create a new Bio::Fetch server object that can subsequently be queried using the Bio::Fetch#fetch method.

You must specify url of a server. The preset default server is deprecated.

If you want to use a server without explicitly specifying the URL, use Bio::Fetch::EBI.new that uses EBI Dbfetch server.


Arguments:

  • url: URL of dbfetch server. (no default value)

Returns

Bio::Fetch object



140
141
142
143
144
145
# File 'lib/bio/io/fetch.rb', line 140

def initialize(url = nil)
  unless url then
    raise ArgumentError, "No server URL is given in Bio::Fetch.new. The default server URL value have been deprecated. You must explicitly specify the url or use Bio::Fetch::EBI for using EBI Dbfetch."
  end
  @url = url
end

Instance Attribute Details

#databaseObject

The default database to query – This will be used by the get_by_id method ++



151
152
153
# File 'lib/bio/io/fetch.rb', line 151

def database
  @database
end

Instance Method Details

#databasesObject

Using this method, the user can ask a dbfetch server what databases it supports. This would normally be the first step you’d take when you use a dbfetch server for the first time. Example:

server = Bio::Fetch.new()
puts server.databases # returns "aa aax bl cpd dgenes dr ec eg emb ..."

This method works for EBI Dbfetch server (and for the bioruby dbfetch server). Not all servers support this method.


Returns

array of database names



192
193
194
# File 'lib/bio/io/fetch.rb', line 192

def databases
  _get_single('info', 'dbs').strip.split(/\s+/)
end

#fetch(db, id, style = 'raw', format = nil) ⇒ Object

Fetch a database entry as specified by database (db), entry id (id), ‘raw’ text or ‘html’ (style), and format.

Examples:

server = Bio::Fetch.new('http://www.ebi.ac.uk/cgi-bin/dbfetch')
puts server.fetch('embl','M33388','raw','fasta')
puts server.fetch('refseq','NM_12345','html','embl')

Arguments:

  • database: name of database to query (see Bio::Fetch#databases to get list of supported databases)

  • id: single ID or ID list separated by commas or white space

  • style: [raw|html] (default = ‘raw’)

  • format: name of output format (see Bio::Fetch#formats)



172
173
174
175
176
177
178
179
# File 'lib/bio/io/fetch.rb', line 172

def fetch(db, id, style = 'raw', format = nil)
  query = [ [ 'db',    db ],
            [ 'id',    id ],
            [ 'style', style ] ]
  query.push([ 'format', format ]) if format
  
  _get(query)
end

#formats(database = @database) ⇒ Object

Lists the formats that are available for a given database. Like the Bio::Fetch#databases method, not all servers support this method. This method is available on the EBI Dbfetch server (and on the bioruby dbfetch server).

Example:

server = Bio::Fetch::EBI.new()
puts server.formats('embl') # returns [ "default", "annot", ... ]

Arguments:

  • database

    name of database you want the supported formats for

Returns

array of formats



208
209
210
211
212
213
214
# File 'lib/bio/io/fetch.rb', line 208

def formats(database = @database)
  if database
    query = [ [ 'info', 'formats' ],
              [ 'db',   database  ] ]
    _get(query).strip.split(/\s+/)
  end
end

#get_by_id(id) ⇒ Object

Get raw database entry by id. This method lets the Bio::Registry class use Bio::Fetch objects.



155
156
157
# File 'lib/bio/io/fetch.rb', line 155

def get_by_id(id)
  fetch(@database, id)
end

#maxidsObject

A dbfetch server will only return entries up to a given maximum number. This method retrieves that number from the server. As for the databases and formats methods, not all servers support the maxids method. This method is available on the EBI Dbfetch server (and on the bioruby dbfetch server).

Example:

server = Bio::Fetch::EBI.new
puts server.maxids # currently returns 200

Arguments: none

Returns

number



228
229
230
# File 'lib/bio/io/fetch.rb', line 228

def maxids
  _get_single('info', 'maxids').to_i
end