Class: Bio::KEGG::API

Inherits:
SOAPWSDL show all
Defined in:
lib/bio/io/keggapi.rb

Overview

Description

KEGG API is a web service to use KEGG system via SOAP/WSDL.

References

For more informations on KEGG API, see the following site and read the reference manual.

List of methods

As of KEGG API v5.0

  • list_databases

  • list_organisms

  • list_pathways(org)

  • binfo(string)

  • bget(string)

  • bfind(string)

  • btit(string)

  • get_linkdb_by_entry(entry_id, db, start, max_results)

  • get_best_best_neighbors_by_gene(genes_id, start, max_results)

  • get_best_neighbors_by_gene(genes_id, start, max_results)

  • get_reverse_best_neighbors_by_gene(genes_id, start, max_results)

  • get_paralogs_by_gene(genes_id, start, max_results)

  • get_similarity_between_genes(genes_id1, genes_id2)

  • get_motifs_by_gene(genes_id, db)

  • get_genes_by_motifs(motif_id_list, start, max_results)

  • get_ko_by_gene(genes_id)

  • get_ko_members(ko_id)

  • get_oc_members_by_gene(genes_id, start, max_results)

  • get_pc_members_by_gene(genes_id, start, max_results)

  • mark_pathway_by_objects(pathway_id, object_id_list)

  • color_pathway_by_objects(pathway_id, object_id_list, fg_color_list, bg_color_list)

  • get_genes_by_pathway(pathway_id)

  • get_enzymes_by_pathway(pathway_id)

  • get_compounds_by_pathway(pathway_id)

  • get_reactions_by_pathway(pathway_id)

  • get_pathways_by_genes(genes_id_list)

  • get_pathways_by_enzymes(enzyme_id_list)

  • get_pathways_by_compounds(compound_id_list)

  • get_pathways_by_reactions(reaction_id_list)

  • get_linked_pathways(pathway_id)

  • get_genes_by_enzyme(enzyme_id, org)

  • get_enzymes_by_gene(genes_id)

  • get_enzymes_by_compound(compound_id)

  • get_enzymes_by_reaction(reaction_id)

  • get_compounds_by_enzyme(enzyme_id)

  • get_compounds_by_reaction(reaction_id)

  • get_reactions_by_enzyme(enzyme_id)

  • get_reactions_by_compound(compound_id)

  • get_genes_by_organism(org, start, max_results)

  • get_number_of_genes_by_organism(org)

KEGG API methods implemented only in BioRuby

In BioRuby, returned values are added filter method to pick up values in a complex data type as an array.

#!/usr/bin/env ruby

require 'bio'

serv = Bio::KEGG::API.new
results = serv.get_best_neighbors_by_gene("eco:b0002", "bsu")

# case 0 : without filter
results.each do |hit|
  print hit.genes_id1, "\t", hit.genes_id2, "\t", hit.sw_score, "\n"
end

# case 1 : select gene names and SW score only
fields = [:genes_id1, :genes_id2, :sw_score]
results.each do |hit|
  puts hit.filter(fields).join("\t")
end

# case 2 : also uses aligned position in each amino acid sequence etc.
fields1 = [:genes_id1, :start_position1, :end_position1, :best_flag_1to2]
fields2 = [:genes_id2, :start_position2, :end_position2, :best_flag_2to1]
results.each do |hit|
  print "> score: ", hit.sw_score, ", identity: ", hit.identity, "\n"
  print "1:\t", hit.filter(fields1).join("\t"), "\n"
  print "2:\t", hit.filter(fields2).join("\t"), "\n"
end

Using filter method will make it easy to change fields to select and keep the script clean.

  • Bio::KEGG::API#get_all_neighbors_by_gene(genes_id, org)

  • Bio::KEGG::API#get_all_best_best_neighbors_by_gene(genes_id)

  • Bio::KEGG::API#get_all_best_neighbors_by_gene(genes_id)

  • Bio::KEGG::API#get_all_reverse_best_neighbors_by_gene(genes_id)

  • Bio::KEGG::API#get_all_paralogs_by_gene(genes_id)

  • Bio::KEGG::API#get_all_genes_by_motifs(motif_id_list)

  • Bio::KEGG::API#get_all_oc_members_by_gene(genes_id)

  • Bio::KEGG::API#get_all_pc_members_by_gene(genes_id)

  • Bio::KEGG::API#get_all_genes_by_organism(org)

These methods are wrapper for the methods without all in its name and internally iterate to retrive all the results using start/max_results value pairs described above. For example,

#!/usr/bin/env ruby

require 'soap/wsdlDriver'

wsdl = "http://soap.genome.jp/KEGG.wsdl"
serv = SOAP::WSDLDriverFactory.new(wsdl).create_driver
serv.generate_explicit_type = true

start = 1
max_results = 100

loop do
  results = serv.get_best_neighbors_by_gene('eco:b0002', start, max_results)
  break unless results	# when no more results returned
  results.each do |hit|
    print hit.genes_id1, "\t", hit.genes_id2, "\t", hit.sw_score, "\n"
  end
  start += max_results
end

can be witten as

#!/usr/bin/env ruby

require 'bio'

serv = Bio::KEGG::API.new

results = serv.get_all_best_neighbors_by_gene('eco:b0002')
results.each do |hit|
  print hit.genes_id1, "\t", hit.genes_id2, "\t", hit.sw_score, "\n"
end
  • Bio::KEGG::API#save_image(url, filename = nil)

Some methods of the KEGG API will return a URL of the generated image. This method save an image specified by the URL. The filename can be specified by its second argument, otherwise basename of the URL will be used.

#!/usr/bin/env ruby

require 'bio'

serv = Bio::KEGG::API.new("http://soap.genome.jp/v3.0/KEGG.wsdl")

list = ["eco:b1002", "eco:b2388"]
url = serv.mark_pathway_by_objects("path:eco00010", list)

# Save with the original filename (eco00010.gif in this case)
serv.save_image(url)

# or save as "save_image.gif"
serv.save_image(url, "save_image.gif")
  • Bio::KEGG::API#get_entries(entry_id_list)

  • Bio::KEGG::API#get_aaseqs(entry_id_list)

  • Bio::KEGG::API#get_naseqs(entry_id_list)

  • Bio::KEGG::API#get_definitions(entry_id_list)

These methods are for the shortcut and backward compatibility (these methods existed in the older version of the KEGG API).

Constant Summary collapse

SERVER_URI =
"http://soap.genome.jp/KEGG.wsdl"

Instance Attribute Summary collapse

Attributes inherited from SOAPWSDL

#log, #wsdl

Instance Method Summary collapse

Methods inherited from SOAPWSDL

#list_methods

Constructor Details

#initialize(wsdl = nil) ⇒ API

Connect to the KEGG API’s SOAP server. A WSDL file will be automatically downloaded and parsed to generate the SOAP client driver. The default URL for the WSDL is soap.genome.jp/KEGG.wsdl but it can be changed by the argument or by wsdl= method.



196
197
198
199
200
201
202
# File 'lib/bio/io/keggapi.rb', line 196

def initialize(wsdl = nil)
  @wsdl = wsdl || SERVER_URI
  @log = nil
  @start = 1
  @max_results = 100
  create_driver
end

Dynamic Method Handling

This class handles dynamic methods through the method_missing method

#method_missing(*arg) ⇒ Object



215
216
217
218
219
220
221
222
223
# File 'lib/bio/io/keggapi.rb', line 215

def method_missing(*arg)
  begin
    results = @driver.send(*arg)
  rescue Timeout::Error
    retry
  end
  results = add_filter(results)
  return results
end

Instance Attribute Details

#max_resultsObject

Returns current value for the ‘max_results’ number for the methods having start/max_results argument pairs or changes the default value for the ‘max_results’ count. If your request timeouts, try smaller value for the max_results.



213
214
215
# File 'lib/bio/io/keggapi.rb', line 213

def max_results
  @max_results
end

#startObject

Returns current value for the ‘start’ count for the methods having start/max_results argument pairs or changes the default value for the ‘start’ count.



207
208
209
# File 'lib/bio/io/keggapi.rb', line 207

def start
  @start
end

Instance Method Details

#get_aaseqs(ary = []) ⇒ Object



292
293
294
295
296
297
298
299
300
301
302
# File 'lib/bio/io/keggapi.rb', line 292

def get_aaseqs(ary = [])
  result = ''
  step = [@max_results, 50].min
  0.step(ary.length, step) do |i|
    str = "-f -n a " + ary[i, step].join(" ")
    if entry = @driver.send(:bget, str)
      result << entry.to_s
    end
  end
  return result
end

#get_all_best_best_neighbors_by_gene(genes_id) ⇒ Object

def get_all_neighbors_by_gene(genes_id, org)

get_all(:get_neighbors_by_gene, genes_id, org)

end



230
231
232
# File 'lib/bio/io/keggapi.rb', line 230

def get_all_best_best_neighbors_by_gene(genes_id)
  get_all(:get_best_best_neighbors_by_gene, genes_id)
end

#get_all_best_neighbors_by_gene(genes_id) ⇒ Object



234
235
236
# File 'lib/bio/io/keggapi.rb', line 234

def get_all_best_neighbors_by_gene(genes_id)
  get_all(:get_best_neighbors_by_gene, genes_id)
end

#get_all_genes_by_motifs(motif_id_list) ⇒ Object



246
247
248
# File 'lib/bio/io/keggapi.rb', line 246

def get_all_genes_by_motifs(motif_id_list)
  get_all(:get_genes_by_motifs, motif_id_list)
end

#get_all_genes_by_organism(org) ⇒ Object



258
259
260
# File 'lib/bio/io/keggapi.rb', line 258

def get_all_genes_by_organism(org)
  get_all(:get_genes_by_organism, org)
end

#get_all_linkdb_by_entry(entry_id, db) ⇒ Object



262
263
264
# File 'lib/bio/io/keggapi.rb', line 262

def get_all_linkdb_by_entry(entry_id, db)
  get_all(:get_linkdb_by_entry, entry_id, db)
end

#get_all_oc_members_by_gene(genes_id) ⇒ Object



250
251
252
# File 'lib/bio/io/keggapi.rb', line 250

def get_all_oc_members_by_gene(genes_id)
  get_all(:get_oc_members_by_gene, genes_id)
end

#get_all_paralogs_by_gene(genes_id) ⇒ Object



242
243
244
# File 'lib/bio/io/keggapi.rb', line 242

def get_all_paralogs_by_gene(genes_id)
  get_all(:get_paralogs_by_gene, genes_id)
end

#get_all_pc_members_by_gene(genes_id) ⇒ Object



254
255
256
# File 'lib/bio/io/keggapi.rb', line 254

def get_all_pc_members_by_gene(genes_id)
  get_all(:get_pc_members_by_gene, genes_id)
end

#get_all_reverse_best_neighbors_by_gene(genes_id) ⇒ Object



238
239
240
# File 'lib/bio/io/keggapi.rb', line 238

def get_all_reverse_best_neighbors_by_gene(genes_id)
  get_all(:get_reverse_best_neighbors_by_gene, genes_id)
end

#get_definitions(ary = []) ⇒ Object



316
317
318
319
320
321
322
323
324
325
326
# File 'lib/bio/io/keggapi.rb', line 316

def get_definitions(ary = [])
  result = ''
  step = [@max_results, 50].min
  0.step(ary.length, step) do |i|
    str = ary[i, step].join(" ")
    if entry = @driver.send(:btit, str)
      result << entry.to_s
    end
  end
  return result
end

#get_entries(ary = []) ⇒ Object



280
281
282
283
284
285
286
287
288
289
290
# File 'lib/bio/io/keggapi.rb', line 280

def get_entries(ary = [])
  result = ''
  step = [@max_results, 50].min
  0.step(ary.length, step) do |i|
    str = ary[i, step].join(" ")
    if entry = @driver.send(:bget, str)
      result << entry.to_s
    end
  end
  return result
end

#get_naseqs(ary = []) ⇒ Object



304
305
306
307
308
309
310
311
312
313
314
# File 'lib/bio/io/keggapi.rb', line 304

def get_naseqs(ary = [])
  result = ''
  step = [@max_results, 50].min
  0.step(ary.length, step) do |i|
    str = "-f -n n " + ary[i, step].join(" ")
    if entry = @driver.send(:bget, str)
      result << entry.to_s
    end
  end
  return result
end

#save_image(url, filename = nil) ⇒ Object



267
268
269
270
271
272
273
274
275
276
277
# File 'lib/bio/io/keggapi.rb', line 267

def save_image(url, filename = nil)
  schema, user, host, port, reg, path, = URI.split(url)
  filename ||= File.basename(path)

  http = Bio::Command.new_http(host, port)
  response = http.get(path)
  File.open(filename, "w+") do |f|
    f.print response.body
  end
  return filename
end