Class: Bio::PubMed

Inherits:
NCBI::REST show all
Defined in:
lib/bio/io/pubmed.rb

Overview

Description

The Bio::PubMed class provides several ways to retrieve bibliographic information from the PubMed database at NCBI.

Basically, two types of queries are possible:

  • searching for PubMed IDs given a query string:

    • Bio::PubMed#esearch (recommended)

    • Bio::PubMed#search (only retrieves top 20 hits; will be deprecated)

  • retrieving the MEDLINE text (i.e. authors, journal, abstract, …) given a PubMed ID

    • Bio::PubMed#efetch (recommended)

    • Bio::PubMed#query (will be deprecated)

    • Bio::PubMed#pmfetch (will be deprecated)

Since BioRuby 1.5, all implementations uses NCBI E-Utilities services. The different methods within the same group still remain because specifications of arguments and/or return values are different. The search, query, and pmfetch will be obsoleted in the future.

Additional information about the MEDLINE format and PubMed programmable APIs can be found on the following websites:

Usage

require 'bio'

# If you don't know the pubmed ID:
Bio::PubMed.esearch("(genome AND analysis) OR bioinformatics").each do |x|
  p x
end

Bio::PubMed.search("(genome AND analysis) OR bioinformatics").each do |x|
  p x
end

# To retrieve the MEDLINE entry for a given PubMed ID:
Bio::PubMed.efetch("10592173").each { |x| puts x }
puts Bio::PubMed.query("10592173")
puts Bio::PubMed.pmfetch("10592173")

# To retrieve MEDLINE entries for given PubMed IDs:
Bio::PubMed.efetch([ "10592173", "14693808" ]).each { |x| puts x }
puts Bio::PubMed.query("10592173", "14693808") # returns a String

# This can be converted into a Bio::MEDLINE object:
manuscript = Bio::PubMed.query("10592173")
medline = Bio::MEDLINE.new(manuscript)

Constant Summary

Constants inherited from NCBI::REST

NCBI::REST::NCBI_INTERVAL

Class Method Summary collapse

Instance Method Summary collapse

Methods inherited from NCBI::REST

#einfo, einfo, #esearch_count, esearch_count

Class Method Details

.efetch(*args) ⇒ Object

The same as Bio::PubMed.new.efetch(*args).



190
191
192
# File 'lib/bio/io/pubmed.rb', line 190

def self.efetch(*args)
  self.new.efetch(*args)
end

.esearch(*args) ⇒ Object

The same as Bio::PubMed.new.esearch(*args).



185
186
187
# File 'lib/bio/io/pubmed.rb', line 185

def self.esearch(*args)
  self.new.esearch(*args)
end

.pmfetch(*args) ⇒ Object

This method will be DEPRECATED. Use efetch method.

The same as Bio::PubMed.new.pmfetch(*args).



211
212
213
# File 'lib/bio/io/pubmed.rb', line 211

def self.pmfetch(*args)
  self.new.pmfetch(*args)
end

.query(*args) ⇒ Object

This method will be DEPRECATED. Use efetch method.

The same as Bio::PubMed.new.query(*args).



204
205
206
# File 'lib/bio/io/pubmed.rb', line 204

def self.query(*args)
  self.new.query(*args)
end

.search(*args) ⇒ Object

This method will be DEPRECATED. Use esearch method.

The same as Bio::PubMed.new.search(*args).



197
198
199
# File 'lib/bio/io/pubmed.rb', line 197

def self.search(*args)
  self.new.search(*args)
end

Instance Method Details

#efetch(ids, hash = {}) ⇒ Object

Retrieve PubMed entry by PMID and returns MEDLINE formatted string using entrez efetch. Multiple PubMed IDs can be provided:

Bio::PubMed.efetch(123)
Bio::PubMed.efetch([123,456,789])

Arguments:

  • ids: list of PubMed IDs (required)

  • hash: hash of E-Utils options

    • _“retmode”_: “xml”, “html”, …

    • _“rettype”_: “medline”, …

    • _“retmax”_: integer (default 100)

    • _“retstart”_: integer

    • _“field”_

    • _“reldate”_

    • _“mindate”_

    • _“maxdate”_

    • _“datetype”_

Returns

Array of MEDLINE formatted String



116
117
118
119
120
121
122
123
124
# File 'lib/bio/io/pubmed.rb', line 116

def efetch(ids, hash = {})
  opts = { "db" => "pubmed", "rettype"  => "medline" }
  opts.update(hash)
  result = super(ids, opts)
  if !opts["retmode"] or opts["retmode"] == "text"
    result = result.split(/\n\n+/)
  end
  result
end

#esearch(str, hash = {}) ⇒ Object

Search the PubMed database by given keywords using E-Utils and returns an array of PubMed IDs.

For information on the possible arguments, see eutils.ncbi.nlm.nih.gov/entrez/query/static/esearch_help.html#PubMed


Arguments:

  • str: query string (required)

  • hash: hash of E-Utils options

    • _“retmode”_: “xml”, “html”, …

    • _“rettype”_: “medline”, …

    • _“retmax”_: integer (default 100)

    • _“retstart”_: integer

    • _“field”_

    • _“reldate”_

    • _“mindate”_

    • _“maxdate”_

    • _“datetype”_

Returns

array of PubMed IDs or a number of results



92
93
94
95
96
# File 'lib/bio/io/pubmed.rb', line 92

def esearch(str, hash = {})
  opts = { "db" => "pubmed" }
  opts.update(hash)
  super(str, opts)
end

#pmfetch(id) ⇒ Object

This method will be DEPRECATED in the future.

Retrieve PubMed entry by PMID and returns MEDLINE formatted string.


Arguments:

  • id: PubMed ID (required)

Returns

MEDLINE formatted String



173
174
175
176
177
178
179
180
181
182
# File 'lib/bio/io/pubmed.rb', line 173

def pmfetch(id)
  warn "Bio::PubMed#pmfetch internally use Bio::PubMed#efetch. Using Bio::PubMed#efetch is recommended." if $VERBOSE

  ret = efetch(id)
  if ret && ret.size > 0 then
    ret.join("\n\n") + "\n"
  else
    ""
  end
end

#query(*ids) ⇒ Object

This method will be DEPRECATED in the future.

Retrieve PubMed entry by PMID and returns MEDLINE formatted string using entrez query.


Arguments:

  • id: PubMed ID (required)

Returns

MEDLINE formatted String



155
156
157
158
159
160
161
162
163
# File 'lib/bio/io/pubmed.rb', line 155

def query(*ids)
  warn "Bio::PubMed#query internally uses Bio::PubMed#efetch. Using Bio::PubMed#efetch is recommended." if $VERBOSE
  ret = efetch(ids)
  if ret && ret.size > 0 then
    ret.join("\n\n") + "\n"
  else
    ""
  end
end

#search(str) ⇒ Object

This method will be DEPRECATED in the future.

Search the PubMed database by given keywords using entrez query and returns an array of PubMed IDs.

Caution: this method returns the first 20 hits only,

Instead, use of the ‘esearch’ method is strongly recomended.

Implementation details: Since BioRuby 1.5, this method internally uses NCBI EUtils with retmax=20 by using Bio::PubMed#efetch method.


Arguments:

  • id: query string (required)

Returns

array of PubMed IDs



142
143
144
145
# File 'lib/bio/io/pubmed.rb', line 142

def search(str)
  warn "Bio::PubMed#search is now a subset of Bio::PubMed#esearch. Using Bio::PubMed#esearch is recommended." if $VERBOSE
  esearch(str, { "retmax" => 20 })
end