Class: Bio::PubMed
Overview
Description
The Bio::PubMed class provides several ways to retrieve bibliographic information from the PubMed database at www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed. Basically, two types of queries are possible:
-
searching for PubMed IDs given a query string:
-
Bio::PubMed#search
-
Bio::PubMed#esearch
-
-
retrieving the MEDLINE text (i.e. authors, journal, abstract, …) given a PubMed ID
-
Bio::PubMed#query
-
Bio::PubMed#pmfetch
-
Bio::PubMed#efetch
-
The different methods within the same group are interchangeable and should return the same result.
Additional information about the MEDLINE format and PubMed programmable APIs can be found on the following websites:
-
Overview: www.ncbi.nlm.nih.gov/entrez/query/static/overview.html
-
How to link: www.ncbi.nlm.nih.gov/entrez/query/static/linking.html
-
MEDLINE format: www.ncbi.nlm.nih.gov/entrez/query/static/help/pmhelp.html#MEDLINEDisplayFormat
-
Search field descriptions and tags: www.ncbi.nlm.nih.gov/entrez/query/static/help/pmhelp.html#SearchFieldDescriptionsandTags
-
Entrez utilities index: www.ncbi.nlm.nih.gov/entrez/utils/utils_index.html
-
PmFetch CGI help: www.ncbi.nlm.nih.gov/entrez/utils/pmfetch_help.html
-
E-Utilities CGI help: eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html
Usage
require 'bio'
# If you don't know the pubmed ID:
Bio::PubMed.search("(genome AND analysis) OR bioinformatics)").each do |x|
p x
end
Bio::PubMed.esearch("(genome AND analysis) OR bioinformatics)").each do |x|
p x
end
# To retrieve the MEDLINE entry for a given PubMed ID:
puts Bio::PubMed.query("10592173")
puts Bio::PubMed.pmfetch("10592173")
puts Bio::PubMed.efetch("10592173", "14693808")
# This can be converted into a Bio::MEDLINE object:
manuscript = Bio::PubMed.query("10592173")
medline = Bio::MEDLINE(manuscript)
Class Method Summary collapse
-
.efetch(*ids) ⇒ Object
Retrieve PubMed entry by PMID and returns MEDLINE formatted string using entrez efetch.
-
.esearch(str, hash = {}) ⇒ Object
Search the PubMed database by given keywords using E-Utils and returns an array of PubMed IDs.
-
.pmfetch(id) ⇒ Object
Retrieve PubMed entry by PMID and returns MEDLINE formatted string using entrez pmfetch.
-
.query(id) ⇒ Object
Retrieve PubMed entry by PMID and returns MEDLINE formatted string using entrez query.
-
.search(str) ⇒ Object
Search the PubMed database by given keywords using entrez query and returns an array of PubMed IDs.
Class Method Details
.efetch(*ids) ⇒ Object
175 176 177 178 179 180 181 182 183 184 185 186 187 188 |
# File 'lib/bio/io/pubmed.rb', line 175 def self.efetch(*ids) return [] if ids.empty? host = "eutils.ncbi.nlm.nih.gov" path = "/entrez/eutils/efetch.fcgi?tool=bioruby&db=pubmed&retmode=text&rettype=medline&id=" ids = ids.join(",") http = Bio::Command.new_http(host) response, = http.get(path + ids) result = response.body result = result.split(/\n\n+/) return result end |
.esearch(str, hash = {}) ⇒ Object
Search the PubMed database by given keywords using E-Utils and returns an array of PubMed IDs.
For information on the possible arguments, see eutils.ncbi.nlm.nih.gov/entrez/query/static/esearch_help.html#PubMed
Arguments:
-
id: query string (required)
-
field
-
reldate
-
mindate
-
maxdate
-
datetype
-
retstart
-
retmax (default 100)
-
retmode
-
rettype
- Returns
-
array of PubMed IDs
106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 |
# File 'lib/bio/io/pubmed.rb', line 106 def self.esearch(str, hash = {}) hash['retmax'] = 100 unless hash['retmax'] opts = [] hash.each do |k, v| opts << "#{k}=#{v}" end host = "eutils.ncbi.nlm.nih.gov" path = "/entrez/eutils/esearch.fcgi?tool=bioruby&db=pubmed&#{opts.join('&')}&term=" http = Bio::Command.new_http(host) response, = http.get(path + CGI.escape(str)) result = response.body result = result.scan(/<Id>(.*?)<\/Id>/m).flatten return result end |
.pmfetch(id) ⇒ Object
Retrieve PubMed entry by PMID and returns MEDLINE formatted string using entrez pmfetch.
Arguments:
-
id: PubMed ID (required)
- Returns
-
MEDLINE formatted String
151 152 153 154 155 156 157 158 159 160 161 162 163 164 |
# File 'lib/bio/io/pubmed.rb', line 151 def self.pmfetch(id) host = "www.ncbi.nlm.nih.gov" path = "/entrez/utils/pmfetch.fcgi?tool=bioruby&mode=text&report=medline&db=PubMed&id=" http = Bio::Command.new_http(host) response, = http.get(path + id.to_s) result = response.body if result =~ /#{id}\s+Error/ raise( result ) else result = result.gsub("\r", "\n").squeeze("\n").gsub(/<\/?pre>/, '') return result end end |
.query(id) ⇒ Object
Retrieve PubMed entry by PMID and returns MEDLINE formatted string using entrez query.
Arguments:
-
id: PubMed ID (required)
- Returns
-
MEDLINE formatted String
130 131 132 133 134 135 136 137 138 139 140 141 142 143 |
# File 'lib/bio/io/pubmed.rb', line 130 def self.query(id) host = "www.ncbi.nlm.nih.gov" path = "/entrez/query.fcgi?tool=bioruby&cmd=Text&dopt=MEDLINE&db=PubMed&uid=" http = Bio::Command.new_http(host) response, = http.get(path + id.to_s) result = response.body if result =~ /#{id}\s+Error/ raise( result ) else result = result.gsub("\r", "\n").squeeze("\n").gsub(/<\/?pre>/, '') return result end end |
.search(str) ⇒ Object
Search the PubMed database by given keywords using entrez query and returns an array of PubMed IDs.
Arguments:
-
id: query string (required)
- Returns
-
array of PubMed IDs
76 77 78 79 80 81 82 83 84 85 86 |
# File 'lib/bio/io/pubmed.rb', line 76 def self.search(str) host = "www.ncbi.nlm.nih.gov" path = "/entrez/query.fcgi?tool=bioruby&cmd=Search&doptcmdl=MEDLINE&db=PubMed&term=" http = Bio::Command.new_http(host) response, = http.get(path + CGI.escape(str)) result = response.body result = result.gsub("\r", "\n").squeeze("\n") result = result.scan(/<pre>(.*?)<\/pre>/m).flatten return result end |