Class: Scopus

Inherits:
Service show all
Includes:
ActionView::Helpers::SanitizeHelper, MetadataHelper, UmlautHttp
Defined in:
app/service_adaptors/scopus.rb

Overview

Service adapter plug-in.

NOTE: This is based on deprecated Scopus API’s, Scopus will take them away 31 December 2014. Please see scopus2.rb instead which uses new Scopus API’s.

PURPOSE: Includes “cited by”, “similar articles” and “more by these authors” links from scopus. Also will throw in an abstract from Scopus if found.

LIMTATIONS: You must be a Scopus customer for these links generated to work for your users at all! Off-campus users should be going through ezproxy, see the EZProxy plug-in. Must find a match in scopus, naturally. “cited by” will only be included if Scopus has non-0 “cited by” links. But there’s no good way to precheck similar/more-by for this, so they are provided blind and may result in 0 hits. You can turn them off if you like, with @include_similar, and @include_more_by_authors.

REGISTERING: This plug in actually has to use two seperate Scopus APIs. For the first, the scopus ‘json’ search api, you must regsiter and get an api key from scopus, which you can do here: searchapi.scopus.com Then include as @json_api_key in service config.

For the second Scopus API, you theoretically need a Scopus “PartnerID” and corresponding “release number”, in @partner_id and @scopus_release There’s no real easy way to get one. Scopus says:

"To obtain a partner ID or release number, contact your nearest regional
Scopus office. A list of Scopus contacts is available at
http://www.info.scopus.com/contactus/index.shtml"

Bah! But fortunately, using the “partnerID” assigned to the Scopus Json API, 65, seems to work, and is coded here as the default. You could try going with that. When you register a partnerID, you also get a ‘salt key’, which is currently not used by this code, but @link_salt_key is reserved for it in case added functionality does later.

Constant Summary

Constants inherited from Service

Service::LinkOutFilterTask, Service::StandardTask

Instance Attribute Summary

Attributes inherited from Service

#group, #name, #priority, #request, #service_id, #status, #task, #url

Instance Method Summary collapse

Methods included from UmlautHttp

#http_fetch, #proxy_like_headers

Methods included from MetadataHelper

#get_doi, #get_epage, #get_gpo_item_nums, #get_identifier, #get_isbn, #get_issn, #get_lccn, #get_month, #get_oclcnum, #get_pmid, #get_search_creator, #get_search_terms, #get_search_title, #get_spage, #get_sudoc, #get_top_level_creator, #get_year, #normalize_lccn, #normalize_title, #raw_search_title, title_is_serial?

Methods included from MarcHelper

#add_856_links, #edition_statement, #get_title, #get_years, #gmd_values, #service_type_for_856, #should_skip_856_link?, #strip_gmd

Methods inherited from Service

#credits, #display_name, #handle_wrapper, #link_out_filter, #preempted_by, required_config_params, #response_url, #translate

Constructor Details

#initialize(config) ⇒ Scopus

Returns a new instance of Scopus.



59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
# File 'app/service_adaptors/scopus.rb', line 59

def initialize(config)
  #defaults
  @display_name = "Scopus"
  @registered_referer
  @scopus_search_base = 'http://www.scopus.com/scsearchapi/search.url'
  
  @include_abstract = true
  @include_cited_by = true
  @include_similar = true
  @include_more_by_authors = true
  @more_by_authors_type = "similar"

  @inward_cited_by_url = "http://www.scopus.com/scopus/inward/citedby.url"
  #@partner_id = "E5wmcMWC"
  @partner_id = 65
  @link_salt_key = nil
  @scopus_release = "R6.0.0"

  # Scopus offers two algorithms for finding similar items.
  # This variable can be:
  # "key" => keyword based similarity 
  # "ref" => reference based similiarity (cites similar refs?) Seems to offer 0 hits quite often, so we use keyword instead. 
  # "aut" => author. More docs by same authors. Incorporated as seperate link usually. 
  @more_like_this_type = "key"
  @inward_more_like_url = "http://www.scopus.com/scopus/inward/mlt.url"
  
  @credits = {
    @display_name => "http://www.scopus.com/home.url"
  }
  
  super(config)
end

Instance Method Details

#add_abstract(first_hit, request) ⇒ Object



240
241
242
243
244
245
246
247
248
249
250
251
# File 'app/service_adaptors/scopus.rb', line 240

def add_abstract(first_hit, request)

  return if first_hit["abstract"].blank?
  
  request.add_service_response( 
    :service=>self, 
    :display_text => "Abstract from #{@display_name}", 
    :content => sanitize(first_hit["abstract"]), 
    :content_html_safe => true,
    :url => detail_url(first_hit), 
    :service_type_value => :abstract)
end

#add_cited_by_response(result, request) ⇒ Object

Input is a ruby hash that came from the scopus JSON, representing a single hit. We’re going to add this as a result.



219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
# File 'app/service_adaptors/scopus.rb', line 219

def add_cited_by_response(result, request)
  # While scopus provides an "inwardurl" in the results, this just takes
  # us to the record detail page. We actually want to go RIGHT to the
  # list of cited-by items. So we create our own, based on Scopus's
  # reversed engineered predictable URLs. 

  count = result["citedbycount"]
  label = ServiceTypeValue[:cited_by].display_name_pluralize.downcase.capitalize    
    if count && count == 1
      label = ServiceTypeValue[:cited_by].display_name.downcase.capitalize
    end
  cited_by_url = cited_by_url( result )
  
  request.add_service_response(:service=>self, 
    :display_text => "#{count} #{label}", 
    :count=> count, 
    :url => cited_by_url, 
    :service_type_value => :cited_by)

end

#check_for_hits(url) ⇒ Object

NOT currently working. Scopus doesn’t make this easy. Takes a scopus direct url for which we’re not sure if there will be results or not, and requests it and html screen-scrapes to get hit count. (We can conveniently find this just in the html <title> at least). Works for cited_by and more_like_this searches at present. May break if Scopus changes their html title!



283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
# File 'app/service_adaptors/scopus.rb', line 283

def check_for_hits(url)

  response = http_fetch(url).body

  response_html = Nokogiri::HTML(response)

  title = response_xml.at('title').inner_text
  # title is "X documents" (or 'Documents') if there are hits.
  # It's annoyingly "Search Error" if there are either 0 hits, or
  # if there was an actual error. So we can't easily log actual
  # errors, sorry.
  title.downcase =~ /^\s*(\d+)?\s+document/
  if ( hits = $1)
    return hits.to_i
  else
    return 0
  end    
end

#cited_by_url(result) ⇒ Object



262
263
264
265
266
267
268
# File 'app/service_adaptors/scopus.rb', line 262

def cited_by_url(result)
  eid = CGI.escape(result["eid"])    
  #return "#{@scopus_cited_by_base}?eid=#{eid}&src=s&origin=recordpage"
  # Use the new scopus direct link format!
  return "#{@inward_cited_by_url}?partnerID=#{@partner_id}&rel=#{@scopus_release}&eid=#{eid}"
  return 
end

#detail_url(hash) ⇒ Object



253
254
255
256
257
258
259
260
# File 'app/service_adaptors/scopus.rb', line 253

def detail_url(hash)
  url = hash["inwardurl"]
  # for some reason ampersand's in query string have wound up double escaped
  # and need to be fixed.
  url = url.gsub(/\&amp\;/, '&')

  return url
end

#handle(request) ⇒ Object



92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
# File 'app/service_adaptors/scopus.rb', line 92

def handle(request)
  scopus_search = scopus_search(request)

  # we can't make a good query, nevermind. 
  return request.dispatched(self, true) if scopus_search.blank? 

  
  # The default fields returned dont' include the eid (Scopus unique id) that we need, so we'll supply our own exhaustive list of &fields=
  url = 
  "#{@scopus_search_base}?devId=#{@json_api_key}&search=#{scopus_search}&callback=findit_callback&fields=title,doctype,citedbycount,inwardurl,sourcetitle,issn,vol,issue,page,pubdate,eid,scp,doi,firstAuth,authlist,affiliations,abstract";
  
  # Make the call.
  headers = {}
  headers["Referer"] = @registered_referer if @registered_referer 

  response = open(url, headers).read    
  
  # Okay, Scopus insists on using a jsonp technique to embed the json array in
  # a procedure call. We don't want that, take the actual content out of it's
  # jsonp wrapper. 
  response =~ /^\w*findit_callback\((.*)\);?$/
  response = $1;
  
  # Take the first hit from scopus's results, hope they relevancy ranked it
  # well. For DOI/pmid search, there should ordinarly be only one hit!
  results = MultiJson.load(response)

  if ( results["ERROR"])
    Rails.logger.error("Error from Scopus API: #{results["ERROR"].inspect}   openurl: ?#{request.referent.to_context_object.kev}")
    return request.dispatched(self, false)
  end

  # For reasons not clear to me, the JSON data structures vary.
  first_hit = nil
  if ( results["PartOK"])
    first_hit = results["PartOK"]["Results"][0]
  elsif ( results["OK"] )
    first_hit = results["OK"]["Results"][0]
  else
    # error. 
  end

  if ( first_hit )
  
    if (@include_cited_by && first_hit["citedbycount"].to_i > 0)
      add_cited_by_response(first_hit, request)
    end

    if (@include_abstract && first_hit["abstract"])
      add_abstract(first_hit, request)
    end

    if (@include_similar)
      url = more_like_this_url(first_hit)
      # Pre-checking for actual hits not currently working, disabled.
      if (true || ( hits = check_for_hits(url) ) > 0 )
        request.add_service_response( 
          :service=>self, 
          :display_text => "#{hits} #{ServiceTypeValue[:similar].display_name_pluralize.downcase.capitalize}", 
          :url => url, 
          :service_type_value => :similar)          
      end                
    end

    if ( @include_more_by_authors)
      url = more_like_this_url(first_hit, :type => "aut")
      # Pre-checking for actual hits not currently working, disabled. 
      if (true || ( hits = check_for_hits(url) ) > 0 )
        request.add_service_response( 
          :service=>self, 
          :display_text => "#{hits} More from these authors", 
          :url => url, 
          :service_type_value => :similar)          
      end        
    end

  end

  return request.dispatched(self, true)
end

#more_like_this_url(result, options = {}) ⇒ Object



270
271
272
273
274
275
# File 'app/service_adaptors/scopus.rb', line 270

def more_like_this_url(result, options = {})
  options[:type] ||= @more_like_this_type
  
  eid = CGI.escape(result["eid"])
  return "#{@inward_more_like_url}?partnerID=#{@partner_id}&rel=#{@scopus_release}&eid=#{eid}&mltType=#{options[:type]}"
end

#phrase(str) ⇒ Object

backslash escapes any double quotes, and embeds string in scopus phrase search double quotes. Does NOT uri-escape.



213
214
215
# File 'app/service_adaptors/scopus.rb', line 213

def phrase(str)
  '"' + str.gsub('"', '\\"') + '"'
end

#scopus_search(request) ⇒ Object

Comes up with a scopus advanced search query intended to find the exact known item identified by this citation.

Will try to use DOI or PMID if available. Otherwise will use issn/year/vol/iss/start page if available. In some cases may resort to author/title.



179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
# File 'app/service_adaptors/scopus.rb', line 179

def scopus_search(request)
  
  if (doi = get_doi(request.referent))
    return CGI.escape( "DOI(#{phrase(doi)})" )
  elsif (pmid = get_pmid(request.referent))
    return CGI.escape( "PMID(#{phrase(pmid)})" )
  elsif (isbn = get_isbn(request.referent))
    # I don't think scopus has a lot of ISBN-holding citations, but
    # it allows search so we might as well try. 
    return CGI.escape( "ISBN(#{phrase(isbn)})" )
  else            
    # Okay, we're going to try to do it on issn/vol/issue/page.
    # If we don't have issn, we'll reluctantly use journal title
    # (damn you google scholar).
     = request.referent.
    issn = request.referent.issn
    if ( (issn || ! ['jtitle'].blank? ) &&
         ! ['volume'].blank? &&
         ! ['issue'].blank? &&
         ! ['spage'].blank? )
      query = "VOLUME(#{phrase(['volume'])}) AND ISSUE(#{phrase(['issue'])}) AND PAGEFIRST(#{phrase(['spage'])}) "
      if ( issn )
        query += " AND (ISSN(#{phrase(issn)}) OR EISSN(#{phrase(issn)}))"
      else
        query += " AND EXACTSRCTITLE(#{phrase(['jtitle'])})"
      end
      return CGI.escape(query)
    end
    
  end
end

#service_types_generatedObject



48
49
50
51
52
53
54
55
56
57
# File 'app/service_adaptors/scopus.rb', line 48

def service_types_generated
  types = []
  types.push( ServiceTypeValue[:abstract]) if @include_abstract
  types.push( ServiceTypeValue[:cited_by] ) if @include_cited_by
  types.push( ServiceTypeValue[:abstract] ) if @include_abstract
  types.push( ServiceTypeValue[:similar] ) if @include_similar
  types.push( ServiceTypeValue[@more_by_authors_type] ) if @include_more_by_authors

  return types
end