Class: Krikri::Harvesters::OAIHarvester
- Inherits:
-
Object
- Object
- Krikri::Harvesters::OAIHarvester
- Includes:
- Krikri::Harvester
- Defined in:
- lib/krikri/harvesters/oai_harvester.rb
Overview
A harvester implementation for OAI-PMH
Constant Summary
Constants included from Krikri::Harvester
Constants included from SoftwareAgent
Instance Attribute Summary collapse
-
#client ⇒ Object
Returns the value of attribute client.
Attributes included from Krikri::Harvester
Class Method Summary collapse
Instance Method Summary collapse
-
#count ⇒ Object
Count on record_ids will request all ids and load them into memory TODO: an efficient implementation of count for OAI.
-
#get_record(identifier, opts = {}) ⇒ Object
TODO: normalize records; there will be differences in XML for different requests.
-
#initialize(opts = {}) ⇒ OAIHarvester
constructor
A new instance of OAIHarvester.
-
#record_ids(opts = {}) ⇒ Object
Sends ListIdentifier requests lazily.
-
#records(opts = {}) ⇒ Object
Sends ListRecords requests lazily.
Methods included from Krikri::Harvester
Methods included from SoftwareAgent
Constructor Details
#initialize(opts = {}) ⇒ OAIHarvester
Returns a new instance of OAIHarvester.
14 15 16 17 18 19 20 21 22 23 24 25 |
# File 'lib/krikri/harvesters/oai_harvester.rb', line 14 def initialize(opts = {}) super @opts = opts.fetch(:oai, {}) http_conn = Faraday.new do |conn| conn.request :retry, :max => 3 conn.response :follow_redirects, :limit => 5 conn.adapter :net_http end @client = OAI::Client.new(uri, :http => http_conn) end |
Instance Attribute Details
#client ⇒ Object
Returns the value of attribute client.
6 7 8 |
# File 'lib/krikri/harvesters/oai_harvester.rb', line 6 def client @client end |
Class Method Details
.expected_opts ⇒ Object
74 75 76 77 78 79 80 81 82 |
# File 'lib/krikri/harvesters/oai_harvester.rb', line 74 def self.expected_opts { key: :oai, opts: { set: {type: :string, required: false, multiple_ok: true}, metadata_prefix: {type: :string, required: true} } } end |
Instance Method Details
#count ⇒ Object
Count on record_ids will request all ids and load them into memory TODO: an efficient implementation of count for OAI
42 43 44 |
# File 'lib/krikri/harvesters/oai_harvester.rb', line 42 def count raise NotImplementedError end |
#get_record(identifier, opts = {}) ⇒ Object
TODO: normalize records; there will be differences in XML for different requests
65 66 67 68 69 70 |
# File 'lib/krikri/harvesters/oai_harvester.rb', line 65 def get_record(identifier, opts = {}) opts[:identifier] = identifier opts = opts.merge(@opts) @record_class.build(mint_id(identifier), record_xml(client.get_record(opts).record)) end |
#record_ids(opts = {}) ⇒ Object
Sends ListIdentifier requests lazily.
The following will only send requests to the endpoint until it has 1000 record ids:
record_ids.take(1000)
35 36 37 38 |
# File 'lib/krikri/harvesters/oai_harvester.rb', line 35 def record_ids(opts = {}) opts = opts.merge(@opts) client.list_identifiers(opts).full.lazy.flat_map(&:identifier) end |
#records(opts = {}) ⇒ Object
Sends ListRecords requests lazily.
The following will only send requests to the endpoint until it has 1000 records:
records.take(1000)
54 55 56 57 58 59 60 61 |
# File 'lib/krikri/harvesters/oai_harvester.rb', line 54 def records(opts = {}) opts = opts.merge(@opts) client.list_records(opts).full.lazy.flat_map do |rec| @record_class.build(mint_id(rec.header.identifier), record_xml(rec)) end end |