Purée
Metadata extraction from the Pure Research Information System.
Status
Installation
Add this line to your application's Gemfile:
gem 'puree'
And then execute:
$ bundle
Or install it yourself as:
$ gem install puree
Configuration
# For Extractor and REST modules.
config = {
url: 'https://YOUR_HOST/ws/api/59',
username: 'YOUR_USERNAME',
password: 'YOUR_PASSWORD',
api_key: 'YOUR_API_KEY'
}
Extractor module
Find a resource by identifier and get Ruby objects.
# Configure an extractor
extractor = Puree::Extractor::Dataset.new config
# Fetch the metadata for a resource with a particular identifier
dataset = extractor.find 'xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'
#=> #<Puree::Model::Dataset:0x00c0ffee>
# Access specific metadata e.g. an internal person's name
dataset.persons_internal[0].name
#=> #<Puree::Model::PersonName:0x00c0ffee @first="Foo", @last="Bar">
# Select a formatting style for a person's name
dataset.persons_internal[0].name.last_initial
#=> "Bar, F."
XMLExtractor module
Get Ruby objects from Pure XML.
Single resource
xml = '<project> ... </project>'
# Configure an XML extractor
xml_extractor = Puree::XMLExtractor::Project.new xml
# Get a single piece of metadata
xml_extractor.title
#=> "An interesting project title"
# Get all the metadata together
xml_extractor.model
#=> #<Puree::Model::Project:0x00c0ffee>
Homogeneous resource collection
xml = '<result>
<dataSet> ... </dataSet>
<dataSet> ... </dataSet>
...
</result>'
# Get an array of datasets
Puree::XMLExtractor::Collection.datasets xml
#=> [#<Puree::Model::Dataset:0x00c0ffee>, ...]
Heterogeneous resource collection
xml = '<result>
<contributionToJournal> ... </contributionToJournal>
<contributionToConference> ... </contributionToConference>
...
</result>'
# Get a hash of research outputs
Puree::XMLExtractor::Collection.research_outputs xml
#=> {
# journal_articles: [#<Puree::Model::JournalArticle:0x00c0ffee>, ...],
# conference_papers: [#<Puree::Model::ConferencePaper:0x00c0ffee>, ...],
# theses: [#<Puree::Model::Thesis:0x00c0ffee>, ...],
# other: [#<Puree::Model::ResearchOutput:0x00c0ffee>, ...]
# }
REST module
Query the Pure REST API.
Client
# Configure a client
client = Puree::REST::Client.new config
# Find a person
client.persons.find id: 'xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'
#=> #<HTTP::Response:0x00c0ffee>
# Find a person, limit the metadata to ORCID and employee start date
client.persons.find id: 'xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx',
params: {fields: ['orcid', 'employeeStartDate']}
#=> #<HTTP::Response:0x00c0ffee>
# Find five people, response body as JSON
client.persons.all params: {size: 5}, accept: :json
#=> #<HTTP::Response:0x00c0ffee>
# Find research outputs for a person
client.persons.research_outputs id: 'xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'
#=> #<HTTP::Response:0x00c0ffee>
Resource
# Configure a resource
persons = Puree::REST::Person.new config
# Find a person
persons.find id: 'xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'
#=> #<HTTP::Response:0x00c0ffee>
REST module with XMLExtractor module
Query the Pure REST API and get Ruby objects from Pure XML.
# Configure a client
client = Puree::REST::Client.new config
# Find projects for a person
response = client.persons.projects id: 'xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'
# Extract metadata from XML
Puree::XMLExtractor::Collection.projects response.to_s
#=> [#<Puree::Model::Project:0x00c0ffee>, ...]