Class: Krikri::Enrichments::LanguageToLexvo
- Inherits:
-
Object
- Object
- Krikri::Enrichments::LanguageToLexvo
- Includes:
- Audumbla::FieldEnrichment
- Defined in:
- lib/krikri/enrichments/language_to_lexvo.rb
Overview
Converts text fields and/or providedLabels to ISO 639-3 URIs (Lexvo)
Transforms text values matching either (English) labels or language codes from Lexvo into DPLA::MAP::Controlled::Language resources with skos:exactMatch of the appropriate Lexvo URIs. Original string values are retained as dpla:providedLabel.
Currently suppports langague codes in ISO 639-3, but may be extended to matchother two and three letter codes in Lexvo with ISO 639-3 URIs.
If no matches are found, returns a bnode with the input value as providedLabel.
If passed an ActiveTriples::Resource, the enrichment will:
- Perform the above text matching on any present `providedLabel`s,
returning the original node if no results are found. If multiple
values are provided and multiple matches found, they will be
deduplicated.
- Leave DPLA::MAP::Controlled::Language objects that are not bnodes
unaltered.
- Remove any values which are not either bnodes or members of
DPLA::MAP::Controlled::Language.
Label matches are cached within the enrichment instance,
Constant Summary collapse
- TERMS =
RDF::ISO_639_3.to_a
- QNAMES =
TERMS.map { |t| t.qname[1] }.freeze
Instance Method Summary collapse
-
#enrich_literal(label) ⇒ ActiveTriples::Resource
Runs the enrichment over a string.
-
#enrich_node(value) ⇒ Array<ActiveTriples::Resource>, ActiveTriples::Resource
Runs the enrichment over a specific node, accepting an ‘ActiveTriples::Resource` with a provided label and returning a new node with a lexvo match.
-
#enrich_value(value) ⇒ DPLA::MAP::Controlled::Language?
Runs the enrichment against a node.
-
#match_iso(code) ⇒ DPLA::MAP::Controlled::Language
Converts string or symbol for a three letter language code to an ‘ActiveTriples::Resource`.
-
#match_label(label) ⇒ DPLA::MAP::Controlled::Language
Converts string or symbol for a language label to an ‘ActiveTriples::Resource`.
Instance Method Details
#enrich_literal(label) ⇒ ActiveTriples::Resource
Runs the enrichment over a string.
106 107 108 109 110 111 112 113 114 115 116 117 118 119 |
# File 'lib/krikri/enrichments/language_to_lexvo.rb', line 106 def enrich_literal(label) node = DPLA::MAP::Controlled::Language.new() node.providedLabel = label match = match_iso(label.to_s) match = match_label(label.to_s) if match.node? # if match is still a node, we didn't find anything return node if match.node? node.exactMatch = match node.prefLabel = RDF::ISO_639_3[match.rdf_subject.qname[1]].label.last node end |
#enrich_node(value) ⇒ Array<ActiveTriples::Resource>, ActiveTriples::Resource
Runs the enrichment over a specific node, accepting an ‘ActiveTriples::Resource` with a provided label and returning a new node with a lexvo match.
93 94 95 96 97 |
# File 'lib/krikri/enrichments/language_to_lexvo.rb', line 93 def enrich_node(value) labels = value.get_values(RDF::DPLA.providedLabel) return value if labels.empty? labels.map { |label| enrich_literal(label) } end |
#enrich_value(value) ⇒ DPLA::MAP::Controlled::Language?
Runs the enrichment against a node. Can match literal values, and Language values with a provided label.
77 78 79 80 81 82 83 |
# File 'lib/krikri/enrichments/language_to_lexvo.rb', line 77 def enrich_value(value) return enrich_node(value) if value.is_a?(ActiveTriples::Resource) && value.node? return value if value.is_a?(DPLA::MAP::Controlled::Language) return nil if value.is_a?(ActiveTriples::Resource) enrich_literal(value) end |
#match_iso(code) ⇒ DPLA::MAP::Controlled::Language
Converts string or symbol for a three letter language code to an ‘ActiveTriples::Resource`.
127 128 129 130 |
# File 'lib/krikri/enrichments/language_to_lexvo.rb', line 127 def match_iso(code) match = QNAMES.find { |c| c == code.downcase.to_sym } from_sym(match) end |
#match_label(label) ⇒ DPLA::MAP::Controlled::Language
Converts string or symbol for a language label to an ‘ActiveTriples::Resource`.
Matched values are cached in an instance variable ‘@lang_cache` to avoid multiple traversals through the vocabulary term labels.
141 142 143 144 145 146 147 148 149 150 151 |
# File 'lib/krikri/enrichments/language_to_lexvo.rb', line 141 def match_label(label) @lang_cache ||= {} return @lang_cache[label] if @lang_cache.keys.include? label match = TERMS.find do |t| Array(t.label).map(&:downcase).include? label.downcase end # Caches and returns the the label match @lang_cache[label] = from_sym(match) end |