Class: Stanford::Mods::Record

Inherits:

Mods::Record

Object
Mods::Record
Stanford::Mods::Record

Defined in:: lib/stanford-mods/geo_spatial.rb,
lib/stanford-mods.rb,
lib/stanford-mods/name.rb,
lib/stanford-mods/origin_info.rb,
lib/stanford-mods/searchworks.rb,
lib/stanford-mods/physical_location.rb,
lib/stanford-mods/searchworks_subjects.rb

Overview

Parsing MODS //location/physicalLocation for series, box, and folder for Special Collections. This is not used by Searchworks, otherwise it would have been in the searchworks.rb file. Note: mods_ng_xml_location.physicalLocation should find top level and relatedItem. Each method here expects to find at most ONE matching element. Subsequent potential matches are ignored.

Constant Summary collapse

COLLECTOR_ROLE_URI =

'http://id.loc.gov/vocabulary/relators/col'.freeze

GMLNS =

'http://www.opengis.net/gml/3.2/'.freeze

Instance Attribute Summary collapse

#druid ⇒ Object
#logger ⇒ Object (also: #sw_logger)

Class Method Summary collapse

.date_is_approximate?(date_element) ⇒ Boolean

NOTE: legal values for MODS date elements with attribute qualifier are ‘approximate’, ‘inferred’ or ‘questionable’.
.earliest_year_int(date_el_array) ⇒ Object

get earliest parseable year (as an Integer) from the passed date elements.
.earliest_year_str(date_el_array) ⇒ Object

get earliest parseable year (as a String) from the passed date elements.
.keyDate(elements) ⇒ Nokogiri::XML::Element^?

given a set of date elements, return the single element with attribute keyDate=“yes” or return nil if no elements have attribute keyDate=“yes”, or if multiple elements have keyDate=“yes”.
.remove_approximate(nodeset) ⇒ Array<Nokogiri::XML::Element>

remove Elements from NodeSet if they have a qualifier attribute of ‘approximate’ or ‘questionable’.

Instance Method Summary collapse

#additional_authors_w_dates ⇒ Object

all names, in display form, except the main_author names will be the display_value_w_date form see Mods::Record.name in nom_terminology for details on the display_value algorithm.
#box ⇒ String

data in location/physicalLocation or in relatedItem/location/physicalLocation so use _location to get the data from either one of them.
#catkey ⇒ String

Value with the numeric catkey in it, or nil if none exists.
#collectors_w_dates ⇒ Object

Array of Strings, each containing the computed display value of a personal name with the role of Collector (see mods gem nom_terminology for display value algorithm).
#coordinates ⇒ Array{String}

Subject cartographic coordinates values.
#coordinates_as_bbox ⇒ Array{String} (also: #point_bbox)

With 4-part space-delimted strings, like “-16.0 -15.0 28.0 13.0”.
#coordinates_as_envelope ⇒ Array{String}

Values suitable for solr SRPT fields, like “ENVELOPE(-16.0, 28.0, 13.0, -15.0)”.
#coordinates_objects ⇒ Array{Stanford::Mods::Coordinate}

Valid coordinates as objects.
#date_created_elements(ignore_approximate = false) ⇒ Array<Nokogiri::XML::Element>

return /originInfo/dateCreated elements in MODS records.
#date_issued_elements(ignore_approximate = false) ⇒ Array<Nokogiri::XML::Element>

return /originInfo/dateIssued elements in MODS records.
#era_facet ⇒ Array<String>

subject/temporal values with trailing comma, semicolon, and backslash (and any preceding spaces) removed.
#first_title_info_node ⇒ Nokogiri::XML::Node

The first titleInfo node if present, else nil.
#folder ⇒ String

data in location/physicalLocation or in relatedItem/location/physicalLocation so use _location to get the data from either one of them.
#format ⇒ Array[String] deprecated
Deprecated.
- kept for backwards compatibility but not part of SW UI redesign work Summer 2014
#format_main ⇒ Array[String]

select one or more format values from the controlled vocabulary per JVine Summer 2014 searchworks-solr-lb.stanford.edu:8983/solr/select?facet.field=format_main_ssim&rows=0&facet.sort=index github.com/sul-dlss/stanford-mods/issues/66 - For geodata, the resource type should be only Map and not include Software, multimedia.
#geo_extensions_as_envelope ⇒ Array{String}

Values suitable for solr SRPT fields, like “ENVELOPE(-16.0, 28.0, 13.0, -15.0)”.
#geo_extensions_point_data ⇒ Array{String}

Values suitable for solr SRPT fields, like “-16.0 28.0”.
#geographic_facet ⇒ Array<String>

geographic_search values with trailing comma, semicolon, and backslash (and any preceding spaces) removed.
#geographic_search ⇒ Array<String>

Values are the contents of: subject/geographic subject/hierarchicalGeographic subject/geographicCode (only include the translated value if it isn’t already present from other mods geo fields).
#imprint_display_str ⇒ String

Single String containing imprint information for display.
#includes_marc_relator_collector_role?(role_node) ⇒ Boolean

True if there is a MARC relator collector role assigned.
#main_author_w_date ⇒ String

the first encountered <mods><name> element with marcrelator flavor role of ‘Creator’ or ‘Author’.
#main_author_w_date_test ⇒ Object
#non_collector_person_authors ⇒ Object

FIXME: this is broken if there are multiple role codes and some of them are not marcrelator.
#nonSort_title ⇒ String

The nonSort text portion of the titleInfo node as a string (if non-empty, else nil).
#physical_location_str ⇒ String

but only if it has series, accession, box or folder data data in location/physicalLocation or in relatedItem/location/physicalLocation so use _location to get the data from either one of them.
#place ⇒ Object

—- old date parsing methods used downstream of gem; will be deprecated/replaced with new date parsing methods.
#present_title_info_nodes ⇒ Nokogiri::XML::NodeSet

Title_info nodes, rejecting ones that just have blank text values.
#pub_date_display ⇒ String deprecated Deprecated.

DO NOT USE: this is no longer used in SW, Revs or Spotlight Jan 2016
#pub_date_facet ⇒ String

Values for the pub date facet.
#pub_date_sort ⇒ Object deprecated Deprecated.

use pub_year_int, or pub_year_sort_str if you must have a string (why?)
#pub_year_display_str(ignore_approximate = false) ⇒ Object

return a single string intended for display of pub year 0 < year < 1000: add A.D.
#pub_year_int(ignore_approximate = false) ⇒ Integer

return pub year as an Integer prefer dateIssued (any) before dateCreated (any) before dateCaptured (any) look for a keyDate and use it if there is one; otherwise pick earliest date.
#pub_year_sort_str(ignore_approximate = false) ⇒ String deprecated Deprecated.

use pub_year_int
#series ⇒ String

data in location/physicalLocation or in relatedItem/location/physicalLocation so use _location to get the data from either one of them.
#subject_all_search ⇒ Array<String>

Values are the contents of: all subject subelements except subject/cartographic plus genre top level element.
#subject_other_search ⇒ Array<String>

Values are the contents of: subject/name subject/occupation - no subelements subject/titleInfo.
#subject_other_subvy_search ⇒ Array<String>

Values are the contents of: subject/temporal subject/genre.
#sw_addl_authors ⇒ Array<String>

Values for author_7xx_search field.
#sw_addl_titles ⇒ Array<String>

this includes all titles except.
#sw_corporate_authors ⇒ Array<String>

Values for author_corp_display.
#sw_full_title ⇒ String

Value for title_245_search, title_full_display.
#sw_full_title_without_commas ⇒ Object deprecated Deprecated.

in favor of sw_title_display
#sw_genre ⇒ Array[String]

github.com/sul-dlss/stanford-mods/issues/66 Limit genre values to Government document, Conference proceedings, Technical report and Thesis/Dissertation.
#sw_geographic_search(sep = ' ') ⇒ Array<String>

Values are the contents of: subject/geographic subject/hierarchicalGeographic subject/geographicCode (only include the translated value if it isn’t already present from other mods geo fields).
#sw_impersonal_authors ⇒ Array<String>

return the display_value_w_date for all <mods><name> elements that do not have type=‘personal’.
#sw_language_facet ⇒ Object

include langagues known to SearchWorks; try to error correct when possible (e.g. when ISO-639 disagrees with MARC standard).
#sw_main_author ⇒ String

Value for author_1xx_search field.
#sw_meeting_authors ⇒ Array<String>

Values for author_meeting_display.
#sw_person_authors ⇒ Array<String>

Values for author_person_facet, author_person_display.
#sw_short_title ⇒ String

Value for title_245a_search field.
#sw_sort_author ⇒ String

Returns a sortable version of the main_author: main_author + sorting title which is the mods approximation of the value created for a marc record.
#sw_sort_title ⇒ String

Returns a sortable version of the main title.
#sw_subject_names(sep = ', ') ⇒ Array<String>

Values are the contents of: subject/name/namePart “Values from namePart subelements should be concatenated in the order they appear (e.g. ”Shakespeare, William, 1564-1616“)”.
#sw_subject_titles(sep = ' ') ⇒ Array<String>

Values are the contents of: subject/titleInfo/(subelements).
#sw_title_display ⇒ String

like sw_full_title without trailing ,/;:.
#title ⇒ String

The text of the titleInfo node as a string (if non-empty, else nil).
#topic_facet ⇒ Array<String>

Values are the contents of: subject/topic subject/name subject/title subject/occupation with trailing comma, semicolon, and backslash (and any preceding spaces) removed.
#topic_search ⇒ Array<String>

Values are the contents of: mods/genre mods/subject/topic.
#year_display_str(date_el_array) ⇒ String

given the passed date elements, look for a single keyDate and use it if there is one; otherwise pick earliest parseable date.
#year_int(date_el_array) ⇒ Integer

given the passed date elements, look for a single keyDate and use it if there is one; otherwise pick earliest parseable date.
#year_sort_str(date_el_array) ⇒ String

given the passed date elements, look for a single keyDate and use it if there is one; otherwise pick earliest parseable date.

Instance Attribute Details

#druid ⇒ `Object`



14
15
16

# File 'lib/stanford-mods/searchworks.rb', line 14

def druid
  @druid || 'Unknown item'
end

#logger ⇒ `Object` Also known as: sw_logger



18
19
20

# File 'lib/stanford-mods/searchworks.rb', line 18

def logger
  @logger ||= Logger.new(STDOUT)
end

Class Method Details

.date_is_approximate?(date_element) ⇒ `Boolean`

NOTE: legal values for MODS date elements with attribute qualifier are

'approximate', 'inferred' or 'questionable'

Parameters:

date_element (Nokogiri::XML::Element) —

MODS date element

Returns:

(Boolean) —

true if date_element has a qualifier attribute of “approximate” or “questionable”, false if no qualifier attribute, or if attribute is ‘inferred’ or some other value

# File 'lib/stanford-mods/origin_info.rb', line 153

def self.date_is_approximate?(date_element)
  qualifier = date_element["qualifier"] if date_element.respond_to?('[]')
  qualifier == 'approximate' || qualifier == 'questionable'
end

.earliest_year_int(date_el_array) ⇒ `Object`

get earliest parseable year (as an Integer) from the passed date elements

Parameters:

date_el_array (Array<Nokogiri::XML::Element>) —

the elements from which to select a pub date

Returns:

two String values: the first is the Integer value of the earliest year; the second is the original String value of the chosen element



163
164
165

# File 'lib/stanford-mods/origin_info.rb', line 163

def self.earliest_year_int(date_el_array)
  earliest_year(date_el_array, :year_int_from_date_str)
end

.earliest_year_str(date_el_array) ⇒ `Object`

get earliest parseable year (as a String) from the passed date elements

Parameters:

date_el_array (Array<Nokogiri::XML::Element>) —

the elements from which to select a pub date

Returns:

two String values: the first is the lexically sortable String value of the earliest year; the second is the original String value of the chosen element



172
173
174

# File 'lib/stanford-mods/origin_info.rb', line 172

def self.earliest_year_str(date_el_array)
  earliest_year(date_el_array, :sortable_year_string_from_date_str)
end

.keyDate(elements) ⇒ `Nokogiri::XML::Element`^?

given a set of date elements, return the single element with attribute keyDate=“yes”

or return nil if no elements have attribute keyDate="yes", or if multiple elements have keyDate="yes"

Parameters:

Array (Array<Nokogiri::XML::Element>) —

of date elements

Returns:

(Nokogiri::XML::Element, nil) —

single date element with attribute keyDate=“yes”, or nil

# File 'lib/stanford-mods/origin_info.rb', line 135

def self.keyDate(elements)
  keyDates = elements.select { |node| node["keyDate"] == 'yes' }
  keyDates.first if keyDates.size == 1
end

.remove_approximate(nodeset) ⇒ `Array<Nokogiri::XML::Element>`

remove Elements from NodeSet if they have a qualifier attribute of ‘approximate’ or ‘questionable’

Parameters:

nodeset (Nokogiri::XML::NodeSet<Nokogiri::XML::Element>) —

set of date elements

Returns:

(Array<Nokogiri::XML::Element>) —

the set of date elements minus any that had a qualifier attribute of ‘approximate’ or ‘questionable’



144
145
146

# File 'lib/stanford-mods/origin_info.rb', line 144

def self.remove_approximate(nodeset)
  nodeset.select { |node| node unless date_is_approximate?(node) }
end

Instance Method Details

#additional_authors_w_dates ⇒ `Object`

all names, in display form, except the main_author

names will be the display_value_w_date form
see Mods::Record.name  in nom_terminology for details on the display_value algorithm

# File 'lib/stanford-mods/name.rb', line 32

def additional_authors_w_dates
  results = []
  mods_ng_xml.plain_name.each { |n|
    results << n.display_value_w_date
  }
  results.delete(main_author_w_date)
  results
end

#box ⇒ `String`

TODO:

should it be hierarchical series/box/folder?

data in location/physicalLocation or in relatedItem/location/physicalLocation so use _location to get the data from either one of them

Returns:

(String) —

box number (note: single valued and might be something like 35A)

# File 'lib/stanford-mods/physical_location.rb', line 15

def box
  mods_ng_xml._location.physicalLocation.each do |node|
    match_data = node.text.match(/Box ?:? ?([^,|(Folder)]+)/i) # note that this will also find Flatbox or Flat-box
    return match_data[1].strip if match_data.present?
  end
  nil
end

#catkey ⇒ `String`

Returns value with the numeric catkey in it, or nil if none exists.

Returns:

(String) —

value with the numeric catkey in it, or nil if none exists

# File 'lib/stanford-mods/searchworks.rb', line 366

def catkey
  catkey = term_values([:record_info, :recordIdentifier])
  return nil unless catkey && !catkey.empty?
  catkey.first.tr('a', '') # ensure catkey is numeric only
end

#collectors_w_dates ⇒ `Object`

Returns Array of Strings, each containing the computed display value of a personal name with the role of Collector (see mods gem nom_terminology for display value algorithm).

Returns:

Array of Strings, each containing the computed display value of a personal name with the role of Collector (see mods gem nom_terminology for display value algorithm)

# File 'lib/stanford-mods/name.rb', line 57

def collectors_w_dates
  result = []
  mods_ng_xml.personal_name.each do |n|
    next if n.role.size.zero?
    n.role.each { |r|
      result << n.display_value_w_date if includes_marc_relator_collector_role?(r)
    }
  end
  result unless result.empty?
end

#coordinates ⇒ `Array{String}`

Returns subject cartographic coordinates values.

Returns:

(Array{String}) —

subject cartographic coordinates values



11
12
13

# File 'lib/stanford-mods/geo_spatial.rb', line 11

def coordinates
  Array(mods_ng_xml.subject.cartographics.coordinates).map(&:text)
end

#coordinates_as_bbox ⇒ `Array{String}` Also known as: point_bbox

Returns with 4-part space-delimted strings, like “-16.0 -15.0 28.0 13.0”.

Returns:

(Array{String}) —

with 4-part space-delimted strings, like “-16.0 -15.0 28.0 13.0”



62
63
64

# File 'lib/stanford-mods/geo_spatial.rb', line 62

def coordinates_as_bbox
  coordinates_objects.map(&:as_bbox).compact
end

#coordinates_as_envelope ⇒ `Array{String}`

Returns values suitable for solr SRPT fields, like “ENVELOPE(-16.0, 28.0, 13.0, -15.0)”.

Returns:

(Array{String}) —

values suitable for solr SRPT fields, like “ENVELOPE(-16.0, 28.0, 13.0, -15.0)”



57
58
59

# File 'lib/stanford-mods/geo_spatial.rb', line 57

def coordinates_as_envelope
  coordinates_objects.map(&:as_envelope).compact
end

#coordinates_objects ⇒ `Array{Stanford::Mods::Coordinate}`

Returns valid coordinates as objects.

Returns:

(Array{Stanford::Mods::Coordinate}) —

valid coordinates as objects



52
53
54

# File 'lib/stanford-mods/geo_spatial.rb', line 52

def coordinates_objects
  coordinates.map { |n| Stanford::Mods::Coordinate.new(n) }.select(&:valid?)
end

#date_created_elements(ignore_approximate = false) ⇒ `Array<Nokogiri::XML::Element>`

return /originInfo/dateCreated elements in MODS records

Parameters:

ignore_approximate (Boolean) (defaults to: false) —

true if approximate dates (per qualifier attribute) should be excluded; false approximate dates should be included

Returns:

(Array<Nokogiri::XML::Element>)

# File 'lib/stanford-mods/origin_info.rb', line 115

def date_created_elements(ignore_approximate = false)
  date_created_nodeset = mods_ng_xml.origin_info.dateCreated
  return self.class.remove_approximate(date_created_nodeset) if ignore_approximate
  date_created_nodeset.to_a
end

#date_issued_elements(ignore_approximate = false) ⇒ `Array<Nokogiri::XML::Element>`

return /originInfo/dateIssued elements in MODS records

Parameters:

ignore_approximate (Boolean) (defaults to: false) —

true if approximate dates (per qualifier attribute) should be excluded; false approximate dates should be included

Returns:

(Array<Nokogiri::XML::Element>)

# File 'lib/stanford-mods/origin_info.rb', line 125

def date_issued_elements(ignore_approximate = false)
  date_issued_nodeset = mods_ng_xml.origin_info.dateIssued
  return self.class.remove_approximate(date_issued_nodeset) if ignore_approximate
  date_issued_nodeset.to_a
end

#era_facet ⇒ `Array<String>`

subject/temporal values with trailing comma, semicolon, and backslash (and any preceding spaces) removed

Returns:

(Array<String>) —

values for the era_facet Solr field for this document or nil if none



93
94
95

# File 'lib/stanford-mods/searchworks_subjects.rb', line 93

def era_facet
  subject_temporal.map { |val| val.sub(/[\\,;]$/, '').strip } if subject_temporal
end

#first_title_info_node ⇒ `Nokogiri::XML::Node`

Returns the first titleInfo node if present, else nil.

Returns:

(Nokogiri::XML::Node) —

the first titleInfo node if present, else nil



138
139
140

# File 'lib/stanford-mods/searchworks.rb', line 138

def first_title_info_node
  present_title_info_nodes ? present_title_info_nodes.first : nil
end

#folder ⇒ `String`

TODO:

should it be hierarchical series/box/folder?

data in location/physicalLocation or in relatedItem/location/physicalLocation so use _location to get the data from either one of them

Returns:

(String) —

folder number (note: single valued)

# File 'lib/stanford-mods/physical_location.rb', line 27

def folder
  mods_ng_xml._location.physicalLocation.each do |node|
    val = node.text
    match_data = val =~ /\|/ ?
                 val.match(/Folder ?:? ?([^|]+)/) : # expect pipe-delimited, may contain commas within values
                 val.match(/Folder ?:? ?([^,]+)/)   # expect comma-delimited, may NOT contain commas within values
    return match_data[1].strip if match_data.present?
  end
  nil
end

#format ⇒ `Array[String]`

Deprecated.

kept for backwards compatibility but not part of SW UI redesign work Summer 2014

select one or more format values from the controlled vocabulary here:

http://searchworks-solr-lb.stanford.edu:8983/solr/select?facet.field=format&rows=0&facet.sort=index

@deprecated: this is no longer used in SW, Revs or Spotlight Jan 2016

Returns:

(Array[String]) —

value in the SearchWorks controlled vocabulary

# File 'lib/stanford-mods/searchworks.rb', line 235

def format
  types = term_values(:typeOfResource)
  return [] unless types
  genres = term_values(:genre)
  issuance = term_values([:origin_info, :issuance])
  val = []
  types.each do |type|
    case type
      when 'cartographic'
        val << 'Map/Globe'
      when 'mixed material'
        val << 'Manuscript/Archive'
      when 'moving image'
        val << 'Video'
      when 'notated music'
        val << 'Music - Score'
      when 'software, multimedia'
        val << 'Computer File'
      when 'sound recording-musical'
        val << 'Music - Recording'
      when 'sound recording-nonmusical', 'sound recording'
        val << 'Sound Recording'
      when 'still image'
        val << 'Image'
      when 'text'
        val << 'Book' if issuance && issuance.include?('monographic')
        book_genres = ['book chapter', 'Book chapter', 'Book Chapter',
          'issue brief', 'Issue brief', 'Issue Brief',
          'librettos', 'Librettos',
          'project report', 'Project report', 'Project Report',
          'technical report', 'Technical report', 'Technical Report',
          'working paper', 'Working paper', 'Working Paper']
        val << 'Book' if genres && !(genres & book_genres).empty?
        conf_pub = ['conference publication', 'Conference publication', 'Conference Publication']
        val << 'Conference Proceedings' if genres && !(genres & conf_pub).empty?
        val << 'Journal/Periodical' if issuance && issuance.include?('continuing')
        article = ['article', 'Article']
        val << 'Journal/Periodical' if genres && !(genres & article).empty?
        stu_proj_rpt = ['student project report', 'Student project report', 'Student Project report', 'Student Project Report']
        val << 'Other' if genres && !(genres & stu_proj_rpt).empty?
        thesis = ['thesis', 'Thesis']
        val << 'Thesis' if genres && !(genres & thesis).empty?
      when 'three dimensional object'
        val << 'Other'
    end
  end
  val.uniq
end

#format_main ⇒ `Array[String]`

select one or more format values from the controlled vocabulary per JVine Summer 2014

http://searchworks-solr-lb.stanford.edu:8983/solr/select?facet.field=format_main_ssim&rows=0&facet.sort=index

github.com/sul-dlss/stanford-mods/issues/66 - For geodata, the resource type should be only Map and not include Software, multimedia.

Returns:

(Array[String]) —

value in the SearchWorks controlled vocabulary

# File 'lib/stanford-mods/searchworks.rb', line 289

def format_main
  types = term_values(:typeOfResource)
  return [] unless types
  article_genres = ['article', 'Article',
    'book chapter', 'Book chapter', 'Book Chapter',
    'issue brief', 'Issue brief', 'Issue Brief',
    'project report', 'Project report', 'Project Report',
    'student project report', 'Student project report', 'Student Project report', 'Student Project Report',
    'technical report', 'Technical report', 'Technical Report',
    'working paper', 'Working paper', 'Working Paper'
  ]
  book_genres = ['conference publication', 'Conference publication', 'Conference Publication',
    'instruction', 'Instruction',
    'librettos', 'Librettos',
    'thesis', 'Thesis'
  ]
  val = []
  genres = term_values(:genre)
  issuance = term_values([:origin_info, :issuance])
  types.each do |type|
    case type
      when 'cartographic'
        val << 'Map'
        val.delete 'Software/Multimedia'
      when 'mixed material'
        val << 'Archive/Manuscript'
      when 'moving image'
        val << 'Video'
      when 'notated music'
        val << 'Music score'
      when 'software, multimedia'
        if genres && (genres.include?('dataset') || genres.include?('Dataset'))
          val << 'Dataset'
        elsif !val.include?('Map')
          val << 'Software/Multimedia'
        end
      when 'sound recording-musical'
        val << 'Music recording'
      when 'sound recording-nonmusical', 'sound recording'
        val << 'Sound recording'
      when 'still image'
        val << 'Image'
      when 'text'
        val << 'Book' if genres && !(genres & article_genres).empty?
        val << 'Book' if issuance && issuance.include?('monographic')
        val << 'Book' if genres && !(genres & book_genres).empty?
        val << 'Journal/Periodical' if issuance && issuance.include?('continuing')
        val << 'Archived website' if genres && genres.include?('archived website')
      when 'three dimensional object'
        val << 'Object'
    end
  end
  val.uniq
end

#geo_extensions_as_envelope ⇒ `Array{String}`

Note:

example xml leaf nodes <gml:lowerCorner>-122.191292 37.4063388</gml:lowerCorner> <gml:upperCorner>-122.149475 37.4435369</gml:upperCorner>

Returns values suitable for solr SRPT fields, like “ENVELOPE(-16.0, 28.0, 13.0, -15.0)”.

Returns:

(Array{String}) —

values suitable for solr SRPT fields, like “ENVELOPE(-16.0, 28.0, 13.0, -15.0)”

# File 'lib/stanford-mods/geo_spatial.rb', line 19

def geo_extensions_as_envelope
  mods_ng_xml.extension
             .xpath(
               '//rdf:RDF/rdf:Description/gml:boundedBy/gml:Envelope',
               'gml' => GMLNS,
               'rdf' => 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'
             ).map do |v|
               uppers = v.xpath('gml:upperCorner', 'gml' => GMLNS).text.split
               lowers = v.xpath('gml:lowerCorner', 'gml' => GMLNS).text.split
               "ENVELOPE(#{lowers[0]}, #{uppers[0]}, #{uppers[1]}, #{lowers[1]})"
             end
rescue RuntimeError => e
  logger.warn "failure parsing <extension> element: #{e.message}"
  []
end

#geo_extensions_point_data ⇒ `Array{String}`

Note:

example xml leaf nodes <gml:pos>-122.191292 37.4063388</gml:pos>

Returns values suitable for solr SRPT fields, like “-16.0 28.0”.

Returns:

(Array{String}) —

values suitable for solr SRPT fields, like “-16.0 28.0”

# File 'lib/stanford-mods/geo_spatial.rb', line 38

def geo_extensions_point_data
  mods_ng_xml.extension
             .xpath(
               '//rdf:RDF/rdf:Description/gmd:centerPoint/gml:Point[gml:pos]',
               'gml' => GMLNS,
               'gmd' => 'http://www.isotc211.org/2005/gmd',
               'rdf' => 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'
             ).map do |v|
               lat, long = v.xpath('gml:pos', 'gml' => GMLNS).text.split
               "#{long} #{lat}"
             end
end

#geographic_facet ⇒ `Array<String>`

geographic_search values with trailing comma, semicolon, and backslash (and any preceding spaces) removed

Returns:

(Array<String>) —

values for the geographic_facet Solr field for this document or nil if none



87
88
89

# File 'lib/stanford-mods/searchworks_subjects.rb', line 87

def geographic_facet
  geographic_search.map { |val| val.sub(/[\\,;]$/, '').strip } if geographic_search
end

#geographic_search ⇒ `Array<String>`

Values are the contents of:

subject/geographic
subject/hierarchicalGeographic
subject/geographicCode  (only include the translated value if it isn't already present from other mods geo fields)

Returns:

(Array<String>) —

values for the geographic_search Solr field for this document or nil if none

# File 'lib/stanford-mods/searchworks_subjects.rb', line 102

def geographic_search
  @geographic_search ||= begin
    result = sw_geographic_search

    # TODO:  this should go into stanford-mods ... but then we have to set that gem up with a Logger
    # print a message for any unrecognized encodings
    xvals = subject.geographicCode.translated_value
    codes = term_values([:subject, :geographicCode])
    if codes && codes.size > xvals.size
      subject.geographicCode.each { |n|
        next unless n.authority != 'marcgac' && n.authority != 'marccountry'
        sw_logger.info("#{druid} has subject geographicCode element with untranslated encoding (#{n.authority}): #{n.to_xml}")
      }
    end

    # FIXME:  stanford-mods should be returning [], not nil ...
    return nil if !result || result.empty?
    result
  end
end

#imprint_display_str ⇒ `String`

Returns single String containing imprint information for display.

Returns:

(String) —

single String containing imprint information for display

# File 'lib/stanford-mods/origin_info.rb', line 73

def imprint_display_str
  imp = Stanford::Mods::Imprint.new(origin_info)
  imp.display_str
end

#includes_marc_relator_collector_role?(role_node) ⇒ `Boolean`

Returns true if there is a MARC relator collector role assigned.

Parameters:

Nokogiri::XML::Node —

role_node the role node from a parent name node

Returns:

(Boolean) —

true if there is a MARC relator collector role assigned

# File 'lib/stanford-mods/name.rb', line 72

def includes_marc_relator_collector_role?(role_node)
  (role_node.authority.include?('marcrelator') && role_node.value.include?('Collector')) ||
  role_node.roleTerm.valueURI.first == COLLECTOR_ROLE_URI
end

#main_author_w_date ⇒ `String`

the first encountered <mods><name> element with marcrelator flavor role of ‘Creator’ or ‘Author’. if no marcrelator ‘Creator’ or ‘Author’, the first name without a role. if no name without a role, then nil see Mods::Record.name in nom_terminology for details on the display_value algorithm

Returns:

(String) —

a name in the display_value_w_date form

# File 'lib/stanford-mods/name.rb', line 13

def main_author_w_date
  result = nil
  first_wo_role = nil
  mods_ng_xml.plain_name.each { |n|
    first_wo_role ||= n if n.role.empty?
    n.role.each { |r|
      if r.authority.include?('marcrelator') &&
            (r.value.include?('Creator') || r.value.include?('Author'))
        result ||= n.display_value_w_date
      end
    }
  }
  result = first_wo_role.display_value_w_date if !result && first_wo_role
  result
end

#main_author_w_date_test ⇒ `Object`

# File 'lib/stanford-mods/searchworks.rb', line 107

def main_author_w_date_test
  result = nil
  first_wo_role = nil
  plain_name.each { |n|
    first_wo_role ||= n if n.role.empty?
    n.role.each { |r|
      if r.authority.include?('marcrelator') &&
        (r.value.include?('Creator') || r.value.include?('Author'))
        result ||= n.display_value_w_date
      end
    }
  }
  result = first_wo_role.display_value_w_date if !result && first_wo_role
  result
end

#non_collector_person_authors ⇒ `Object`

FIXME: this is broken if there are multiple role codes and some of them are not marcrelator

Returns:

Array of Strings, each containing the computed display value of a personal name except for the collector role (see mods gem nom_terminology for display value algorithm)

# File 'lib/stanford-mods/name.rb', line 44

def non_collector_person_authors
  result = []
  mods_ng_xml.personal_name.map do |n|
    next if n.role.size.zero?
    n.role.each { |r|
      result << n.display_value_w_date unless includes_marc_relator_collector_role?(r)
    }
  end
  result unless result.empty?
end

#nonSort_title ⇒ `String`

Returns the nonSort text portion of the titleInfo node as a string (if non-empty, else nil).

Returns:

(String) —

the nonSort text portion of the titleInfo node as a string (if non-empty, else nil)

# File 'lib/stanford-mods/searchworks.rb', line 143

def nonSort_title
  return unless first_title_info_node && first_title_info_node.nonSort

  first_title_info_node.nonSort.text.strip.empty? ? nil : first_title_info_node.nonSort.text.strip
end

#physical_location_str ⇒ `String`

TODO:

should it be hierarchical series/box/folder?

Note:

there is a “physicalLocation” and a “location” method defined in the mods gem, so we cannot use these names to avoid conflicts

but only if it has series, accession, box or folder data data in location/physicalLocation or in relatedItem/location/physicalLocation so use _location to get the data from either one of them

Returns:

(String) —

entire contents of physicalLocation as a string (note: single valued)

# File 'lib/stanford-mods/physical_location.rb', line 44

def physical_location_str
  mods_ng_xml._location.physicalLocation.map(&:text).find do |text|
    text =~ /.*(Series)|(Accession)|(Folder)|(Box).*/i
  end
end

#place ⇒ `Object`

—- old date parsing methods used downstream of gem; will be deprecated/replaced with new date parsing methods



226
227
228

# File 'lib/stanford-mods/origin_info.rb', line 226

def place
  term_values([:origin_info, :place, :placeTerm])
end

#present_title_info_nodes ⇒ `Nokogiri::XML::NodeSet`

Returns title_info nodes, rejecting ones that just have blank text values.

Returns:

(Nokogiri::XML::NodeSet) —

title_info nodes, rejecting ones that just have blank text values



133
134
135

# File 'lib/stanford-mods/searchworks.rb', line 133

def present_title_info_nodes
  mods_ng_xml.title_info.reject {|node| node.text.strip.empty?}
end

#pub_date_display ⇒ `String`

Deprecated.

DO NOT USE: this is no longer used in SW, Revs or Spotlight Jan 2016

For the date display only, the first place to look is in the dates without encoding=marc array. If no such dates, select the first date in the dates_marc_encoding array. Otherwise return nil

Returns:

(String) —

value for the pub_date_display Solr field for this document or nil if none

# File 'lib/stanford-mods/origin_info.rb', line 261

def pub_date_display
  return dates_no_marc_encoding.first unless dates_no_marc_encoding.empty?
  return dates_marc_encoding.first unless dates_marc_encoding.empty?
  nil
end

#pub_date_facet ⇒ `String`

Values for the pub date facet. This is less strict than the 4 year date requirements for pub_date Jan 2016: used to populate Solr pub_date field for Spotlight and SearchWorks

Spotlight:  pub_date field should be replaced by pub_year_w_approx_isi and pub_year_no_approx_isi
SearchWorks:  pub_date field used for display in search results and show view; for sorting nearby-on-shelf
   these could be done with more approp fields/methods (pub_year_int for sorting;  new pub year methods to populate field)

TODO: prob should deprecate this in favor of pub_year_display_str;

need head-to-head testing with pub_year_display_str

Returns:

(String) —

value for the pub date facet

# File 'lib/stanford-mods/origin_info.rb', line 238

def pub_date_facet
  return nil unless pub_date
  return "#{pub_date.to_i + 1000} B.C." if pub_date.start_with?('-')
  return pub_date unless pub_date.include? '--'
  "#{pub_date[0, 2].to_i + 1}th century"
end

#pub_date_sort ⇒ `Object`

Deprecated.

use pub_year_int, or pub_year_sort_str if you must have a string (why?)

creates a date suitable for sorting. Guarnteed to be 4 digits or nil

# File 'lib/stanford-mods/origin_info.rb', line 247

def pub_date_sort
  if pub_date
    pd = pub_date
    pd = '0' + pd if pd.length == 3
    pd = pd.gsub('--', '00')
  end
  fail "pub_date_sort was about to return a non 4 digit value #{pd}!" if pd && pd.length != 4
  pd
end

#pub_year_display_str(ignore_approximate = false) ⇒ `Object`

return a single string intended for display of pub year 0 < year < 1000: add A.D. suffix year < 0: add B.C. suffix. (‘-5’ => ‘5 B.C.’, ‘700 B.C.’ => ‘700 B.C.’) 195u => 195x 19uu => 19xx

'-5'  =>  '5 B.C.'
'700 B.C.'  => '700 B.C.'
'7th century' => '7th century'

date ranges? prefer dateIssued (any) before dateCreated (any) before dateCaptured (any)

look for a keyDate and use it if there is one;  otherwise pick earliest date

Parameters:

ignore_approximate (Boolean) (defaults to: false) —

true if approximate dates (per qualifier attribute) should be ignored; false if approximate dates should be included

# File 'lib/stanford-mods/origin_info.rb', line 47

def pub_year_display_str(ignore_approximate = false)
  single_pub_year(ignore_approximate, :year_display_str)

  # TODO: want range displayed when start and end points
  # TODO: also want best year in year_isi fields
  # get_main_title_date
  # https://github.com/sul-dlss/SearchWorks/blob/7d4d870a9d450fed8b081c38dc3dbd590f0b706e/app/helpers/results_document_helper.rb#L8-L46

  # "publication_year_isi"   => "Publication date",  <--  do it already
  # "beginning_year_isi"     => "Beginning date",
  # "earliest_year_isi"      => "Earliest date",
  # "earliest_poss_year_isi" => "Earliest possible date",
  # "ending_year_isi"        => "Ending date",
  # "latest_year_isi"        => "Latest date",
  # "latest_poss_year_isi"   => "Latest possible date",
  # "production_year_isi"    => "Production date",
  # "original_year_isi"      => "Original date",
  # "copyright_year_isi"     => "Copyright date"} %>

  # "creation_year_isi"      => "Creation date",  <--  do it already
  # {}"release_year_isi"       => "Release date",
  # {}"reprint_year_isi"       => "Reprint/reissue date",
  # {}"other_year_isi"         => "Date",
end

#pub_year_int(ignore_approximate = false) ⇒ `Integer`

Note:

for sorting: 5 B.C. => -5; 666 B.C. => -666

return pub year as an Integer prefer dateIssued (any) before dateCreated (any) before dateCaptured (any)

look for a keyDate and use it if there is one;  otherwise pick earliest date

Parameters:

ignore_approximate (Boolean) (defaults to: false) —

true if approximate dates (per qualifier attribute) should be ignored; false if approximate dates should be included

Returns:

(Integer) —

publication year as an Integer



19
20
21

# File 'lib/stanford-mods/origin_info.rb', line 19

def pub_year_int(ignore_approximate = false)
  single_pub_year(ignore_approximate, :year_int)
end

#pub_year_sort_str(ignore_approximate = false) ⇒ `String`

Deprecated.

use pub_year_int

Note:

for string sorting 5 B.C. = -5 => -995; 6 B.C. => -994, so 6 B.C. sorts before 5 B.C.

return a single string intended for lexical sorting for pub date prefer dateIssued (any) before dateCreated (any) before dateCaptured (any)

look for a keyDate and use it if there is one;  otherwise pick earliest date

Parameters:

ignore_approximate (Boolean) (defaults to: false) —

true if approximate dates (per qualifier attribute) should be ignored; false if approximate dates should be included

Returns:

(String) —

single String containing publication year for lexical sorting



30
31
32

# File 'lib/stanford-mods/origin_info.rb', line 30

def pub_year_sort_str(ignore_approximate = false)
  single_pub_year(ignore_approximate, :year_sort_str)
end

#series ⇒ `String`

TODO:

should it be hierarchical series/box/folder?

data in location/physicalLocation or in relatedItem/location/physicalLocation so use _location to get the data from either one of them

Returns:

(String) —

series/accession ‘number’ (note: single valued)

# File 'lib/stanford-mods/physical_location.rb', line 54

def series
  mods_ng_xml._location.physicalLocation.each do |node|
    # feigenbaum uses 'Accession'
    match_data = node.text.match(/(?:(?:Series)|(?:Accession)):? ([^,|]+)/i)
    return match_data[1].strip if match_data.present?
  end
  nil
end

#subject_all_search ⇒ `Array<String>`

Values are the contents of:

all subject subelements except subject/cartographic plus  genre top level element

Returns:

(Array<String>) —

values for the subject_all_search Solr field for this document or nil if none

# File 'lib/stanford-mods/searchworks_subjects.rb', line 159

def subject_all_search
  vals = topic_search ? Array.new(topic_search) : []
  vals.concat(geographic_search) if geographic_search
  vals.concat(subject_other_search) if subject_other_search
  vals.concat(subject_other_subvy_search) if subject_other_subvy_search
  vals.empty? ? nil : vals
end

#subject_other_search ⇒ `Array<String>`

Values are the contents of:

subject/name
subject/occupation  - no subelements
subject/titleInfo

Returns:

(Array<String>) —

values for the subject_other_search Solr field for this document or nil if none

# File 'lib/stanford-mods/searchworks_subjects.rb', line 128

def subject_other_search
  @subject_other_search ||= begin
    vals = subject_occupations ? Array.new(subject_occupations) : []
    vals.concat(subject_names) if subject_names
    vals.concat(subject_titles) if subject_titles
    vals.empty? ? nil : vals
  end
end

#subject_other_subvy_search ⇒ `Array<String>`

Values are the contents of:

subject/temporal
subject/genre

Returns:

(Array<String>) —

values for the subject_other_subvy_search Solr field for this document or nil if none

# File 'lib/stanford-mods/searchworks_subjects.rb', line 141

def subject_other_subvy_search
  @subject_other_subvy_search ||= begin
    vals = subject_temporal ? Array.new(subject_temporal) : []
    gvals = term_values([:subject, :genre])
    vals.concat(gvals) if gvals

    # print a message for any temporal encodings
    subject.temporal.each { |n|
      sw_logger.info("#{druid} has subject temporal element with untranslated encoding: #{n.to_xml}") unless n.encoding.empty?
    }

    vals.empty? ? nil : vals
  end
end

#sw_addl_authors ⇒ `Array<String>`

Returns values for author_7xx_search field.

Returns:

(Array<String>) —

values for author_7xx_search field



72
73
74

# File 'lib/stanford-mods/searchworks.rb', line 72

def sw_addl_authors
  additional_authors_w_dates
end

#sw_addl_titles ⇒ `Array<String>`

this includes all titles except

Returns:

(Array<String>) —

values for title_variant_search



199
200
201

# File 'lib/stanford-mods/searchworks.rb', line 199

def sw_addl_titles
  full_titles.select { |s| s !~ Regexp.new(Regexp.escape(sw_short_title)) }
end

#sw_corporate_authors ⇒ `Array<String>`

Returns values for author_corp_display.

Returns:

(Array<String>) —

values for author_corp_display



88
89
90

# File 'lib/stanford-mods/searchworks.rb', line 88

def sw_corporate_authors
  mods_ng_xml.plain_name.select { |n| n.type_at == 'corporate' }.map { |n| n.display_value_w_date }
end

#sw_full_title ⇒ `String`

Returns value for title_245_search, title_full_display.

Returns:

(String) —

value for title_245_search, title_full_display

# File 'lib/stanford-mods/searchworks.rb', line 157

def sw_full_title
  
  return nil unless first_title_info_node        
  preSubTitle = nonSort_title ? [nonSort_title, title].compact.join(" ") : title
  preSubTitle.sub!(/:$/, '') if preSubTitle # remove trailing colon

  subTitle = first_title_info_node.subTitle.text.strip
  preParts = subTitle.empty? ? preSubTitle : preSubTitle + " : " + subTitle
  preParts.sub!(/\.$/, '') if preParts # remove trailing period

  partName   = first_title_info_node.partName.text.strip   unless first_title_info_node.partName.text.strip.empty?
  partNumber = first_title_info_node.partNumber.text.strip unless first_title_info_node.partNumber.text.strip.empty?
  partNumber.sub!(/,$/, '') if partNumber # remove trailing comma
  if partNumber && partName
    parts = partNumber + ", " + partName
  elsif partNumber
    parts = partNumber
  elsif partName
    parts = partName
  end
  parts.sub!(/\.$/, '') if parts

  result = parts ? preParts + ". " + parts : preParts
  return nil unless result
  result += "." unless result =~ /[[:punct:]]$/
  result.strip!
  result = nil if result.empty?
  result
end

#sw_full_title_without_commas ⇒ `Object`

Deprecated.

in favor of sw_title_display

remove trailing commas

# File 'lib/stanford-mods/searchworks.rb', line 214

def sw_full_title_without_commas
  result = sw_full_title
  result.sub!(/,$/, '') if result
  result
end

#sw_genre ⇒ `Array[String]`

github.com/sul-dlss/stanford-mods/issues/66 Limit genre values to Government document, Conference proceedings, Technical report and Thesis/Dissertation

Returns:

(Array[String]) —

values for the genre facet in SearchWorks

# File 'lib/stanford-mods/searchworks.rb', line 348

def sw_genre
  genres = term_values(:genre)
  return [] unless genres
  types = term_values(:typeOfResource)
  val = []
  val << 'Thesis/Dissertation' if genres.include?('thesis') || genres.include?('Thesis')
  if genres && types && types.include?('text')
    conf_pub = ['conference publication', 'Conference publication', 'Conference Publication']
    gov_pub  = ['government publication', 'Government publication', 'Government Publication']
    tech_rpt = ['technical report', 'Technical report', 'Technical Report']
    val << 'Conference proceedings' unless (genres & conf_pub).empty?
    val << 'Government document' unless (genres & gov_pub).empty?
    val << 'Technical report' unless (genres & tech_rpt).empty?
  end
  val.uniq
end

#sw_geographic_search(sep = ' ') ⇒ `Array<String>`

Values are the contents of:

subject/geographic
subject/hierarchicalGeographic
subject/geographicCode  (only include the translated value if it isn't already present from other mods geo fields)

Parameters:

sep (String) (defaults to: ' ') —
- the separator string for joining hierarchicalGeographic sub elements

Returns:

(Array<String>) —

values for geographic_search Solr field for this document or [] if none

# File 'lib/stanford-mods/searchworks_subjects.rb', line 15

def sw_geographic_search(sep = ' ')
  result = term_values([:subject, :geographic]) || []

  # hierarchicalGeographic has sub elements
  mods_ng_xml.subject.hierarchicalGeographic.each { |hg_node|
    hg_vals = hg_node.element_children.map(&:text).reject(&:empty?)
    result << hg_vals.join(sep) unless hg_vals.empty?
  }

  trans_code_vals = mods_ng_xml.subject.geographicCode.translated_value || []
  trans_code_vals.each { |val|
    result << val unless result.include?(val)
  }
  result
end

#sw_impersonal_authors ⇒ `Array<String>`

return the display_value_w_date for all <mods><name> elements that do not have type=‘personal’

Returns:

(Array<String>) —

values for author_other_facet



83
84
85

# File 'lib/stanford-mods/searchworks.rb', line 83

def sw_impersonal_authors
  mods_ng_xml.plain_name.select { |n| n.type_at != 'personal' }.map { |n| n.display_value_w_date }
end

#sw_language_facet ⇒ `Object`

include langagues known to SearchWorks; try to error correct when possible (e.g. when ISO-639 disagrees with MARC standard)

# File 'lib/stanford-mods/searchworks.rb', line 24

def sw_language_facet
  result = []
  mods_ng_xml.language.each { |n|
    # get languageTerm codes and add their translations to the result
    n.code_term.each { |ct|
      if ct.authority =~ /^iso639/
        vals = ct.text.split(/[,|\ ]/).reject { |x| x.strip.empty? }
        vals.each do |v|
          if ISO_639.find(v.strip)
            iso639_val = ISO_639.find(v.strip).english_name
            if SEARCHWORKS_LANGUAGES.has_value?(iso639_val)
              result << iso639_val
            else
              result << SEARCHWORKS_LANGUAGES[v.strip]
            end
          else
            logger.warn "Couldn't find english name for #{ct.text}"
          end
        end
      else
        vals = ct.text.split(/[,|\ ]/).reject { |x| x.strip.empty? }
        vals.each do |v|
          result << SEARCHWORKS_LANGUAGES[v.strip]
        end
      end
    }
    # add languageTerm text values
    n.text_term.each { |tt|
      val = tt.text.strip
      result << val if !val.empty? && SEARCHWORKS_LANGUAGES.has_value?(val)
    }

    # add language values that aren't in languageTerm subelement
    if n.languageTerm.empty?
      result << n.text if SEARCHWORKS_LANGUAGES.has_value?(n.text)
    end
  }
  result.uniq
end

#sw_main_author ⇒ `String`

Returns value for author_1xx_search field.

Returns:

(String) —

value for author_1xx_search field



67
68
69

# File 'lib/stanford-mods/searchworks.rb', line 67

def sw_main_author
  main_author_w_date
end

#sw_meeting_authors ⇒ `Array<String>`

Returns values for author_meeting_display.

Returns:

(Array<String>) —

values for author_meeting_display



93
94
95

# File 'lib/stanford-mods/searchworks.rb', line 93

def sw_meeting_authors
  mods_ng_xml.plain_name.select { |n| n.type_at == 'conference' }.map { |n| n.display_value_w_date }
end

#sw_person_authors ⇒ `Array<String>`

Returns values for author_person_facet, author_person_display.

Returns:

(Array<String>) —

values for author_person_facet, author_person_display



77
78
79

# File 'lib/stanford-mods/searchworks.rb', line 77

def sw_person_authors
  personal_names_w_dates
end

#sw_short_title ⇒ `String`

Returns value for title_245a_search field.

Returns:

(String) —

value for title_245a_search field



128
129
130

# File 'lib/stanford-mods/searchworks.rb', line 128

def sw_short_title
  short_titles ? short_titles.compact.reject(&:empty?).first : nil
end

#sw_sort_author ⇒ `String`

Returns a sortable version of the main_author:

main_author + sorting title

which is the mods approximation of the value created for a marc record

Returns:

(String) —

value for author_sort field

# File 'lib/stanford-mods/searchworks.rb', line 101

def sw_sort_author
  #  substitute java Character.MAX_CODE_POINT for nil main_author so missing main authors sort last
  val = '' + (main_author_w_date ? main_author_w_date : "\u{10FFFF} ") + (sort_title ? sort_title : '')
  val.gsub(/[[:punct:]]*/, '').strip
end

#sw_sort_title ⇒ `String`

Returns a sortable version of the main title

Returns:

(String) —

value for title_sort field

# File 'lib/stanford-mods/searchworks.rb', line 205

def sw_sort_title
  val = '' + (sw_full_title ? sw_full_title : '')
  val.sub!(Regexp.new("^" + Regexp.escape(nonSort_title)), '') if nonSort_title
  val.gsub!(/[[:punct:]]*/, '').strip
  val.squeeze(" ").strip
end

#sw_subject_names(sep = ', ') ⇒ `Array<String>`

Values are the contents of:

 subject/name/namePart
"Values from namePart subelements should be concatenated in the order they appear (e.g. "Shakespeare, William, 1564-1616")"

Parameters:

sep (String) (defaults to: ', ') —
- the separator string for joining namePart sub elements

Returns:

(Array<String>) —

values for names inside subject elements or [] if none

# File 'lib/stanford-mods/searchworks_subjects.rb', line 36

def sw_subject_names(sep = ', ')
  mods_ng_xml.subject.name_el
             .select { |n_el| n_el.namePart }
             .map { |name_el_w_np| name_el_w_np.namePart.map(&:text).reject(&:empty?) }
             .reject(&:empty?)
             .map { |parts| parts.join(sep).strip }
end

#sw_subject_titles(sep = ' ') ⇒ `Array<String>`

Values are the contents of:

subject/titleInfo/(subelements)

Parameters:

sep (String) (defaults to: ' ') —
- the separator string for joining titleInfo sub elements

Returns:

(Array<String>) —

values for titles inside subject elements or [] if none

# File 'lib/stanford-mods/searchworks_subjects.rb', line 48

def sw_subject_titles(sep = ' ')
  result = []
  mods_ng_xml.subject.titleInfo.each { |ti_el|
    parts = ti_el.element_children.map(&:text).reject(&:empty?)
    result << parts.join(sep).strip unless parts.empty?
  }
  result
end

#sw_title_display ⇒ `String`

like sw_full_title without trailing ,/;:. spec from solrmarc-sw sw_index.properties

title_display = custom, removeTrailingPunct(245abdefghijklmnopqrstuvwxyz, [\\\\,/;:], ([A-Za-z]{4}|[0-9]{3}|\\)|\\,))

Returns:

(String) —

value for title_display (like title_full_display without trailing punctuation)

# File 'lib/stanford-mods/searchworks.rb', line 191

def sw_title_display
  result = sw_full_title
  return nil unless result
  result.sub(/[\.,;:\/\\]+$/, '').strip
end

#title ⇒ `String`

Returns the text of the titleInfo node as a string (if non-empty, else nil).

Returns:

(String) —

the text of the titleInfo node as a string (if non-empty, else nil)

# File 'lib/stanford-mods/searchworks.rb', line 150

def title
  return unless first_title_info_node && first_title_info_node.title

  first_title_info_node.title.text.strip.empty?   ? nil : first_title_info_node.title.text.strip
end

#topic_facet ⇒ `Array<String>`

Values are the contents of:

 subject/topic
 subject/name
 subject/title
 subject/occupation
with trailing comma, semicolon, and backslash (and any preceding spaces) removed

Returns:

(Array<String>) —

values for the topic_facet Solr field for this document or nil if none

# File 'lib/stanford-mods/searchworks_subjects.rb', line 76

def topic_facet
  vals = subject_topics ? Array.new(subject_topics) : []
  vals.concat(subject_names) if subject_names
  vals.concat(subject_titles) if subject_titles
  vals.concat(subject_occupations) if subject_occupations
  vals.map! { |val| val.sub(/[\\,;]$/, '').strip }
  vals.empty? ? nil : vals
end

#topic_search ⇒ `Array<String>`

Values are the contents of:

mods/genre
mods/subject/topic

Returns:

(Array<String>) —

values for the topic_search Solr field for this document or nil if none

# File 'lib/stanford-mods/searchworks_subjects.rb', line 61

def topic_search
  @topic_search ||= begin
    vals = term_values(:genre) || []
    vals.concat(subject_topics) if subject_topics
    vals.empty? ? nil : vals
  end
end

#year_display_str(date_el_array) ⇒ `String`

given the passed date elements, look for a single keyDate and use it if there is one;

otherwise pick earliest parseable date

Parameters:

date_el_array (Array<Nokogiri::XML::Element>) —

the elements from which to select a pub date

Returns:

(String) —

single String containing publication year for display

# File 'lib/stanford-mods/origin_info.rb', line 82

def year_display_str(date_el_array)
  result = date_parsing_result(date_el_array, :date_str_for_display)
  return result if result
  _ignore, orig_str_to_parse = self.class.earliest_year_str(date_el_array)
  DateParsing.date_str_for_display(orig_str_to_parse) if orig_str_to_parse
end

#year_int(date_el_array) ⇒ `Integer`

given the passed date elements, look for a single keyDate and use it if there is one;

otherwise pick earliest parseable date

Parameters:

date_el_array (Array<Nokogiri::XML::Element>) —

the elements from which to select a pub date

Returns:

(Integer) —

publication year as an Integer

# File 'lib/stanford-mods/origin_info.rb', line 93

def year_int(date_el_array)
  result = date_parsing_result(date_el_array, :year_int_from_date_str)
  return result if result
  year_int, _ignore = self.class.earliest_year_int(date_el_array)
  year_int if year_int
end

#year_sort_str(date_el_array) ⇒ `String`

given the passed date elements, look for a single keyDate and use it if there is one;

otherwise pick earliest parseable date

Parameters:

date_el_array (Array<Nokogiri::XML::Element>) —

the elements from which to select a pub date

Returns:

(String) —

single String containing publication year for lexical sorting

# File 'lib/stanford-mods/origin_info.rb', line 104

def year_sort_str(date_el_array)
  result = date_parsing_result(date_el_array, :sortable_year_string_from_date_str)
  return result if result
  sortable_str, _ignore = self.class.earliest_year_str(date_el_array)
  sortable_str if sortable_str
end

Class: Stanford::Mods::Record

Overview

Constant Summary collapse

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Instance Attribute Details

#druid ⇒ Object

#logger ⇒ Object Also known as: sw_logger

Class Method Details

.date_is_approximate?(date_element) ⇒ Boolean

.earliest_year_int(date_el_array) ⇒ Object

.earliest_year_str(date_el_array) ⇒ Object

.keyDate(elements) ⇒ Nokogiri::XML::Element?

.remove_approximate(nodeset) ⇒ Array<Nokogiri::XML::Element>

Instance Method Details

#additional_authors_w_dates ⇒ Object

#box ⇒ String

#catkey ⇒ String

#collectors_w_dates ⇒ Object

#coordinates ⇒ Array{String}

#coordinates_as_bbox ⇒ Array{String} Also known as: point_bbox

#coordinates_as_envelope ⇒ Array{String}

#coordinates_objects ⇒ Array{Stanford::Mods::Coordinate}

#date_created_elements(ignore_approximate = false) ⇒ Array<Nokogiri::XML::Element>

#date_issued_elements(ignore_approximate = false) ⇒ Array<Nokogiri::XML::Element>

#era_facet ⇒ Array<String>

#first_title_info_node ⇒ Nokogiri::XML::Node

#folder ⇒ String

#format ⇒ Array[String]

#format_main ⇒ Array[String]

#geo_extensions_as_envelope ⇒ Array{String}

#geo_extensions_point_data ⇒ Array{String}

#geographic_facet ⇒ Array<String>

#geographic_search ⇒ Array<String>

#imprint_display_str ⇒ String

#includes_marc_relator_collector_role?(role_node) ⇒ Boolean

#main_author_w_date ⇒ String

#main_author_w_date_test ⇒ Object

#non_collector_person_authors ⇒ Object

#nonSort_title ⇒ String

#physical_location_str ⇒ String

#place ⇒ Object

#present_title_info_nodes ⇒ Nokogiri::XML::NodeSet

#pub_date_display ⇒ String

#pub_date_facet ⇒ String

#pub_date_sort ⇒ Object

#pub_year_display_str(ignore_approximate = false) ⇒ Object

#pub_year_int(ignore_approximate = false) ⇒ Integer

#pub_year_sort_str(ignore_approximate = false) ⇒ String

#series ⇒ String

#subject_all_search ⇒ Array<String>

#subject_other_search ⇒ Array<String>

#subject_other_subvy_search ⇒ Array<String>

#sw_addl_authors ⇒ Array<String>

#sw_addl_titles ⇒ Array<String>

#sw_corporate_authors ⇒ Array<String>

#sw_full_title ⇒ String

#sw_full_title_without_commas ⇒ Object

#sw_genre ⇒ Array[String]

#sw_geographic_search(sep = ' ') ⇒ Array<String>

#sw_impersonal_authors ⇒ Array<String>

#sw_language_facet ⇒ Object

#sw_main_author ⇒ String

#sw_meeting_authors ⇒ Array<String>

#sw_person_authors ⇒ Array<String>

#sw_short_title ⇒ String

#sw_sort_author ⇒ String

#sw_sort_title ⇒ String

#sw_subject_names(sep = ', ') ⇒ Array<String>

#sw_subject_titles(sep = ' ') ⇒ Array<String>

#sw_title_display ⇒ String

#title ⇒ String

#topic_facet ⇒ Array<String>

#topic_search ⇒ Array<String>

#year_display_str(date_el_array) ⇒ String

#year_int(date_el_array) ⇒ Integer

#year_sort_str(date_el_array) ⇒ String

#druid ⇒ `Object`

#logger ⇒ `Object` Also known as: sw_logger

.date_is_approximate?(date_element) ⇒ `Boolean`

.earliest_year_int(date_el_array) ⇒ `Object`

.earliest_year_str(date_el_array) ⇒ `Object`

.keyDate(elements) ⇒ `Nokogiri::XML::Element`^?

.remove_approximate(nodeset) ⇒ `Array<Nokogiri::XML::Element>`

#additional_authors_w_dates ⇒ `Object`

#box ⇒ `String`

#catkey ⇒ `String`

#collectors_w_dates ⇒ `Object`

#coordinates ⇒ `Array{String}`

#coordinates_as_bbox ⇒ `Array{String}` Also known as: point_bbox

#coordinates_as_envelope ⇒ `Array{String}`

#coordinates_objects ⇒ `Array{Stanford::Mods::Coordinate}`

#date_created_elements(ignore_approximate = false) ⇒ `Array<Nokogiri::XML::Element>`

#date_issued_elements(ignore_approximate = false) ⇒ `Array<Nokogiri::XML::Element>`

#era_facet ⇒ `Array<String>`

#first_title_info_node ⇒ `Nokogiri::XML::Node`

#folder ⇒ `String`

#format ⇒ `Array[String]`

#format_main ⇒ `Array[String]`

#geo_extensions_as_envelope ⇒ `Array{String}`

#geo_extensions_point_data ⇒ `Array{String}`

#geographic_facet ⇒ `Array<String>`

#geographic_search ⇒ `Array<String>`

#imprint_display_str ⇒ `String`

#includes_marc_relator_collector_role?(role_node) ⇒ `Boolean`

#main_author_w_date ⇒ `String`

#main_author_w_date_test ⇒ `Object`

#non_collector_person_authors ⇒ `Object`

#nonSort_title ⇒ `String`

#physical_location_str ⇒ `String`

#place ⇒ `Object`

#present_title_info_nodes ⇒ `Nokogiri::XML::NodeSet`

#pub_date_display ⇒ `String`

#pub_date_facet ⇒ `String`

#pub_date_sort ⇒ `Object`

#pub_year_display_str(ignore_approximate = false) ⇒ `Object`

#pub_year_int(ignore_approximate = false) ⇒ `Integer`

#pub_year_sort_str(ignore_approximate = false) ⇒ `String`

#series ⇒ `String`

#subject_all_search ⇒ `Array<String>`

#subject_other_search ⇒ `Array<String>`

#subject_other_subvy_search ⇒ `Array<String>`

#sw_addl_authors ⇒ `Array<String>`

#sw_addl_titles ⇒ `Array<String>`

#sw_corporate_authors ⇒ `Array<String>`

#sw_full_title ⇒ `String`

#sw_full_title_without_commas ⇒ `Object`

#sw_genre ⇒ `Array[String]`

#sw_geographic_search(sep = ' ') ⇒ `Array<String>`

#sw_impersonal_authors ⇒ `Array<String>`

#sw_language_facet ⇒ `Object`

#sw_main_author ⇒ `String`

#sw_meeting_authors ⇒ `Array<String>`

#sw_person_authors ⇒ `Array<String>`

#sw_short_title ⇒ `String`

#sw_sort_author ⇒ `String`

#sw_sort_title ⇒ `String`

#sw_subject_names(sep = ', ') ⇒ `Array<String>`

#sw_subject_titles(sep = ' ') ⇒ `Array<String>`

#sw_title_display ⇒ `String`

#title ⇒ `String`

#topic_facet ⇒ `Array<String>`

#topic_search ⇒ `Array<String>`

#year_display_str(date_el_array) ⇒ `String`

#year_int(date_el_array) ⇒ `Integer`

#year_sort_str(date_el_array) ⇒ `String`