Class: Stanford::Mods::Record

Inherits:

Mods::Record

Object
Mods::Record
Stanford::Mods::Record

show all

Defined in:: lib/stanford-mods.rb,
lib/stanford-mods/searchworks.rb

Instance Method Summary collapse

#additional_authors_w_dates ⇒ Object

all names, in display form, except the main_author names will be the display_value_w_date form see Mods::Record.name in nom_terminology for details on the display_value algorithm.
#catkey ⇒ String

Value with the numeric catkey in it, or nil if none exists.
#dates_marc_encoding ⇒ Array<String>

Dates from dateIssued and dateCreated tags from origin_info with encoding=“marc”.
#dates_no_marc_encoding ⇒ Array<String>

Dates from dateIssued and dateCreated tags from origin_info with encoding not “marc”.
#druid ⇒ Object
#druid=(new_druid) ⇒ Object
#era_facet ⇒ Array<String>

subject/temporal values with trailing comma, semicolon, and backslash (and any preceding spaces) removed.
#format ⇒ Array[String] deprecated
Deprecated.
- kept for backwards compatibility but not part of SW UI redesign work Summer 2014
#format_main ⇒ Array[String]

select one or more format values from the controlled vocabulary per JVine Summer 2014 searchworks-solr-lb.stanford.edu:8983/solr/select?facet.field=format_main_ssim&rows=0&facet.sort=index.
#geographic_facet ⇒ Array<String>

geographic_search values with trailing comma, semicolon, and backslash (and any preceding spaces) removed.
#geographic_search ⇒ Array<String>

Values are the contents of: subject/geographic subject/hierarchicalGeographic subject/geographicCode (only include the translated value if it isn’t already present from other mods geo fields).
#get_bc_year(dates) ⇒ Object

get the 3 digit BC year, return it as a negative, so -700 for 300 BC.
#get_double_digit_century(dates) ⇒ Object

get a double digit century like ‘12th century’ from the date array.
#get_plain_four_digit_year(dates) ⇒ Object

get a 4 digit year like 1865 from the date array.
#get_single_digit_century(dates) ⇒ Object

get a single digit century like ‘9th century’ from the date array.
#get_three_digit_year(dates) ⇒ Object

get a 3 digit year like 965 from the date array.
#get_u_year(dates) ⇒ Object

If a year has a “u” in it, replace instances of u with 0.
#is_date?(object) ⇒ Boolean
#is_number?(object) ⇒ Boolean
#main_author_w_date ⇒ String

the first encountered <mods><name> element with marcrelator flavor role of ‘Creator’ or ‘Author’.
#main_author_w_date_test ⇒ Object
#parse_dates_from_originInfo ⇒ Object

Populate @dates_marc_encoding and @dates_no_marc_encoding from dateIssued and dateCreated tags from origin_info with and without encoding=marc.
#place ⇒ Object

—- PUBLICATION (place, year) —-.
#pub_date ⇒ String

The year the object was published, , filtered based on max_pub_date and min_pub_date from the config file.
#pub_date_display ⇒ String

For the date display only, the first place to look is in the dates without encoding=marc array.
#pub_date_facet ⇒ Array[String]

Values for the pub date facet.
#pub_date_sort ⇒ Object

creates a date suitable for sorting.
#pub_dates ⇒ Array<String>

For the date indexing, sorting and faceting, the first place to look is in the dates with encoding=marc array.
#pub_year ⇒ String

Get the publish year from mods.
#subject_all_search ⇒ Array<String>

Values are the contents of: all subject subelements except subject/cartographic plus genre top level element.
#subject_names ⇒ Object

convenience method for subject/name/namePart values (to avoid parsing the mods for the same thing multiple times).
#subject_occupations ⇒ Object

convenience method for subject/occupation values (to avoid parsing the mods for the same thing multiple times).
#subject_other_search ⇒ Array<String>

Values are the contents of: subject/name subject/occupation - no subelements subject/titleInfo.
#subject_other_subvy_search ⇒ Array<String>

Values are the contents of: subject/temporal subject/genre.
#subject_temporal ⇒ Object

convenience method for subject/temporal values (to avoid parsing the mods for the same thing multiple times).
#subject_titles ⇒ Object

convenience method for subject/titleInfo values (to avoid parsing the mods for the same thing multiple times).
#subject_topics ⇒ Object

convenience method for subject/topic values (to avoid parsing the mods for the same thing multiple times).
#sw_addl_authors ⇒ Array<String>

Values for author_7xx_search field.
#sw_addl_titles ⇒ Array<String>

this includes all titles except.
#sw_corporate_authors ⇒ Array<String>

Values for author_corp_display.
#sw_full_title ⇒ String

Value for title_245_search, title_full_display.
#sw_full_title_without_commas ⇒ Object deprecated Deprecated.

in favor of sw_title_display
#sw_genre ⇒ Array[String]

return values for the genre facet in SearchWorks.
#sw_geographic_search(sep = ' ') ⇒ Array<String>

Values are the contents of: subject/geographic subject/hierarchicalGeographic subject/geographicCode (only include the translated value if it isn’t already present from other mods geo fields).
#sw_impersonal_authors ⇒ Array<String>

return the display_value_w_date for all <mods><name> elements that do not have type=‘personal’.
#sw_language_facet ⇒ Object

include langagues known to SearchWorks; try to error correct when possible (e.g. when ISO-639 disagrees with MARC standard).
#sw_logger ⇒ Object

—- end PUBLICATION (place, year) —-.
#sw_main_author ⇒ String

Value for author_1xx_search field.
#sw_meeting_authors ⇒ Array<String>

Values for author_meeting_display.
#sw_person_authors ⇒ Array<String>

Values for author_person_facet, author_person_display.
#sw_short_title ⇒ String

Value for title_245a_search field.
#sw_sort_author ⇒ String

Returns a sortable version of the main_author: main_author + sorting title which is the mods approximation of the value created for a marc record.
#sw_sort_title ⇒ String

Returns a sortable version of the main title.
#sw_subject_names(sep = ', ') ⇒ Array<String>

Values are the contents of: subject/name/namePart “Values from namePart subelements should be concatenated in the order they appear (e.g. ”Shakespeare, William, 1564-1616“)”.
#sw_subject_titles(sep = ' ') ⇒ Array<String>

Values are the contents of: subject/titleInfo/(subelements).
#sw_title_display ⇒ String

like sw_full_title without trailing ,/;:.
#topic_facet ⇒ Array<String>

Values are the contents of: subject/topic subject/name subject/title subject/occupation with trailing comma, semicolon, and backslash (and any preceding spaces) removed.
#topic_search ⇒ Array<String>

Values are the contents of: mods/genre mods/subject/topic.

Instance Method Details

#additional_authors_w_dates ⇒ `Object`

all names, in display form, except the main_author

names will be the display_value_w_date form
see Mods::Record.name  in nom_terminology for details on the display_value algorithm

# File 'lib/stanford-mods.rb', line 39

def additional_authors_w_dates
  results = []
  @mods_ng_xml.plain_name.each { |n|
    results << n.display_value_w_date
  }
  results.delete(main_author_w_date)
  results
end

#catkey ⇒ `String`

Returns value with the numeric catkey in it, or nil if none exists.

Returns:

(String) —

value with the numeric catkey in it, or nil if none exists

# File 'lib/stanford-mods/searchworks.rb', line 632

def catkey
  catkey=self.term_values([:record_info,:recordIdentifier])
  if catkey and catkey.length>0
    return catkey.first.gsub('a','') #need to ensure catkey is numeric only
  end
  nil
end

#dates_marc_encoding ⇒ `Array<String>`

Returns dates from dateIssued and dateCreated tags from origin_info with encoding=“marc”.

Returns:

(Array<String>) —

dates from dateIssued and dateCreated tags from origin_info with encoding=“marc”

# File 'lib/stanford-mods/searchworks.rb', line 787

def dates_marc_encoding
  @dates_marc_encoding ||= begin
    parse_dates_from_originInfo
    @dates_marc_encoding
  end
end

#dates_no_marc_encoding ⇒ `Array<String>`

Returns dates from dateIssued and dateCreated tags from origin_info with encoding not “marc”.

Returns:

(Array<String>) —

dates from dateIssued and dateCreated tags from origin_info with encoding not “marc”

# File 'lib/stanford-mods/searchworks.rb', line 795

def dates_no_marc_encoding
  @dates_no_marc_encoding ||= begin
    parse_dates_from_originInfo
    @dates_no_marc_encoding
  end
end

#druid ⇒ `Object`



642
643
644

# File 'lib/stanford-mods/searchworks.rb', line 642

def druid
  @druid ? @druid : 'Unknown item'
end

#druid=(new_druid) ⇒ `Object`



639
640
641

# File 'lib/stanford-mods/searchworks.rb', line 639

def druid= new_druid
  @druid=new_druid
end

#era_facet ⇒ `Array<String>`

subject/temporal values with trailing comma, semicolon, and backslash (and any preceding spaces) removed

Returns:

(Array<String>) —

values for the era_facet Solr field for this document or nil if none



305
306
307

# File 'lib/stanford-mods/searchworks.rb', line 305

def era_facet
  subject_temporal.map { |val| val.sub(/[\\,;]$/, '').strip } unless !subject_temporal
end

#format ⇒ `Array[String]`

Deprecated.

kept for backwards compatibility but not part of SW UI redesign work Summer 2014

select one or more format values from the controlled vocabulary here:

http://searchworks-solr-lb.stanford.edu:8983/solr/select?facet.field=format&rows=0&facet.sort=index

Returns:

(Array[String]) —

value in the SearchWorks controlled vocabulary

# File 'lib/stanford-mods/searchworks.rb', line 500

def format
  val = []
  types = self.term_values(:typeOfResource)
  if types
    genres = self.term_values(:genre)
    issuance = self.term_values([:origin_info,:issuance])
    types.each do |type|
      case type
        when 'cartographic'
          val << 'Map/Globe'
        when 'mixed material'
          val << 'Manuscript/Archive'
        when 'moving image'
          val << 'Video'
        when 'notated music'
          val << 'Music - Score'
        when 'software, multimedia'
          val << 'Computer File'
        when 'sound recording-musical'
          val << 'Music - Recording'
        when 'sound recording-nonmusical', 'sound recording'
          val << 'Sound Recording'
        when 'still image'
          val << 'Image'
        when 'text'
          val << 'Book' if issuance and issuance.include? 'monographic'
          book_genres = ['book chapter', 'Book chapter', 'Book Chapter',
            'issue brief', 'Issue brief', 'Issue Brief',
            'librettos', 'Librettos',
            'project report', 'Project report', 'Project Report',
            'technical report', 'Technical report', 'Technical Report',
            'working paper', 'Working paper', 'Working Paper']
          val << 'Book' if genres and !(genres & book_genres).empty?
          conf_pub = ['conference publication', 'Conference publication', 'Conference Publication']
          val << 'Conference Proceedings' if genres and !(genres & conf_pub).empty?
          val << 'Journal/Periodical' if issuance and issuance.include? 'continuing'
          article = ['article', 'Article']
          val << 'Journal/Periodical' if genres and !(genres & article).empty?
          stu_proj_rpt = ['student project report', 'Student project report', 'Student Project report', 'Student Project Report']
          val << 'Other' if genres and !(genres & stu_proj_rpt).empty?
          thesis = ['thesis', 'Thesis']
          val << 'Thesis' if genres and !(genres & thesis).empty?
        when 'three dimensional object'
          val << 'Other'
      end
    end
  end
  val.uniq
end

#format_main ⇒ `Array[String]`

select one or more format values from the controlled vocabulary per JVine Summer 2014

http://searchworks-solr-lb.stanford.edu:8983/solr/select?facet.field=format_main_ssim&rows=0&facet.sort=index

Returns:

(Array[String]) —

value in the SearchWorks controlled vocabulary

# File 'lib/stanford-mods/searchworks.rb', line 553

def format_main
  val = []
  types = self.term_values(:typeOfResource)
  article_genres = ['article', 'Article',
    'book chapter', 'Book chapter', 'Book Chapter',
    'issue brief', 'Issue brief', 'Issue Brief',
    'project report', 'Project report', 'Project Report',
    'student project report', 'Student project report', 'Student Project report', 'Student Project Report',
    'technical report', 'Technical report', 'Technical Report',
    'working paper', 'Working paper', 'Working Paper'
  ]
  book_genres = ['conference publication', 'Conference publication', 'Conference Publication',
    'instruction', 'Instruction',
    'librettos', 'Librettos',
    'thesis', 'Thesis'
  ]
  if types
    genres = self.term_values(:genre)
    issuance = self.term_values([:origin_info,:issuance])
    types.each do |type|
      case type
        when 'cartographic'
          val << 'Map'
        when 'mixed material'
          val << 'Archive/Manuscript'
        when 'moving image'
          val << 'Video'
        when 'notated music'
          val << 'Music score'
        when 'software, multimedia'
          if genres and (genres.include?('dataset') || genres.include?('Dataset'))
            val << 'Dataset'
          else
            val << 'Software/Multimedia'
          end
        when 'sound recording-musical'
          val << 'Music recording'
        when 'sound recording-nonmusical', 'sound recording'
          val << 'Sound recording'
        when 'still image'
          val << 'Image'
        when 'text'
          val << 'Book' if genres   and !(genres & article_genres).empty?
          val << 'Book' if issuance and issuance.include? 'monographic'
          val << 'Book' if genres   and !(genres & book_genres).empty?
          val << 'Journal/Periodical' if issuance and issuance.include? 'continuing'
        when 'three dimensional object'
          val << 'Object'
      end
    end
  end
  val.uniq
end

#geographic_facet ⇒ `Array<String>`

geographic_search values with trailing comma, semicolon, and backslash (and any preceding spaces) removed

Returns:

(Array<String>) —

values for the geographic_facet Solr field for this document or nil if none



299
300
301

# File 'lib/stanford-mods/searchworks.rb', line 299

def geographic_facet
  geographic_search.map { |val| val.sub(/[\\,;]$/, '').strip } unless !geographic_search
end

#geographic_search ⇒ `Array<String>`

Values are the contents of:

subject/geographic
subject/hierarchicalGeographic
subject/geographicCode  (only include the translated value if it isn't already present from other mods geo fields)

Returns:

(Array<String>) —

values for the geographic_search Solr field for this document or nil if none

# File 'lib/stanford-mods/searchworks.rb', line 314

def geographic_search
  @geographic_search ||= begin
    result = self.sw_geographic_search

    # TODO:  this should go into stanford-mods ... but then we have to set that gem up with a Logger
    # print a message for any unrecognized encodings
    xvals = self.subject.geographicCode.translated_value
    codes = self.term_values([:subject, :geographicCode])
    if codes && codes.size > xvals.size
      self.subject.geographicCode.each { |n|
        if n.authority != 'marcgac' && n.authority != 'marccountry'
          sw_logger.info("#{druid} has subject geographicCode element with untranslated encoding (#{n.authority}): #{n.to_xml}")
        end
      }
    end

    # FIXME:  stanford-mods should be returning [], not nil ...
    return nil if !result || result.empty?
    result
  end
end

#get_bc_year(dates) ⇒ `Object`

get the 3 digit BC year, return it as a negative, so -700 for 300 BC. Other methods will translate it to proper display, this is good for sorting.

# File 'lib/stanford-mods/searchworks.rb', line 751

def get_bc_year dates
  dates.each do |f_date|
    matches=f_date.scan(/\d{3} B.C./)
    if matches.length > 0
      bc_year=matches.first[0..2]
      return (bc_year.to_i-1000).to_s
    end
  end
  return nil
end

#get_double_digit_century(dates) ⇒ `Object`

get a double digit century like ‘12th century’ from the date array

# File 'lib/stanford-mods/searchworks.rb', line 717

def get_double_digit_century dates
  dates.each do |f_date|
    matches=f_date.scan(/\d{2}th/)
    if matches.length == 1
      @pub_year=((matches.first[0,2].to_i)-1).to_s+'--'
      return @pub_year
    end
    #if there are multiples, check for ones with CE after them
    if matches.length > 0
      matches.each do |match|
        pos = f_date.index(Regexp.new(match+'...CE'))
        pos = pos ? pos.to_i : f_date.index(Regexp.new(match+' century CE'))
        pos = pos ? pos.to_i : 0
        if f_date.include?(match+' CE') or pos > 0
          @pub_year=((match[0,2].to_i) - 1).to_s+'--'
          return @pub_year
        end
      end
    end
  end
  return nil
end

#get_plain_four_digit_year(dates) ⇒ `Object`

get a 4 digit year like 1865 from the date array

# File 'lib/stanford-mods/searchworks.rb', line 674

def get_plain_four_digit_year dates
  dates.each do |f_date|
    matches=f_date.scan(/\d{4}/)
    if matches.length == 1
      @pub_year=matches.first
      return matches.first
    else
      #if there are multiples, check for ones with CE after them
      matches.each do |match|
        #look for things like '1865-6 CE'
        pos = f_date.index(Regexp.new(match+'...CE'))
        pos = pos ? pos.to_i : 0
        if f_date.include?(match+' CE') or pos > 0
          @pub_year=match
          return match
        end
      end
      return matches.first
    end
  end
  return nil
end

#get_single_digit_century(dates) ⇒ `Object`

get a single digit century like ‘9th century’ from the date array

# File 'lib/stanford-mods/searchworks.rb', line 763

def get_single_digit_century dates
  dates.each do |f_date|
    matches=f_date.scan(/\d{1}th/)
    if matches.length == 1
      @pub_year=((matches.first[0,2].to_i)-1).to_s+'--'
      return @pub_year
    end
    #if there are multiples, check for ones with CE after them
    if matches.length > 0
      matches.each do |match|
        pos = f_date.index(Regexp.new(match+'...CE'))
        pos = pos ? pos.to_i : f_date.index(Regexp.new(match+' century CE'))
        pos = pos ? pos.to_i : 0
        if f_date.include?(match+' CE') or pos > 0
          @pub_year=((match[0,1].to_i) - 1).to_s+'--'
          return @pub_year
        end
      end
    end
  end
  return nil
end

#get_three_digit_year(dates) ⇒ `Object`

get a 3 digit year like 965 from the date array

# File 'lib/stanford-mods/searchworks.rb', line 741

def get_three_digit_year dates
  dates.each do |f_date|
    matches=f_date.scan(/\d{3}/)
    if matches.length > 0
      return matches.first
    end
  end
  return nil
end

#get_u_year(dates) ⇒ `Object`

If a year has a “u” in it, replace instances of u with 0

Parameters:

dates (String)

Returns:

String

# File 'lib/stanford-mods/searchworks.rb', line 700

def get_u_year dates
  dates.each do |f_date|
    # Single digit u notation
    matches = f_date.scan(/\d{3}u/)
    if matches.length == 1
      return matches.first.gsub('u','0')
    end
    # Double digit u notation
    matches = f_date.scan(/\d{2}u{2}/)
    if matches.length == 1
      return matches.first.gsub('u','-')
    end
  end
  return nil
end

#is_date?(object) ⇒ `Boolean`

Returns:

(Boolean)



409
410
411

# File 'lib/stanford-mods/searchworks.rb', line 409

def is_date?(object)
  true if Date.parse(object) rescue false
end

#is_number?(object) ⇒ `Boolean`

Returns:

(Boolean)



406
407
408

# File 'lib/stanford-mods/searchworks.rb', line 406

def is_number?(object)
  true if Integer(object) rescue false
end

#main_author_w_date ⇒ `String`

the first encountered <mods><name> element with marcrelator flavor role of ‘Creator’ or ‘Author’. if no marcrelator ‘Creator’ or ‘Author’, the first name without a role. if no name without a role, then nil see Mods::Record.name in nom_terminology for details on the display_value algorithm

Returns:

(String) —

a name in the display_value_w_date form

# File 'lib/stanford-mods.rb', line 16

def main_author_w_date
  result = nil
  first_wo_role = nil
  @mods_ng_xml.plain_name.each { |n|
    if n.role.size == 0
      first_wo_role ||= n
    end
    n.role.each { |r|
      if r.authority.include?('marcrelator') &&
            (r.value.include?('Creator') || r.value.include?('Author'))
        result ||= n.display_value_w_date
      end
    }
  }
  if !result && first_wo_role
    result = first_wo_role.display_value_w_date
  end
  result
end

#main_author_w_date_test ⇒ `Object`

# File 'lib/stanford-mods/searchworks.rb', line 99

def main_author_w_date_test
  result = nil
  first_wo_role = nil
  self.plain_name.each { |n|
    if n.role.size == 0
      first_wo_role ||= n
    end
    n.role.each { |r|
      if r.authority.include?('marcrelator') &&
        (r.value.include?('Creator') || r.value.include?('Author'))
        result ||= n.display_value_w_date
      end
    }
  }
  if !result && first_wo_role
    result = first_wo_role.display_value_w_date
  end
  result
end

#parse_dates_from_originInfo ⇒ `Object`

Populate @dates_marc_encoding and @dates_no_marc_encoding from dateIssued and dateCreated tags from origin_info with and without encoding=marc

# File 'lib/stanford-mods/searchworks.rb', line 804

def parse_dates_from_originInfo
  @dates_marc_encoding = []
  @dates_no_marc_encoding = []
  self.origin_info.dateIssued.each { |di|
    if di.encoding == "marc"
      @dates_marc_encoding << di.text
    else
      @dates_no_marc_encoding << di.text
    end
  }
  self.origin_info.dateCreated.each { |dc|
    if dc.encoding == "marc"
      @dates_marc_encoding << dc.text
    else
      @dates_no_marc_encoding << dc.text
    end
  }
end

#place ⇒ `Object`

—- PUBLICATION (place, year) —-

# File 'lib/stanford-mods/searchworks.rb', line 383

def place
  vals = self.term_values([:origin_info,:place,:placeTerm])
  vals
end

#pub_date ⇒ `String`

The year the object was published, , filtered based on max_pub_date and min_pub_date from the config file

Returns:

(String) —

4 character year or nil



466
467
468

# File 'lib/stanford-mods/searchworks.rb', line 466

def pub_date
  pub_year || nil
end

#pub_date_display ⇒ `String`

For the date display only, the first place to look is in the dates without encoding=marc array. If no such dates, select the first date in the dates_marc_encoding array. Otherwise return nil

Returns:

(String) —

value for the pub_date_display Solr field for this document or nil if none

# File 'lib/stanford-mods/searchworks.rb', line 391

def pub_date_display
    return dates_no_marc_encoding.first unless dates_no_marc_encoding.empty?
    return dates_marc_encoding.first    unless dates_marc_encoding.empty?
    return nil
end

#pub_date_facet ⇒ `Array[String]`

Values for the pub date facet. This is less strict than the 4 year date requirements for pub_date

Returns:

(Array[String]) —

with values for the pub date facet

# File 'lib/stanford-mods/searchworks.rb', line 472

def pub_date_facet
  if pub_date
    if pub_date.start_with?('-')
      return (pub_date.to_i + 1000).to_s + ' B.C.'
    end
    if pub_date.include? '--'
      cent=pub_date[0,2].to_i
      cent+=1
      cent=cent.to_s+'th century'
      return cent
    else
      return pub_date
    end
  else
    nil
  end
end

#pub_date_sort ⇒ `Object`

creates a date suitable for sorting. Guarnteed to be 4 digits or nil

# File 'lib/stanford-mods/searchworks.rb', line 451

def pub_date_sort
  pd=nil
  if pub_date
    pd=pub_date
    if pd.length == 3
      pd='0'+pd
    end
    pd=pd.gsub('--','00')
  end
  raise "pub_date_sort was about to return a non 4 digit value #{pd}!" if pd and pd.length !=4
  pd
end

#pub_dates ⇒ `Array<String>`

For the date indexing, sorting and faceting, the first place to look is in the dates with encoding=marc array. If that doesn’t exist, look in the dates without encoding=marc array. Otherwise return nil

Returns:

(Array<String>) —

values for the date Solr field for this document or nil if none

# File 'lib/stanford-mods/searchworks.rb', line 400

def pub_dates
  return dates_marc_encoding    unless dates_marc_encoding.empty?
  return dates_no_marc_encoding unless dates_no_marc_encoding.empty?
  return nil
end

#pub_year ⇒ `String`

Get the publish year from mods

Returns:

(String) —

4 character year or nil if no valid date was found

# File 'lib/stanford-mods/searchworks.rb', line 415

def pub_year
  #use the cached year if there is one
  if @pub_year
    if @pub_year == ''
      return nil
    end
    return @pub_year
  end
  dates = pub_dates
  if dates
    year = []
    pruned_dates = []
    dates.each do |f_date|
      #remove ? and []
      pruned_dates << f_date.gsub('?','').gsub('[','').gsub(']','')
    end
    #try to find a date starting with the most normal date formats and progressing to more wonky ones
    @pub_year = get_plain_four_digit_year pruned_dates
    return @pub_year if @pub_year
    # Check for years in u notation, e.g., 198u
    @pub_year = get_u_year pruned_dates
    return @pub_year if @pub_year
    @pub_year = get_double_digit_century pruned_dates
    return @pub_year if @pub_year
    @pub_year = get_bc_year pruned_dates
    return @pub_year if @pub_year
    @pub_year = get_three_digit_year pruned_dates
    return @pub_year if @pub_year
    @pub_year = get_single_digit_century pruned_dates
    return @pub_year if @pub_year
  end
  @pub_year=''
  return nil
end

#subject_all_search ⇒ `Array<String>`

Values are the contents of:

all subject subelements except subject/cartographic plus  genre top level element

Returns:

(Array<String>) —

values for the subject_all_search Solr field for this document or nil if none

# File 'lib/stanford-mods/searchworks.rb', line 372

def subject_all_search
  vals = topic_search ? Array.new(topic_search) : []
  vals.concat(geographic_search) if geographic_search
  vals.concat(subject_other_search) if subject_other_search
  vals.concat(subject_other_subvy_search) if subject_other_subvy_search
  vals.empty? ? nil : vals
end

#subject_names ⇒ `Object`

convenience method for subject/name/namePart values (to avoid parsing the mods for the same thing multiple times)



649
650
651

# File 'lib/stanford-mods/searchworks.rb', line 649

def subject_names
  @subject_names ||= self.sw_subject_names
end

#subject_occupations ⇒ `Object`

convenience method for subject/occupation values (to avoid parsing the mods for the same thing multiple times)



654
655
656

# File 'lib/stanford-mods/searchworks.rb', line 654

def subject_occupations
  @subject_occupations ||= self.term_values([:subject, :occupation])
end

#subject_other_search ⇒ `Array<String>`

Values are the contents of:

subject/name
subject/occupation  - no subelements
subject/titleInfo

Returns:

(Array<String>) —

values for the subject_other_search Solr field for this document or nil if none

# File 'lib/stanford-mods/searchworks.rb', line 341

def subject_other_search
  @subject_other_search ||= begin
    vals = subject_occupations ? Array.new(subject_occupations) : []
    vals.concat(subject_names) if subject_names
    vals.concat(subject_titles) if subject_titles
    vals.empty? ? nil : vals
  end
end

#subject_other_subvy_search ⇒ `Array<String>`

Values are the contents of:

subject/temporal
subject/genre

Returns:

(Array<String>) —

values for the subject_other_subvy_search Solr field for this document or nil if none

# File 'lib/stanford-mods/searchworks.rb', line 354

def subject_other_subvy_search
  @subject_other_subvy_search ||= begin
    vals = subject_temporal ? Array.new(subject_temporal) : []
    gvals = self.term_values([:subject, :genre])
    vals.concat(gvals) if gvals

    # print a message for any temporal encodings
    self.subject.temporal.each { |n|
      sw_logger.info("#{druid} has subject temporal element with untranslated encoding: #{n.to_xml}") if !n.encoding.empty?
    }

    vals.empty? ? nil : vals
  end
end

#subject_temporal ⇒ `Object`

convenience method for subject/temporal values (to avoid parsing the mods for the same thing multiple times)



659
660
661

# File 'lib/stanford-mods/searchworks.rb', line 659

def subject_temporal
  @subject_temporal ||= self.term_values([:subject, :temporal])
end

#subject_titles ⇒ `Object`

convenience method for subject/titleInfo values (to avoid parsing the mods for the same thing multiple times)



664
665
666

# File 'lib/stanford-mods/searchworks.rb', line 664

def subject_titles
  @subject_titles ||= self.sw_subject_titles
end

#subject_topics ⇒ `Object`

convenience method for subject/topic values (to avoid parsing the mods for the same thing multiple times)



669
670
671

# File 'lib/stanford-mods/searchworks.rb', line 669

def subject_topics
  @subject_topics ||= self.term_values([:subject, :topic])
end

#sw_addl_authors ⇒ `Array<String>`

Returns values for author_7xx_search field.

Returns:

(Array<String>) —

values for author_7xx_search field



63
64
65

# File 'lib/stanford-mods/searchworks.rb', line 63

def sw_addl_authors
  additional_authors_w_dates
end

#sw_addl_titles ⇒ `Array<String>`

this includes all titles except

Returns:

(Array<String>) —

values for title_variant_search



179
180
181

# File 'lib/stanford-mods/searchworks.rb', line 179

def sw_addl_titles
  full_titles.select { |s| s !~ Regexp.new(Regexp.escape(sw_short_title)) }
end

#sw_corporate_authors ⇒ `Array<String>`

Returns values for author_corp_display.

Returns:

(Array<String>) —

values for author_corp_display

# File 'lib/stanford-mods/searchworks.rb', line 79

def sw_corporate_authors
  val = @mods_ng_xml.plain_name.select {|n| n.type_at == 'corporate'}.map { |n| n.display_value_w_date }
  val
end

#sw_full_title ⇒ `String`

Returns value for title_245_search, title_full_display.

Returns:

(String) —

value for title_245_search, title_full_display

# File 'lib/stanford-mods/searchworks.rb', line 129

def sw_full_title
  outer_nodes = @mods_ng_xml.title_info
  outer_node = outer_nodes ? outer_nodes.first : nil
  if outer_node
    nonSort = outer_node.nonSort.text.strip.empty? ? nil : outer_node.nonSort.text.strip
    title   = outer_node.title.text.strip.empty?   ? nil : outer_node.title.text.strip
    preSubTitle = nonSort ? [nonSort, title].compact.join(" ") : title
    preSubTitle.sub!(/:$/, '') if preSubTitle # remove trailing colon

    subTitle = outer_node.subTitle.text.strip
    preParts = subTitle.empty? ? preSubTitle : preSubTitle + " : " + subTitle
    preParts.sub!(/\.$/, '') if preParts # remove trailing period

    partName   = outer_node.partName.text.strip   unless outer_node.partName.text.strip.empty?
    partNumber = outer_node.partNumber.text.strip unless outer_node.partNumber.text.strip.empty?
    partNumber.sub!(/,$/, '') if partNumber # remove trailing comma
    if partNumber && partName
      parts = partNumber + ", " + partName
    elsif partNumber
      parts = partNumber
    elsif partName
      parts = partName
    end
    parts.sub!(/\.$/, '') if parts

    result = parts ? preParts + ". " + parts : preParts
    result += "." if !result.match(/[[:punct:]]$/)
    result.strip!
    result = nil if result.empty?
    result
  else
    nil
  end
end

#sw_full_title_without_commas ⇒ `Object`

Deprecated.

in favor of sw_title_display

remove trailing commas

# File 'lib/stanford-mods/searchworks.rb', line 201

def sw_full_title_without_commas
  result = self.sw_full_title
  result.sub!(/,$/, '') if result
  result
end

#sw_genre ⇒ `Array[String]`

return values for the genre facet in SearchWorks

Returns:

(Array[String])

# File 'lib/stanford-mods/searchworks.rb', line 609

def sw_genre
  val = []
  genres = self.term_values(:genre)
  if genres
    val << genres.map(&:capitalize)
    val.flatten! if !val.empty?
    if genres.include?('thesis') || genres.include?('Thesis')
      val << 'Thesis/Dissertation'
      val.delete 'Thesis'
    end
    conf_pub = ['conference publication', 'Conference publication', 'Conference Publication']
    if !(genres & conf_pub).empty?
      types = self.term_values(:typeOfResource)
      if types && types.include?('text')
        val << 'Conference proceedings'
        val.delete 'Conference publication'
      end
    end
  end
  val.uniq
end

#sw_geographic_search(sep = ' ') ⇒ `Array<String>`

Values are the contents of:

subject/geographic
subject/hierarchicalGeographic
subject/geographicCode  (only include the translated value if it isn't already present from other mods geo fields)

Parameters:

sep (String) (defaults to: ' ') —
- the separator string for joining hierarchicalGeographic sub elements

Returns:

(Array<String>) —

values for geographic_search Solr field for this document or [] if none

# File 'lib/stanford-mods/searchworks.rb', line 217

def sw_geographic_search(sep = ' ')
  result = term_values([:subject, :geographic]) || []

  # hierarchicalGeographic has sub elements
  @mods_ng_xml.subject.hierarchicalGeographic.each { |hg_node|
    hg_vals = []
    hg_node.element_children.each { |e|
      hg_vals << e.text unless e.text.empty?
    }
    result << hg_vals.join(sep) unless hg_vals.empty?
  }

  trans_code_vals = @mods_ng_xml.subject.geographicCode.translated_value
  if trans_code_vals
    trans_code_vals.each { |val|
      result << val if !result.include?(val)
    }
  end

  result
end

#sw_impersonal_authors ⇒ `Array<String>`

return the display_value_w_date for all <mods><name> elements that do not have type=‘personal’

Returns:

(Array<String>) —

values for author_other_facet



74
75
76

# File 'lib/stanford-mods/searchworks.rb', line 74

def sw_impersonal_authors
  @mods_ng_xml.plain_name.select {|n| n.type_at != 'personal'}.map { |n| n.display_value_w_date }
end

#sw_language_facet ⇒ `Object`

include langagues known to SearchWorks; try to error correct when possible (e.g. when ISO-639 disagrees with MARC standard)

# File 'lib/stanford-mods/searchworks.rb', line 13

def sw_language_facet
  result = []
  @mods_ng_xml.language.each { |n|
    # get languageTerm codes and add their translations to the result
    n.code_term.each { |ct|
      if ct.authority.match(/^iso639/)
        begin
          vals = ct.text.split(/[,|\ ]/).reject {|x| x.strip.length == 0 }
          vals.each do |v|
            iso639_val = ISO_639.find(v.strip).english_name
            if SEARCHWORKS_LANGUAGES.has_value?(iso639_val)
              result << iso639_val
            else
              result << SEARCHWORKS_LANGUAGES[v.strip]
            end
          end
        rescue => e
          # TODO:  this should be written to a logger
          p "Couldn't find english name for #{ct.text}"
        end
      else
        vals = ct.text.split(/[,|\ ]/).reject {|x| x.strip.length == 0 }
        vals.each do |v|
          result << SEARCHWORKS_LANGUAGES[v.strip]
        end
      end
    }
    # add languageTerm text values
    n.text_term.each { |tt|
      val = tt.text.strip
      result << val if val.length > 0 && SEARCHWORKS_LANGUAGES.has_value?(val)
    }

    # add language values that aren't in languageTerm subelement
    if n.languageTerm.size == 0
      result << n.text if SEARCHWORKS_LANGUAGES.has_value?(n.text)
    end
  }
  result.uniq
end

#sw_logger ⇒ `Object`

—- end PUBLICATION (place, year) —-



492
493
494

# File 'lib/stanford-mods/searchworks.rb', line 492

def sw_logger
  @logger ||= Logger.new(STDOUT)
end

#sw_main_author ⇒ `String`

Returns value for author_1xx_search field.

Returns:

(String) —

value for author_1xx_search field



58
59
60

# File 'lib/stanford-mods/searchworks.rb', line 58

def sw_main_author
  main_author_w_date
end

#sw_meeting_authors ⇒ `Array<String>`

Returns values for author_meeting_display.

Returns:

(Array<String>) —

values for author_meeting_display



85
86
87

# File 'lib/stanford-mods/searchworks.rb', line 85

def sw_meeting_authors
  @mods_ng_xml.plain_name.select {|n| n.type_at == 'conference'}.map { |n| n.display_value_w_date }
end

#sw_person_authors ⇒ `Array<String>`

Returns values for author_person_facet, author_person_display.

Returns:

(Array<String>) —

values for author_person_facet, author_person_display



68
69
70

# File 'lib/stanford-mods/searchworks.rb', line 68

def sw_person_authors
  personal_names_w_dates
end

#sw_short_title ⇒ `String`

Returns value for title_245a_search field.

Returns:

(String) —

value for title_245a_search field



124
125
126

# File 'lib/stanford-mods/searchworks.rb', line 124

def sw_short_title
  short_titles ? short_titles.first : nil
end

#sw_sort_author ⇒ `String`

Returns a sortable version of the main_author:

main_author + sorting title

which is the mods approximation of the value created for a marc record

Returns:

(String) —

value for author_sort field

# File 'lib/stanford-mods/searchworks.rb', line 93

def sw_sort_author
  #  substitute java Character.MAX_CODE_POINT for nil main_author so missing main authors sort last
  val = '' + (main_author_w_date ? main_author_w_date : "\u{10FFFF} ") + ( sort_title ? sort_title : '')
  val.gsub(/[[:punct:]]*/, '').strip
end

#sw_sort_title ⇒ `String`

Returns a sortable version of the main title

Returns:

(String) —

value for title_sort field

# File 'lib/stanford-mods/searchworks.rb', line 185

def sw_sort_title
  # get nonSort piece
  outer_nodes = @mods_ng_xml.title_info
  outer_node = outer_nodes ? outer_nodes.first : nil
  if outer_node
    nonSort = outer_node.nonSort.text.strip.empty? ? nil : outer_node.nonSort.text.strip
  end

  val = '' + ( sw_full_title ? sw_full_title : '')
  val.sub!(Regexp.new("^" + nonSort), '') if nonSort
  val.gsub!(/[[:punct:]]*/, '').strip
  val.squeeze(" ").strip
end

#sw_subject_names(sep = ', ') ⇒ `Array<String>`

Values are the contents of:

 subject/name/namePart
"Values from namePart subelements should be concatenated in the order they appear (e.g. "Shakespeare, William, 1564-1616")"

Parameters:

sep (String) (defaults to: ', ') —
- the separator string for joining namePart sub elements

Returns:

(Array<String>) —

values for names inside subject elements or [] if none

# File 'lib/stanford-mods/searchworks.rb', line 244

def sw_subject_names(sep = ', ')
  result = []
  @mods_ng_xml.subject.name_el.select { |n_el| n_el.namePart }.each { |name_el_w_np|
    parts = name_el_w_np.namePart.map { |npn| npn.text unless npn.text.empty? }.compact
    result << parts.join(sep).strip unless parts.empty?
  }
  result
end

#sw_subject_titles(sep = ' ') ⇒ `Array<String>`

Values are the contents of:

subject/titleInfo/(subelements)

Parameters:

sep (String) (defaults to: ' ') —
- the separator string for joining titleInfo sub elements

Returns:

(Array<String>) —

values for titles inside subject elements or [] if none

# File 'lib/stanford-mods/searchworks.rb', line 257

def sw_subject_titles(sep = ' ')
  result = []
  @mods_ng_xml.subject.titleInfo.each { |ti_el|
    parts = ti_el.element_children.map { |el| el.text unless el.text.empty? }.compact
    result << parts.join(sep).strip unless parts.empty?
  }
  result
end

#sw_title_display ⇒ `String`

like sw_full_title without trailing ,/;:. spec from solrmarc-sw sw_index.properties

title_display = custom, removeTrailingPunct(245abdefghijklmnopqrstuvwxyz, [\\\\,/;:], ([A-Za-z]{4}|[0-9]{3}|\\)|\\,))

Returns:

(String) —

value for title_display (like title_full_display without trailing punctuation)

# File 'lib/stanford-mods/searchworks.rb', line 168

def sw_title_display
  result = sw_full_title ? sw_full_title : nil
  if result
    result.sub!(/[\.,;:\/\\]+$/, '')
    result.strip!
  end
  result
end

#topic_facet ⇒ `Array<String>`

Values are the contents of:

 subject/topic
 subject/name
 subject/title
 subject/occupation
with trailing comma, semicolon, and backslash (and any preceding spaces) removed

Returns:

(Array<String>) —

values for the topic_facet Solr field for this document or nil if none

# File 'lib/stanford-mods/searchworks.rb', line 285

def topic_facet
  vals = subject_topics ? Array.new(subject_topics) : []
  vals.concat(subject_names) if subject_names
  vals.concat(subject_titles) if subject_titles
  vals.concat(subject_occupations) if subject_occupations
  vals.map! { |val|
    v = val.sub(/[\\,;]$/, '')
    v.strip
  }
  vals.empty? ? nil : vals
end

#topic_search ⇒ `Array<String>`

Values are the contents of:

mods/genre
mods/subject/topic

Returns:

(Array<String>) —

values for the topic_search Solr field for this document or nil if none

# File 'lib/stanford-mods/searchworks.rb', line 270

def topic_search
  @topic_search ||= begin
    vals = self.term_values(:genre) || []
    vals.concat(subject_topics) if subject_topics
    vals.empty? ? nil : vals
  end
end

Class: Stanford::Mods::Record

Instance Method Summary collapse

Instance Method Details

#additional_authors_w_dates ⇒ Object

#catkey ⇒ String

#dates_marc_encoding ⇒ Array<String>

#dates_no_marc_encoding ⇒ Array<String>

#druid ⇒ Object

#druid=(new_druid) ⇒ Object

#era_facet ⇒ Array<String>

#format ⇒ Array[String]

#format_main ⇒ Array[String]

#geographic_facet ⇒ Array<String>

#geographic_search ⇒ Array<String>

#get_bc_year(dates) ⇒ Object

#get_double_digit_century(dates) ⇒ Object

#get_plain_four_digit_year(dates) ⇒ Object

#get_single_digit_century(dates) ⇒ Object

#get_three_digit_year(dates) ⇒ Object

#get_u_year(dates) ⇒ Object

#is_date?(object) ⇒ Boolean

#is_number?(object) ⇒ Boolean

#main_author_w_date ⇒ String

#main_author_w_date_test ⇒ Object

#parse_dates_from_originInfo ⇒ Object

#place ⇒ Object

#pub_date ⇒ String

#pub_date_display ⇒ String

#pub_date_facet ⇒ Array[String]

#pub_date_sort ⇒ Object

#pub_dates ⇒ Array<String>

#pub_year ⇒ String

#subject_all_search ⇒ Array<String>

#subject_names ⇒ Object

#subject_occupations ⇒ Object

#subject_other_search ⇒ Array<String>

#subject_other_subvy_search ⇒ Array<String>

#subject_temporal ⇒ Object

#subject_titles ⇒ Object

#subject_topics ⇒ Object

#sw_addl_authors ⇒ Array<String>

#sw_addl_titles ⇒ Array<String>

#sw_corporate_authors ⇒ Array<String>

#sw_full_title ⇒ String

#sw_full_title_without_commas ⇒ Object

#sw_genre ⇒ Array[String]

#sw_geographic_search(sep = ' ') ⇒ Array<String>

#sw_impersonal_authors ⇒ Array<String>

#sw_language_facet ⇒ Object

#sw_logger ⇒ Object

#sw_main_author ⇒ String

#sw_meeting_authors ⇒ Array<String>

#sw_person_authors ⇒ Array<String>

#sw_short_title ⇒ String

#sw_sort_author ⇒ String

#sw_sort_title ⇒ String

#sw_subject_names(sep = ', ') ⇒ Array<String>

#sw_subject_titles(sep = ' ') ⇒ Array<String>

#sw_title_display ⇒ String

#topic_facet ⇒ Array<String>

#topic_search ⇒ Array<String>

#additional_authors_w_dates ⇒ `Object`

#catkey ⇒ `String`

#dates_marc_encoding ⇒ `Array<String>`

#dates_no_marc_encoding ⇒ `Array<String>`

#druid ⇒ `Object`

#druid=(new_druid) ⇒ `Object`

#era_facet ⇒ `Array<String>`

#format ⇒ `Array[String]`

#format_main ⇒ `Array[String]`

#geographic_facet ⇒ `Array<String>`

#geographic_search ⇒ `Array<String>`

#get_bc_year(dates) ⇒ `Object`

#get_double_digit_century(dates) ⇒ `Object`

#get_plain_four_digit_year(dates) ⇒ `Object`

#get_single_digit_century(dates) ⇒ `Object`

#get_three_digit_year(dates) ⇒ `Object`

#get_u_year(dates) ⇒ `Object`

#is_date?(object) ⇒ `Boolean`

#is_number?(object) ⇒ `Boolean`

#main_author_w_date ⇒ `String`

#main_author_w_date_test ⇒ `Object`

#parse_dates_from_originInfo ⇒ `Object`

#place ⇒ `Object`

#pub_date ⇒ `String`

#pub_date_display ⇒ `String`

#pub_date_facet ⇒ `Array[String]`

#pub_date_sort ⇒ `Object`

#pub_dates ⇒ `Array<String>`

#pub_year ⇒ `String`

#subject_all_search ⇒ `Array<String>`

#subject_names ⇒ `Object`

#subject_occupations ⇒ `Object`

#subject_other_search ⇒ `Array<String>`

#subject_other_subvy_search ⇒ `Array<String>`

#subject_temporal ⇒ `Object`

#subject_titles ⇒ `Object`

#subject_topics ⇒ `Object`

#sw_addl_authors ⇒ `Array<String>`

#sw_addl_titles ⇒ `Array<String>`

#sw_corporate_authors ⇒ `Array<String>`

#sw_full_title ⇒ `String`

#sw_full_title_without_commas ⇒ `Object`

#sw_genre ⇒ `Array[String]`

#sw_geographic_search(sep = ' ') ⇒ `Array<String>`

#sw_impersonal_authors ⇒ `Array<String>`

#sw_language_facet ⇒ `Object`

#sw_logger ⇒ `Object`

#sw_main_author ⇒ `String`

#sw_meeting_authors ⇒ `Array<String>`

#sw_person_authors ⇒ `Array<String>`

#sw_short_title ⇒ `String`

#sw_sort_author ⇒ `String`

#sw_sort_title ⇒ `String`

#sw_subject_names(sep = ', ') ⇒ `Array<String>`

#sw_subject_titles(sep = ' ') ⇒ `Array<String>`

#sw_title_display ⇒ `String`

#topic_facet ⇒ `Array<String>`

#topic_search ⇒ `Array<String>`