Class: Traject::Macros::MarcFormatClassifier

Inherits:
Object
  • Object
show all
Defined in:
lib/traject/macros/marc_format_classifier.rb

Overview

Not actually a macro, but we’re keeping it here for now, a class for classifying marc according to format/genre/type.

VERY opinionated.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(marc_record) ⇒ MarcFormatClassifier

Returns a new instance of MarcFormatClassifier.



22
23
24
# File 'lib/traject/macros/marc_format_classifier.rb', line 22

def initialize(marc_record)
  @record = marc_record
end

Instance Attribute Details

#recordObject (readonly)

Returns the value of attribute record.



20
21
22
# File 'lib/traject/macros/marc_format_classifier.rb', line 20

def record
  @record
end

Instance Method Details

#formats(options = {}) ⇒ Object

A very opinionated method that just kind of jams together all the possible format/genre/types into one array of 1 to N elements.

Default “Other” will be used



30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
# File 'lib/traject/macros/marc_format_classifier.rb', line 30

def formats(options = {})
  options = {:default => "Other"}.merge(options)

  formats = []

  formats.concat genre
  
  formats << "Manuscript/Archive" if manuscript_archive?
  formats << "Microform" if microform?
  formats << "Online"    if online?

  # In our own data, if it's an audio recording, it might show up
  # as print, but it's probably not. 
  formats << "Print"     if print? && ! (formats.include?("Non-musical Recording") || formats.include?("Musical Recording"))

  # If it's a Dissertation, we decide it's NOT a book
  if thesis?
    formats.delete("Book")
    formats << "Dissertation/Thesis"
  end

  if proceeding?
    formats <<  "Conference"
  end

  if formats.empty?
    formats << options[:default]
  end

  return formats
end

#genreObject

Returns 1 or more values in an array from: Book; Journal/Newspaper; Musical Score; Map/Globe; Non-musical Recording; Musical Recording Image; Software/Data; Video/Film

Uses leader byte 6, leader byte 7, and 007 byte 0.

Gets actual labels from marc_genre_leader and marc_genre_007 translation maps, so you can customize labels if you want.



72
73
74
75
76
77
78
79
80
81
# File 'lib/traject/macros/marc_format_classifier.rb', line 72

def genre
  marc_genre_leader = Traject::TranslationMap.new("marc_genre_leader")
  marc_genre_007    = Traject::TranslationMap.new("marc_genre_007")

  results = marc_genre_leader[ record.leader.slice(6,2) ] ||
    marc_genre_leader[ record.leader.slice(6)] ||
    record.find_all {|f| f.tag == "007"}.collect {|f| marc_genre_007[f.value.slice(0)]}

  [results].flatten
end

#manuscript_archive?Boolean

Marked as manuscript OR archive.

Returns:

  • (Boolean)


157
158
159
160
161
162
163
164
165
166
167
168
# File 'lib/traject/macros/marc_format_classifier.rb', line 157

def manuscript_archive?
  leader06 = record.leader.slice(6)
  leader08 = record.leader.slice(8)

  # leader 6 t=Manuscript Language Material, d=Manuscript Music,
  # f=Manuscript Cartograhpic 
  #
  # leader 06 = 'b' is obsolete, but if it exists it means archival countrl
  #
  # leader 08 'a'='archival control'
  %w{t d f b}.include?(leader06) || leader08 == "a"
end

#microform?Boolean

if field 007 byte 0 is ‘h’, that’s microform. But many of our microform don’t have that. If leader byte 6 is ‘h’, that’s an obsolete way of saying microform. And finally, if GMD is

Returns:

  • (Boolean)


150
151
152
153
154
# File 'lib/traject/macros/marc_format_classifier.rb', line 150

def microform?
  normalized_gmd.start_with?("[microform]") ||
  record.leader['6'] == "h" ||
  record.find {|f| (f.tag == "007") && (f.value['0'] == "h")}
end

#normalized_gmdObject

downcased version of the gmd, or else empty string



171
172
173
174
175
# File 'lib/traject/macros/marc_format_classifier.rb', line 171

def normalized_gmd
  @gmd ||= begin
    ((a245 = record['245']) && a245['h'] && a245['h'].downcase) || ""
  end
end

#online?Boolean

We use marc 007 to determine if this represents an online resource. But sometimes resort to 245$h GMD too.

Returns:

  • (Boolean)


132
133
134
135
136
137
138
139
140
141
142
143
144
145
# File 'lib/traject/macros/marc_format_classifier.rb', line 132

def online?
  # field 007, byte 0 c="electronic" byte 1 r="remote" ==> sure Online
  found_007 = record.find do |field|
    field.tag == "007" && field.value.slice(0) == "c" && field.value.slice(1) == "r"
  end

  return true if found_007

  # Otherwise, if it has a GMD ["electronic resource"], we count it
  # as online only if NO 007[0] == 'c' exists, cause if it does we already
  # know it's electronic but not remote, otherwise first try would
  # have found it. 
  return (normalized_gmd.start_with? "[electronic resource]") && ! record.find {|f| f.tag == '007' && f.value.slice(0) == "c"}        
end

#print?Boolean

Algorithm with help from Chris Case.

  • If it has any RDA 338, then it’s print if it has a value of volume, sheet, or card.

  • If it does not have an RDA 338, it’s print if and only if it has NO 245$h GMD.

  • Here at JH, for legacy reasons we also choose to not call it print if it’s already been marked audio, but we do that in a different method.

This algorithm is definitely going to get some things wrong in both directions, with real world data. But seems to be good enough.

Returns:

  • (Boolean)


111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
# File 'lib/traject/macros/marc_format_classifier.rb', line 111

def print?


  rda338 = record.find_all do |field|
    field.tag == "338" && field['2'] == "rdacarrier"
  end

  if rda338.length > 0
    rda338.find do |field| 
      field.subfields.find do |sf|
        (sf.code == "a" && %w{volume card sheet}.include?(sf.value)) ||
        (sf.code == "b" && %w{nc no nb}.include?(sf.value))
      end
    end
  else
    normalized_gmd.length == 0
  end
end

#proceeding?Boolean

Just checks all $6xx for a $v “Congresses”

Returns:

  • (Boolean)


91
92
93
94
95
96
97
# File 'lib/traject/macros/marc_format_classifier.rb', line 91

def proceeding?
  @proceeding_q ||= begin
    ! record.find do |field|
      field.tag.slice(0) == '6' && field.subfields.find {|sf| sf.code == "v" && sf.value =~ /^\s*(C|c)ongresses\.?\s*$/}
    end.nil?
  end
end

#thesis?Boolean

Just checks if it has a 502, if it does it’s considered a thesis

Returns:

  • (Boolean)


84
85
86
87
88
# File 'lib/traject/macros/marc_format_classifier.rb', line 84

def thesis?
  @thesis_q ||= begin
    ! record.find {|a| a.tag == "502"}.nil?
  end
end