Module: MARC2Solr::Custom
- Defined in:
- lib/marc2solr/marc2solr_custom.rb
Constant Summary collapse
- LOG =
JLogger::RootLogger.new
Class Method Summary collapse
-
.as_marc_in_json(doc, r) ⇒ Object
And another for marc-in-json.
-
.asMARC(doc, r) ⇒ Object
Another for marc binary.
-
.asXML(doc, r) ⇒ String
The simplest possible example; just call a method on the underlying MARC4J4R record Note that even though we don’t use the arguments, the method signature has to support it.
-
.fieldWithoutIndexingChars(doc, r, tag) ⇒ Object
A simple function to pull the non-indexing characters off the front of a field based on the second indicator.
-
.getAllSearchableFields(doc, r, lower, upper) ⇒ String
Here we get all the text from fields between (inclusive) the two tag strings in args;.
-
.getDate(doc, r) ⇒ String
An example of a DateOfPublication implementation.
-
.getDateRange(date, r) ⇒ Object
A helper function – take in a year, and return a date category.
-
.getISBNS(doc, r, codes = ['a', 'z']) ⇒ Object
Extract an ISBN from the given subfields of the 020 and provide both 10-character and 13-digit versions for each.
- .pubDateAndRange(doc, r) ⇒ Object
-
.pubDateRange(doc, r, wherePubdateIsStored) ⇒ Object
Get the date range, based on the previously-computed pubdate.
-
.valsByPattern(doc, r, tag, codes, pattern, matchindex = 0) ⇒ Array<String>
How about one to sort out, say, the 035s? We’ll make a generic routine that looks for specified values in specified subfields of variable fields, and then make sure they match before returning them.
Class Method Details
.as_marc_in_json(doc, r) ⇒ Object
And another for marc-in-json
38 39 40 |
# File 'lib/marc2solr/marc2solr_custom.rb', line 38 def self.as_marc_in_json doc, r return r.to_marc_in_json end |
.asMARC(doc, r) ⇒ Object
Another for marc binary
31 32 33 |
# File 'lib/marc2solr/marc2solr_custom.rb', line 31 def self.asMARC doc, r return r.to_marc end |
.asXML(doc, r) ⇒ String
The simplest possible example; just call a method on the underlying MARC4J4R record Note that even though we don’t use the arguments, the method signature has to support it
26 27 28 |
# File 'lib/marc2solr/marc2solr_custom.rb', line 26 def self.asXML doc, r #Remember, module fucntion! Define with "def self.methodName" return r.to_xml end |
.fieldWithoutIndexingChars(doc, r, tag) ⇒ Object
A simple function to pull the non-indexing characters off the front of a field based on the second indicator
143 144 145 146 147 148 149 150 151 152 |
# File 'lib/marc2solr/marc2solr_custom.rb', line 143 def self.fieldWithoutIndexingChars doc, r, tag vals = [] r.find_by_tag(tag).each do |df| ind2 = df.ind2.to_i if ind2 > 0 vals << df.value[ind2..-1] end end return vals end |
.getAllSearchableFields(doc, r, lower, upper) ⇒ String
Here we get all the text from fields between (inclusive) the two tag strings in args;
the highest
49 50 51 52 53 54 55 56 |
# File 'lib/marc2solr/marc2solr_custom.rb', line 49 def self.getAllSearchableFields(doc, r, lower, upper) data = [] r.each do |field| next unless field.tag <= upper and field.tag >= lower data << field.value end return data.join(' ') end |
.getDate(doc, r) ⇒ String
An example of a DateOfPublication implementation
109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 |
# File 'lib/marc2solr/marc2solr_custom.rb', line 109 def self.getDate doc, r begin ohoh8 = r['008'].value date1 = ohoh8[7..10].downcase datetype = ohoh8[6..6] if ['n','u','b'].include? datetype date1 = "" else date1 = date1.gsub('u', '0').gsub('|', ' ') date1 = '' if date1 == '0000' end if m = /^\d\d\d\d$/.match(date1) return m[0] end rescue # do nothing ... go on to the 260c end # No good? Fall back on the 260c begin d = r['260']['c'] if m = /\d\d\d\d/.match(d) return m[0] end rescue LOG.debug "Record #{r['001']} has no valid date" return nil end end |
.getDateRange(date, r) ⇒ Object
A helper function – take in a year, and return a date category
156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 |
# File 'lib/marc2solr/marc2solr_custom.rb', line 156 def self.getDateRange(date, r) if date < "1500" return "Pre-1500" end case date.to_i when 1500..1800 then century = date[0..1] return century + '00' + century + '99' when 1801..2100 then decade = date[0..2] return decade + "0-" + decade + "9"; else # puts "getDateRange: #{r['001'].value} invalid date #{date}" end end |
.getISBNS(doc, r, codes = ['a', 'z']) ⇒ Object
Extract an ISBN from the given subfields of the 020 and provide both 10-character and 13-digit versions for each. If they appear to not be ISBNs, just return the original value
89 90 91 92 93 94 95 96 97 98 99 100 101 102 |
# File 'lib/marc2solr/marc2solr_custom.rb', line 89 def self.getISBNS doc, r, codes=['a', 'z'] rv = [] r.find_by_tag('020').each do |f| f.sub_values(codes).each do |v| std = StdNum::ISBN.allNormalizedValues(v) if std.size > 0 rv.concat std else rv << v end end end return rv end |
.pubDateAndRange(doc, r) ⇒ Object
203 204 205 206 207 208 |
# File 'lib/marc2solr/marc2solr_custom.rb', line 203 def self.pubDateAndRange(doc, r) date = self.getDate(doc, r) return [nil, nil] unless date range = self.getDateRange(date, r) return [date, range] end |
.pubDateRange(doc, r, wherePubdateIsStored) ⇒ Object
Get the date range, based on the previously-computed pubdate
175 176 177 178 |
# File 'lib/marc2solr/marc2solr_custom.rb', line 175 def self.pubDateRange(doc, r, wherePubdateIsStored) previouslyComputedPubdate = doc[wherePubdateIsStored][0] return [self.getDateRange(previouslyComputedPubdate)] end |
.valsByPattern(doc, r, tag, codes, pattern, matchindex = 0) ⇒ Array<String>
How about one to sort out, say, the 035s? We’ll make a generic routine that looks for specified values in specified subfields of variable fields, and then make sure they match before returning them.
See the use of this in the simple_sample/simple_index.rb file for field ‘oclc’
The default is zero, which means “the whole string”
72 73 74 75 76 77 78 79 80 81 82 83 |
# File 'lib/marc2solr/marc2solr_custom.rb', line 72 def self.valsByPattern(doc, r, tag, codes, pattern, matchindex=0) data = [] r.find_by_tag(tag).each do |f| f.sub_values(codes).each do |v| if m = pattern.match(v) data << m[matchindex] end end end data.uniq! return data end |