Class: Dor::IndexingService

Inherits:
Object
  • Object
show all
Defined in:
lib/dor/services/indexing_service.rb

Defined Under Namespace

Classes: ReindexError

Constant Summary collapse

@@loggers =

memoize the loggers we create in a hash, init with a nil default logger

{ default: nil }

Class Method Summary collapse

Class Method Details

.default_index_loggerObject



30
31
32
# File 'lib/dor/services/indexing_service.rb', line 30

def self.default_index_logger
  @@loggers[:default] ||= generate_index_logger
end

.generate_index_logger { ... } ⇒ Object

Returns a Logger instance for recording info about indexing attempts

Yields:

  • attempt to execute ‘entry_id_block’ and use the result as an extra identifier for the log entry. a placeholder will be used otherwise. ‘request.uuid’ might be useful in a Rails app.



13
14
15
16
17
18
19
20
21
22
23
24
25
# File 'lib/dor/services/indexing_service.rb', line 13

def self.generate_index_logger(&entry_id_block)
  index_logger = Logger.new(Config.indexing_svc.log, Config.indexing_svc.log_rotation_interval)
  index_logger.formatter = proc do |_severity, datetime, _progname, msg|
    date_format_str = Config.indexing_svc.log_date_format_str
    entry_id = begin begin
                       entry_id_block.call
                     rescue StandardError
                       '---'
                     end end
    "[#{entry_id}] [#{datetime.utc.strftime(date_format_str)}] #{msg}\n"
  end
  index_logger
end

.reindex_object(obj, options = {}) ⇒ Object

takes a Dor object and indexes it to solr. doesn’t commit automatically.



35
36
37
38
39
# File 'lib/dor/services/indexing_service.rb', line 35

def self.reindex_object(obj, options = {})
  solr_doc = obj.to_solr
  Dor::SearchService.solr.add(solr_doc, options)
  solr_doc
end

.reindex_pid(pid, index_logger, options = {}) ⇒ Object .reindex_pid(pid, index_logger, should_raise_errors, options = {}) ⇒ Object .reindex_pid(pid, options = {}) ⇒ Object

retrieves a single Dor object by pid, indexes the object to solr, does some logging (will use a default logger if one is not provided). doesn’t commit automatically.

WARNING/TODO: the tests indicate that the “rescue Exception” block at the end will get skipped, and the thrown exception (e.g. SystemStackError) will not be logged. since that’s the only consequence, and the exception bubbles up as we would want anyway, it doesn’t seem worth blocking refactoring. see github.com/sul-dlss/dor-services/issues/156 extra logging in this case would be nice, but centralized indexing that’s otherwise fully functional is nicer.



74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
# File 'lib/dor/services/indexing_service.rb', line 74

def self.reindex_pid(pid, *args)
  options = {}
  options = args.pop if args.last.is_a? Hash

  if args.length > 0
    warn 'Dor::IndexingService.reindex_pid with primitive arguments is deprecated; pass e.g. { logger: logger, raise_errors: bool } instead'
    index_logger, should_raise_errors = args
    index_logger ||= default_index_logger
    should_raise_errors = true if should_raise_errors.nil?
  else
    index_logger = options.fetch(:logger, default_index_logger)
    should_raise_errors = options.fetch(:raise_errors, true)
  end

  obj = nil
  solr_doc = nil

  # benchmark how long it takes to load the object
  load_stats = Benchmark.measure('load_instance') do
    obj = Dor.load_instance pid
  end.format('%n realtime %rs total CPU %ts').gsub(/[\(\)]/, '')

  # benchmark how long it takes to convert the object to a Solr document
  to_solr_stats = Benchmark.measure('to_solr') do
    solr_doc = reindex_object obj, options
  end.format('%n realtime %rs total CPU %ts').gsub(/[\(\)]/, '')

  index_logger.info "successfully updated index for #{pid} (metrics: #{load_stats}; #{to_solr_stats})"

  solr_doc
rescue StandardError => se
  if se.is_a? ActiveFedora::ObjectNotFoundError
    index_logger.warn "failed to update index for #{pid}, object not found in Fedora"
  else
    index_logger.warn "failed to update index for #{pid}, unexpected StandardError, see main app log: #{se.backtrace}"
  end
  raise se if should_raise_errors
rescue Exception => ex
  index_logger.error "failed to update index for #{pid}, unexpected Exception, see main app log: #{ex.backtrace}"
  raise ex # don't swallow anything worse than StandardError
end

.reindex_pid_list(pid_list, should_commit = false) ⇒ Object

given a list of pids, retrieve those objects from fedora, index each to solr, optionally commit



117
118
119
120
# File 'lib/dor/services/indexing_service.rb', line 117

def self.reindex_pid_list(pid_list, should_commit = false)
  pid_list.each { |pid| reindex_pid pid, raise_errors: false } # use the default logger, don't let individual errors nuke the rest of the batch
  ActiveFedora.solr.conn.commit if should_commit
end

.reindex_pid_remotely(pid) ⇒ Object

Use the dor-indexing-app service to reindex a pid

Parameters:

  • pid (String)

    the druid

Raises:



44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
# File 'lib/dor/services/indexing_service.rb', line 44

def self.reindex_pid_remotely(pid)
  pid = "druid:#{pid}" unless pid =~ /^druid:/
  realtime = Benchmark.realtime do
    with_retries(max_tries: 3, rescue: [RestClient::Exception, Errno::ECONNREFUSED]) do
      RestClient.post("#{Config.dor_indexing_app.url}/reindex/#{pid}", '')
    end
  end
  default_index_logger.info "successfully updated index for #{pid} in #{'%.3f' % realtime}s"
rescue RestClient::Exception, Errno::ECONNREFUSED => e
  msg = "failed to reindex #{pid}: #{e}"
  default_index_logger.error msg
  raise ReindexError.new(msg)
rescue StandardError => e
  default_index_logger.error "failed to reindex #{pid}: #{e}"
  raise
end