Class: DiscoveryIndexer::Writer::SolrClient

Inherits:
Object
  • Object
show all
Includes:
Logging
Defined in:
lib/writer/solr_client.rb

Overview

Processes adds and deletes to the solr core

Class Method Summary collapse

Class Method Details

.add(id, solr_doc, solr_connector, max_retries = 10) ⇒ Object

Add the document to solr, retry if an error occurs. See github.com/ooyala/retries for docs on with_retries.

Parameters:

  • id (String)

    the document id, usually it will be druid.

  • solr_doc (Hash)

    a Hash representation of the solr document

  • solr_connector (RSolr::Client)

    is an open connection with the solr core

  • max_retries (Integer) (defaults to: 10)

    the maximum number of tries before fail



16
17
18
# File 'lib/writer/solr_client.rb', line 16

def self.add(id, solr_doc, solr_connector, max_retries = 10)
  process(id, solr_doc, solr_connector, max_retries, false)
end

.allow_update?(solr_connector) ⇒ Boolean

Returns true if the solr core allowing update feature.

Parameters:

  • solr_connector (RSolr::Client)

    is an open connection with the solr core

Returns:

  • (Boolean)

    true if the solr core allowing update feature



61
62
63
# File 'lib/writer/solr_client.rb', line 61

def self.allow_update?(solr_connector)
  solr_connector.options.include?(:allow_update) ? solr_connector.options[:allow_update] : false
end

.commit(solr_connector) ⇒ Object

send hard commit to solr

Parameters:

  • solr_connector (RSolr::Client)

    is an open connection with the solr core



75
76
77
# File 'lib/writer/solr_client.rb', line 75

def self.commit(solr_connector)
  RestClient.post self.solr_url(solr_connector), {},:content_type => :json, :accept=>:json
end

.delete(id, solr_connector, max_retries = 10) ⇒ Object

Add the document to solr, retry if an error occurs. See github.com/ooyala/retries for docs on with_retries.

Parameters:

  • id (String)

    the document id, usually it will be druid.

  • solr_connector (RSolr::Client)

    is an open connection with the solr core

  • max_retries (Integer) (defaults to: 10)

    the maximum number of tries before fail



25
26
27
# File 'lib/writer/solr_client.rb', line 25

def self.delete(id, solr_connector, max_retries = 10)
  process(id, {}, solr_connector, max_retries, true)
end

.doc_exists?(id, solr_connector) ⇒ Boolean

Returns true if the solr doc defined by this id exists.

Parameters:

  • id (String)

    the document id, usually it will be druid.

  • solr_connector (RSolr::Client)

    is an open connection with the solr core

Returns:

  • (Boolean)

    true if the solr doc defined by this id exists



68
69
70
71
# File 'lib/writer/solr_client.rb', line 68

def self.doc_exists?(id, solr_connector)
  response = solr_connector.get 'select', params: { q: 'id:"' + id + '"' }
  response['response']['numFound'] == 1
end

.process(id, solr_doc, solr_connector, max_retries, is_delete = false) ⇒ Object

It’s an internal method that receives all the requests and deal with SOLR core. This method can call add, delete, or update

Parameters:

  • id (String)

    the document id, usually it will be druid.

  • solr_doc (Hash)

    is the solr doc in hash format

  • solr_connector (RSolr::Client)

    is an open connection with the solr core

  • max_retries (Integer)

    the maximum number of tries before fail



36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
# File 'lib/writer/solr_client.rb', line 36

def self.process(id, solr_doc, solr_connector, max_retries, is_delete = false)
  handler = proc do |exception, attempt_number, _total_delay|
    DiscoveryIndexer::Logging.logger.debug "#{exception.class} on attempt #{attempt_number} for #{id}"
  end

  with_retries(max_tries: max_retries, handler: handler, base_sleep_seconds: 1, max_sleep_seconds: 5) do |attempt|
    DiscoveryIndexer::Logging.logger.debug "Attempt #{attempt} for #{id}"

    if is_delete
      DiscoveryIndexer::Logging.logger.info "Deleting #{id} on attempt #{attempt}"
      solr_connector.delete_by_id(id, :add_attributes => {:commitWithin => 10000})
    elsif allow_update?(solr_connector) && doc_exists?(id, solr_connector)
      DiscoveryIndexer::Logging.logger.info "Updating #{id} on attempt #{attempt}"
      update_solr_doc(id, solr_doc, solr_connector)
    else
      DiscoveryIndexer::Logging.logger.info "Indexing #{id} on attempt #{attempt}"
      solr_connector.add(solr_doc, :add_attributes => {:commitWithin => 10000})
    end
    #solr_connector.commit
    DiscoveryIndexer::Logging.logger.info "Completing #{id} successfully on attempt #{attempt}"
  end
end

.solr_url(solr_connector) ⇒ String

adjust the solr_url so it works with or without a trailing /

Parameters:

  • solr_connector (RSolr::Client)

    is an open connection with the solr core

Returns:

  • (String)

    the solr URL



102
103
104
105
106
107
108
109
# File 'lib/writer/solr_client.rb', line 102

def self.solr_url(solr_connector)
  solr_url = solr_connector.options[:url]
  if solr_url.end_with?('/')
    "#{solr_url}update?commit=true"
  else
    "#{solr_url}/update?commit=true"
  end
end

.update_solr_doc(id, solr_doc, solr_connector) ⇒ Object

It is an internal method that updates the solr doc instead of adding a new one.

Parameters:

  • id (String)

    the document id, usually it will be druid.

  • solr_doc (Hash)

    is the solr doc in hash format

  • solr_connector (RSolr::Client)

    is an open connection with the solr core



83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
# File 'lib/writer/solr_client.rb', line 83

def self.update_solr_doc(id, solr_doc, solr_connector)
  # update_solr_doc can't used RSolr because updating hash doc is not supported
  #  so we need to build the json input manually
  params = "[{\"id\":\"#{id}\","
  solr_doc.each do |field_name, new_values|
    next if field_name == :id
    params += "\"#{field_name}\":"
    new_values = [new_values] unless new_values.class == Array
    new_values = new_values.map { |s| s.to_s.gsub('\\', '\\\\\\').gsub('"', '\"').strip } # strip leading/trailing spaces and escape quotes for each value
    params += "{\"set\":[\"#{new_values.join('","')}\"]},"
  end
  params.chomp!(',')
  params += '}]'
  RestClient.post self.solr_url(solr_connector), params, content_type: :json, accept: :json
end