Class: SolrjWrapper

Inherits:
Object
  • Object
show all
Defined in:
lib/solrj_wrapper.rb,
lib/solrj_wrapper/version.rb

Overview

Methods required to interact with SolrJ objects, such as org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer

Constant Summary collapse

VERSION =
"0.0.2"

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(solrj_jar_dir, solr_url, queue_size, num_threads, log_level = Logger::INFO, log_file = STDERR) ⇒ SolrjWrapper

Returns a new instance of SolrjWrapper.

Parameters:

  • solrj_jar_dir

    the location of Solrj jars needed to use SolrJ here

  • solr_url

    base url of the solr instance

  • queue_size

    the number of Solr documents to buffer before writing to Solr

  • num_threads

    the number of threads to use when writing to Solr (should not be more than the number of cpu cores avail)

  • log_level (defaults to: Logger::INFO)

    level of Logger messages to output; defaults to Logger::INFO

  • log_file (defaults to: STDERR)

    file to receive Logger output; defaults to STDERR



18
19
20
21
22
23
24
25
26
27
28
# File 'lib/solrj_wrapper.rb', line 18

def initialize(solrj_jar_dir, solr_url, queue_size, num_threads, log_level=Logger::INFO, log_file=STDERR)
  if not defined? JRUBY_VERSION
    raise "SolrjWrapper only runs under jruby"
  end
  @logger = Logger.new(log_file)
  @logger.level = log_level
  load_solrj(solrj_jar_dir)
  @query_server = org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.new(solr_url)
  @streaming_update_server = org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer.new(solr_url, queue_size, num_threads)
  useJavabin!
end

Instance Attribute Details

#queryObject

Returns the value of attribute query.



10
11
12
# File 'lib/solrj_wrapper.rb', line 10

def query
  @query
end

#query_serverObject (readonly)

Returns the value of attribute query_server.



9
10
11
# File 'lib/solrj_wrapper.rb', line 9

def query_server
  @query_server
end

#streaming_update_serverObject (readonly)

Returns the value of attribute streaming_update_server.



9
10
11
# File 'lib/solrj_wrapper.rb', line 9

def streaming_update_server
  @streaming_update_server
end

Instance Method Details

#add_doc_to_ix(solr_input_doc, id) ⇒ Object

add the doc to Solr by calling add on the Solrj StreamingUpdateServer object

Parameters:

  • solr_input_doc
    • the SolrInputDocument to be added to the Solr index

  • id
    • the id of the Solr document, used for log messages



86
87
88
89
90
91
92
93
94
95
96
97
# File 'lib/solrj_wrapper.rb', line 86

def add_doc_to_ix(solr_input_doc, id)
  unless solr_input_doc.nil?
    begin
      @streaming_update_server.add(solr_input_doc)
      @logger.info("updating Solr document #{id}")        
    rescue org.apache.solr.common.SolrException => e 
      @logger.error("SolrException while indexing document #{id}")
      @logger.error("#{e.message}")
      @logger.error("#{e.backtrace}")
    end
  end
end

#add_val_to_fld(solr_input_doc, fld_name, value) ⇒ Object

given a SolrInputDocument, add the field and/or the value. This will not add empty values, and it will not add duplicate values

Parameters:

  • solr_input_doc
    • the SolrInputDocument object receiving a new field value

  • fld_name
    • the name of the Solr field

  • value
    • the value to add to the Solr field



61
62
63
64
65
66
67
68
69
70
# File 'lib/solrj_wrapper.rb', line 61

def add_val_to_fld(solr_input_doc, fld_name, value)
  if !solr_input_doc.nil? && !fld_name.nil? && fld_name.size > 0 && !value.nil? && value.size > 0
    if !solr_input_doc[fld_name].nil? && solr_input_doc
      existing_vals = solr_input_doc[fld_name].getValues
    end
    if existing_vals.nil? || !existing_vals.contains(value)
      solr_input_doc.addField(fld_name, value, 1.0)
    end
  end
end

#add_vals_to_fld(solr_input_doc, fld_name, val_array) ⇒ Object

given a SolrInputDocument, add the field and/or the values. This will not add empty values, and it will not add duplicate values

Parameters:

  • solr_input_doc
    • the SolrInputDocument object receiving a new field value

  • fld_name
    • the name of the Solr field

  • val_array
    • an array of values for the Solr field



49
50
51
52
53
54
55
# File 'lib/solrj_wrapper.rb', line 49

def add_vals_to_fld(solr_input_doc, fld_name, val_array)
  unless val_array.nil? || solr_input_doc.nil? || fld_name.nil?
    val_array.each { |value|  
      add_val_to_fld(solr_input_doc, fld_name, value)
    }
  end
end

#commitObject

send a commit to the Solrj StreamingUpdateServer object



100
101
102
103
104
105
106
107
108
# File 'lib/solrj_wrapper.rb', line 100

def commit
  begin
    update_response = @streaming_update_server.commit
  rescue org.apache.solr.common.SolrException => e
    @logger.error("SolrException while committing updates")
    @logger.error("#{e.message}")
    @logger.error("#{e.backtrace}")
  end
end

#empty_ixObject

remove all docs from the Solr index. Assumes default request handler has type dismax



111
112
113
114
# File 'lib/solrj_wrapper.rb', line 111

def empty_ix
  delete_response = @streaming_update_server.deleteByQuery("*:*")
  commit
end

#get_query_result_docs(query_obj) ⇒ Object

send the query to Solr and get the SolrDocumentList from the response

Parameters:

  • org.apache.solr.client.solrj.SolrQuery

    object populated with query information to send to Solr

Returns:

  • Java::OrgApacheSolrCommon::SolrDocumentList per the query. The list size will be the number of rows in the Solr response



33
34
35
36
# File 'lib/solrj_wrapper.rb', line 33

def get_query_result_docs(query_obj)
  response = @query_server.query(query_obj)
  response.getResults
end

#replace_field_values(solr_input_doc, fld_name, val_array) ⇒ Object

given a SolrInputDocument, replace all the values of the field with the new values.

If the values to be added are an empty array, the field will be removed.
If the field doesn't exist in the document, then it will be created (if the value array isn't empty)

Parameters:

  • solr_input_doc
    • the SolrInputDocument object receiving a new field value

  • fld_name
    • the name of the Solr field

  • value
    • an array of values for the Solr field



78
79
80
81
# File 'lib/solrj_wrapper.rb', line 78

def replace_field_values(solr_input_doc, fld_name, val_array)
  solr_input_doc.removeField(fld_name)
  add_vals_to_fld(solr_input_doc, fld_name, val_array)
end

#useJavabin!Object

Send requests using the Javabin binary format instead of serializing to XML Requires /update/javabin to be defined in solrconfig.xml as <requestHandler name=“/update/javabin” class=“solr.BinaryUpdateRequestHandler” />



41
42
43
# File 'lib/solrj_wrapper.rb', line 41

def useJavabin!
  @streaming_update_server.setRequestWriter Java::org.apache.solr.client.solrj.impl.BinaryRequestWriter.new
end