CircleCI Maintainability Gem Version

Solrb

Object-Oriented approach to Solr in Ruby.

Table of contents

Installation

Add solrb to your Gemfile:

gem 'solrb'

If you are going to use solrb with solr cloud:

gem 'zk' # required for solrb solr-cloud integration
gem 'solrb'

Configuration

Setting Solr URL via environment variable

The simplest way to use Solrb is SORL_URL environment variable (that has a core name in it):

  ENV['SOLR_URL'] = 'http://localhost:8983/solr/demo'

You can also use Solr.configure to specify the solr URL explicitly:

Solr.configure do |config|
  config.url = 'http://localhost:8983/solr/demo'
end

It's important to note that those fields that are not configured, will be passed as-is to solr. So you only need to specify fields in configuration if you want Solrb to modify them at runtime.

Single core configuration

Use Solr.configure for an additional configuration:

Solr.configure do |config|
  config.url = 'http://localhost:8983/solr/demo'

  # This gem uses faraday to make requests to Solr. You can specify additional faraday
  # options here.
  config.faraday_options = {}

  # Core's URL is 'http://localhost:8983/solr/demo'
  # Adding fields to work with
  config.define_core do |f|
    f.field :title, dynamic_field: :text
    f.dynamic_field :text, solr_name: '*_text'
  end
end

Multiple core configuration

Solr.configure do |config|
  config.url = 'http://localhost:8983/solr'

  # Define a core with fields that will be used with Solr.
  # Core URL is 'http://localhost:8983/solr/listings'
  config.define_core(name: :listings) do |f|
    # When a dynamic_field is present, the field name will be mapped to match the dynamic field.
    # Here, "title" will be mapped to "title_text"
    # You must define a dynamic field to be able to use the dynamic_field option
    f.field :title, dynamic_field: :text

    # When solr_name is present, the field name will be mapped to the solr_name at runtime
    f.field :tags, solr_name: :tags_array

    # define a dynamic field
    f.dynamic_field :text, solr_name: '*_text'
  end

  # Pass `default: true` to use one core as a default.
  # Core's URL is 'http://localhost:8983/solr/cars'
  config.define_core(name: :cars, default: true) do |f|
    f.field :manufacturer, solr_name: :manuf_s
    f.field :model, solr_name: :model_s
  end
end

Warning: Solrb doesn't support fields with the same name. If you have two fields with the same name mapping to a single solr field, you'll have to rename one of the fields.

...
config.define_core do |f|
  ...
  # Not allowed: Two fields with same name 'title'
  f.field :title, solr_name: :article_title
  f.field :title, solr_name: :page_title
end
...

Solr Cloud

To enable solr cloud mode you must define a zookeeper url on solr config block. In solr cloud mode you don't need to provide a solr url (config.url or ENV['SOLR_URL']). Solrb will watch the zookeeper state to receive up-to-date information about active solr nodes including the solr urls.

You can also specify the ACL credentials for Zookeeper. More Information

Solr.configure do |config|
  config.zookeeper_urls = ['localhost:2181', 'localhost:2182', 'localhost:2183']
  config.zookeeper_auth_user = 'zk_acl_user'
  config.zookeeper_auth_password = 'zk_acl_password'
end

If you are using puma web server in clustered mode you must call enable_solr_cloud! on on_worker_boot callback to make each puma worker connect with zookeeper.

on_worker_boot do
  Solr.enable_solr_cloud!
end

Basic Authentication

Basic authentication is supported by solrb. You can enable it by providing auth_user and auth_password on the config block.

Solr.configure do |config|
  config.auth_user = 'user'
  config.auth_password = 'password'
end

Indexing

# creates a single document and commits it to index
doc = Solr::Indexing::Document.new
doc.add_field(:id, 1)
doc.add_field(:name, 'Solrb!!!')

request = Solr::Indexing::Request.new(documents: [doc])
request.run(commit: true)

You can also create indexing document directly from attributes:

doc = Solr::Indexing::Document.new(id: 5, name: 'John')

Querying

Simple Query

  query_field = Solr::Query::Request::FieldWithBoost.new(field: :name)

  request = Solr::Query::Request.new(search_term: 'term', query_fields: [query_field])
  request.run(page: 1, page_size: 10)

Querying multiple cores

For multi-core configuration use Solr.with_core block:

Solr.with_core(:models) do
  Solr.delete_by_id(3242343)
  Solr::Query::Request.new(search_term: 'term', query_fields: query_fields)
  Solr::Indexing::Request.new(documents: [doc])
end

Query with field boost

  query_fields = [
    # Use boost_magnitude argument to apply boost to a specific field that you query
    Solr::Query::Request::FieldWithBoost.new(field: :name, boost_magnitude: 16),
    Solr::Query::Request::FieldWithBoost.new(field: :title)
  ]
  request = Solr::Query::Request.new(search_term: 'term', query_fields: query_fields)
  request.run(page: 1, page_size: 10)

Query with filtering

  query_fields = [
    Solr::Query::Request::FieldWithBoost.new(field: :name),
    Solr::Query::Request::FieldWithBoost.new(field: :title)
  ]
  filters = [Solr::Query::Request::Filter.new(type: :equal, field: :title, value: 'A title')]
  request = Solr::Query::Request.new(search_term: 'term', query_fields: query_fields, filters: filters)
  request.run(page: 1, page_size: 10)

AND and OR filters

  usa_filter =
    Solr::Query::Request::AndFilter.new(
      Solr::Query::Request::Filter.new(type: :equal, field: :contry, value: 'USA'),
      Solr::Query::Request::Filter.new(type: :equal, field: :region, value: 'Idaho')
    )
  canada_filter =
    Solr::Query::Request::AndFilter.new(
      Solr::Query::Request::Filter.new(type: :equal, field: :contry, value: 'Canada'),
      Solr::Query::Request::Filter.new(type: :equal, field: :region, value: 'Alberta')
    )

  location_filters = Solr::Query::Request::OrFilter.new(usa_filter, canada_filter)
  request = Solr::Query::Request.new(search_term: 'term', filters: location_filters)
  request.run(page: 1, page_size: 10)

Query with sorting

  query_fields = [
    Solr::Query::Request::FieldWithBoost.new(field: :name),
    Solr::Query::Request::FieldWithBoost.new(field: :title)
  ]
  sort_fields = [Solr::Query::Request::Sorting::Field.new(name: :name, direction: :asc)]
  request = Solr::Query::Request.new(search_term: 'term', query_fields: query_fields)
  request.sorting = Solr::Query::Request::Sorting.new(fields: sort_fields)
  request.run(page: 1, page_size: 10)

Default sorting logic is following: nulls last, not-nulls first.

  query_fields = [
    Solr::Query::Request::FieldWithBoost.new(field: :name)
  ]
  sort_fields = [
    Solr::Query::Request::Sorting::Field.new(name: :is_featured, direction: :desc),
    Solr::Query::Request::Sorting::Function.new(function: "score desc")
  ]
  request = Solr::Query::Request.new(search_term: 'term', query_fields: query_fields)
  request.sorting = Solr::Query::Request::Sorting.new(fields: sort_fields)
  request.run(page: 1, page_size: 10)

Query with grouping

  query_fields = [
    Solr::Query::Request::FieldWithBoost.new(field: :name),
    Solr::Query::Request::FieldWithBoost.new(field: :category)
  ]
  request = Solr::Query::Request.new(search_term: 'term', query_fields: query_fields)
  request.grouping = Solr::Query::Request::Grouping.new(field: :category, limit: 10)
  request.run(page: 1, page_size: 10)

Query with facets

  query_fields = [
    Solr::Query::Request::FieldWithBoost.new(field: :name),
    Solr::Query::Request::FieldWithBoost.new(field: :category)
  ]
  request = Solr::Query::Request.new(search_term: 'term', query_fields: query_fields)
  request.facets = [Solr::Query::Request::Facet.new(type: :terms, field: :category, options: { limit: 10 })]
  request.run(page: 1, page_size: 10)

Query with boosting functions

  query_fields = [
    Solr::Query::Request::FieldWithBoost.new(field: :name),
    Solr::Query::Request::FieldWithBoost.new(field: :category)
  ]
  request = Solr::Query::Request.new(search_term: 'term', query_fields: query_fields)
  request.boosting = Solr::Query::Request::Boosting.new(
    multiplicative_boost_functions: [Solr::Query::Request::Boosting::RankingFieldBoostFunction.new(field: :name)],
    phrase_boosts: [Solr::Query::Request::Boosting::PhraseProximityBoost.new(field: :category, boost_magnitude: 4)]
  )
  request.run(page: 1, page_size: 10)

Dictionary boosting function

Sometimes you want to do a dictionary-style boosting example: given a hash (dictionary)

{3025 => 2.0, 3024 => 1.5, 3023 => 1.2}

and a field of category_id the resulting boosting function will be:

if(eq(category_id_it, 3025), 2.0, if(eq(category_id_it, 3024), 1.5, if(eq(category_id_it, 3023), 1.2, 1)))

note that I added spaces for readability, real Solr query functions must always be w/out spaces

Example of usage:

  category_id_boosts = {3025 => 2.0, 3024 => 1.5, 3023 => 1.2}
  request.boosting = Solr::Query::Request::Boosting.new(
    multiplicative_boost_functions: [
      Solr::Query::Request::Boosting::DictionaryBoostFunction.new(field: :category_id,
        dictionary: category_id_boosts)
    ]
  )

Field list

  query_fields = [
    Solr::Query::Request::FieldWithBoost.new(field: :name),
    Solr::Query::Request::FieldWithBoost.new(field: :category)
  ]
  request = Solr::Query::Request.new(search_term: 'term', query_fields: query_fields)
  # Solr::Query::Request will return only :id field by default.
  # Specify additional return fields (fl param) by setting the request field_list
  request.field_list = [:name, :category]
  request.run(page: 1, page_size: 10)

Deleting documents

Solr.delete_by_id(3242343)
Solr.delete_by_id(3242343, commit: true)
Solr.delete_by_query('*:*')
Solr.delete_by_query('*:*', commit: true)

Active Support instrumentation

This gem publishes events via Active Support Instrumentation

To subscribe to solrb events, you can add this code to initializer:

ActiveSupport::Notifications.subscribe('request.solrb')  do |*args|
  event = ActiveSupport::Notifications::Event.new(*args)
  if Logger::INFO == Rails.logger.level
    Rails.logger.info("Solrb #{event.duration.round(1)}ms")
  elsif Logger::DEBUG == Rails.logger.level && Rails.env.development?
    Pry::ColorPrinter.pp(event.payload)
  end
end

Testing

It's possible to inspect the parameters for each solr query request done using Solrb by requiring solr/testing file in your test suite. The query parameters will be accessible by reading Solr::Testing.last_solr_request_params after each request.

require 'solr/testing'

RSpec.describe MyTest do
  let(:query) { Solr::Query::Request.new(search_term: 'Solrb') }
  it 'returns the last solr request params' do
    query.run(page: 1, page_size: 10)
    expect(Solr::Testing.last_solr_request_params).to eq({ ... })
  end
end

Running specs

This project is setup to use CI to run all specs agains a real solr.

If you want to run it locally, you can either use CircleCI CLI or do a completely manual setup (for up-to-date steps see circleci config)

docker pull solr:7.7.1
docker run -it --name test-solr -p 8983:8983/tcp -t solr:7.7.1
# create a core
curl 'http://localhost:8983/solr/admin/cores?action=CREATE&name=test-core&configSet=_default'
# disable field guessing
curl http://localhost:8983/solr/test-core/config -d '{"set-user-property": {"update.autoCreateFields":"false"}}'
SOLR_URL=http://localhost:8983/solr/test-core rspec