Solrb
Object-Oriented approach to Solr in Ruby.
Table of contents
- Installation
- Configuration
- Indexing
- Querying
- Deleting documents
- Active Support instrumentation
- Testing
- Running specs
Installation
Add solrb
to your Gemfile:
gem 'solrb'
If you are going to use solrb with solr cloud:
gem 'zk' # required for solrb solr-cloud integration
gem 'solrb'
Configuration
Setting Solr URL via environment variable
The simplest way to use Solrb is SORL_URL
environment variable (that has a core name in it):
ENV['SOLR_URL'] = 'http://localhost:8983/solr/demo'
You can also use Solr.configure
to specify the solr URL explicitly:
Solr.configure do |config|
config.url = 'http://localhost:8983/solr/demo'
end
It's important to note that those fields that are not configured, will be passed as-is to solr. So you only need to specify fields in configuration if you want Solrb to modify them at runtime.
Single core configuration
Use Solr.configure
for an additional configuration:
Solr.configure do |config|
config.url = 'http://localhost:8983/solr/demo'
# This gem uses faraday to make requests to Solr. You can specify additional faraday
# options here.
config. = {}
# Core's URL is 'http://localhost:8983/solr/demo'
# Adding fields to work with
config.define_core do |f|
f.field :title, dynamic_field: :text
f.dynamic_field :text, solr_name: '*_text'
end
end
Multiple core configuration
Solr.configure do |config|
config.url = 'http://localhost:8983/solr'
# Define a core with fields that will be used with Solr.
# Core URL is 'http://localhost:8983/solr/listings'
config.define_core(name: :listings) do |f|
# When a dynamic_field is present, the field name will be mapped to match the dynamic field.
# Here, "title" will be mapped to "title_text"
# You must define a dynamic field to be able to use the dynamic_field option
f.field :title, dynamic_field: :text
# When solr_name is present, the field name will be mapped to the solr_name at runtime
f.field :tags, solr_name: :tags_array
# define a dynamic field
f.dynamic_field :text, solr_name: '*_text'
end
# Pass `default: true` to use one core as a default.
# Core's URL is 'http://localhost:8983/solr/cars'
config.define_core(name: :cars, default: true) do |f|
f.field :manufacturer, solr_name: :manuf_s
f.field :model, solr_name: :model_s
end
end
Warning: Solrb doesn't support fields with the same name. If you have two fields with the same name mapping to a single solr field, you'll have to rename one of the fields.
...
config.define_core do |f|
...
# Not allowed: Two fields with same name 'title'
f.field :title, solr_name: :article_title
f.field :title, solr_name: :page_title
end
...
Solr Cloud
To enable solr cloud mode you must define a zookeeper url on solr config block.
In solr cloud mode you don't need to provide a solr url (config.url
or ENV['SOLR_URL']
).
Solrb will watch the zookeeper state to receive up-to-date information about active solr nodes including the solr urls.
You can also specify the ACL credentials for Zookeeper. More Information
Solr.configure do |config|
config.zookeeper_urls = ['localhost:2181', 'localhost:2182', 'localhost:2183']
config.zookeeper_auth_user = 'zk_acl_user'
config.zookeeper_auth_password = 'zk_acl_password'
end
If you are using puma web server in clustered mode you must call enable_solr_cloud!
on on_worker_boot
callback to make each puma worker connect with zookeeper.
on_worker_boot do
Solr.enable_solr_cloud!
end
Basic Authentication
Basic authentication is supported by solrb. You can enable it by providing auth_user
and auth_password
on the config block.
Solr.configure do |config|
config.auth_user = 'user'
config.auth_password = 'password'
end
Indexing
# creates a single document and commits it to index
doc = Solr::Indexing::Document.new
doc.add_field(:id, 1)
doc.add_field(:name, 'Solrb!!!')
request = Solr::Indexing::Request.new(documents: [doc])
request.run(commit: true)
You can also create indexing document directly from attributes:
doc = Solr::Indexing::Document.new(id: 5, name: 'John')
Querying
Simple Query
query_field = Solr::Query::Request::FieldWithBoost.new(field: :name)
request = Solr::Query::Request.new(search_term: 'term', query_fields: [query_field])
request.run(page: 1, page_size: 10)
Querying multiple cores
For multi-core configuration use Solr.with_core
block:
Solr.with_core(:models) do
Solr.delete_by_id(3242343)
Solr::Query::Request.new(search_term: 'term', query_fields: query_fields)
Solr::Indexing::Request.new(documents: [doc])
end
Query with field boost
query_fields = [
# Use boost_magnitude argument to apply boost to a specific field that you query
Solr::Query::Request::FieldWithBoost.new(field: :name, boost_magnitude: 16),
Solr::Query::Request::FieldWithBoost.new(field: :title)
]
request = Solr::Query::Request.new(search_term: 'term', query_fields: query_fields)
request.run(page: 1, page_size: 10)
Query with filtering
query_fields = [
Solr::Query::Request::FieldWithBoost.new(field: :name),
Solr::Query::Request::FieldWithBoost.new(field: :title)
]
filters = [Solr::Query::Request::Filter.new(type: :equal, field: :title, value: 'A title')]
request = Solr::Query::Request.new(search_term: 'term', query_fields: query_fields, filters: filters)
request.run(page: 1, page_size: 10)
AND and OR filters
usa_filter =
Solr::Query::Request::AndFilter.new(
Solr::Query::Request::Filter.new(type: :equal, field: :contry, value: 'USA'),
Solr::Query::Request::Filter.new(type: :equal, field: :region, value: 'Idaho')
)
canada_filter =
Solr::Query::Request::AndFilter.new(
Solr::Query::Request::Filter.new(type: :equal, field: :contry, value: 'Canada'),
Solr::Query::Request::Filter.new(type: :equal, field: :region, value: 'Alberta')
)
location_filters = Solr::Query::Request::OrFilter.new(usa_filter, canada_filter)
request = Solr::Query::Request.new(search_term: 'term', filters: location_filters)
request.run(page: 1, page_size: 10)
Query with sorting
query_fields = [
Solr::Query::Request::FieldWithBoost.new(field: :name),
Solr::Query::Request::FieldWithBoost.new(field: :title)
]
sort_fields = [Solr::Query::Request::Sorting::Field.new(name: :name, direction: :asc)]
request = Solr::Query::Request.new(search_term: 'term', query_fields: query_fields)
request.sorting = Solr::Query::Request::Sorting.new(fields: sort_fields)
request.run(page: 1, page_size: 10)
Default sorting logic is following: nulls last, not-nulls first.
query_fields = [
Solr::Query::Request::FieldWithBoost.new(field: :name)
]
sort_fields = [
Solr::Query::Request::Sorting::Field.new(name: :is_featured, direction: :desc),
Solr::Query::Request::Sorting::Function.new(function: "score desc")
]
request = Solr::Query::Request.new(search_term: 'term', query_fields: query_fields)
request.sorting = Solr::Query::Request::Sorting.new(fields: sort_fields)
request.run(page: 1, page_size: 10)
Query with grouping
query_fields = [
Solr::Query::Request::FieldWithBoost.new(field: :name),
Solr::Query::Request::FieldWithBoost.new(field: :category)
]
request = Solr::Query::Request.new(search_term: 'term', query_fields: query_fields)
request.grouping = Solr::Query::Request::Grouping.new(field: :category, limit: 10)
request.run(page: 1, page_size: 10)
Query with facets
query_fields = [
Solr::Query::Request::FieldWithBoost.new(field: :name),
Solr::Query::Request::FieldWithBoost.new(field: :category)
]
request = Solr::Query::Request.new(search_term: 'term', query_fields: query_fields)
request.facets = [Solr::Query::Request::Facet.new(type: :terms, field: :category, options: { limit: 10 })]
request.run(page: 1, page_size: 10)
Query with boosting functions
query_fields = [
Solr::Query::Request::FieldWithBoost.new(field: :name),
Solr::Query::Request::FieldWithBoost.new(field: :category)
]
request = Solr::Query::Request.new(search_term: 'term', query_fields: query_fields)
request.boosting = Solr::Query::Request::Boosting.new(
multiplicative_boost_functions: [Solr::Query::Request::Boosting::RankingFieldBoostFunction.new(field: :name)],
phrase_boosts: [Solr::Query::Request::Boosting::PhraseProximityBoost.new(field: :category, boost_magnitude: 4)]
)
request.run(page: 1, page_size: 10)
Dictionary boosting function
Sometimes you want to do a dictionary-style boosting example: given a hash (dictionary)
{3025 => 2.0, 3024 => 1.5, 3023 => 1.2}
and a field of category_id
the resulting boosting function will be:
if(eq(category_id_it, 3025), 2.0, if(eq(category_id_it, 3024), 1.5, if(eq(category_id_it, 3023), 1.2, 1)))
note that I added spaces for readability, real Solr query functions must always be w/out spaces
Example of usage:
category_id_boosts = {3025 => 2.0, 3024 => 1.5, 3023 => 1.2}
request.boosting = Solr::Query::Request::Boosting.new(
multiplicative_boost_functions: [
Solr::Query::Request::Boosting::DictionaryBoostFunction.new(field: :category_id,
dictionary: category_id_boosts)
]
)
Field list
query_fields = [
Solr::Query::Request::FieldWithBoost.new(field: :name),
Solr::Query::Request::FieldWithBoost.new(field: :category)
]
request = Solr::Query::Request.new(search_term: 'term', query_fields: query_fields)
# Solr::Query::Request will return only :id field by default.
# Specify additional return fields (fl param) by setting the request field_list
request.field_list = [:name, :category]
request.run(page: 1, page_size: 10)
Deleting documents
Solr.delete_by_id(3242343)
Solr.delete_by_id(3242343, commit: true)
Solr.delete_by_query('*:*')
Solr.delete_by_query('*:*', commit: true)
Active Support instrumentation
This gem publishes events via Active Support Instrumentation
To subscribe to solrb events, you can add this code to initializer:
ActiveSupport::Notifications.subscribe('request.solrb') do |*args|
event = ActiveSupport::Notifications::Event.new(*args)
if Logger::INFO == Rails.logger.level
Rails.logger.info("Solrb #{event.duration.round(1)}ms")
elsif Logger::DEBUG == Rails.logger.level && Rails.env.development?
Pry::ColorPrinter.pp(event.payload)
end
end
Testing
It's possible to inspect the parameters for each solr query request done using Solrb by requiring
solr/testing
file in your test suite. The query parameters will be accessible by reading
Solr::Testing.last_solr_request_params
after each request.
require 'solr/testing'
RSpec.describe MyTest do
let(:query) { Solr::Query::Request.new(search_term: 'Solrb') }
it 'returns the last solr request params' do
query.run(page: 1, page_size: 10)
expect(Solr::Testing.last_solr_request_params).to eq({ ... })
end
end
Running specs
This project is setup to use CI to run all specs agains a real solr.
If you want to run it locally, you can either use CircleCI CLI or do a completely manual setup (for up-to-date steps see circleci config)
docker pull solr:7.7.1
docker run -it --name test-solr -p 8983:8983/tcp -t solr:7.7.1
# create a core
curl 'http://localhost:8983/solr/admin/cores?action=CREATE&name=test-core&configSet=_default'
# disable field guessing
curl http://localhost:8983/solr/test-core/config -d '{"set-user-property": {"update.autoCreateFields":"false"}}'
SOLR_URL=http://localhost:8983/solr/test-core rspec