Stretchy

Build Status

Stretchy is a query builder for Elasticsearch. It helps you quickly construct the JSON to send to Elastic, which can get rather complicated.

Stretchy is modeled after ActiveRecord's interface and architecture - query objects are immutable and chainable, which makes quickly building the right query and caching the results easy.

Stretchy is not:

  1. an integration with ActiveModel to help you index your data
  2. a way to manage Elasticsearch configuration
  3. a general-purpose Elasticsearch API client

The first two are very application-specific. For any non-trivial app, the level of customization necessary will have you writing almost everything yourself. The last one is better handled by the elasticsearch gem.

Installation

Add this line to your application's Gemfile:

gem 'stretchy'

And then execute:

$ bundle

Or install it yourself as:

$ gem install stretchy

Usage

Stretchy is still in early development, so it does not yet support the full feature set of the Elasticsearch API. It does support fairly basic queries in an ActiveRecord-ish style.

Configuration

Stretchy.configure do |c|
  c.index_name = 'my_index'                       # REQUIRED
  c.client     = $my_client                       # ignore below, use a custom client
  c.url        = 'https://user:[email protected]' # default is ENV['ELASTICSEARCH_URL']
  c.adapter    = :patron                          # default is :excon

  c.logger     = Logger.new(STDOUT)               # passed to elasticsearch-api gem
                                                  # Stretchy will also log, with the params
                                                  # specified below
  c.log_level  = :debug                           # default is :silence
  c.log_color  = :green                           # default is :blue 
end

Base

query = Stretchy.query(type: 'model_name')

From here, you can chain the following query methods:

Match

query = query.match('welcome to my web site').match(title: 'welcome to my web site')

Performs a full-text search for the given string. If given a hash, it will use a match query on the specified fields, otherwise it will default to '_all'.

Where

query = query.where(
  name: 'alice',
  email: [
    '[email protected]',
    'beatrice.christine@other_company.com'
  ],
  commit_count: 27..33,
  is_robot: nil
)

Allows passing a hash of matchable options similar to ActiveRecord's where method. To be returned, the document must match each of the parameters. If you pass an array of parameters for a field, the document must match at least one of those parameters.

Gotcha

If you pass a string or symbol for a field, it will be converted to a Match Query for the specified field. Since Elastic analyzes terms by default, string or symbol terms will be looked for by an analyzed query.

Range

query = query.range(:rating, min: 3, max: 5)
             .range(:released, min: Time.now - 60*60*24*100)
             .range(:quantity, max: 100, exclusive: true)

Only documents with the specified field, and within the specified range match. You can also pass in dates and times as ranges. While you could pass a normal ruby Range object to .where, this allows you to specify only a minimum or only a maximum. Range filters are inclusive by default, but you can also pass :exclusive, :exclusive_min, or :exclusive_max.

Geo Distance

query = query.geo(field: 'coords', distance: '20mi', lat: 35.0117, lng: 135.7683)

Filters for documents where the specified geo_point field is within the given range.

Gotcha

The field must be mapped as a geo_point field. See Elasticsearch types for more info.

Not

query = query.where.not(rating: 0)
             .match.not('angry')
             .where.not.geo(field: 'coords', distance: '20mi', lat: 35.0117, lng: 135.7683)

Called after where or match will let you apply inverted filters. Any documents that match those filters will be excluded.

Should

query = query.should(name: 'Sarah', awesomeness: 1000).should.not(awesomeness: 0)

Should filters work similarly to .where. Documents that do not match are still returned, but they have a lower relevancy score and will appear after documents that do match in the results. See Elastic's documentation for BoolQuery and BoolFilter for more info.

Boost

query = query.boost.where(category: 3, weight: 100)
             .boost.range(:awesomeness, min: 10, weight: 10)
             .boost.match.not('sucks')

Boosts use a Function Score Query with filters to allow you to affect the score for the document. Each condition will be applied as a filter with an optional weight.

Near

query = query.boost.near(field: :published_at, origin: Time.now, scale: '5d')
             .boost.near(field: :coords, lat: 35.0117, lng: 135.7683, scale: '10mi', decay: 0.33, weight: 1000)

Boosts a document by how close a given field is to a given :origin . Accepts dates, times, numbers, and geographical points. Unlike .where.range or .boost.geo, .boost.near is not a binary operation. All documents get a score for that field, which decays the further it is away from the origin point.

The :scale param determines how quickly the value falls off. In the example above, if a document's :coords field is 10 miles away from the starting point, its score is about 1/3 that of a document at the origin point.

See the Function Score Query section on Decay Functions for more info.

Random

query = query.boost.random(user.id, 50)

Gives each document a randomized boost with a given seed and optional weight. This allows you to show slightly different result sets to different users, but show the same result set to that user every time.

Limit and Offset

query = query.limit(20).offset(1000)

Works the same way as ActiveRecord's limit and offset methods - analogous to Elasticsearch's from and size parameters.

Response

query.response

Executes the query, returns the raw JSON response from Elasticsearch and caches it. Use this to get at search API data not in the source documents.

Results

query.results

Executes the query and provides the parsed json for each hit returned by Elasticsearch, along with _index, _type, _id, and _score fields.

Ids

query.ids

Provides only the ids for each hit. If your document ids are numeric (as is the case for many ActiveRecord integrations), they will be converted to integers.

Total

query.total

Returns the total number of matches returned by the query - not just the current page. Makes plugging into Kaminari a snap.

Development

After checking out the repo, run bin/setup to install dependencies. Then, run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release to create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Contributing

  1. Fork it ( https://github.com/[my-github-username]/stretchy/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new Pull Request