Chewie

A declarative interface for building Elasticsearch queries.

Building valid Elasticsearch queries by hand is difficult, especially as search criteria and logic become more complex.

Chewie aims to reduce the cognitive complexity of building queries, so you can focus on the search experience instead of grappling Elasticsearch syntax.

NOTE: Chewie currently supports Elasticsearch 7.x.

Contents

Installation

Add this line to your application's Gemfile:

gem 'chewie'

And then execute:

$ bundle

Or install it yourself as:

$ gem install chewie

Usage

Define a Chewie class:

# app/chewies/school_chewie.rb

class SchoolChewie
    extend Chewie

    term :name
    range :age
    match :description

    filter_by :governances, with: :terms 
end

Pass filter parameters to the #build method:

# app/**/*.rb

params = {
    query: "Park School"
    filters: {
        age: { 'gte': 20, 'lte': 10 },
        governances: ['Charter', 'Alop']
    }
}

query = params[:query]
filters = params[:filters]

query = SchoolChewie.build(query: query, filters: filters)

puts query
# =>
# {
#   query: {
#       term: { 
#           name: { value: 'Park School' }
#       },
#       range: { 
#           age: { 'gte': 20, 'lte': 10 }
#       },
#       match: {
#           message: { query: 'Park School' }
#       },
#       bool: {
#           filter: {
#               terms: {
#                   governances: [ 'Charter', 'Alop' ]
#               }
#           }
#       }
#   }
# }

Chewie expects incoming parameter attributes to match the attributes defined in your Chewie class, in order to pull the correct value and build the query

# definition
filter_by :governances, with: :terms

# parameters
{ governances: ['ALOP'] }

# output
{ filter: { terms: { governances: ['ALOP'] } }

Some queries simply take a string value, which is pulled from :query.

:query is typically a user search value (search bar).

# definition
term :name

# parameters
{ query: 'A search value' }

# output
{ query: { term: { name: { value: 'A search value' } } } }

Filtering by Associations

Depending on how you build your index, some fields might store values from multiple attributes.

A simple case is if you'd like to filter records through an association.

class School
    has_many :school_disciplines
    has_many :disciplines, through: :school_disciplines
end

class Discipline
    has_many :school_disciplines
    has_many :schools, through: :school_disciplines
end

class SchoolDiscipline
    belongs_to :school
    belongs_to :discipline
end

We might imagine a search engine that helps users find schools in their area and allow them to filter schools by various criteria. Some schools might offer discipline specific programs, therefore a school will have many disciplines. Disciplines is a standard collection that schools can associate with in our application.

In our search UI, we might provide a disciplines filter and allow users to filter by disciplines via dropdown.

We provide the search UI with ids of disciplines we'd like to filter by.

{
    filters: {
        disciplines: [1, 2, 3, 4]
    }
}

Our idex consists of school records, therefore we won't have access to every discipline each school is associated to by default.

Instead, we need to define custom index attributes for our school records to capture those relationships.

We can do that by defining model methods on School that collects associated id values and returns a collection of strings to be indexed.

class School
    def disciplines_index
        discipline_ids = disciplines.pluck(:id)
        discipline_ids.map do |discipline_id|
            "discipline_#{discipline_id}"
        end
    end

    # Method Elasticsearch can use to populate the index
    def search_data
        {
            name: name,
            disciplines: disciplines_index
        }
    end
end

When Elasticsearch indexes School records, each record will now have knowledge of which disciplines it is associated to.

    { 
        name: 'Park School',
        disciplines: [
            "discipline_1",
            "discipline_2",
            "discipline_3"
        ]
    }

Format

At this point, our index is ready to return associated School records when given a collection of Discipline ids.

The caveat is the stored values of :disciplines is in a format that contains both the School and Discipline id.

We'll need to do a little extra work at search time to ensure our id filter values are transformed into the appropriate string format.

To address this, bool query methods have a :format option that takes a lambda and exposes attribute values given.

class SchoolChewie
    disciplines_format = lambda do |id|
        "discipline_#{id}"
    end

    filter_by :disciplines, with: :terms, format: disciplines_format
end

params = {
    query: '',
    filters: {
        disciplines: [1, 4]
    }
}

result = SchoolChewie.build(query: params[:query], filters: params[:filters])

puts result
# =>
# {
#   query: { 
#       bool: { 
#           filter: { 
#               terms: { 
#                   disciplines: [
#                       "discipline_1",
#                       "discipline_4",
#                   ]
#               }
#           }
#       }
#   }
# }

Now that our query output for disciplines matches values stored in the index, Elasticsearch will find School records where disciplines match to either "discipline_1" or "discipline_4"; allowing us to find schools by their associated disciplines.

Combine

Sometimes there are additional criteria we'd like to leverage when filtering against associated records.

Continuing with the previous example, let's say we want to filter schools by disciplines where the discipline programs are "active".

"active" might be a boolean attribute found on SchoolDiscipline.

We can re-write our discipline_index method to pull the discipline id and active attributes from SchoolDiscipline join records.

class School
    def disciplines_index
        school_disciplines.map do |school_discipline|
            discipline_id = school_discipline.id
            active = school_discipline.active

            "discipline_#{discipline_id}_active_#{active}"
        end
    end

    # Method Elasticsearch can use to populate the index
    def search_data
        {
            name: name,
            disciplines: disciplines_index
        }
    end
end

Which changes our index to:

    { 
        name: 'Park School',
        disciplines: [
            "discipline_1_active_true",
            "discipline_2_active_false",
            "discipline_3_active_false"
        ]
    }

We can now imagine there is a active toggle in the search UI, which expands our filter parameters.

params = {
    query: '',
    filters: {
        disciplines: [1, 4],
        active: true
    }
}

Now, at search time we not only need to format with the disciplines collection, but combine those values with the active attribute.

Let's update our Chewie to take this new criteria into account.

class SchoolChewie
    disciplines_format = lambda do |id, combine|
        "discipline_#{id}_active_#{combine.first}"
    end

    filter_by :disciplines, with: :terms, combine: [:active], format: disciplines_format
end

:combine takes a collection of attribute symbols, which Chewie uses to access and pass parameter values to the format lambda at search time; the value collection is exposed as the second argument in the lambda block.

The order of the values matches the order defined in the method call.

combine: [:active, :governances, :age]

lambda do |id, combine|
    combine[0] #=> :active value
    combine[1] #=> :governances value
    combine[2] #=> :age value
end

The output becomes:

result = SchoolChewie.build(query: params[:query], filters: params[:filters])

puts result
# =>
# {
#   query: { 
#       bool: { 
#           filter: { 
#               terms: { 
#                   disciplines: [
#                       "discipline_1_active_true",
#                       "discipline_4_active_true",
#                   ]
#               }
#           }
#       }
#   }
# }

Supported Queries

Compound Queries

Bool

  • filter (#filter_by)
  • should (#should_include)
  • must (#must_include)
  • must_not (#must_not_include)

Term Level Queries

  • term (#term)
  • terms (#terms)
  • range (#range)
  • fuzzy (#fuzzy)

Full Text Queries

  • match (#match)
  • multi-match (#multimatch)

Development

After checking out the repo, run bin/setup to install dependencies. Then, run rake spec to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/[USERNAME]/chewie. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.

License

The gem is available as open source under the terms of the MIT License.

Code of Conduct

Everyone interacting in the Chewie project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.