Module: Elasticsearch::Persistence::Model::Find::ClassMethods

Defined in:
lib/elasticsearch/persistence/model/find.rb

Instance Method Summary collapse

Instance Method Details

#all(query = { query: { match_all: {} } }, options = {}) ⇒ Object

Returns all models (up to 10,000)

Examples:

Retrieve all people


Person.all
# => [#<Person:0x007ff1d8fb04b0 ... ]

Retrieve all people matching a query


Person.all query: { match: { last_name: 'Smith'  } }
# => [#<Person:0x007ff1d8fb04b0 ... ]


30
31
32
33
# File 'lib/elasticsearch/persistence/model/find.rb', line 30

def all(query={ query: { match_all: {} } }, options={})
  query[:size] ||= 10_000
  search(query, options)
end

#count(query_or_definition = nil, options = {}) ⇒ Integer

Returns the number of models

Examples:

Return the count of all models


Person.count
# => 2

Return the count of models matching a simple query


Person.count('fox or dog')
# => 1

Return the count of models matching a query in the Elasticsearch DSL


Person.search(query: { match: { title: 'fox dog' } })
# => 1

Returns:

  • (Integer)


54
55
56
# File 'lib/elasticsearch/persistence/model/find.rb', line 54

def count(query_or_definition=nil, options={})
  gateway.count( query_or_definition, options )
end

#find_each(options = {}) ⇒ String, Enumerator

Iterate effectively over models using the ‘find_in_batches` method.

All the options are passed to ‘find_in_batches` and each result is yielded to the passed block.

Examples:

Print out the people’s names by scrolling through the index


Person.find_each { |person| puts person.name }

# # GET http://localhost:9200/people/person/_search?scroll=5m&search_type=scan&size=20
# # GET http://localhost:9200/_search/scroll?scroll=5m&scroll_id=c2Nhbj...
# Test 0
# Test 1
# Test 2
# ...
# # GET http://localhost:9200/_search/scroll?scroll=5m&scroll_id=c2Nhbj...
# Test 20
# Test 21
# Test 22

Leave out the block to return an Enumerator instance


Person.find_each.select { |person| person.name =~ /John/ }
# => => [#<Person {id: "NkltJP5vRxqk9_RMP7SU8Q", name: "John Smith",  ...}>]

Returns:

  • (String, Enumerator)

    The ‘scroll_id` for the request or Enumerator when the block is not passed



170
171
172
173
174
175
176
# File 'lib/elasticsearch/persistence/model/find.rb', line 170

def find_each(options = {})
  return to_enum(:find_each, options) unless block_given?

  find_in_batches(options) do |batch|
    batch.each { |result| yield result }
  end
end

#find_in_batches(options = {}, &block) ⇒ String, Enumerator

Returns all models efficiently via the Elasticsearch’s scan/scroll API

You can restrict the models being returned with a query.

The Search API options are passed to the search method as parameters, all remaining options are passed as the ‘:body` parameter.

The full Repository::Response::Results instance is yielded to the passed block in each batch, so you can access any of its properties; calling ‘to_a` will convert the object to an Array of model instances.

Examples:

Return all models in batches of 20 x number of primary shards


Person.find_in_batches { |batch| puts batch.map(&:name) }

Return all models in batches of 100 x number of primary shards


Person.find_in_batches(size: 100) { |batch| puts batch.map(&:name) }

Return all models matching a specific query


Person.find_in_batches(query: { match: { name: 'test' } }) { |batch| puts batch.map(&:name) }

Return all models, fetching only the ‘name` attribute from Elasticsearch


Person.find_in_batches( _source_include: 'name') { |_| puts _.response.hits.hits.map(&:to_hash) }

Leave out the block to return an Enumerator instance


Person.find_in_batches(size: 100).map { |batch| batch.size }
# => [100, 100, 100, ... ]

Returns:

  • (String, Enumerator)

    The ‘scroll_id` for the request or Enumerator when the block is not passed



93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
# File 'lib/elasticsearch/persistence/model/find.rb', line 93

def find_in_batches(options={}, &block)
  return to_enum(:find_in_batches, options) unless block_given?

  search_params = options.extract!(
    :index,
    :type,
    :scroll,
    :size,
    :explain,
    :ignore_indices,
    :ignore_unavailable,
    :allow_no_indices,
    :expand_wildcards,
    :preference,
    :q,
    :routing,
    :source,
    :_source,
    :_source_include,
    :_source_exclude,
    :stats,
    :timeout)

  scroll = search_params.delete(:scroll) || '5m'

  body = options

  # Get the initial scroll_id
  #
  response = gateway.client.search( { index: gateway.index_name,
                               type:  gateway.document_type,
                               search_type: 'scan',
                               scroll:      scroll,
                               size:        20,
                               body:        body }.merge(search_params) )

  # Get the initial batch of documents
  #
  response = gateway.client.scroll( { scroll_id: response['_scroll_id'], scroll: scroll } )

  # Break when receiving an empty array of hits
  #
  while response['hits']['hits'].any? do
    yield Repository::Response::Results.new(gateway, response)

    response = gateway.client.scroll( { scroll_id: response['_scroll_id'], scroll: scroll } )
  end

  return response['_scroll_id']
end

#search(query_or_definition, options = {}) ⇒ Object



14
15
16
# File 'lib/elasticsearch/persistence/model/find.rb', line 14

def search(query_or_definition, options={})
  SearchRequest.new(self, query_or_definition, options).execute!
end