Module: Elasticsearch::Persistence::Model::Find::ClassMethods

Defined in:
lib/elasticsearch/persistence/model/find.rb

Instance Method Summary collapse

Instance Method Details

#all(options = {}) ⇒ Object

Returns all models (up to 10,000)

Examples:

Retrieve all people


Person.all
# => [#<Person:0x007ff1d8fb04b0 ... ]

Retrieve all people matching a query


Person.all query: { match: { last_name: 'Smith'  } }
# => [#<Person:0x007ff1d8fb04b0 ... ]


20
21
22
# File 'lib/elasticsearch/persistence/model/find.rb', line 20

def all(options={})
  gateway.search( { query: { match_all: {} }, size: 10_000 }.merge(options) )
end

#count(query_or_definition = nil, options = {}) ⇒ Integer

Returns the number of models

Examples:

Return the count of all models


Person.count
# => 2

Return the count of models matching a simple query


Person.count('fox or dog')
# => 1

Return the count of models matching a query in the Elasticsearch DSL


Person.search(query: { match: { title: 'fox dog' } })
# => 1

Returns:

  • (Integer)


43
44
45
# File 'lib/elasticsearch/persistence/model/find.rb', line 43

def count(query_or_definition=nil, options={})
  gateway.count( query_or_definition, options )
end

#find_each(options = {}) ⇒ String, Enumerator

Iterate effectively over models using the ‘find_in_batches` method.

All the options are passed to ‘find_in_batches` and each result is yielded to the passed block.

Examples:

Print out the people’s names by scrolling through the index


Person.find_each { |person| puts person.name }

# # GET http://localhost:9200/people/person/_search?scroll=5m&search_type=scan&size=20
# # GET http://localhost:9200/_search/scroll?scroll=5m&scroll_id=c2Nhbj...
# Test 0
# Test 1
# Test 2
# ...
# # GET http://localhost:9200/_search/scroll?scroll=5m&scroll_id=c2Nhbj...
# Test 20
# Test 21
# Test 22

Leave out the block to return an Enumerator instance


Person.find_each.select { |person| person.name =~ /John/ }
# => => [#<Person {id: "NkltJP5vRxqk9_RMP7SU8Q", name: "John Smith",  ...}>]

Returns:

  • (String, Enumerator)

    The ‘scroll_id` for the request or Enumerator when the block is not passed



159
160
161
162
163
164
165
# File 'lib/elasticsearch/persistence/model/find.rb', line 159

def find_each(options = {})
  return to_enum(:find_each, options) unless block_given?

  find_in_batches(options) do |batch|
    batch.each { |result| yield result }
  end
end

#find_in_batches(options = {}, &block) ⇒ String, Enumerator

Returns all models efficiently via the Elasticsearch’s scan/scroll API

You can restrict the models being returned with a query.

The Search API options are passed to the search method as parameters, all remaining options are passed as the ‘:body` parameter.

The full Repository::Response::Results instance is yielded to the passed block in each batch, so you can access any of its properties; calling ‘to_a` will convert the object to an Array of model instances.

Examples:

Return all models in batches of 20 x number of primary shards


Person.find_in_batches { |batch| puts batch.map(&:name) }

Return all models in batches of 100 x number of primary shards


Person.find_in_batches(size: 100) { |batch| puts batch.map(&:name) }

Return all models matching a specific query


Person.find_in_batches(query: { match: { name: 'test' } }) { |batch| puts batch.map(&:name) }

Return all models, fetching only the ‘name` attribute from Elasticsearch


Person.find_in_batches( _source_include: 'name') { |_| puts _.response.hits.hits.map(&:to_hash) }

Leave out the block to return an Enumerator instance


Person.find_in_batches(size: 100).map { |batch| batch.size }
# => [100, 100, 100, ... ]

Returns:

  • (String, Enumerator)

    The ‘scroll_id` for the request or Enumerator when the block is not passed



82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
# File 'lib/elasticsearch/persistence/model/find.rb', line 82

def find_in_batches(options={}, &block)
  return to_enum(:find_in_batches, options) unless block_given?

  search_params = options.extract!(
    :index,
    :type,
    :scroll,
    :size,
    :explain,
    :ignore_indices,
    :ignore_unavailable,
    :allow_no_indices,
    :expand_wildcards,
    :preference,
    :q,
    :routing,
    :source,
    :_source,
    :_source_include,
    :_source_exclude,
    :stats,
    :timeout)

  scroll = search_params.delete(:scroll) || '5m'

  body = options

  # Get the initial scroll_id
  #
  response = gateway.client.search( { index: gateway.index_name,
                               type:  gateway.document_type,
                               search_type: 'scan',
                               scroll:      scroll,
                               size:        20,
                               body:        body }.merge(search_params) )

  # Get the initial batch of documents
  #
  response = gateway.client.scroll( { scroll_id: response['_scroll_id'], scroll: scroll } )

  # Break when receiving an empty array of hits
  #
  while response['hits']['hits'].any? do
    yield Repository::Response::Results.new(gateway, response)

    response = gateway.client.scroll( { scroll_id: response['_scroll_id'], scroll: scroll } )
  end

  return response['_scroll_id']
end