Class: Elastomer::Client::Scroller

Inherits:
Object
  • Object
show all
Defined in:
lib/elastomer/client/scroller.rb

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(client, query, opts = {}) ⇒ Scroller

Create a new scroller that can be used to iterate over all the documents returned by the ‘query`. The Scroller supports both the ’scan’ and the ‘scroll’ search types.

See www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html and www.elastic.co/guide/en/elasticsearch/reference/current/search-request-search-type.html#scan

client - Elastomer::Client used for HTTP requests to the server query - The query to scroll as a Hash or a JSON encoded String opts - Options Hash

:index       - the name of the index to search
:type        - the document type to search
:scroll      - the keep alive time of the scrolling request (5 minutes by default)
:size        - the number of documents per shard to fetch per scroll
:search_type - set to 'scan' for scan semantics  # DEPRECATED in ES 2.1.0 - use a Scroll query sorted by _doc: https://www.elastic.co/guide/en/elasticsearch/reference/2.3/search-request-search-type.html#scan

Examples

scan = Scroller.new(client, {query: {match_all: {}}}, index: 'test-1')
scan.each_document { |doc|
  doc['_id']
  doc['_source']
}


170
171
172
173
174
175
176
177
# File 'lib/elastomer/client/scroller.rb', line 170

def initialize( client, query, opts = {} )
  @client = client

  @opts = DEFAULT_OPTS.merge({ body: query }).merge(opts)

  @scroll_id = nil
  @offset = 0
end

Instance Attribute Details

#clientObject (readonly)

Returns the value of attribute client.



179
180
181
# File 'lib/elastomer/client/scroller.rb', line 179

def client
  @client
end

#queryObject (readonly)

Returns the value of attribute query.



179
180
181
# File 'lib/elastomer/client/scroller.rb', line 179

def query
  @query
end

#scroll_idObject (readonly)

Returns the value of attribute scroll_id.



179
180
181
# File 'lib/elastomer/client/scroller.rb', line 179

def scroll_id
  @scroll_id
end

Instance Method Details

#clear!Object

Terminate the scroll query. This will remove the search context from the cluster and no further documents can be returned by this Scroller instance.

Returns nil if the ‘scroll_id` is not valid; returns the reponse body if the `scroll_id` was cleared.



243
244
245
246
247
248
# File 'lib/elastomer/client/scroller.rb', line 243

def clear!
  return if scroll_id.nil?
  client.clear_scroll(scroll_id)
rescue ::Elastomer::Client::IllegalArgument
  nil
end

#do_scrollObject

Internal: Perform the actual scroll requests. This method wil call out to the ‘Client#start_scroll` and `Client#continue_scroll` methods while keeping track of the `scroll_id` internally.

Returns the response body as a Hash.



255
256
257
258
259
260
261
262
263
264
265
266
267
268
# File 'lib/elastomer/client/scroller.rb', line 255

def do_scroll
  if scroll_id.nil?
    body = client.start_scroll(@opts)
    if body["hits"]["hits"].empty?
      @scroll_id = body["_scroll_id"]
      return do_scroll
    end
  else
    body = client.continue_scroll(scroll_id, @opts[:scroll])
  end

  @scroll_id = body["_scroll_id"]
  body
end

#eachObject

Iterate over all the search results from the scan query.

block - The block will be called for each set of matching documents

returned from executing the scan query.

Yields a hits Hash containing the ‘total’ number of hits, current ‘offset’ into that total, and the Array of ‘hits’ document Hashes.

Examples

scan.each do |hits|
  hits['total']
  hits['offset']
  hits['hits'].each { |document| ... }
end

Returns this Scroller instance.



198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
# File 'lib/elastomer/client/scroller.rb', line 198

def each
  loop do
    body = do_scroll

    hits = body["hits"]
    break if hits["hits"].empty?

    hits["offset"] = @offset
    @offset += hits["hits"].length

    yield hits
  end

  self
ensure
  clear!
end

#each_document(&block) ⇒ Object

Iterate over each document from the scan query. This method is just a convenience wrapper around the ‘each` method; it iterates the Array of documents and passes them one by one to the block.

block - The block will be called for each document returned from

executing the scan query.

Yields a document Hash.

Examples

scan.each_document do |document|
  document['_id']
  document['_source']
end

Returns this Scroller instance.



233
234
235
# File 'lib/elastomer/client/scroller.rb', line 233

def each_document( &block )
  each { |hits| hits["hits"].each(&block) }
end