Class: Elastomer::Client::Scroller

Inherits:
Object
  • Object
show all
Defined in:
lib/elastomer/client/scroller.rb

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(client, query, opts = {}) ⇒ Scroller

Create a new scroller that can be used to iterate over all the documents returned by the ‘query`. The Scroller supports both the ’scan’ and the ‘scroll’ search types.

See www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html and www.elastic.co/guide/en/elasticsearch/reference/current/search-request-search-type.html#scan

client - Elastomer::Client used for HTTP requests to the server query - The query to scan as a Hash or a JSON encoded String opts - Options Hash

:index       - the name of the index to search
:type        - the document type to search
:scroll      - the keep alive time of the scrolling request (5 minutes by default)
:size        - the number of documents per shard to fetch per scroll
:search_type - set to 'scan' for scan semantics  # DEPRECATED in ES 2.1.0 - use a Scroll query sorted by _doc: https://www.elastic.co/guide/en/elasticsearch/reference/2.3/search-request-search-type.html#scan

Examples

scan = Scroller.new(client, {:query => {:match_all => {}}}, :index => 'test-1')
scan.each_document { |doc|
  doc['_id']
  doc['_source']
}


156
157
158
159
160
161
162
163
# File 'lib/elastomer/client/scroller.rb', line 156

def initialize( client, query, opts = {} )
  @client = client

  @opts = DEFAULT_OPTS.merge({ :body => query }).merge(opts)

  @scroll_id = nil
  @offset = 0
end

Instance Attribute Details

#clientObject (readonly)

Returns the value of attribute client.



165
166
167
# File 'lib/elastomer/client/scroller.rb', line 165

def client
  @client
end

#queryObject (readonly)

Returns the value of attribute query.



165
166
167
# File 'lib/elastomer/client/scroller.rb', line 165

def query
  @query
end

#scroll_idObject (readonly)

Returns the value of attribute scroll_id.



165
166
167
# File 'lib/elastomer/client/scroller.rb', line 165

def scroll_id
  @scroll_id
end

Instance Method Details

#do_scrollObject

Internal: Perform the actual scroll requests. This method wil call out to the ‘Client#start_scroll` and `Client#continue_scroll` methods while keeping track of the `scroll_id` internally.

Returns the response body as a Hash.



226
227
228
229
230
231
232
233
234
235
236
237
238
239
# File 'lib/elastomer/client/scroller.rb', line 226

def do_scroll
  if scroll_id.nil?
    body = client.start_scroll(@opts)
    if body["hits"]["hits"].empty?
      @scroll_id = body["_scroll_id"]
      return do_scroll
    end
  else
    body = client.continue_scroll(scroll_id, @opts[:scroll])
  end

  @scroll_id = body["_scroll_id"]
  body
end

#eachObject

Iterate over all the search results from the scan query.

block - The block will be called for each set of matching documents

returned from executing the scan query.

Yields a hits Hash containing the ‘total’ number of hits, current ‘offset’ into that total, and the Array of ‘hits’ document Hashes.

Examples

scan.each do |hits|
  hits['total']
  hits['offset']
  hits['hits'].each { |document| ... }
end

Returns this Scan instance.



184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
# File 'lib/elastomer/client/scroller.rb', line 184

def each
  loop do
    body = do_scroll

    hits = body["hits"]
    break if hits["hits"].empty?

    hits["offset"] = @offset
    @offset += hits["hits"].length

    yield hits
  end

  self
end

#each_document(&block) ⇒ Object

Iterate over each document from the scan query. This method is just a convenience wrapper around the ‘each` method; it iterates the Array of documents and passes them one by one to the block.

block - The block will be called for each document returned from

executing the scan query.

Yields a document Hash.

Examples

scan.each_document do |document|
  document['_id']
  document['_source']
end

Returns this Scan instance.



217
218
219
# File 'lib/elastomer/client/scroller.rb', line 217

def each_document( &block )
  each { |hits| hits["hits"].each(&block) }
end