Class: Elastomer::Client::Docs

Inherits:
Object
  • Object
show all
Defined in:
lib/elastomer/client/docs.rb

Constant Summary collapse

SPECIAL_KEYS =
%w[index type id version version_type op_type routing parent timestamp ttl consistency replication refresh].freeze
SPECIAL_KEYS_HASH =
SPECIAL_KEYS.inject({}) { |h, k| h[k.to_sym] = "_#{k}"; h }.freeze

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(client, name, type = nil) ⇒ Docs

Create a new document client for making API requests that pertain to the indexing and searching of documents in a search index.

client - Elastomer::Client used for HTTP requests to the server name - The name of the index as a String type - The document type as a String



27
28
29
30
31
# File 'lib/elastomer/client/docs.rb', line 27

def initialize( client, name, type = nil )
  @client = client
  @name   = @client.assert_param_presence(name, 'index name') unless name.nil?
  @type   = @client.assert_param_presence(type, 'document type') unless type.nil?
end

Instance Attribute Details

#clientObject (readonly)

Returns the value of attribute client.



33
34
35
# File 'lib/elastomer/client/docs.rb', line 33

def client
  @client
end

#nameObject (readonly)

Returns the value of attribute name.



33
34
35
# File 'lib/elastomer/client/docs.rb', line 33

def name
  @name
end

#typeObject (readonly)

Returns the value of attribute type.



33
34
35
# File 'lib/elastomer/client/docs.rb', line 33

def type
  @type
end

Instance Method Details

#bulk(params = {}, &block) ⇒ Object

Perform bulk indexing and/or delete operations. The current index name and document type will be passed to the bulk API call as part of the request parameters.

params - Parameters Hash that will be passed to the bulk API call. block - Required block that is used to accumulate bulk API operations.

All the operations will be passed to the search cluster via a
single API request.

Yields a Bulk instance for building bulk API call bodies.

Examples

docs.bulk do |b|
  b.index( document1 )
  b.index( document2 )
  b.delete( document3 )
  ...
end

Returns the response body as a Hash



411
412
413
414
415
416
# File 'lib/elastomer/client/docs.rb', line 411

def bulk( params = {}, &block )
  raise 'a block is required' if block.nil?

  params = {:index => self.name, :type => self.type}.merge params
  client.bulk params, &block
end

#count(query, params = nil) ⇒ Object

Executes a search query, but instead of returning results, returns the number of documents matched. This method supports both the “request body” query and the “URI request” query. When using the request body semantics, the query hash must contain the :query key. Otherwise we assume a URI request is being made.

query - The query body as a Hash params - Parameters Hash

Examples

# request body query
count({:match_all => {}}, :type => 'tweet')

# same thing but using the URI request method
count(:q => '*:*', :type => 'tweet')

See www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-count.html

Returns the response body as a Hash



242
243
244
245
246
247
# File 'lib/elastomer/client/docs.rb', line 242

def count( query, params = nil )
  query, params = extract_params(query) if params.nil?

  response = client.get '/{index}{/type}/_count', update_params(params, :body => query, :action => 'docs.count')
  response.body
end

#defaultsObject

Internal: Returns a Hash containing default parameters.



543
544
545
# File 'lib/elastomer/client/docs.rb', line 543

def defaults
  { :index => name, :type => type }
end

#delete(params = {}) ⇒ Object

Delete a document from the index based on the document ID. The :id is provided as part of the params hash.

params - Parameters Hash

:id - the ID of the document to delete

See www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-delete.html

Returns the response body as a Hash



94
95
96
97
# File 'lib/elastomer/client/docs.rb', line 94

def delete( params = {} )
  response = client.delete '/{index}/{type}/{id}', update_params(params, :action => 'docs.delete')
  response.body
end

#delete_by_query(query, params = nil) ⇒ Object

Delete documents from one or more indices and one or more types based on a query. This method supports both the “request body” query and the “URI request” query. When using the request body semantics, the query hash must contain the :query key. Otherwise we assume a URI request is being made.

query - The query body as a Hash params - Parameters Hash

Examples

# request body query
delete_by_query({:query => {:match_all => {}}}, :type => 'tweet')

# same thing but using the URI request method
delete_by_query(:q => '*:*', :type => 'tweet')

See www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-delete-by-query.html

Returns the response body as a hash



269
270
271
272
273
274
# File 'lib/elastomer/client/docs.rb', line 269

def delete_by_query( query, params = nil )
  query, params = extract_params(query) if params.nil?

  response = client.delete '/{index}{/type}/_query', update_params(params, :body => query, :action => 'docs.delete_by_query')
  response.body
end

#exists?(params = {}) ⇒ Boolean Also known as: exist?

Check to see if a document exists. The :id is provided as part of the params hash.

params - Parameters Hash

:id - the ID of the document to check

See www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-get.html#docs-get

Returns true if the document exists

Returns:

  • (Boolean)


122
123
124
125
# File 'lib/elastomer/client/docs.rb', line 122

def exists?( params = {} )
  response = client.head '/{index}/{type}/{id}', update_params(params, :action => 'docs.exists')
  response.success?
end

#explain(query, params = nil) ⇒ Object

Compute a score explanation for a query and a specific document. This can give useful feedback about why a document matched or didn’t match a query. The document :id is provided as part of the params hash.

query - The query body as a Hash params - Parameters Hash

:id - the ID of the document

Examples

explain({:query => {:term => {"message" => "search"}}}, :id => 1)

explain(:q => "message:search", :id => 1)

See www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-explain.html

Returns the response body as a hash



358
359
360
361
362
363
# File 'lib/elastomer/client/docs.rb', line 358

def explain( query, params = nil )
  query, params = extract_params(query) if params.nil?

  response = client.get '/{index}/{type}/{id}/_explain', update_params(params, :body => query, :action => 'docs.explain')
  response.body
end

#extract_params(query, params = nil) ⇒ Object

Internal: Allow params to be passed as the first argument to methods that take both an optional query hash and params.

query - query hash OR params hash params - params hash OR nil if no query

Returns an array of the query (possibly nil) and params Hash.



554
555
556
557
558
559
560
561
562
563
# File 'lib/elastomer/client/docs.rb', line 554

def extract_params( query, params = nil )
  if params.nil?
    if query.key? :query
      params = {}
    else
      params, query = query, nil
    end
  end
  [query, params]
end

#from_document(document) ⇒ Object

Internal: Given a ‘document` generate an options hash that will override parameters based on the content of the document. The document will be returned as the value of the :body key.

We only extract information from the document if it is given as a Hash. We do not parse JSON encoded Strings.

document - A document Hash or JSON encoded String.

Returns an options Hash extracted from the document.



515
516
517
518
519
520
521
522
523
524
525
526
# File 'lib/elastomer/client/docs.rb', line 515

def from_document( document )
  opts = {:body => document}

  if document.is_a? Hash
    SPECIAL_KEYS_HASH.each do |key, field|
      opts[key] = document.delete field if document.key? field
      opts[key] = document.delete field.to_sym if document.key? field.to_sym
    end
  end

  opts
end

#get(params = {}) ⇒ Object

Retrieve a document from the index based on its ID. The :id is provided as part of the params hash.

params - Parameters Hash

:id - the ID of the document to get

See www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-get.html#docs-get

Returns the response body as a Hash



108
109
110
111
# File 'lib/elastomer/client/docs.rb', line 108

def get( params = {} )
  response = client.get '/{index}/{type}/{id}', update_params(params, :action => 'docs.get')
  response.body
end

#index(document, params = {}) ⇒ Object

Adds or updates a document in the index, making it searchable. If the document contains an ‘:_id` attribute then PUT semantics will be used to create (or update) a document with that ID. If no ID is provided then a new document will be created using POST semantics.

There are several other document attributes that control how ElasticSearch will index the document. They are listed below. Please refer to the ElasticSearch documentation for a full explanation of each and how it affects the indexing process.

:_id
:_type
:_version
:_version_type
:_op_type
:_routing
:_parent
:_timestamp
:_ttl
:_consistency
:_replication
:_refresh

If any of these attributes are present in the document they will be removed from the document before it is indexed. This means that the document will be modified by this method.

document - The document (as a Hash or JSON encoded String) to add to the index params - Parameters Hash

See www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-index_.html

Returns the response body as a Hash



68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
# File 'lib/elastomer/client/docs.rb', line 68

def index( document, params = {} )
  overrides = from_document document
  params = update_params(params, overrides)
  params[:action] = 'docs.index'

  params.delete(:id) if params[:id].nil? || params[:id].to_s =~ /\A\s*\z/

  response =
      if params[:id]
        client.put '/{index}/{type}/{id}', params
      else
        client.post '/{index}/{type}', params
      end

  response.body
end

#more_like_this(query, params = nil) ⇒ Object

Search for documents similar to a specific document. The document :id is provided as part of the params hash. If the _all field is not enabled, :mlt_fields must be passed. A query cannot be present in the query body, but other fields like :size and :facets are allowed.

params - Parameters Hash

:id - the ID of the document

Examples

more_like_this(:mlt_fields => "title", :min_term_freq => 1, :type => "doc1", :id => 1)

# with query hash
more_like_this({:from => 5, :size => 10}, :mlt_fields => "title",
                :min_term_freq => 1, :type => "doc1", :id => 1)

See www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-more-like-this.html

Returns the response body as a hash



334
335
336
337
338
339
# File 'lib/elastomer/client/docs.rb', line 334

def more_like_this( query, params = nil )
  query, params = extract_params(query) if params.nil?

  response = client.get '/{index}/{type}/{id}/_mlt', update_params(params, :body => query, :action => 'docs.more_like_this')
  response.body
end

#multi_get(body, params = {}) ⇒ Object Also known as: mget

Allows to get multiple documents based on an index, type, and id (and possibly routing).

body - The request body as a Hash or a JSON encoded String params - Parameters Hash

See www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-multi-get.html

Returns the response body as a Hash



150
151
152
153
154
155
156
# File 'lib/elastomer/client/docs.rb', line 150

def multi_get( body, params = {} )
  overrides = from_document body
  overrides[:action] = 'docs.multi_get'

  response = client.get '{/index}{/type}/_mget', update_params(params, overrides)
  response.body
end

#multi_search(params = {}, &block) ⇒ Object

Execute an array of searches in bulk. Results are returned in an array in the order the queries were sent. The current index name and document type will be passed to the multi_search API call as part of the request parameters.

params - Parameters Hash that will be passed to the API call. block - Required block that is used to accumulate searches.

All the operations will be passed to the search cluster
via a single API request.

Yields a MultiSearch instance for building multi_search API call bodies.

Examples

docs.multi_search do |m|
  m.search({:query => {:match_all => {}}, :search_type => :count)
  m.search({:query => {:field => {"foo" => "bar"}}})
  ...
end

See www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-multi-search.html

Returns the response body as a Hash



495
496
497
498
499
500
# File 'lib/elastomer/client/docs.rb', line 495

def multi_search( params = {}, &block )
  raise 'a block is required' if block.nil?

  params = {:index => self.name, :type => self.type}.merge params
  client.multi_search params, &block
end

#multi_termvectors(body, params = {}) ⇒ Object Also known as: multi_term_vectors

Multi termvectors API allows you to get multiple termvectors based on an index, type and id. The response includes a docs array with all the fetched termvectors, each element having the structure provided by the ‘termvector` API.

params - Parameters Hash

See www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-multi-termvectors.html

Returns the response body as a hash



304
305
306
307
# File 'lib/elastomer/client/docs.rb', line 304

def multi_termvectors( body, params = {} )
  response = client.get '{/index}{/type}/_mtermvectors', update_params(params, :body => body, :action => 'docs.multi_termvectors')
  response.body
end

#scan(query, opts = {}) ⇒ Object

Create a new Scroller instance for scanning all results from a ‘query`. The Scroller will be scoped to the current index and document type. The Scroller is configured to use `scan` semantics which are more efficient than a standard scroll query; the caveat is that the returned documents cannot be sorted.

query - The query to scan as a Hash or a JSON encoded String opts - Options Hash

:index  - the name of the index to search
:type   - the document type to search
:scroll - the keep alive time of the scrolling request (5 minutes by default)
:size   - the number of documents per shard to fetch per scroll

Examples

scan = docs.scan('{"query":{"match_all":{}}}')
scan.each_document do |document|
  document['_id']
  document['_source']
end

Returns a new Scroller instance



466
467
468
469
# File 'lib/elastomer/client/docs.rb', line 466

def scan( query, opts = {} )
  opts = {:index => name, :type => type}.merge opts
  client.scan query, opts
end

#scroll(query, opts = {}) ⇒ Object

Create a new Scroller instance for scrolling all results from a ‘query`. The Scroller will be scoped to the current index and document type.

query - The query to scroll as a Hash or a JSON encoded String opts - Options Hash

:index  - the name of the index to search
:type   - the document type to search
:scroll - the keep alive time of the scrolling request (5 minutes by default)
:size   - the number of documents per shard to fetch per scroll

Examples

scroll = index.scroll('{"query":{"match_all":{}},"sort":{"date":"desc"}}')
scroll.each_document do |document|
  document['_id']
  document['_source']
end

See www.elasticsearch.org/guide/en/elasticsearch/guide/current/scan-scroll.html

Returns a new Scroller instance



439
440
441
442
# File 'lib/elastomer/client/docs.rb', line 439

def scroll( query, opts = {} )
  opts = {:index => name, :type => type}.merge opts
  client.scroll query, opts
end

#search(query, params = nil) ⇒ Object

Allows you to execute a search query and get back search hits that match the query. This method supports both the “request body” query and the “URI request” query. When using the request body semantics, the query hash must contain the :query key. Otherwise we assume a URI request is being made.

query - The query body as a Hash params - Parameters Hash

Examples

# request body query
search({:query => {:match_all => {}}}, :type => 'tweet')

# same thing but using the URI request method
search(:q => '*:*', :type => 'tweet')

See www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-search.html See www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-uri-request.html See www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-body.html

Returns the response body as a hash



197
198
199
200
201
202
# File 'lib/elastomer/client/docs.rb', line 197

def search( query, params = nil )
  query, params = extract_params(query) if params.nil?

  response = client.get '/{index}{/type}/_search', update_params(params, :body => query, :action => 'docs.search')
  response.body
end

#search_shards(params = {}) ⇒ Object

The search shards API returns the indices and shards that a search request would be executed against. This can give useful feedback for working out issues or planning optimizations with routing and shard preferences.

params - Parameters Hash

:routing    - routing values
:preference - which shard replicas to execute the search request on
:local      - boolean value to use local cluster state

See www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-shards.html

Returns the response body as a hash



217
218
219
220
# File 'lib/elastomer/client/docs.rb', line 217

def search_shards( params = {} )
  response = client.get '/{index}{/type}/_search_shards', update_params(params, :action => 'docs.search_shards')
  response.body
end

#source(params = {}) ⇒ Object

Retrieve the document source from the index based on the ID and type. The :id is provided as part of the params hash.

params - Parameters Hash

:id - the ID of the document

See www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-get.html#_source

Returns the response body as a Hash



137
138
139
140
# File 'lib/elastomer/client/docs.rb', line 137

def source( params = {} )
  response = client.get '/{index}/{type}/{id}/_source', update_params(params, :action => 'docs.source')
  response.body
end

#termvector(params = {}) ⇒ Object Also known as: termvectors, term_vector, term_vectors

Returns information and statistics on terms in the fields of a particular document as stored in the index. The :id is provided as part of the params hash.

params - Parameters Hash

:id - the ID of the document to get

See www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-termvectors.html

Returns the response body as a hash



286
287
288
289
# File 'lib/elastomer/client/docs.rb', line 286

def termvector( params = {} )
  response = client.get '/{index}/{type}/{id}/_termvector', update_params(params, :action => 'docs.termvector')
  response.body
end

#update(script, params = {}) ⇒ Object

Update a document based on a script provided.

script - The script (as a Hash) used to update the document in place params - Parameters Hash

See www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-update.html

Returns the response body as a Hash



167
168
169
170
171
172
173
# File 'lib/elastomer/client/docs.rb', line 167

def update( script, params = {} )
  overrides = from_document script
  overrides[:action] = 'docs.update'

  response = client.post '/{index}/{type}/{id}/_update', update_params(params, overrides)
  response.body
end

#update_params(params, overrides = nil) ⇒ Object

Internal: Add default parameters to the ‘params` Hash and then apply `overrides` to the params if any are given.

params - Parameters Hash overrides - Optional parameter overrides as a Hash

Returns a new params Hash.



535
536
537
538
539
540
# File 'lib/elastomer/client/docs.rb', line 535

def update_params( params, overrides = nil )
  h = defaults.update params
  h.update overrides unless overrides.nil?
  h[:routing] = h[:routing].join(',') if Array === h[:routing]
  h
end

#validate(query, params = nil) ⇒ Object

Validate a potentially expensive query before running it. The :explain parameter can be used to get detailed information about why a query failed.

query - The query body as a Hash params - Parameters Hash

Examples

# request body query
validate({:query => {:query_string => {:query => "*:*"}}}, :explain => true)

# same thing but using the URI query parameter
validate(:q => "post_date:foo", :explain => true)

See www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-validate.html

Returns the response body as a hash



383
384
385
386
387
388
# File 'lib/elastomer/client/docs.rb', line 383

def validate( query, params = nil )
  query, params = extract_params(query) if params.nil?

  response = client.get '/{index}{/type}/_validate/query', update_params(params, :body => query, :action => 'docs.validate')
  response.body
end