Module: JayAPI::Elasticsearch::Indexable

Included in:
Index, Indexes
Defined in:
lib/jay_api/elasticsearch/indexable.rb

Overview

This module houses the Elasticsearch methods that can be used with a single or with multiple indexes. Its main purpose is to avoid code repetition between classes.

Constant Summary collapse

DEFAULT_DOC_TYPE =

Default type for documents indexed with the #index method.

'nested'
SUPPORTED_TYPES =

Supported document types (for the #index method)

[DEFAULT_DOC_TYPE, nil].freeze

Instance Attribute Summary collapse

Instance Method Summary collapse

Instance Attribute Details

#batch_sizeObject (readonly)

Returns the value of attribute batch_size.



28
29
30
# File 'lib/jay_api/elasticsearch/indexable.rb', line 28

def batch_size
  @batch_size
end

#clientObject (readonly)

Returns the value of attribute client.



28
29
30
# File 'lib/jay_api/elasticsearch/indexable.rb', line 28

def client
  @client
end

Instance Method Details

#delete_by_query(query, slices: nil, wait_for_completion: true) ⇒ Hash

Delete the documents matching the given query from the Index. For more information on how to build the query please refer to the Elasticsearch DSL documentation: www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html

Examples:

Returned Hash (with ‘wait_for_completion: true`):

{
  took: 103,
  timed_out: false,
  total: 76,
  deleted: 76,
  batches: 1,
  version_conflicts: 0,
  noops: 0,
  retries: { bulk: 0, search: 0 },
  throttled_millis: 0,
  requests_per_second: 1.0,
  throttled_until_millis: 0,
  failures: []
}

Returned Hash (with ‘wait_for_completion: false`):

{
  task: "B5oDyEsHQu2Q-wpbaMSMTg:577388264"
}

Parameters:

  • query (Hash)

    The delete query

  • slices (Integer) (defaults to: nil)

    Number of slices to cut the operation into for faster processing (i.e., run the operation in parallel)

  • wait_for_completion (Boolean) (defaults to: true)

    False if Elasticsearch should not wait for completion and perform the request asynchronously, true if it should wait for completion (i.e., run the operation synchronously)

Returns:

  • (Hash)

    A Hash that details the results of the operation

Raises:

  • (Elasticsearch::Transport::Transport::ServerError)

    If the query fails.



153
154
155
156
157
158
159
160
# File 'lib/jay_api/elasticsearch/indexable.rb', line 153

def delete_by_query(query, slices: nil, wait_for_completion: true)
  request_params = { index: index_names, body: query }.tap do |params|
    params.merge!(slices: slices) if slices
    params.merge!(wait_for_completion: false) unless wait_for_completion
  end

  client.delete_by_query(**request_params).deep_symbolize_keys
end

#delete_by_query_async(query, slices: nil) ⇒ Concurrent::Promise

Deletes asynchronously the documents matching the given query from the Index. For more information on how to build the query please refer to the Elasticsearch DSL documentation: www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html

Parameters:

  • query (Hash)

    The delete query

  • slices (Integer, String) (defaults to: nil)

    Number of slices to cut the operation into for faster processing (i.e., run the operation in parallel). Use “auto” to make Elasticsearch decide how many slices to divide into

Returns:

  • (Concurrent::Promise)

    The eventual value returned from the single completion of the delete operation



174
175
176
# File 'lib/jay_api/elasticsearch/indexable.rb', line 174

def delete_by_query_async(query, slices: nil)
  async.delete_by_query(query, slices: slices)
end

#flushObject

Sends whatever is currently in the send queue to the Elasticsearch instance and clears the queue.



108
109
110
111
112
# File 'lib/jay_api/elasticsearch/indexable.rb', line 108

def flush
  return unless @batch.any?

  flush!
end

#index(data, type: DEFAULT_DOC_TYPE) ⇒ Array<Hash>

Sends a record to the Elasticsearch instance right away.

{
  "_index" => "xyz01_unit_test",
  "_type" => "nested",
  "_id" => "SVY1mJEBQ5CNFZM8Lodt",
  "_version" => 1,
  "result" => "created",
  "_shards" => { "total" => 2, "successful" => 1, "failed" => 0 },
  "_seq_no" => 0,
  "_primary_term" => 1
}

For information on the contents of this Hash please see: www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#docs-index-api-response-body

Parameters:

  • data (Hash)

    The data to be sent.

  • type (String, nil) (defaults to: DEFAULT_DOC_TYPE)

    The type of the document. When set to nil the decision is left to Elasticsearch’s API. Which will normally default to _doc.

Returns:

  • (Array<Hash>)

    An array with hashes containing information about the created documents. An example of such Hashes is:

Raises:

  • (ArgumentError)


80
81
82
83
84
# File 'lib/jay_api/elasticsearch/indexable.rb', line 80

def index(data, type: DEFAULT_DOC_TYPE)
  raise ArgumentError, "Unsupported type: '#{type}'" unless SUPPORTED_TYPES.include?(type)

  index_names.map { |index_name| client.index index: index_name, type: type, body: data }
end

#initialize(client:, index_names:, batch_size: 100, logger: nil) ⇒ Object

:reek:ControlParameter (want to avoid the creating of the logger on method definition)

Parameters:

  • client (JayAPI::Elasticsearch::Client)

    The Elasticsearch Client object.

  • index_names (Array<String>)

    The names of the Elasticsearch indexes.

  • batch_size (Integer) (defaults to: 100)

    The size of the batch. When this many items are pushed into the index they are flushed to the Elasticsearch instance.

  • logger (Logging::Logger, nil) (defaults to: nil)

    The logger object to use, if none is given a new one will be created.



38
39
40
41
42
43
44
45
46
# File 'lib/jay_api/elasticsearch/indexable.rb', line 38

def initialize(client:, index_names:, batch_size: 100, logger: nil)
  @logger = logger || Logging.logger[self]

  @client = client
  @index_names = index_names
  @batch_size = batch_size

  @batch = []
end

#push(data) ⇒ Object

Pushes a record into the index. (This does not send the record to the Elasticsearch instance, only puts it into the send queue).

Parameters:

  • data (Hash)

    The data to be pushed to the index.



51
52
53
54
55
56
57
# File 'lib/jay_api/elasticsearch/indexable.rb', line 51

def push(data)
  index_names.each do |index_name|
    batch << { index: { _index: index_name, _type: 'nested', data: data } }
  end

  flush! if batch.size >= batch_size
end

#queue_sizeInteger

Returns the number of elements currently on the send queue.

Returns:

  • (Integer)

    The number of items in the send queue.



116
117
118
# File 'lib/jay_api/elasticsearch/indexable.rb', line 116

def queue_size
  batch.size
end

#search(query, batch_counter: nil, type: nil) ⇒ JayAPI::Elasticsearch::QueryResults

Performs a query on the index. For more information on how to build the query please refer to the Elasticsearch DSL query: www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html

Parameters:

  • query (Hash)

    The query to perform.

  • batch_counter (JayAPI::Elasticsearch::BatchCounter, nil) (defaults to: nil)

    Object keeping track of batches.

  • type (Symbol, nil) (defaults to: nil)

    Type of query, at the moment either nil or :search_after.

Returns:

Raises:

  • (Elasticsearch::Transport::Transport::ServerError)

    If the query fails.



96
97
98
99
100
101
102
103
104
# File 'lib/jay_api/elasticsearch/indexable.rb', line 96

def search(query, batch_counter: nil, type: nil)
  begin
    response = Response.new(client.search(index: index_names, body: query))
  rescue ::Elasticsearch::Transport::Transport::Errors::BadRequest
    logger.error "The 'search' query is invalid: #{JSON.pretty_generate(query)}"
    raise
  end
  query_results(query, response, batch_counter, type)
end