Module: ElasticSearch::Index::ClassMethods

Extended by:
Forwardable
Defined in:
lib/elastic_search/index.rb

Instance Method Summary collapse

Instance Method Details

#base_urlString

Returns the ElasticSearch base URL, ie protcol and host with port. Override to specify an index specific ElasticSearch cluster.

Returns:

  • (String)

    The ElasticSearch base URL



539
540
541
# File 'lib/elastic_search/index.rb', line 539

def base_url
  ElasticSearch::Config[:base_url]
end

#bulk(options = {}) ⇒ Object

Initiates and yields the bulk object, such that index, import, create, update and delete requests can be appended to the bulk request. Sends a refresh request afterwards if auto_refresh is enabled.

Examples:

CommentIndex.bulk ignore_errors: [409] do |bulk|
  bulk.create comment.id, JSON.generate(CommentIndex.serialize(comment)),
    version: comment.version, version_type: "external_gte"

  bulk.delete comment.id, routing: comment.user_id

  # ...
end

Parameters:

  • options (Hash) (defaults to: {})

    Specifies options regarding the bulk indexing

Options Hash (options):

  • ignore_errors (Array)

    Specifies an array of http status codes that shouldn't raise any exceptions, like eg 409 for conflicts, ie when optimistic concurrency control is used.

  • raise (Boolean)

    Prevents any exceptions from being raised. Please note that this only applies to the bulk response, not to the request in general, such that connection errors, etc will still raise.

See Also:



508
509
510
511
512
513
514
# File 'lib/elastic_search/index.rb', line 508

def bulk(options = {})
  ElasticSearch::Bulk.new("#{type_url}/_bulk", ElasticSearch::Config[:bulk_limit], options) do |indexer|
    yield indexer
  end

  refresh if ElasticSearch::Config[:auto_refresh]
end

#create(scope, options = {}, _index_options = {}) ⇒ Object

Indexes the given record set, array of records or individual record using ElasticSearch's create operation via the Bulk API, such that the request will fail if a record with a particular primary key already exists in ElasticSearch.



439
440
441
442
443
444
445
446
447
# File 'lib/elastic_search/index.rb', line 439

def create(scope, options = {}, _index_options = {})
  bulk options do |indexer|
    each_record(scope, index_scope: true) do |object|
      indexer.create record_id(object), JSON.generate(serialize(object)), index_options(object).merge(_index_options)
    end
  end

  scope
end

#create_indexObject

Creates the index within ElasticSearch and applies index settings, if specified. Raises ElasticSearch::ResponseError in case any errors occur.



300
301
302
303
304
# File 'lib/elastic_search/index.rb', line 300

def create_index
  ElasticSearch::HTTPClient.put(index_url, json: index_settings)

  true
end

#delete(scope, options = {}, _index_options = {}) ⇒ Object

Deletes the given record set, array of records or individual record from ElasticSearch using the Bulk API.



473
474
475
476
477
478
479
480
481
# File 'lib/elastic_search/index.rb', line 473

def delete(scope, options = {}, _index_options = {})
  bulk options do |indexer|
    each_record(scope) do |object|
      indexer.delete record_id(object), index_options(object).merge(_index_options)
    end
  end

  scope
end

#delete_indexObject

Deletes the index from ElasticSearch. Raises ElasticSearch::ResponseError in case any errors occur.



319
320
321
322
323
# File 'lib/elastic_search/index.rb', line 319

def delete_index
  ElasticSearch::HTTPClient.delete(index_url)

  true
end

#each_record(scope, index_scope: false) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Used to iterate a record set. Here, a record set may be a) an ActiveRecord::Relation or anything responding to #find_each, b) an Array of records or anything responding to #each or c) a single record.

Parameters:

  • scope

    The record set that gets iterated

  • index_scope (Boolean)

    Set to true if you want the the index scope to be applied to the scope



124
125
126
127
128
129
130
131
132
133
134
135
136
# File 'lib/elastic_search/index.rb', line 124

def each_record(scope, index_scope: false)
  return enum_for(:each_record, scope) unless block_given?

  if scope.respond_to?(:find_each)
    (index_scope ? self.index_scope(scope) : scope).find_each do |record|
      yield record
    end
  else
    (scope.respond_to?(:each) ? scope : Array(scope)).each do |record|
      yield record
    end
  end
end

#fetch_records(ids) ⇒ Object

Returns a record set, usually an ActiveRecord::Relation, for the specified ids, ie primary keys. Override this method for custom primary keys and/or ORMs.

Parameters:

  • ids (Array)

    The array of ids to fetch the records for

Returns:

  • The record set or an array of records



166
167
168
# File 'lib/elastic_search/index.rb', line 166

def fetch_records(ids)
  model.where(id: ids)
end

#get(id, params = {}) ⇒ Hash

Retrieves the document specified by id from ElasticSearch. Raises ElasticSearch::ResponseError specific exceptions in case any errors occur.

Returns:

  • (Hash)

    The specified document



370
371
372
# File 'lib/elastic_search/index.rb', line 370

def get(id, params = {})
  ElasticSearch::HTTPClient.headers(accept: "application/json").get("#{type_url}/#{id}", params: params).parse
end

#get_index_settingsHash

Fetches the index settings from ElasticSearch. Sends a GET request to index_url/_settings. Raises ElasticSearch::ResponseError in case any errors occur.

Returns:

  • (Hash)

    The index settings



292
293
294
# File 'lib/elastic_search/index.rb', line 292

def get_index_settings
  ElasticSearch::HTTPClient.headers(accept: "application/json").get("#{index_url}/_settings").parse
end

#get_mappingHash

Retrieves the current type mapping from ElasticSearch. Raises ElasticSearch::ResponseError in case any errors occur.

Returns:

  • (Hash)

    The current type mapping



360
361
362
# File 'lib/elastic_search/index.rb', line 360

def get_mapping
  ElasticSearch::HTTPClient.headers(accept: "application/json").get("#{type_url}/_mapping").parse
end

#import(*args) ⇒ Object

Indexes the given record set, array of records or individual record. Alias for #index.



388
389
390
# File 'lib/elastic_search/index.rb', line 388

def import(*args)
  index(*args)
end

#index(scope, options = {}, _index_options = {}) ⇒ Object

Indexes the given record set, array of records or individual record. A record set usually is an ActiveRecord::Relation, but can be any other ORM as well. Uses the ElasticSearch bulk API no matter what is provided. Refreshes the index if auto_refresh is enabled. Raises ElasticSearch::ResponseError in case any errors occur.

Examples:

CommentIndex.import Comment.all
CommentIndex.import [comment1, comment2]
CommentIndex.import Comment.first
CommentIndex.import Comment.all, ignore_errors: [409]
CommentIndex.import Comment.all, raise: false

Parameters:

  • scope

    A record set, array of records or individual record to index

  • options (Hash) (defaults to: {})

    Specifies options regarding the bulk indexing

  • _index_options (Hash) (defaults to: {})

    Provides custom index options for eg routing, versioning, etc

Options Hash (options):

  • ignore_errors (Array)

    Specifies an array of http status codes that shouldn't raise any exceptions, like eg 409 for conflicts, ie when optimistic concurrency control is used.

  • raise (Boolean)

    Prevents any exceptions from being raised. Please note that this only applies to the bulk response, not to the request in general, such that connection errors, etc will still raise.

See Also:



421
422
423
424
425
426
427
428
429
# File 'lib/elastic_search/index.rb', line 421

def index(scope, options = {}, _index_options = {})
  bulk options do |indexer|
    each_record(scope, index_scope: true) do |object|
      indexer.index record_id(object), JSON.generate(serialize(object)), index_options(object).merge(_index_options)
    end
  end

  scope
end

#index_exists?Boolean

Returns whether or not the associated ElasticSearch index already exists.

Returns:

  • (Boolean)

    Whether or not the index exists



276
277
278
279
280
281
282
283
284
# File 'lib/elastic_search/index.rb', line 276

def index_exists?
  ElasticSearch::HTTPClient.headers(accept: "application/json").head(index_url)

  true
rescue ElasticSearch::ResponseError => e
  return false if e.code == 404

  raise e
end

#index_nameString

Returns the base name of the index within ElasticSearch, ie the index name without prefix. Equals #type_name by default.

Returns:

  • (String)

    The base name of the index, ie without prefix



237
238
239
# File 'lib/elastic_search/index.rb', line 237

def index_name
  type_name
end

#index_name_with_prefixString

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Returns the full name of the index within ElasticSearch, ie with prefix specified via ElasticSearch::Config.

Returns:

  • (String)

    The full index name



248
249
250
# File 'lib/elastic_search/index.rb', line 248

def index_name_with_prefix
  "#{ElasticSearch::Config[:index_prefix]}#{index_name}"
end

#index_options(record) ⇒ Hash

Override this method to automatically pass index options for a record at index-time, like routing or versioning.

Examples:

def self.index_options(comment)
  {
    routing: comment.user_id,
    version: comment.version,
    version_type: "external_gte"
  }
end

Parameters:

  • record

    The record that gets indexed

Returns:

  • (Hash)

    The index options



67
68
69
# File 'lib/elastic_search/index.rb', line 67

def index_options(record)
  {}
end

#index_scope(scope) ⇒ Object

Override this method to specify an index scope, which will automatically be applied to scopes, eg. ActiveRecord::Relation objects, passed to #import or #index. This can be used to preload associations that are used when serializing records or to restrict the records you want to index.

Examples:

Preloading an association

class CommentIndex
  # ...

  def self.index_scope(scope)
    scope.preload(:user)
  end
end

CommentIndex.import(Comment.all) # => CommentIndex.import(Comment.preload(:user))

Restricting records

class CommentIndex
  # ...

  def self.index_scope(scope)
    scope.where(public: true)
  end
end

CommentIndex.import(Comment.all) # => CommentIndex.import(Comment.where(public: true))

Parameters:

  • scope

    The supplied scope to extend

Returns:

  • The extended scope



202
203
204
# File 'lib/elastic_search/index.rb', line 202

def index_scope(scope)
  scope
end

#index_settingsHash

Override to specify index settings like number of shards, analyzers, refresh interval, etc.

Examples:

def self.index_settings
  {
    settings: {
      number_of_shards: 10,
      number_of_replicas: 2
    }
  }
end

Returns:

  • (Hash)

    The index settings



267
268
269
# File 'lib/elastic_search/index.rb', line 267

def index_settings
  {}
end

#index_url(base_url: self.base_url) ⇒ String

Returns the ElasticSearch index URL, ie base URL and index name with prefix.

Returns:

  • (String)

    The ElasticSearch index URL



530
531
532
# File 'lib/elastic_search/index.rb', line 530

def index_url(base_url: self.base_url)
  "#{base_url}/#{index_name_with_prefix}"
end

#mappingObject

Specifies a type mapping. Override to specify a custom mapping.

Examples:

def self.mapping
  {
    comments: {
      _all: {
        enabled: false
      },
      properties: {
        email: { type: "string", analyzer: "custom_analyzer" }
      }
    }
  }
end


341
342
343
# File 'lib/elastic_search/index.rb', line 341

def mapping
  { type_name => {} }
end

#record_id(record) ⇒ String, Fixnum

Returns the record's id, ie the unique identifier or primary key of a record. Override this method for custom primary keys, but return a String or Fixnum.

Examples:

Default implementation

def self.record_id(record)
  record.id
end

Custom primary key

def self.record_id(user)
  user.username
end

Parameters:

  • record

    The record to get the primary key for

Returns:

  • (String, Fixnum)

    The record's primary key



155
156
157
# File 'lib/elastic_search/index.rb', line 155

def record_id(record)
  record.id
end

#refreshObject

Sends a index refresh request to ElasticSearch. Raises ElasticSearch::ResponseError in case any errors occur.



377
378
379
380
381
# File 'lib/elastic_search/index.rb', line 377

def refresh
  ElasticSearch::HTTPClient.post("#{index_url}/_refresh", json: {})

  true
end

#relationElasticSearch::Relation

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Creates an ElasticSearch::Relation for the current index, which is used as a base for chaining relation methods.

Returns:



213
214
215
# File 'lib/elastic_search/index.rb', line 213

def relation
  ElasticSearch::Relation.new(target: self)
end

#scope(name, &block) ⇒ Object

Adds a named scope to the index.

Examples:

scope(:active) { where(active: true) }

UserIndex.active
scope(:active) { |value| where(active: value) }

UserIndex.active(true)
UserIndex.active(false)

Parameters:

  • name (Symbol)

    The name of the scope

  • block

    The scope definition. Add filters, etc.



110
111
112
# File 'lib/elastic_search/index.rb', line 110

def scope(name, &block)
  define_singleton_method(name, &block)
end

#serialize(record) ⇒ Hash

This method is abstract.

Override this method to generate a hash representation of a record, used to generate the JSON representation of it.

Examples:

def self.serialize(comment)
  {
    id: comment.id,
    user_id: comment.user_id,
    message: comment.message,
    created_at: comment.created_at,
    updated_at: comment.updated_at
  }
end

Parameters:

  • record

    The record that gets serialized

Returns:

  • (Hash)

    The hash-representation of the record

Raises:

  • (NotImplementedError)


90
91
92
# File 'lib/elastic_search/index.rb', line 90

def serialize(record)
  raise NotImplementedError
end

#type_nameString

Override to specify the type name used within ElasticSearch. Recap, this gem uses an individual index for each index class, because ElasticSearch requires to have the same mapping for the same field name, even if the field is living in different types of the same index.

Returns:

  • (String)

    The name used for the type within the index

Raises:

  • (NotImplementedError)


228
229
230
# File 'lib/elastic_search/index.rb', line 228

def type_name
  raise NotImplementedError
end

#type_url(base_url: self.base_url) ⇒ String

Returns the full ElasticSearch type URL, ie base URL, index name with prefix and type name.

Returns:

  • (String)

    The ElasticSearch type URL



521
522
523
# File 'lib/elastic_search/index.rb', line 521

def type_url(base_url: self.base_url)
  "#{index_url(base_url: base_url)}/#{type_name}"
end

#update(scope, options = {}, _index_options = {}) ⇒ Object

Indexes the given record set, array of records or individual record using ElasticSearch's update operation via the Bulk API, such that the request will fail if a record you want to update does not already exist in ElasticSearch.



457
458
459
460
461
462
463
464
465
# File 'lib/elastic_search/index.rb', line 457

def update(scope, options = {}, _index_options = {})
  bulk options do |indexer|
    each_record(scope, index_scope: true) do |object|
      indexer.update record_id(object), JSON.generate(:doc => serialize(object)), index_options(object).merge(_index_options)
    end
  end

  scope
end

#update_index_settingsObject

Updates the index settings within ElasticSearch according to the index settings specified. Raises ElasticSearch::ResponseError in case any errors occur.



310
311
312
313
314
# File 'lib/elastic_search/index.rb', line 310

def update_index_settings
  ElasticSearch::HTTPClient.put("#{index_url}/_settings", json: index_settings)

  true
end

#update_mappingObject

Updates the type mapping within ElasticSearch according to the mapping currently specified. Raises ElasticSearch::ResponseError in case any errors occur.



349
350
351
352
353
# File 'lib/elastic_search/index.rb', line 349

def update_mapping
  ElasticSearch::HTTPClient.put("#{type_url}/_mapping", json: mapping)

  true
end