Module: ElasticGraph::DatastoreCore::IndexDefinition::Base

Included in:: Index, RolloverIndexTemplate

Defined in:: lib/elastic_graph/datastore_core/index_definition/base.rb

Overview

This module contains common implementation logic for both the rollover and non-rollover implementations of the common IndexDefinition type.

Instance Method Summary collapse

#accessible_cluster_names_to_index_into ⇒ Object
#accessible_from_queries? ⇒ Boolean

Indicates whether not the index is be accessible from GraphQL queries, by virtue of the ‘cluster_to_query` being a defined cluster or not.
#all_accessible_cluster_names ⇒ Object

Returns a list of all defined datastore clusters this index resides within.
#cluster_to_query ⇒ Object
#clusters_to_index_into ⇒ Object
#flattened_env_setting_overrides ⇒ Object

Returns any setting overrides for this index from the environment-specific config file, after flattening it so that it can be directly used in a create index request.
#has_custom_routing? ⇒ Boolean
#ignored_values_for_routing ⇒ Object
#known_related_query_rollover_indices ⇒ Object

Returns a list of indices related to this template in the datastore cluster this index definition is configured to query.
#list_counts_field_paths_for_source(source) ⇒ Object

Returns a set of all of the field paths to subfields of the special ‘LIST_COUNTS_FIELD` that contains the element counts of all list fields.
#routing_value_for_prepared_record(prepared_record, route_with_path: route_with, id_path: "id") ⇒ Object

Gets the routing value for the given ‘prepared_record`.
#searches_could_hit_incomplete_docs? ⇒ Boolean

Indicates if a search on this index definition may hit incomplete documents.
#to_s ⇒ Object (also: #inspect)
#use_updates_for_indexing? ⇒ Boolean

Instance Method Details

#accessible_cluster_names_to_index_into ⇒ `Object`

# File 'lib/elastic_graph/datastore_core/index_definition/base.rb', line 105

def accessible_cluster_names_to_index_into
  @accessible_cluster_names_to_index_into ||= clusters_to_index_into.select do |name|
    defined_clusters.include?(name)
  end
end

#accessible_from_queries? ⇒ `Boolean`

Indicates whether not the index is be accessible from GraphQL queries, by virtue of the ‘cluster_to_query` being a defined cluster or not. This will be used to hide GraphQL schema elements that can’t be queried when our config omits the means to query an index (e.g. due to lacking a configured URL).

Returns:

(Boolean)

# File 'lib/elastic_graph/datastore_core/index_definition/base.rb', line 115

def accessible_from_queries?
  return false unless (cluster = cluster_to_query)
  defined_clusters.include?(cluster)
end

#all_accessible_cluster_names ⇒ `Object`

Returns a list of all defined datastore clusters this index resides within.

# File 'lib/elastic_graph/datastore_core/index_definition/base.rb', line 97

def all_accessible_cluster_names
  @all_accessible_cluster_names ||=
    # Using `_` because steep doesn't understand that `compact` removes nils.
    (clusters_to_index_into + [_ = cluster_to_query]).compact.uniq.select do |name|
      defined_clusters.include?(name)
    end
end

#cluster_to_query ⇒ `Object`



78
79
80

# File 'lib/elastic_graph/datastore_core/index_definition/base.rb', line 78

def cluster_to_query
  env_index_config.query_cluster
end

#clusters_to_index_into ⇒ `Object`

# File 'lib/elastic_graph/datastore_core/index_definition/base.rb', line 82

def clusters_to_index_into
  env_index_config.index_into_clusters.tap do |clusters_to_index_into|
    raise ConfigError, "No `index_into_clusters` defined for #{self} in env_index_config" unless clusters_to_index_into
  end
end

#flattened_env_setting_overrides ⇒ `Object`

Returns any setting overrides for this index from the environment-specific config file, after flattening it so that it can be directly used in a create index request.

# File 'lib/elastic_graph/datastore_core/index_definition/base.rb', line 21

def flattened_env_setting_overrides
  @flattened_env_setting_overrides ||= Support::HashUtil.flatten_and_stringify_keys(
    env_index_config.setting_overrides,
    prefix: "index"
  )
end

#has_custom_routing? ⇒ `Boolean`

Returns:

(Boolean)



45
46
47

# File 'lib/elastic_graph/datastore_core/index_definition/base.rb', line 45

def has_custom_routing?
  route_with != "id"
end

#ignored_values_for_routing ⇒ `Object`



92
93
94

# File 'lib/elastic_graph/datastore_core/index_definition/base.rb', line 92

def ignored_values_for_routing
  env_index_config.ignore_routing_values
end

#known_related_query_rollover_indices ⇒ `Object`

Returns a list of indices related to this template in the datastore cluster this index definition is configured to query. Note that for performance reasons, this method memoizes the result of querying the datastore for its current list of indices, and as a result the return value may be out of date. If it is absolutely essential that you get an up-to-date list of related indices, use ‘related_rollover_indices(datastore_client`) instead of this method.

Note, however, that indices generally change very rarely (say, monthly or yearly) and as such this will very rarely be out of date, even with the memoization.

# File 'lib/elastic_graph/datastore_core/index_definition/base.rb', line 129

def known_related_query_rollover_indices
  @known_related_query_rollover_indices ||= cluster_to_query&.then do |name|
    # For query purposes, we only want indices that exist. If we return a query that is defined in our configuration
    # but does not exist, and that gets used in a search index expression (even for the purposes of excluding it!),
    # the datastore will return an error.
    related_rollover_indices(datastore_clients_by_name.fetch(name), only_if_exists: true)
  end || []
end

#list_counts_field_paths_for_source(source) ⇒ `Object`

Returns a set of all of the field paths to subfields of the special ‘LIST_COUNTS_FIELD` that contains the element counts of all list fields. The returned set is filtered based on the provided `source` to only contain the paths of fields that are populated by the given source.

# File 'lib/elastic_graph/datastore_core/index_definition/base.rb', line 142

def list_counts_field_paths_for_source(source)
  @list_counts_field_paths_for_source ||= {} # : ::Hash[::String, ::Set[::String]]
  @list_counts_field_paths_for_source[source] ||= identify_list_counts_field_paths_for_source(source)
end

#routing_value_for_prepared_record(prepared_record, route_with_path: route_with, id_path: "id") ⇒ `Object`

Gets the routing value for the given ‘prepared_record`. Notably, `prepared_record` must be previously prepared with an `Indexer::RecordPreparer` in order to ensure that it uses internal index field names (to align with `route_with_path`/`route_with` which also use the internal name) rather than the public field name (which can differ).

# File 'lib/elastic_graph/datastore_core/index_definition/base.rb', line 32

def routing_value_for_prepared_record(prepared_record, route_with_path: route_with, id_path: "id")
  return nil unless has_custom_routing?

  unless route_with_path
    raise ConfigError, "`#{self}` uses custom routing, but `route_with_path` is misconfigured (was `nil`)"
  end

  config_routing_value = Support::HashUtil.fetch_value_at_path(prepared_record, route_with_path).to_s
  return config_routing_value unless ignored_values_for_routing.include?(config_routing_value)

  Support::HashUtil.fetch_value_at_path(prepared_record, id_path).to_s
end

#searches_could_hit_incomplete_docs? ⇒ `Boolean`

Indicates if a search on this index definition may hit incomplete documents. An incomplete document can occur when multiple event types flow into the same index. An index that has only one source type can never have incomplete documents, but an index that has 2 or more sources can have incomplete documents when the “primary” event type hasn’t yet been received for a document.

This case is notable because we need to apply automatic filtering in order to hide documents that are not yet complete.

Note: determining this value sometimes requires that we query the datastore for the record of all sources that an index has ever had. This value changes very, very rarely, and we don’t want to slow down every GraphQL query by adding the extra query against the datastore, so we cache the value here.

Returns:

(Boolean)

# File 'lib/elastic_graph/datastore_core/index_definition/base.rb', line 60

def searches_could_hit_incomplete_docs?
  return @searches_could_hit_incomplete_docs if defined?(@searches_could_hit_incomplete_docs)

  if current_sources.size > 1
    # We know that incomplete docs are possible, without needing to check sources recorded in `_meta`.
    @searches_could_hit_incomplete_docs = true
  else
    # While our current configuration can't produce incomplete documents, some may already exist in the index
    # if we previously had some `sourced_from` fields (but no longer have them). Here we check for the sources
    # we've recorded in `_meta` to account for that.
    client = datastore_clients_by_name.fetch(cluster_to_query)
    recorded_sources = mappings_in_datastore(client).dig("_meta", "ElasticGraph", "sources") || []
    sources = recorded_sources.union(current_sources.to_a)

    @searches_could_hit_incomplete_docs = sources.size > 1
  end
end

#to_s ⇒ `Object` Also known as: inspect



147
148
149

# File 'lib/elastic_graph/datastore_core/index_definition/base.rb', line 147

def to_s
  "#<#{self.class.name} #{name}>"
end

#use_updates_for_indexing? ⇒ `Boolean`

Returns:

(Boolean)



88
89
90

# File 'lib/elastic_graph/datastore_core/index_definition/base.rb', line 88

def use_updates_for_indexing?
  env_index_config.use_updates_for_indexing
end

Module: ElasticGraph::DatastoreCore::IndexDefinition::Base

Overview

Instance Method Summary collapse

Instance Method Details

#accessible_cluster_names_to_index_into ⇒ Object

#accessible_from_queries? ⇒ Boolean

#all_accessible_cluster_names ⇒ Object

#cluster_to_query ⇒ Object

#clusters_to_index_into ⇒ Object

#flattened_env_setting_overrides ⇒ Object

#has_custom_routing? ⇒ Boolean

#ignored_values_for_routing ⇒ Object

#known_related_query_rollover_indices ⇒ Object

#list_counts_field_paths_for_source(source) ⇒ Object

#routing_value_for_prepared_record(prepared_record, route_with_path: route_with, id_path: "id") ⇒ Object

#searches_could_hit_incomplete_docs? ⇒ Boolean

#to_s ⇒ Object Also known as: inspect

#use_updates_for_indexing? ⇒ Boolean

#accessible_cluster_names_to_index_into ⇒ `Object`

#accessible_from_queries? ⇒ `Boolean`

#all_accessible_cluster_names ⇒ `Object`

#cluster_to_query ⇒ `Object`

#clusters_to_index_into ⇒ `Object`

#flattened_env_setting_overrides ⇒ `Object`

#has_custom_routing? ⇒ `Boolean`

#ignored_values_for_routing ⇒ `Object`

#known_related_query_rollover_indices ⇒ `Object`

#list_counts_field_paths_for_source(source) ⇒ `Object`

#routing_value_for_prepared_record(prepared_record, route_with_path: route_with, id_path: "id") ⇒ `Object`

#searches_could_hit_incomplete_docs? ⇒ `Boolean`

#to_s ⇒ `Object` Also known as: inspect

#use_updates_for_indexing? ⇒ `Boolean`