Module: ElasticGraph::SchemaDefinition::Indexing::ListCountsMapping

Defined in:
lib/elastic_graph/schema_definition/indexing/list_counts_mapping.rb

Overview

To support filtering on the count of a list field, we need to index the counts as we ingest events. This is responsible for defining the mapping for the special __counts field in which we store the list counts.

Class Method Summary collapse

Class Method Details

.merged_into(mapping_hash, for_type:) ⇒ Object

Builds the __counts field mapping for the given for_type. Returns a new mapping_hash with the extra __counts field merged into it.



23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# File 'lib/elastic_graph/schema_definition/indexing/list_counts_mapping.rb', line 23

def self.merged_into(mapping_hash, for_type:)
  counts_properties = for_type.indexing_fields_by_name_in_index.values.flat_map do |field|
    field.paths_to_lists_for_count_indexing.map do |path|
      # We chose the `integer` type here because:
      #
      # - While we expect datasets with more documents than the max integer value (~2B), we don't expect
      #   individual documents to have any list fields with more elements than can fit in an integer.
      # - Using `long` would allow for much larger counts, but we don't want to take up double the
      #   storage space for this.
      #
      # Note that `new_list_filter_input_type` (in `schema_definition/factory.rb`) relies on this, and
      # has chosen to use `IntFilterInput` (rather than `JsonSafeLongFilterInput`) for filtering these count values.
      # If we change the mapping type here, we should re-evaluate the filter used there.
      [path, {"type" => "integer"}]
    end
  end.to_h

  return mapping_hash if counts_properties.empty?

  Support::HashUtil.deep_merge(mapping_hash, {
    "properties" => {
      LIST_COUNTS_FIELD => {
        "properties" => counts_properties
      }
    }
  })
end