Class: ElasticGraph::GraphQL::DatastoreQuery
- Inherits:
-
Object
- Object
- ElasticGraph::GraphQL::DatastoreQuery
- Defined in:
- lib/elastic_graph/graphql/datastore_query.rb,
lib/elastic_graph/graphql/datastore_query/paginator.rb,
lib/elastic_graph/graphql/datastore_query/routing_picker.rb,
lib/elastic_graph/graphql/datastore_query/document_paginator.rb,
lib/elastic_graph/graphql/datastore_query/index_expression_builder.rb
Overview
An immutable class that represents a datastore query. Since this represents a datastore query, and not a GraphQL query, all the data in it is modeled in datastore terms, not GraphQL terms. For example, any field names in a ‘Query` should be references to index fields, not GraphQL fields.
Filters are modeled as a ‘Set` of filtering hashes. While we usually expect only a single `filter` hash, modeling it as a set makes it easy for us to support merging queries. The datastore knows how to apply multiple `must` clauses that apply to the same field, giving us the exact semantics we want in such a situation with minimal effort.
Defined Under Namespace
Classes: Builder, IndexExpression, Paginator
Class Method Summary collapse
-
.perform(queries) ⇒ Object
Performs a list of queries by building a hash of datastore msearch header/body tuples (keyed by query), yielding them to the caller, and then post-processing the results.
Instance Method Summary collapse
-
#cluster_name ⇒ Object
Returns the name of the datastore cluster as a String where this query should be setn.
- #document_paginator ⇒ Object
-
#empty? ⇒ Boolean
Indicates if the query does not need any results from the datastore.
-
#hash ⇒ Object
‘DatastoreQuery` objects are used as keys in a hash.
- #inspect ⇒ Object
-
#merge(other_query) ⇒ Object
Merges the provided query, returning a new combined query object.
-
#merge_with(**query_options) ⇒ Object
Convenience method for merging when you do not have access to an ‘DatastoreQuery::Builder`.
-
#route_with_field_paths ⇒ Object
Returns a list of unique field paths that should be used for shard routing during searches.
-
#search_index_expression ⇒ Object
Returns an index_definition expression string to use for searches.
-
#shard_routing_values ⇒ Object
The shard routing values used for this search.
- #to_datastore_msearch_header ⇒ Object
-
#to_datastore_msearch_header_and_body ⇒ Object
Pairs the multi-search headers and body into a tuple, as per the format required by the datastore: www.elastic.co/guide/en/elasticsearch/reference/current/search-multi-search.html#search-multi-search-api-desc.
Class Method Details
.perform(queries) ⇒ Object
Performs a list of queries by building a hash of datastore msearch header/body tuples (keyed by query), yielding them to the caller, and then post-processing the results. The caller is responsible for returning a hash of responses by query from its block.
Note that some of the passed queries may not be yielded to the caller; when we can tell that a query does not have to be sent to the datastore we avoid yielding it from here. Therefore, the caller should not assume that all queries passed to this method will be yielded back.
The return value is a hash of ‘DatastoreResponse::SearchResponse` objects by query.
Note: this method uses ‘send` to work around ruby visibility rules. We do not want `#decoded_cursor_factory` to be public, as we only need it here, but we cannot access it from a class method without using `send`.
93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 |
# File 'lib/elastic_graph/graphql/datastore_query.rb', line 93 def self.perform(queries) empty_queries, present_queries = queries.partition(&:empty?) responses_by_query = Aggregation::QueryOptimizer.optimize_queries(present_queries) do |optimized_queries| header_body_tuples_by_query = optimized_queries.each_with_object({}) do |query, hash| hash[query] = query.to_datastore_msearch_header_and_body end yield(header_body_tuples_by_query) end empty_responses = empty_queries.each_with_object({}) do |query, hash| hash[query] = DatastoreResponse::SearchResponse::RAW_EMPTY end empty_responses.merge(responses_by_query).each_with_object({}) do |(query, response), hash| hash[query] = DatastoreResponse::SearchResponse.build(response, decoded_cursor_factory: query.send(:decoded_cursor_factory)) end.tap do |responses_hash| # Callers expect this `perform` method to provide an invariant: the returned hash MUST contain one entry # for each of the `queries` passed in the args. In practice, violating this invariant primarily causes a # problem when the caller uses the `GraphQL::Dataloader` (which happens for every GraphQL request in production...). # However, our tests do not always run queries end-to-end, so this is an added check we want to do, so that # anytime our logic here fails to include a query in the response in any test, we'll be notified of the # problem. expected_queries = queries.to_set actual_queries = responses_hash.keys.to_set if expected_queries != actual_queries missing_queries = expected_queries - actual_queries extra_queries = actual_queries - expected_queries raise SearchFailedError, "The `responses_hash` does not have the expected set of queries as keys. " \ "This can cause problems for the `GraphQL::Dataloader` and suggests a bug in the logic that should be fixed.\n\n" \ "Missing queries (#{missing_queries.size}):\n#{missing_queries.map(&:inspect).join("\n")}.\n\n" \ "Extra queries (#{extra_queries.size}): #{extra_queries.map(&:inspect).join("\n")}" end end end |
Instance Method Details
#cluster_name ⇒ Object
Returns the name of the datastore cluster as a String where this query should be setn. Unless exactly 1 cluster name is found, this method raises a ConfigError.
181 182 183 184 185 186 |
# File 'lib/elastic_graph/graphql/datastore_query.rb', line 181 def cluster_name cluster_name = search_index_definitions.map(&:cluster_to_query).uniq return cluster_name.first if cluster_name.size == 1 raise ConfigError, "Found different datastore clusters (#{cluster_name}) to query " \ "for query targeting indices: #{search_index_definitions}" end |
#document_paginator ⇒ Object
252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 |
# File 'lib/elastic_graph/graphql/datastore_query.rb', line 252 def document_paginator @document_paginator ||= DocumentPaginator.new( sort_clauses: sort_with_tiebreaker, individual_docs_needed: individual_docs_needed, total_document_count_needed: total_document_count_needed, decoded_cursor_factory: decoded_cursor_factory, schema_element_names: schema_element_names, paginator: Paginator.new( default_page_size: default_page_size, max_page_size: max_page_size, first: document_pagination[:first], after: document_pagination[:after], last: document_pagination[:last], before: document_pagination[:before], schema_element_names: schema_element_names ) ) end |
#empty? ⇒ Boolean
Indicates if the query does not need any results from the datastore. As an optimization, we can reply with a default “empty” response for an empty query.
224 225 226 227 228 229 230 231 |
# File 'lib/elastic_graph/graphql/datastore_query.rb', line 224 def empty? # If we are searching no indices or routing to an empty set of shards, there is no need to query the datastore at all. # This only happens when our filter processing has deduced that the query will match no results. return true if search_index_expression.empty? || shard_routing_values&.empty? datastore_body = to_datastore_body datastore_body.fetch(:size) == 0 && !datastore_body.fetch(:track_total_hits) && aggregations_datastore_body.empty? end |
#hash ⇒ Object
‘DatastoreQuery` objects are used as keys in a hash. Computing `#hash` can be expensive (given how many fields an `DatastoreQuery` has) and it’s safe to cache since ‘DatastoreQuery` instances are immutable, so we memoize it here. We’ve observed this making a very noticeable difference in our test suite runtime.
248 249 250 |
# File 'lib/elastic_graph/graphql/datastore_query.rb', line 248 def hash @hash ||= super end |
#inspect ⇒ Object
233 234 235 236 237 238 239 |
# File 'lib/elastic_graph/graphql/datastore_query.rb', line 233 def inspect description = to_datastore_msearch_header.merge(to_datastore_body).map do |key, value| "#{key}=#{(key == :query) ? "<REDACTED>" : value.inspect}" end.join(" ") "#<#{self.class.name} #{description}>" end |
#merge(other_query) ⇒ Object
Merges the provided query, returning a new combined query object. Both query objects are left unchanged.
134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 |
# File 'lib/elastic_graph/graphql/datastore_query.rb', line 134 def merge(other_query) if search_index_definitions != other_query.search_index_definitions raise ElasticGraph::InvalidMergeError, "`search_index_definitions` conflict while merging between " \ "#{search_index_definitions} and #{other_query.search_index_definitions}" end with( individual_docs_needed: individual_docs_needed || other_query.individual_docs_needed, total_document_count_needed: total_document_count_needed || other_query.total_document_count_needed, filters: filters + other_query.filters, sort: merge_attribute(other_query, :sort), requested_fields: requested_fields + other_query.requested_fields, document_pagination: merge_attribute(other_query, :document_pagination), monotonic_clock_deadline: [monotonic_clock_deadline, other_query.monotonic_clock_deadline].compact.min, aggregations: aggregations.merge(other_query.aggregations) ) end |
#merge_with(**query_options) ⇒ Object
Convenience method for merging when you do not have access to an ‘DatastoreQuery::Builder`. Allows you to pass the query options you would like to merge. As with `#merge`, leaves the original query unchanged and returns a combined query object.
156 157 158 |
# File 'lib/elastic_graph/graphql/datastore_query.rb', line 156 def merge_with(**) merge(with(**)) end |
#route_with_field_paths ⇒ Object
Returns a list of unique field paths that should be used for shard routing during searches.
If a search is filtering on one of these fields, we can optimize the search by routing it to only the shards containing documents for that routing value.
Note that this returns a list due to our support for type unions. A unioned type can be composed of subtypes that have use different shard routing; this will return the set union of them all.
196 197 198 |
# File 'lib/elastic_graph/graphql/datastore_query.rb', line 196 def route_with_field_paths search_index_definitions.map(&:route_with).uniq end |
#search_index_expression ⇒ Object
Returns an index_definition expression string to use for searches. This string can specify multiple indices, use wildcards, etc. For info about what is supported, see: www.elastic.co/guide/en/elasticsearch/reference/current/multi-index.html
169 170 171 172 173 174 175 176 177 |
# File 'lib/elastic_graph/graphql/datastore_query.rb', line 169 def search_index_expression @search_index_expression ||= index_expression_builder.determine_search_index_expression( filters, search_index_definitions, # When we have aggregations, we must require indices to search. When we search no indices, the datastore does not return # the standard aggregations response structure, which causes problems. require_indices: !aggregations_datastore_body.empty? ).to_s end |
#shard_routing_values ⇒ Object
The shard routing values used for this search. Can be ‘nil` if the query will hit all shards. `[]` means that we are routing to no shards.
202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 |
# File 'lib/elastic_graph/graphql/datastore_query.rb', line 202 def shard_routing_values return @shard_routing_values if defined?(@shard_routing_values) routing_values = routing_picker.extract_eligible_routing_values(filters, route_with_field_paths) @shard_routing_values ||= if routing_values&.empty? && !aggregations_datastore_body.empty? # If we return an empty array of routing values, no shards will get searched, which causes a problem for aggregations. # When a query includes aggregations, there are normally aggregation structures on the respopnse (even when there are no # search hits to aggregate over!) but if there are no routing values, those aggregation structures will be missing from # the response. It's complex to handle that in our downstream response handling code, so we prefer to force a "fallback" # routing value here to ensure that at least one shard gets searched. Which shard gets searched doesn't matter; the search # filter that led to an empty set of routing values will match on documents on any shard. ["fallback_shard_routing_value"] elsif contains_ignored_values_for_routing?(routing_values) nil else routing_values&.sort # order doesn't matter, but sorting it makes it easier to assert on in our tests. end end |
#to_datastore_msearch_header ⇒ Object
241 242 243 |
# File 'lib/elastic_graph/graphql/datastore_query.rb', line 241 def to_datastore_msearch_header @to_datastore_msearch_header ||= {index: search_index_expression, routing: shard_routing_values&.join(",")}.compact end |
#to_datastore_msearch_header_and_body ⇒ Object
Pairs the multi-search headers and body into a tuple, as per the format required by the datastore: www.elastic.co/guide/en/elasticsearch/reference/current/search-multi-search.html#search-multi-search-api-desc
162 163 164 |
# File 'lib/elastic_graph/graphql/datastore_query.rb', line 162 def to_datastore_msearch_header_and_body @to_datastore_msearch_header_and_body ||= [to_datastore_msearch_header, to_datastore_body] end |