Class: Gcloud::Search::Index
- Inherits:
-
Object
- Object
- Gcloud::Search::Index
- Defined in:
- lib/gcloud/search/index.rb,
lib/gcloud/search/index/list.rb
Overview
Index
An index manages Document instances for retrieval. Indexes cannot be created, updated, or deleted directly on the server: They are derived from the documents that reference them. You can manage groups of documents by putting them into separate indexes.
With an index, you can retrieve documents with #find and #documents; manage them with #document, #save, and #remove; and perform searches over their fields with #search.
require "gcloud"
gcloud = Gcloud.new
search = gcloud.search
index = search.index "books"
results = index.search "dark stormy"
results.each do |result|
puts result.doc_id
end
For more information, see Documents and Indexes.
Defined Under Namespace
Classes: List
Instance Attribute Summary collapse
-
#connection ⇒ Object
The Connection object.
-
#raw ⇒ Object
The raw data object.
Class Method Summary collapse
-
.from_raw(raw, conn) ⇒ Object
New Index from a raw data object.
Instance Method Summary collapse
-
#atom_fields ⇒ Object
The names of fields in which ATOM values are stored.
-
#datetime_fields ⇒ Object
The names of fields in which DATE values are stored.
-
#delete(force: false) ⇒ Object
Permanently deletes the index by deleting its documents.
-
#document(doc_id = nil, rank = nil) ⇒ Object
Helper for creating a new Document instance.
-
#documents(token: nil, max: nil, view: nil) ⇒ Object
Retrieves the list of documents belonging to the index.
-
#field_names ⇒ Object
The names of all the fields that are stored on the index.
-
#field_types_for(name) ⇒ Object
The field value types that are stored on the field name.
-
#find(doc_id) ⇒ Object
(also: #get)
Retrieves an existing document by id.
-
#geo_fields ⇒ Object
The names of fields in which GEO values are stored.
-
#html_fields ⇒ Object
The names of fields in which HTML values are stored.
-
#index_id ⇒ Object
The index identifier.
-
#initialize ⇒ Index
constructor
Creates a new Index instance.
-
#number_fields ⇒ Object
The names of fields in which NUMBER values are stored.
-
#remove(doc_id) ⇒ Object
Permanently deletes the document from the index.
-
#save(document) ⇒ Object
Saves a new or existing document to the index.
-
#search(query, expressions: nil, matched_count_accuracy: nil, offset: nil, order: nil, fields: nil, scorer: nil, scorer_size: nil, token: nil, max: nil) ⇒ Object
Runs a search against the documents in the index using the provided query.
-
#text_fields ⇒ Object
The names of fields in which TEXT values are stored.
Constructor Details
#initialize ⇒ Index
Creates a new Index instance.
60 61 62 63 |
# File 'lib/gcloud/search/index.rb', line 60 def initialize #:nodoc: @connection = nil @raw = nil end |
Instance Attribute Details
#connection ⇒ Object
The Connection object.
51 52 53 |
# File 'lib/gcloud/search/index.rb', line 51 def connection @connection end |
#raw ⇒ Object
The raw data object.
55 56 57 |
# File 'lib/gcloud/search/index.rb', line 55 def raw @raw end |
Class Method Details
.from_raw(raw, conn) ⇒ Object
New Index from a raw data object.
421 422 423 424 425 426 |
# File 'lib/gcloud/search/index.rb', line 421 def self.from_raw raw, conn #:nodoc: new.tap do |f| f.raw = raw f.connection = conn end end |
Instance Method Details
#atom_fields ⇒ Object
The names of fields in which ATOM values are stored. See Index schemas .
95 96 97 98 |
# File 'lib/gcloud/search/index.rb', line 95 def atom_fields return @raw["indexedField"]["atomFields"] if @raw["indexedField"] [] end |
#datetime_fields ⇒ Object
The names of fields in which DATE values are stored. See Index schemas .
103 104 105 106 |
# File 'lib/gcloud/search/index.rb', line 103 def datetime_fields return @raw["indexedField"]["dateFields"] if @raw["indexedField"] [] end |
#delete(force: false) ⇒ Object
Permanently deletes the index by deleting its documents. (Indexes cannot be created, updated, or deleted directly on the server: They are derived from the documents that reference them.)
Parameters
force-
If
true, ensures the deletion of the index by first deleting all documents. Iffalseand the index contains documents, the request will fail. Default isfalse. (Boolean)
Examples
require "gcloud"
gcloud = Gcloud.new
search = gcloud.search
index = search.index "books"
An index containing documents can be forcefully deleted with the force option:
require "gcloud"
gcloud = Gcloud.new
search = gcloud.search
index = search.index "books"
index.delete force: true
401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 |
# File 'lib/gcloud/search/index.rb', line 401 def delete force: false ensure_connection! docs_to_be_removed = documents view: "ID_ONLY" return if docs_to_be_removed.empty? unless force fail "Unable to delete because documents exist. Use force option." end while docs_to_be_removed docs_to_be_removed.each { |d| remove d } if docs_to_be_removed.next? docs_to_be_removed = documents token: docs_to_be_removed.token, view: "ID_ONLY" else docs_to_be_removed = nil end end end |
#document(doc_id = nil, rank = nil) ⇒ Object
Helper for creating a new Document instance. The returned instance is local: It is either not yet saved to the service (see #save), or if it has been given the id of an existing document, it is not yet populated with the document’s data (see #find).
Parameters
doc_id-
The unique identifier of the new document. This is optional. When the document is saved, this value must contain only visible, printable ASCII characters (ASCII codes 33 through 126 inclusive) and be no longer than 500 characters. It cannot begin with an exclamation point (
!), and it cannot begin and end with double underscores (__). (String) rank-
The rank of the new document. This is optional. A positive integer which determines the default ordering of documents returned from a search. It is a bad idea to assign the same rank to many documents, and the same rank should never be assigned to more than 10,000 documents. By default (when it is not specified or set to 0), it is set at the time the document is saved to the number of seconds since January 1, 2011. The rank can be used in the
expressions,order, andfieldsoptions in #search, where it should referenced asrank. (Integer)
Returns
Gcloud::Search::Document
Example
require "gcloud"
gcloud = Gcloud.new
search = gcloud.search
index = search.index "products"
document = index.document "product-sku-000001"
document.doc_id #=> nil
document.rank #=> nil
To check if an index already contains a document with the same id, pass the instance to #find:
require "gcloud"
gcloud = Gcloud.new
search = gcloud.search
index = search.index "products"
document = index.document "product-sku-000001"
document = index.find document # returns nil if not present
234 235 236 237 238 239 |
# File 'lib/gcloud/search/index.rb', line 234 def document doc_id = nil, rank = nil Document.new.tap do |d| d.doc_id = doc_id d.rank = rank end end |
#documents(token: nil, max: nil, view: nil) ⇒ Object
Retrieves the list of documents belonging to the index.
Parameters
token-
A previously-returned page token representing part of the larger set of results to view. (
String) max-
Maximum number of documents to return. The default is
100. (Integer)
Returns
Array of Gcloud::Search::Document (See Gcloud::Search::Document::List)
Examples
require "gcloud"
gcloud = Gcloud.new
search = gcloud.search
index = search.index "products"
documents = index.documents
documents.each do |index|
puts index.index_id
end
If you have a significant number of documents, you may need to paginate through them: (See Gcloud::Search::Document::List)
require "gcloud"
gcloud = Gcloud.new
search = gcloud.search
index = search.index "products"
documents = index.documents
loop do
documents.each do |index|
puts index.index_id
end
break unless documents.next?
documents = documents.next
end
288 289 290 291 292 293 294 |
# File 'lib/gcloud/search/index.rb', line 288 def documents token: nil, max: nil, view: nil ensure_connection! = { token: token, max: max, view: view } resp = connection.list_docs index_id, return Document::List.from_response(resp, self) if resp.success? fail ApiError.from_response(resp) end |
#field_names ⇒ Object
The names of all the fields that are stored on the index.
126 127 128 129 |
# File 'lib/gcloud/search/index.rb', line 126 def field_names (text_fields + html_fields + atom_fields + datetime_fields + number_fields + geo_fields).uniq end |
#field_types_for(name) ⇒ Object
The field value types that are stored on the field name.
133 134 135 136 137 138 139 140 141 142 |
# File 'lib/gcloud/search/index.rb', line 133 def field_types_for name { text: text_fields.include?(name), html: html_fields.include?(name), atom: atom_fields.include?(name), datetime: datetime_fields.include?(name), number: number_fields.include?(name), geo: geo_fields.include?(name) }.delete_if { |_k, v| !v }.keys end |
#find(doc_id) ⇒ Object Also known as: get
Retrieves an existing document by id.
Parameters
doc_id-
The id of a document or a Document instance. (
Stringor Document)
Returns
Gcloud::Search::Document or nil if the document does not exist
Example
require "gcloud"
gcloud = Gcloud.new
search = gcloud.search
index = search.index "products"
document = index.find "product-sku-000001"
puts document.doc_id
167 168 169 170 171 172 173 174 175 176 177 |
# File 'lib/gcloud/search/index.rb', line 167 def find doc_id # Get the id if passes a Document object doc_id = doc_id.doc_id if doc_id.respond_to? :doc_id ensure_connection! resp = connection.get_doc index_id, doc_id return Document.from_hash(JSON.parse(resp.body)) if resp.success? return nil if resp.status == 404 fail ApiError.from_response(resp) rescue JSON::ParserError raise ApiError.from_response(resp) end |
#geo_fields ⇒ Object
The names of fields in which GEO values are stored. See Index .
119 120 121 122 |
# File 'lib/gcloud/search/index.rb', line 119 def geo_fields return @raw["indexedField"]["geoFields"] if @raw["indexedField"] [] end |
#html_fields ⇒ Object
The names of fields in which HTML values are stored. See Index schemas .
87 88 89 90 |
# File 'lib/gcloud/search/index.rb', line 87 def html_fields return @raw["indexedField"]["htmlFields"] if @raw["indexedField"] [] end |
#index_id ⇒ Object
The index identifier. May be defined by the server or by the client. Must be unique within the project. It cannot be an empty string. It must contain only visible, printable ASCII characters (ASCII codes 33 through 126 inclusive) and be no longer than 100 characters. It cannot begin with an exclamation point (!), and it cannot begin and end with double underscores (__).
72 73 74 |
# File 'lib/gcloud/search/index.rb', line 72 def index_id @raw["indexId"] end |
#number_fields ⇒ Object
The names of fields in which NUMBER values are stored. See Indexschemas .
111 112 113 114 |
# File 'lib/gcloud/search/index.rb', line 111 def number_fields return @raw["indexedField"]["numberFields"] if @raw["indexedField"] [] end |
#remove(doc_id) ⇒ Object
362 363 364 365 366 367 368 369 |
# File 'lib/gcloud/search/index.rb', line 362 def remove doc_id # Get the id if passes a Document object doc_id = doc_id.doc_id if doc_id.respond_to? :doc_id ensure_connection! resp = connection.delete_doc index_id, doc_id return true if resp.success? fail ApiError.from_response(resp) end |
#save(document) ⇒ Object
Saves a new or existing document to the index. If the document instance is new and has been given an id (see #document), it will replace an existing document in the index that has the same unique id.
Parameters
document-
A Document instance, either new (see #document) or existing (see #find).
Returns
Gcloud::Search::Document
Example
require "gcloud"
gcloud = Gcloud.new
search = gcloud.search
index = search.index "products"
document = index.document "product-sku-000001"
document.doc_id #=> nil
document.rank #=> nil
document = index.save document
document.doc_id #=> "-2486020449015432113"
document.rank #=> 154223228
327 328 329 330 331 332 333 334 335 336 337 338 |
# File 'lib/gcloud/search/index.rb', line 327 def save document ensure_connection! resp = connection.create_doc index_id, document.to_hash if resp.success? raw = document.instance_variable_get "@raw" raw.merge! JSON.parse(resp.body) return document end fail ApiError.from_response(resp) rescue JSON::ParserError raise ApiError.from_response(resp) end |
#search(query, expressions: nil, matched_count_accuracy: nil, offset: nil, order: nil, fields: nil, scorer: nil, scorer_size: nil, token: nil, max: nil) ⇒ Object
Runs a search against the documents in the index using the provided query. For more information see the REST API documentation for indexes.search.
Parameters
query-
The query string in search query syntax. If the query is
nilor empty, all documents are returned. For more information see Query Strings. (String) expressions-
Customized expressions used in
orderorfields. The expression can contain fields in Document, the built-in fields (rank, the documentrank, andscoreif scoring is enabled) and fields defined inexpressions. All field expressions expressed as aHashwith the keys as thenameand the values as theexpression. The expression value can be a combination of supported functions encoded in the string. Expressions involving number fields can use the arithmetical operators (+, -, *, /) and the built-in numeric functions (max,min,pow,count,log,abs). Expressions involving geopoint fields can use thegeopointanddistancefunctions. Expressions for text and html fields can use thesnippetfunction. (Hash) matched_count_accuracy-
Minimum accuracy requirement for Result::List#matched_count. If specified,
matched_countwill be accurate to at least that number. For example, when set to 100, anymatched_count <= 100is accurate. This option may add considerable latency/expense. By default (when it is not specified or set to 0), the accuracy is the same asmax. (Integer) offset-
Used to advance pagination to an arbitrary result, independent of the previous results. Offsets are an inefficient alternative to using
token. (Both cannot be both set.) The default is 0. (Integer) order-
A comma-separated list of fields for sorting on the search result, including fields from Document, the built-in fields (
rankandscore), and fields defined in expressions. The default sorting order is ascending. To specify descending order for a field, a suffix" desc"should be appended to the field name. For example:orderBy="foo desc,bar". The default value for text sort is the empty string, and the default value for numeric sort is 0. If not specified, the search results are automatically sorted by descendingrank. Sorting by ascendingrankis not allowed. (String) fields-
The fields to return in the Search::Result objects. These can be fields from Document, the built-in fields
rankandscore, and fields defined in expressions. The default is to return all fields. (StringorArrayofString) scorer-
The scoring function to invoke on a search result for this query. If scorer is not set, scoring is disabled and
scoreis 0 for all documents in the search result. To enable document relevancy score based on term frequency, setscorerto:generic. (StringorSymbol) scorer_size-
Maximum number of top retrieved results to score. It is valid only when
scoreris set. The default is 100. (Integer) token-
A previously-returned page token representing part of the larger set of results to view. (
String) max-
Maximum number of results to return per page. (
Integer)
Returns
Array of Gcloud::Search::Result (See Gcloud::Search::Result::List)
Examples
require "gcloud"
gcloud = Gcloud.new
search = gcloud.search
index = search.index "books"
results = index.search "dark stormy"
results.each do |result|
puts result.doc_id
end
If you have a significant number of search results, you may need to paginate through them: (See Gcloud::Search::Result::List)
require "gcloud"
gcloud = Gcloud.new
search = gcloud.search
index = search.index "books"
results = index.results
loop do
results.each do |result|
puts result.doc_id
end
break unless results.next?
results = results.next
end
By default, Result objects are sorted by document rank. For more information see the REST API documentation for Document.rank.
You can specify how to sort results with the order option. In the example below, the - character before avg_review means that results will be sorted in ascending order by published and then in descending order by avg_review.
require "gcloud"
gcloud = Gcloud.new
search = gcloud.search
index = search.index "books"
results = index.search "dark stormy", order: "published, avg_review desc"
documents = index.search query # API call
You can add computed fields with the expressions option, and limit the fields that are returned with the fields option:
require "gcloud"
gcloud = Gcloud.new
search = gcloud.search
index = search.index "products"
results = index.search "cotton T-shirt",
expressions: { total_price: "(price + tax)" },
fields: ["name", "total_price", "highlight"]
Just as in documents, Result data is accessible via Fields methods:
require "gcloud"
gcloud = Gcloud.new
search = gcloud.search
index = search.index "products"
document = index.find "product-sku-000001"
results = index.search "cotton T-shirt"
values = results[0]["description"]
values[0] #=> "100% organic cotton ruby gem T-shirt"
values[0].type #=> :text
values[0].lang #=> "en"
values[1] #=> "<p>100% organic cotton ruby gem T-shirt</p>"
values[1].type #=> :html
values[1].lang #=> "en"
580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 |
# File 'lib/gcloud/search/index.rb', line 580 def search query, expressions: nil, matched_count_accuracy: nil, offset: nil, order: nil, fields: nil, scorer: nil, scorer_size: nil, token: nil, max: nil ensure_connection! = { expressions: format_expressions(expressions), matched_count_accuracy: matched_count_accuracy, offset: offset, order: order, fields: fields, scorer: scorer, scorer_size: scorer_size, token: token, max: max } resp = connection.search index_id, query, if resp.success? Result::List.from_response resp, self, query, else fail ApiError.from_response(resp) end end |
#text_fields ⇒ Object
The names of fields in which TEXT values are stored. See Index schemas .
79 80 81 82 |
# File 'lib/gcloud/search/index.rb', line 79 def text_fields return @raw["indexedField"]["textFields"] if @raw["indexedField"] [] end |