Class: Gcloud::Search::Index

Inherits:
Object
  • Object
show all
Defined in:
lib/gcloud/search/index.rb,
lib/gcloud/search/index/list.rb

Overview

# Index

An index manages Document instances for retrieval. Indexes cannot be created, updated, or deleted directly on the server: They are derived from the documents that reference them. You can manage groups of documents by putting them into separate indexes.

With an index, you can retrieve documents with #find and #documents; manage them with #document, #save, and #remove; and perform searches over their fields with #search.

Examples:

require "gcloud"

gcloud = Gcloud.new
search = gcloud.search
index = search.index "books"

results = index.search "dark stormy"
results.each do |result|
  puts result.doc_id
end

See Also:

Defined Under Namespace

Classes: List

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initializeIndex

Returns a new instance of Index.



63
64
65
66
# File 'lib/gcloud/search/index.rb', line 63

def initialize
  @connection = nil
  @raw = nil
end

Instance Attribute Details

#connectionObject



54
55
56
# File 'lib/gcloud/search/index.rb', line 54

def connection
  @connection
end

#rawObject



58
59
60
# File 'lib/gcloud/search/index.rb', line 58

def raw
  @raw
end

Class Method Details

.from_raw(raw, conn) ⇒ Object



392
393
394
395
396
397
# File 'lib/gcloud/search/index.rb', line 392

def self.from_raw raw, conn
  new.tap do |f|
    f.raw = raw
    f.connection = conn
  end
end

Instance Method Details

#atom_fieldsObject

The names of fields in which ATOM values are stored.

See Also:



101
102
103
104
# File 'lib/gcloud/search/index.rb', line 101

def atom_fields
  return @raw["indexedField"]["atomFields"] if @raw["indexedField"]
  []
end

#datetime_fieldsObject

The names of fields in which DATE values are stored.

See Also:



110
111
112
113
# File 'lib/gcloud/search/index.rb', line 110

def datetime_fields
  return @raw["indexedField"]["dateFields"] if @raw["indexedField"]
  []
end

#delete(force: false) ⇒ Object

Permanently deletes the index by deleting its documents. (Indexes cannot be created, updated, or deleted directly on the server: They are derived from the documents that reference them.)

Examples:

require "gcloud"

gcloud = Gcloud.new
search = gcloud.search
index = search.index "books"
index.delete

Deleting an index containing documents with the ‘force` option:

require "gcloud"

gcloud = Gcloud.new
search = gcloud.search
index = search.index "books"
index.delete force: true

Parameters:

  • force (Boolean) (defaults to: false)

    If ‘true`, ensures the deletion of the index by first deleting all documents. If `false` and the index contains documents, the request will fail. Default is `false`.



372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
# File 'lib/gcloud/search/index.rb', line 372

def delete force: false
  ensure_connection!
  docs_to_be_removed = documents view: "ID_ONLY"
  return if docs_to_be_removed.empty?
  unless force
    fail "Unable to delete because documents exist. Use force option."
  end
  while docs_to_be_removed
    docs_to_be_removed.each { |d| remove d }
    if docs_to_be_removed.next?
      docs_to_be_removed = documents token: docs_to_be_removed.token,
                                     view: "ID_ONLY"
    else
      docs_to_be_removed = nil
    end
  end
end

#document(doc_id = nil, rank = nil) ⇒ Gcloud::Search::Document

Helper for creating a new Document instance. The returned instance is local: It is either not yet saved to the service (see #save), or if it has been given the id of an existing document, it is not yet populated with the document’s data (see #find).

Examples:

require "gcloud"

gcloud = Gcloud.new
search = gcloud.search
index = search.index "products"

document = index.document "product-sku-000001"
document.doc_id #=> nil
document.rank #=> nil

To check if an index already contains a document:

require "gcloud"

gcloud = Gcloud.new
search = gcloud.search
index = search.index "products"

document = index.document "product-sku-000001"
document = index.find document # returns nil if not present

Parameters:

  • doc_id (String, nil) (defaults to: nil)

    An optional unique ID for the new document. When the document is saved, this value must contain only visible, printable ASCII characters (ASCII codes 33 through 126 inclusive) and be no longer than 500 characters. It cannot begin with an exclamation point (!), and it cannot begin and end with double underscores (__).

  • rank (Integer, nil) (defaults to: nil)

    An optional rank for the new document. An integer which determines the default ordering of documents returned from a search. It is a bad idea to assign the same rank to many documents, and the same rank should never be assigned to more than 10,000 documents. By default (when it is not specified or set to 0), it is set at the time the document is saved to the number of seconds since January 1, 2011. The rank can be used in the ‘expressions`, `order`, and `fields` options in #search, where it should referenced as `rank`.

Returns:



230
231
232
233
234
235
# File 'lib/gcloud/search/index.rb', line 230

def document doc_id = nil, rank = nil
  Document.new.tap do |d|
    d.doc_id = doc_id
    d.rank = rank
  end
end

#documents(token: nil, max: nil, view: nil) ⇒ Array<Gcloud::Search::Document>

Retrieves the list of documents belonging to the index.

Examples:

require "gcloud"

gcloud = Gcloud.new
search = gcloud.search
index = search.index "products"

documents = index.documents
documents.each do |index|
  puts index.index_id
end

With pagination: (See Document::List)

require "gcloud"

gcloud = Gcloud.new
search = gcloud.search
index = search.index "products"

documents = index.documents
loop do
  documents.each do |index|
    puts index.index_id
  end
  break unless documents.next?
  documents = documents.next
end

Parameters:

  • token (String) (defaults to: nil)

    A previously-returned page token representing part of the larger set of results to view.

  • max (Integer) (defaults to: nil)

    Maximum number of documents to return. The default is ‘100`.

Returns:



275
276
277
278
279
280
281
# File 'lib/gcloud/search/index.rb', line 275

def documents token: nil, max: nil, view: nil
  ensure_connection!
  options = { token: token, max: max, view: view }
  resp = connection.list_docs index_id, options
  return Document::List.from_response(resp, self) if resp.success?
  fail ApiError.from_response(resp)
end

#field_namesObject

The names of all the fields that are stored on the index.



135
136
137
138
# File 'lib/gcloud/search/index.rb', line 135

def field_names
  (text_fields + html_fields + atom_fields + datetime_fields +
    number_fields + geo_fields).uniq
end

#field_types_for(name) ⇒ Object

The field value types that are stored on the field name.



142
143
144
145
146
147
148
149
150
151
# File 'lib/gcloud/search/index.rb', line 142

def field_types_for name
  {
    text: text_fields.include?(name),
    html: html_fields.include?(name),
    atom: atom_fields.include?(name),
    datetime: datetime_fields.include?(name),
    number: number_fields.include?(name),
    geo: geo_fields.include?(name)
  }.delete_if { |_k, v| !v }.keys
end

#find(doc_id) ⇒ Gcloud::Search::Document? Also known as: get

Retrieves an existing document by id.

Examples:

require "gcloud"

gcloud = Gcloud.new
search = gcloud.search
index = search.index "products"

document = index.find "product-sku-000001"
puts document.doc_id

Parameters:

Returns:



171
172
173
174
175
176
177
178
179
180
181
# File 'lib/gcloud/search/index.rb', line 171

def find doc_id
  # Get the id if passes a Document object
  doc_id = doc_id.doc_id if doc_id.respond_to? :doc_id
  ensure_connection!
  resp = connection.get_doc index_id, doc_id
  return Document.from_hash(JSON.parse(resp.body)) if resp.success?
  return nil if resp.status == 404
  fail ApiError.from_response(resp)
rescue JSON::ParserError
  raise ApiError.from_response(resp)
end

#geo_fieldsObject

The names of fields in which GEO values are stored.

See Also:



128
129
130
131
# File 'lib/gcloud/search/index.rb', line 128

def geo_fields
  return @raw["indexedField"]["geoFields"] if @raw["indexedField"]
  []
end

#html_fieldsObject

The names of fields in which HTML values are stored.

See Also:



92
93
94
95
# File 'lib/gcloud/search/index.rb', line 92

def html_fields
  return @raw["indexedField"]["htmlFields"] if @raw["indexedField"]
  []
end

#index_idObject

The index identifier. May be defined by the server or by the client. Must be unique within the project. It cannot be an empty string. It must contain only visible, printable ASCII characters (ASCII codes 33 through 126 inclusive) and be no longer than 100 characters. It cannot begin with an exclamation point (!), and it cannot begin and end with double underscores (__).



75
76
77
# File 'lib/gcloud/search/index.rb', line 75

def index_id
  @raw["indexId"]
end

#number_fieldsObject

The names of fields in which NUMBER values are stored.

See Also:



119
120
121
122
# File 'lib/gcloud/search/index.rb', line 119

def number_fields
  return @raw["indexedField"]["numberFields"] if @raw["indexedField"]
  []
end

#remove(doc_id) ⇒ Boolean

Permanently deletes the document from the index.

Examples:

require "gcloud"

gcloud = Gcloud.new
search = gcloud.search
index = search.index "products"

index.remove "product-sku-000001"

Parameters:

  • doc_id (String)

    The id of the document.

Returns:

  • (Boolean)

    ‘true` if successful



338
339
340
341
342
343
344
345
# File 'lib/gcloud/search/index.rb', line 338

def remove doc_id
  # Get the id if passes a Document object
  doc_id = doc_id.doc_id if doc_id.respond_to? :doc_id
  ensure_connection!
  resp = connection.delete_doc index_id, doc_id
  return true if resp.success?
  fail ApiError.from_response(resp)
end

#save(document) ⇒ Gcloud::Search::Document

Saves a new or existing document to the index. If the document instance is new and has been given an id (see #document), it will replace an existing document in the index that has the same unique id.

Examples:

require "gcloud"

gcloud = Gcloud.new
search = gcloud.search
index = search.index "products"

document = index.document "product-sku-000001"
document.doc_id #=> nil
document.rank #=> nil

document = index.save document
document.doc_id #=> "-2486020449015432113"
document.rank #=> 154223228

Parameters:

Returns:



310
311
312
313
314
315
316
317
318
319
320
321
# File 'lib/gcloud/search/index.rb', line 310

def save document
  ensure_connection!
  resp = connection.create_doc index_id, document.to_hash
  if resp.success?
    raw = document.instance_variable_get "@raw"
    raw.merge! JSON.parse(resp.body)
    return document
  end
  fail ApiError.from_response(resp)
rescue JSON::ParserError
  raise ApiError.from_response(resp)
end

#search(query, expressions: nil, matched_count_accuracy: nil, offset: nil, order: nil, fields: nil, scorer: nil, scorer_size: nil, token: nil, max: nil) ⇒ Array<Gcloud::Search::Result>

Runs a search against the documents in the index using the provided query.

By default, Result objects are sorted by document rank. For more information see the [REST API documentation for Document.rank](cloud.google.com/search/reference/rest/v1/projects/indexes/documents#resource_representation.google.cloudsearch.v1.Document.rank).

You can specify how to sort results with the ‘order` option. In the example below, the - character before `avg_review` means that results will be sorted in ascending order by `published` and then in descending order by `avg_review`. You can add computed fields with the `expressions` option, and limit the fields that are returned with the `fields` option.

Examples:

require "gcloud"

gcloud = Gcloud.new
search = gcloud.search
index = search.index "books"

results = index.search "dark stormy"
results.each do |result|
  puts result.doc_id
end

With pagination: (See Result::List)

require "gcloud"

gcloud = Gcloud.new
search = gcloud.search
index = search.index "books"

results = index.results
loop do
  results.each do |result|
    puts result.doc_id
  end
  break unless results.next?
  results = results.next
end

With the ‘order` option:

require "gcloud"

gcloud = Gcloud.new
search = gcloud.search
index = search.index "books"

results = index.search "dark stormy", order: "published, avg_review desc"
documents = index.search query # API call

With the ‘fields` option:

require "gcloud"

gcloud = Gcloud.new
search = gcloud.search
index = search.index "products"

results = index.search "cotton T-shirt",
                       expressions: { total_price: "(price + tax)" },
                       fields: ["name", "total_price", "highlight"]

Just as in documents, data is accessible via Fields methods:

require "gcloud"

gcloud = Gcloud.new
search = gcloud.search
index = search.index "products"
document = index.find "product-sku-000001"
results = index.search "cotton T-shirt"
values = results[0]["description"]

values[0] #=> "100% organic cotton ruby gem T-shirt"
values[0].type #=> :text
values[0].lang #=> "en"
values[1] #=> "<p>100% organic cotton ruby gem T-shirt</p>"
values[1].type #=> :html
values[1].lang #=> "en"

Parameters:

  • query (String)

    The query string in search query syntax. If the query is ‘nil` or empty, all documents are returned. For more information see [Query Strings](cloud.google.com/search/query).

  • expressions (Hash) (defaults to: nil)

    Customized expressions used in ‘order` or `fields`. The expression can contain fields in Document, the built-in fields ( `rank`, the document `rank`, and `score` if scoring is enabled) and fields defined in `expressions`. All field expressions expressed as a `Hash` with the keys as the `name` and the values as the `expression`. The expression value can be a combination of supported functions encoded in the string. Expressions involving number fields can use the arithmetical operators (+, -, *, /) and the built-in numeric functions (`max`, `min`, `pow`, `count`, `log`, `abs`). Expressions involving geopoint fields can use the `geopoint` and `distance` functions. Expressions for text and html fields can use the `snippet` function.

  • matched_count_accuracy (Integer) (defaults to: nil)

    Minimum accuracy requirement for Result::List#matched_count. If specified, ‘matched_count` will be accurate to at least that number. For example, when set to 100, any matched_count <= 100 is accurate. This option may add considerable latency/expense. By default (when it is not specified or set to 0), the accuracy is the same as `max`.

  • offset (Integer) (defaults to: nil)

    Used to advance pagination to an arbitrary result, independent of the previous results. Offsets are an inefficient alternative to using ‘token`. (Both cannot be both set.) The default is 0.

  • order (String) (defaults to: nil)

    A comma-separated list of fields for sorting on the search result, including fields from Document, the built-in fields (‘rank` and `score`), and fields defined in expressions. The default sorting order is ascending. To specify descending order for a field, a suffix " desc" should be appended to the field name. For example: orderBy="foo desc,bar". The default value for text sort is the empty string, and the default value for numeric sort is 0. If not specified, the search results are automatically sorted by descending `rank`. Sorting by ascending `rank` is not allowed.

  • fields (String, Array<String>) (defaults to: nil)

    The fields to return in the Result objects. These can be fields from Document, the built-in fields ‘rank` and `score`, and fields defined in expressions. The default is to return all fields.

  • scorer (String, Symbol) (defaults to: nil)

    The scoring function to invoke on a search result for this query. If scorer is not set, scoring is disabled and ‘score` is 0 for all documents in the search result. To enable document relevancy score based on term frequency, set `scorer` to `:generic`.

  • scorer_size (Integer) (defaults to: nil)

    Maximum number of top retrieved results to score. It is valid only when ‘scorer` is set. The default is 100.

  • token (String) (defaults to: nil)

    A previously-returned page token representing part of the larger set of results to view.

  • max (Integer) (defaults to: nil)

    Maximum number of results to return per page.

Returns:

See Also:



539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
# File 'lib/gcloud/search/index.rb', line 539

def search query, expressions: nil, matched_count_accuracy: nil,
           offset: nil, order: nil, fields: nil, scorer: nil,
           scorer_size: nil, token: nil, max: nil
  ensure_connection!
  options = { expressions: format_expressions(expressions),
              matched_count_accuracy: matched_count_accuracy,
              offset: offset, order: order, fields: fields,
              scorer: scorer, scorer_size: scorer_size, token: token,
              max: max }
  resp = connection.search index_id, query, options
  if resp.success?
    Result::List.from_response resp, self, query, options
  else
    fail ApiError.from_response(resp)
  end
end

#text_fieldsObject

The names of fields in which TEXT values are stored.

See Also:



83
84
85
86
# File 'lib/gcloud/search/index.rb', line 83

def text_fields
  return @raw["indexedField"]["textFields"] if @raw["indexedField"]
  []
end