Class: Google::Cloud::Bigquery::Data

Inherits:
Array
  • Object
show all
Defined in:
lib/google/cloud/bigquery/data.rb

Overview

Data

Represents a page of results (rows) as an array of hashes. Because Data delegates to Array, methods such as Array#count represent the number of rows in the page. In addition, methods of this class include result set metadata such as total and provide access to the schema of the query or table. See Project#query, Google::Cloud::Bigquery::Dataset#query and Table#data.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

sql = "SELECT word FROM `bigquery-public-data.samples.shakespeare`"
job = bigquery.query_job sql

job.wait_until_done!
data = job.data

data.count # 100000
data.total # 164656

# Iterate over the first page of results
data.each do |row|
  puts row[:word]
end
# Retrieve the next page of results
data = data.next if data.next?

Instance Method Summary collapse

Instance Method Details

#all(request_limit: nil) {|row| ... } ⇒ Enumerator

Retrieves all rows by repeatedly loading #next until #next? returns false. Calls the given block once for each row, which is passed as the parameter.

An enumerator is returned if no block is given.

This method may make several API calls until all rows are retrieved. Be sure to use as narrow a search criteria as possible. Please use with caution.

Examples:

Iterating each rows by passing a block:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

table.data.all do |row|
  puts row[:word]
end

Using the enumerator by not passing a block:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

words = table.data.all.map do |row|
  row[:word]
end

Limit the number of API calls made:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

table.data.all(request_limit: 10) do |row|
  puts row[:word]
end

Parameters:

  • request_limit (Integer) (defaults to: nil)

    The upper limit of API requests to make to load all data. Default is no limit.

Yields:

  • (row)

    The block for accessing each row of data.

Yield Parameters:

  • row (Hash)

    The row object.

Returns:

  • (Enumerator)

    An enumerator providing access to all of the data.



533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
# File 'lib/google/cloud/bigquery/data.rb', line 533

def all request_limit: nil, &block
  request_limit = request_limit.to_i if request_limit

  return enum_for :all, request_limit: request_limit unless block_given?

  results = self
  loop do
    results.each(&block)
    if request_limit
      request_limit -= 1
      break if request_limit.negative?
    end
    break unless results.next?
    results = results.next
  end
end

#ddl?Boolean

Whether the query that created this data was a DDL statement.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
data = bigquery.query "CREATE TABLE my_table (x INT64)"

data.statement_type #=> "CREATE_TABLE"
data.ddl? #=> true

Returns:

  • (Boolean)

See Also:



280
281
282
283
284
285
286
287
288
289
290
291
# File 'lib/google/cloud/bigquery/data.rb', line 280

def ddl?
  [
    "ALTER_TABLE",
    "CREATE_MODEL",
    "CREATE_TABLE",
    "CREATE_TABLE_AS_SELECT",
    "CREATE_VIEW",
    "DROP_MODEL",
    "DROP_TABLE",
    "DROP_VIEW"
  ].include? statement_type
end

#ddl_operation_performedString?

The DDL operation performed, possibly dependent on the pre-existence of the DDL target. (See #ddl_target_table.) Possible values (new values might be added in the future):

  • "CREATE": The query created the DDL target.
  • "SKIP": No-op. Example cases: the query is CREATE TABLE IF NOT EXISTS while the table already exists, or the query is DROP TABLE IF EXISTS while the table does not exist.
  • "REPLACE": The query replaced the DDL target. Example case: the query is CREATE OR REPLACE TABLE, and the table already exists.
  • "DROP": The query deleted the DDL target.

Returns:

  • (String, nil)

    The DDL operation performed.



336
337
338
# File 'lib/google/cloud/bigquery/data.rb', line 336

def ddl_operation_performed
  job_gapi&.statistics&.query&.ddl_operation_performed
end

#ddl_target_routineGoogle::Cloud::Bigquery::Routine?

The DDL target routine, in reference state. (See Routine#reference?.) Present only for CREATE/DROP FUNCTION/PROCEDURE queries. (See #statement_type.)

Returns:



348
349
350
351
352
353
# File 'lib/google/cloud/bigquery/data.rb', line 348

def ddl_target_routine
  ensure_service!
  routine = job_gapi&.statistics&.query&.ddl_target_routine
  return nil if routine.nil?
  Google::Cloud::Bigquery::Routine.new_reference_from_gapi routine, service
end

#ddl_target_tableGoogle::Cloud::Bigquery::Table?

The DDL target table, in reference state. (See Table#reference?.) Present only for CREATE/DROP TABLE/VIEW queries. (See #statement_type.)

Returns:



363
364
365
366
367
368
# File 'lib/google/cloud/bigquery/data.rb', line 363

def ddl_target_table
  ensure_service!
  table = job_gapi&.statistics&.query&.ddl_target_table
  return nil if table.nil?
  Google::Cloud::Bigquery::Table.new_reference_from_gapi table, service
end

#deleted_row_countInteger?

The number of deleted rows. Present only for DML statements DELETE, MERGE and TRUNCATE. (See #statement_type.)

Returns:

  • (Integer, nil)

    The number of deleted rows, or nil if not applicable.



388
389
390
# File 'lib/google/cloud/bigquery/data.rb', line 388

def deleted_row_count
  job_gapi&.statistics&.query&.dml_stats&.deleted_row_count
end

#dml?Boolean

Whether the query that created this data was a DML statement.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
data = bigquery.query "UPDATE my_table " \
                      "SET x = x + 1 " \
                      "WHERE x IS NOT NULL"

data.statement_type #=> "UPDATE"
data.dml? #=> true

Returns:

  • (Boolean)

See Also:



312
313
314
315
316
317
318
319
# File 'lib/google/cloud/bigquery/data.rb', line 312

def dml?
  [
    "INSERT",
    "UPDATE",
    "MERGE",
    "DELETE"
  ].include? statement_type
end

#etagString

An ETag hash for the page of results represented by the data instance.

Returns:

  • (String)

    The ETag hash.



91
92
93
# File 'lib/google/cloud/bigquery/data.rb', line 91

def etag
  @gapi_json[:etag]
end

#fieldsArray<Schema::Field>

The fields of the data, obtained from the schema of the table from which the data was read.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

data = table.data

data.fields.each do |field|
  puts field.name
end

Returns:



182
183
184
# File 'lib/google/cloud/bigquery/data.rb', line 182

def fields
  schema.fields
end

#headersArray<Symbol>

The names of the columns in the data, obtained from the schema of the table from which the data was read.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

data = table.data

data.headers.each do |header|
  puts header
end

Returns:

  • (Array<Symbol>)

    An array of column names.



205
206
207
# File 'lib/google/cloud/bigquery/data.rb', line 205

def headers
  schema.headers
end

#inserted_row_countInteger?

The number of inserted rows. Present only for DML statements INSERT and MERGE. (See #statement_type.)

Returns:

  • (Integer, nil)

    The number of inserted rows, or nil if not applicable.



399
400
401
# File 'lib/google/cloud/bigquery/data.rb', line 399

def inserted_row_count
  job_gapi&.statistics&.query&.dml_stats&.inserted_row_count
end

#kindString

The resource type of the API response.

Returns:

  • (String)

    The resource type.



82
83
84
# File 'lib/google/cloud/bigquery/data.rb', line 82

def kind
  @gapi_json[:kind]
end

#nextData

Retrieves the next page of data.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

sql = "SELECT word FROM `bigquery-public-data.samples.shakespeare`"
job = bigquery.query_job sql

job.wait_until_done!
data = job.data

data.count # 100000
data.total # 164656

# Iterate over the first page of results
data.each do |row|
  puts row[:word]
end
# Retrieve the next page of results
data = data.next if data.next?

Returns:

  • (Data)

    A new instance providing the next page of data.



470
471
472
473
474
475
476
477
478
# File 'lib/google/cloud/bigquery/data.rb', line 470

def next
  return nil unless next?
  ensure_service!
  data_json = service.list_tabledata \
    @table_gapi.table_reference.dataset_id,
    @table_gapi.table_reference.table_id,
    token: token
  self.class.from_gapi_json data_json, @table_gapi, job_gapi, @service
end

#next?Boolean

Whether there is a next page of data.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

sql = "SELECT word FROM `bigquery-public-data.samples.shakespeare`"
job = bigquery.query_job sql

job.wait_until_done!
data = job.data

data.count # 100000
data.total # 164656

# Iterate over the first page of results
data.each do |row|
  puts row[:word]
end
# Retrieve the next page of results
data = data.next if data.next?

Returns:

  • (Boolean)

    true when there is a next page, false otherwise.



440
441
442
# File 'lib/google/cloud/bigquery/data.rb', line 440

def next?
  !token.nil?
end

#num_dml_affected_rowsInteger?

The number of rows affected by a DML statement. Present only for DML statements INSERT, UPDATE or DELETE. (See #statement_type.)

Returns:

  • (Integer, nil)

    The number of rows affected by a DML statement, or nil if the query is not a DML statement.



377
378
379
# File 'lib/google/cloud/bigquery/data.rb', line 377

def num_dml_affected_rows
  job_gapi&.statistics&.query&.num_dml_affected_rows
end

#param_typesHash

The types of the fields in the data, obtained from the schema of the table from which the data was read. Types use the same format as the optional query parameter types.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

data = table.data

data.param_types

Returns:

  • (Hash)

    A hash with field names as keys, and types as values.



227
228
229
# File 'lib/google/cloud/bigquery/data.rb', line 227

def param_types
  schema.param_types
end

#schemaSchema

The schema of the table from which the data was read.

The returned object is frozen and changes are not allowed. Use Table#schema to update the schema.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

data = table.data

schema = data.schema
field = schema.field "name"
field.required? #=> true

Returns:

  • (Schema)

    A schema object.



158
159
160
161
# File 'lib/google/cloud/bigquery/data.rb', line 158

def schema
  return nil unless @table_gapi
  Schema.from_gapi(@table_gapi.schema).freeze
end

#statement_typeString?

The type of query statement, if valid. Possible values (new values might be added in the future):

Returns:

  • (String, nil)

    The type of query statement.



259
260
261
# File 'lib/google/cloud/bigquery/data.rb', line 259

def statement_type
  job_gapi&.statistics&.query&.statement_type
end

#tokenString

A token used for paging results. Used by the data instance to retrieve subsequent pages. See #next.

Returns:

  • (String)

    The pagination token.



101
102
103
# File 'lib/google/cloud/bigquery/data.rb', line 101

def token
  @gapi_json[:pageToken]
end

#totalInteger

The total number of rows in the complete table.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

sql = "SELECT word FROM `bigquery-public-data.samples.shakespeare`"
job = bigquery.query_job sql

job.wait_until_done!
data = job.data

data.count # 100000
data.total # 164656

# Iterate over the first page of results
data.each do |row|
  puts row[:word]
end
# Retrieve the next page of results
data = data.next if data.next?

Returns:

  • (Integer)

    The number of rows.



131
132
133
134
135
# File 'lib/google/cloud/bigquery/data.rb', line 131

def total
  Integer @gapi_json[:totalRows]
rescue StandardError
  nil
end

#updated_row_countInteger?

The number of updated rows. Present only for DML statements UPDATE and MERGE. (See #statement_type.)

Returns:

  • (Integer, nil)

    The number of updated rows, or nil if not applicable.



410
411
412
# File 'lib/google/cloud/bigquery/data.rb', line 410

def updated_row_count
  job_gapi&.statistics&.query&.dml_stats&.updated_row_count
end