Class: Gcloud::Bigquery::Table

Inherits:
Object
  • Object
show all
Defined in:
lib/gcloud/bigquery/table.rb,
lib/gcloud/bigquery/table/list.rb,
lib/gcloud/bigquery/table/schema.rb

Overview

Table

A named resource representing a BigQuery table that holds zero or more records. Every table is defined by a schema that may contain nested and repeated fields. (For more information about nested and repeated fields, see Preparing Data for BigQuery.)

require "gcloud"

gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"

table = dataset.create_table "my_table" do |schema|
  schema.string "first_name", mode: :required
  schema.record "cities_lived", mode: :repeated do |nested_schema|
    nested_schema.string "place", mode: :required
    nested_schema.integer "number_of_years", mode: :required
  end
end

row = {
  "first_name" => "Alice",
  "cities_lived" => [
    {
      "place" => "Seattle",
      "number_of_years" => 5
    },
    {
      "place" => "Stockholm",
      "number_of_years" => 6
    }
  ]
}
table.insert row

Defined Under Namespace

Classes: List, Schema

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initializeTable

Create an empty Table object.



75
76
77
78
# File 'lib/gcloud/bigquery/table.rb', line 75

def initialize #:nodoc:
  @connection = nil
  @gapi = {}
end

Instance Attribute Details

#connectionObject

The Connection object.



67
68
69
# File 'lib/gcloud/bigquery/table.rb', line 67

def connection
  @connection
end

#gapiObject

The Google API Client object.



71
72
73
# File 'lib/gcloud/bigquery/table.rb', line 71

def gapi
  @gapi
end

Class Method Details

.from_gapi(gapi, conn) ⇒ Object

New Table from a Google API Client object.



833
834
835
836
837
838
839
# File 'lib/gcloud/bigquery/table.rb', line 833

def self.from_gapi gapi, conn #:nodoc:
  klass = class_for gapi
  klass.new.tap do |f|
    f.gapi = gapi
    f.connection = conn
  end
end

Instance Method Details

#bytes_countObject

The number of bytes in the table.

:category: Data



216
217
218
219
# File 'lib/gcloud/bigquery/table.rb', line 216

def bytes_count
  ensure_full_data!
  @gapi["numBytes"]
end

#copy(destination_table, options = {}) ⇒ Object

Copies the data from the table to another table.

Parameters

destination_table

The destination for the copied data. (Table or String)

options

An optional Hash for controlling additional behavior. (Hash)

options[:create]

Specifies whether the job is allowed to create new tables. (String)

The following values are supported:

  • needed - Create the table if it does not exist.

  • never - The table must already exist. A ‘notFound’ error is raised if the table does not exist.

options[:write]

Specifies how to handle data already present in the destination table. The default value is empty. (String)

The following values are supported:

  • truncate - BigQuery overwrites the table data.

  • append - BigQuery appends the data to the table.

  • empty - An error will be returned if the destination table already contains data.

Returns

Gcloud::Bigquery::CopyJob

Examples

require "gcloud"

gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"
destination_table = dataset.table "my_destination_table"

copy_job = table.copy destination_table

The destination table argument can also be a string identifier as specified by the Query Reference: project_name:datasetId.tableId. This is useful for referencing tables in other projects and datasets.

require "gcloud"

gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

copy_job = table.copy "other-project:other_dataset.other_table"

:category: Data



512
513
514
515
516
517
518
519
520
521
522
# File 'lib/gcloud/bigquery/table.rb', line 512

def copy destination_table, options = {}
  ensure_connection!
  resp = connection.copy_table table_ref,
                               get_table_ref(destination_table),
                               options
  if resp.success?
    Job.from_gapi resp.data, connection
  else
    fail ApiError.from_response(resp)
  end
end

#created_atObject

The time when this table was created.

:category: Attributes



236
237
238
239
# File 'lib/gcloud/bigquery/table.rb', line 236

def created_at
  ensure_full_data!
  Time.at(@gapi["creationTime"] / 1000.0)
end

#data(options = {}) ⇒ Object

Retrieves data from the table.

Parameters

options

An optional Hash for controlling additional behavior. (Hash)

options[:token]

Page token, returned by a previous call, identifying the result set. (String)

options[:max]

Maximum number of results to return. (Integer)

options[:start]

Zero-based index of the starting row to read. (Integer)

Returns

Gcloud::Bigquery::Data

Example

require "gcloud"

gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

data = table.data
data.each do |row|
  puts row["first_name"]
end
more_data = table.data token: data.token

:category: Data



443
444
445
446
447
448
449
450
451
# File 'lib/gcloud/bigquery/table.rb', line 443

def data options = {}
  ensure_connection!
  resp = connection.list_tabledata dataset_id, table_id, options
  if resp.success?
    Data.from_response resp, self
  else
    fail ApiError.from_response(resp)
  end
end

#dataset_idObject

The ID of the Dataset containing this table.

:category: Attributes



96
97
98
# File 'lib/gcloud/bigquery/table.rb', line 96

def dataset_id
  @gapi["tableReference"]["datasetId"]
end

#deleteObject

Permanently deletes the table.

Returns

true if the table was deleted.

Example

require "gcloud"

gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

table.delete

:category: Lifecycle



805
806
807
808
809
810
811
812
813
# File 'lib/gcloud/bigquery/table.rb', line 805

def delete
  ensure_connection!
  resp = connection.delete_table dataset_id, table_id
  if resp.success?
    true
  else
    fail ApiError.from_response(resp)
  end
end

#descriptionObject

The description of the table.

:category: Attributes



197
198
199
200
# File 'lib/gcloud/bigquery/table.rb', line 197

def description
  ensure_full_data!
  @gapi["description"]
end

#description=(new_description) ⇒ Object

Updates the description of the table.

:category: Attributes



207
208
209
# File 'lib/gcloud/bigquery/table.rb', line 207

def description= new_description
  patch_gapi! description: new_description
end

#etagObject

A string hash of the dataset.

:category: Attributes



177
178
179
180
# File 'lib/gcloud/bigquery/table.rb', line 177

def etag
  ensure_full_data!
  @gapi["etag"]
end

#expires_atObject

The time when this table expires. If not present, the table will persist indefinitely. Expired tables will be deleted and their storage reclaimed.

:category: Attributes



248
249
250
251
252
# File 'lib/gcloud/bigquery/table.rb', line 248

def expires_at
  ensure_full_data!
  return nil if @gapi["expirationTime"].nil?
  Time.at(@gapi["expirationTime"] / 1000.0)
end

#extract(extract_url, options = {}) ⇒ Object

Extract the data from the table to a Google Cloud Storage file. For more information, see Exporting Data From BigQuery .

Parameters

extract_url

The Google Storage file or file URI pattern(s) to which BigQuery should extract the table data. (Gcloud::Storage::File or String or Array)

options

An optional Hash for controlling additional behavior. (Hash)

options[:format]

The exported file format. The default value is csv. (String)

The following values are supported:

Returns

Gcloud::Bigquery::ExtractJob

Example

require "gcloud"

gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

extract_job = table.extract "gs://my-bucket/file-name.json",
                            format: "json"

:category: Data



605
606
607
608
609
610
611
612
613
# File 'lib/gcloud/bigquery/table.rb', line 605

def extract extract_url, options = {}
  ensure_connection!
  resp = connection.extract_table table_ref, extract_url, options
  if resp.success?
    Job.from_gapi resp.data, connection
  else
    fail ApiError.from_response(resp)
  end
end

#fieldsObject

The fields of the table.

:category: Attributes



391
392
393
394
395
396
# File 'lib/gcloud/bigquery/table.rb', line 391

def fields
  f = schema["fields"]
  f = f.to_hash if f.respond_to? :to_hash
  f = [] if f.nil?
  f
end

#headersObject

The names of the columns in the table.

:category: Attributes



403
404
405
# File 'lib/gcloud/bigquery/table.rb', line 403

def headers
  fields.map { |f| f["name"] }
end

#idObject

The combined Project ID, Dataset ID, and Table ID for this table, in the format specified by the Query Reference: project_name:datasetId.tableId. To use this value in queries see #query_id.

:category: Attributes



127
128
129
# File 'lib/gcloud/bigquery/table.rb', line 127

def id
  @gapi["id"]
end

#insert(rows, options = {}) ⇒ Object

Inserts data into the table for near-immediate querying, without the need to complete a #load operation before the data can appear in query results. See Streaming Data Into BigQuery .

Parameters

rows

A hash object or array of hash objects containing the data. (Array or Hash)

options

An optional Hash for controlling additional behavior. (Hash)

options[:skip_invalid]

Insert all valid rows of a request, even if invalid rows exist. The default value is false, which causes the entire request to fail if any invalid rows exist. (Boolean)

options[:ignore_unknown]

Accept rows that contain values that do not match the schema. The unknown values are ignored. Default is false, which treats unknown values as errors. (Boolean)

Returns

Gcloud::Bigquery::InsertResponse

Example

require "gcloud"

gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

rows = [
  { "first_name" => "Alice", "age" => 21 },
  { "first_name" => "Bob", "age" => 22 }
]
table.insert rows

:category: Data



774
775
776
777
778
779
780
781
782
783
# File 'lib/gcloud/bigquery/table.rb', line 774

def insert rows, options = {}
  rows = [rows] if rows.is_a? Hash
  ensure_connection!
  resp = connection.insert_tabledata dataset_id, table_id, rows, options
  if resp.success?
    InsertResponse.from_gapi rows, resp.data
  else
    fail ApiError.from_response(resp)
  end
end

Links the table to a source table identified by a URI.

Parameters

source_url

The URI of source table to link. (String)

options

An optional Hash for controlling additional behavior. (Hash)

options[:create]

Specifies whether the job is allowed to create new tables. (String)

The following values are supported:

  • needed - Create the table if it does not exist.

  • never - The table must already exist. A ‘notFound’ error is raised if the table does not exist.

options[:write]

Specifies how to handle data already present in the table. The default value is empty. (String)

The following values are supported:

  • truncate - BigQuery overwrites the table data.

  • append - BigQuery appends the data to the table.

  • empty - An error will be returned if the table already contains data.

Returns

Gcloud::Bigquery::Job

:category: Data



556
557
558
559
560
561
562
563
564
# File 'lib/gcloud/bigquery/table.rb', line 556

def link source_url, options = {} #:nodoc:
  ensure_connection!
  resp = connection.link_table table_ref, source_url, options
  if resp.success?
    Job.from_gapi resp.data, connection
  else
    fail ApiError.from_response(resp)
  end
end

#load(file, options = {}) ⇒ Object

Loads data into the table.

Parameters

file

A file or the URI of a Google Cloud Storage file containing data to load into the table. (File or Gcloud::Storage::File or String)

options

An optional Hash for controlling additional behavior. (Hash)

options[:format]

The exported file format. The default value is csv. (String)

The following values are supported:

options[:create]

Specifies whether the job is allowed to create new tables. (String)

The following values are supported:

  • needed - Create the table if it does not exist.

  • never - The table must already exist. A ‘notFound’ error is raised if the table does not exist.

options[:write]

Specifies how to handle data already present in the table. The default value is empty. (String)

The following values are supported:

  • truncate - BigQuery overwrites the table data.

  • append - BigQuery appends the data to the table.

  • empty - An error will be returned if the table already contains data.

options[:projection_fields]

If the format option is set to datastore_backup, indicates which entity properties to load from a Cloud Datastore backup. Property names are case sensitive and must be top-level properties. If not set, BigQuery loads all properties. If any named property isn’t found in the Cloud Datastore backup, an invalid error is returned. (Array)

Returns

Gcloud::Bigquery::LoadJob

Examples

require "gcloud"

gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

load_job = table.load "gs://my-bucket/file-name.csv"

You can also pass a gcloud storage file instance.

require "gcloud"
require "gcloud/storage"

gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

storage = gcloud.storage
bucket = storage.bucket "my-bucket"
file = bucket.file "file-name.csv"
load_job = table.load file

Or, you can upload a file directly. See Data with a POST Request[ cloud.google.com/bigquery/loading-data-post-request#multipart].

require "gcloud"

gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

file = File.open "my_data.csv"
load_job = table.load file

A note about large direct uploads

You may encounter a broken pipe error while attempting to upload large files. To avoid this problem, add httpclient as a dependency to your project, and configure Faraday to use it, after requiring Gcloud, but before initiating your Gcloud connection.

require "gcloud"

Faraday.default_adapter = :httpclient

gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"

:category: Data



720
721
722
723
724
725
726
727
728
729
# File 'lib/gcloud/bigquery/table.rb', line 720

def load file, options = {}
  ensure_connection!
  if storage_url? file
    load_storage file, options
  elsif local_file? file
    load_local file, options
  else
    fail Gcloud::Bigquery::Error, "Don't know how to load #{file}"
  end
end

#locationObject

The geographic location where the table should reside. Possible values include EU and US. The default value is US.

:category: Attributes



288
289
290
291
# File 'lib/gcloud/bigquery/table.rb', line 288

def location
  ensure_full_data!
  @gapi["location"]
end

#modified_atObject

The date when this table was last modified.

:category: Attributes



259
260
261
262
# File 'lib/gcloud/bigquery/table.rb', line 259

def modified_at
  ensure_full_data!
  Time.at(@gapi["lastModifiedTime"] / 1000.0)
end

#nameObject

The name of the table.

:category: Attributes



159
160
161
# File 'lib/gcloud/bigquery/table.rb', line 159

def name
  @gapi["friendlyName"]
end

#name=(new_name) ⇒ Object

Updates the name of the table.

:category: Attributes



168
169
170
# File 'lib/gcloud/bigquery/table.rb', line 168

def name= new_name
  patch_gapi! name: new_name
end

#project_idObject

The ID of the Project containing this table.

:category: Attributes



105
106
107
# File 'lib/gcloud/bigquery/table.rb', line 105

def project_id
  @gapi["tableReference"]["projectId"]
end

#query_idObject

The value returned by #id, wrapped in square brackets if the Project ID contains dashes, as specified by the Query Reference. Useful in queries.

Example

require "gcloud"

gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

data = bigquery.query "SELECT name FROM #{table.query_id}"

:category: Attributes



150
151
152
# File 'lib/gcloud/bigquery/table.rb', line 150

def query_id
  project_id["-"] ? "[#{id}]" : id
end

#reload!Object Also known as: refresh!

Reloads the table with current data from the BigQuery service.

:category: Lifecycle



820
821
822
823
824
825
826
827
828
# File 'lib/gcloud/bigquery/table.rb', line 820

def reload!
  ensure_connection!
  resp = connection.get_table dataset_id, table_id
  if resp.success?
    @gapi = resp.data
  else
    fail ApiError.from_response(resp)
  end
end

#rows_countObject

The number of rows in the table.

:category: Data



226
227
228
229
# File 'lib/gcloud/bigquery/table.rb', line 226

def rows_count
  ensure_full_data!
  @gapi["numRows"]
end

#schema(options = {}) {|schema_builder| ... } ⇒ Object

Returns the table’s schema as hash containing the keys and values returned by the Google Cloud BigQuery Rest API . This method can also be used to set, replace, or add to the schema by passing a block. See Table::Schema for available methods. To set the schema by passing a hash instead, use #schema=.

Parameters

options

An optional Hash for controlling additional behavior. (Hash)

options[:replace]

Whether to replace the existing schema with the new schema. If true, the fields will replace the existing schema. If false, the fields will be added to the existing schema. When a table already contains data, schema changes must be additive. Thus, the default value is false. (Boolean)

Examples

require "gcloud"

gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.create_table "my_table"

table.schema do |schema|
  schema.string "first_name", mode: :required
  schema.record "cities_lived", mode: :repeated do |nested_schema|
    nested_schema.string "place", mode: :required
    nested_schema.integer "number_of_years", mode: :required
  end
end

:category: Attributes

Yields:

  • (schema_builder)


331
332
333
334
335
336
337
338
339
340
341
# File 'lib/gcloud/bigquery/table.rb', line 331

def schema options = {}
  ensure_full_data!
  g = @gapi
  g = g.to_hash if g.respond_to? :to_hash
  s = g["schema"] ||= {}
  return s unless block_given?
  s = nil if options[:replace]
  schema_builder = Schema.new s
  yield schema_builder
  self.schema = schema_builder.schema if schema_builder.changed?
end

#schema=(new_schema) ⇒ Object

Updates the schema of the table. To update the schema using a block instead, use #schema.

Parameters

schema

A hash containing keys and values as specified by the Google Cloud BigQuery Rest API . (Hash)

Example

require "gcloud"

gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.create_table "my_table"

schema = {
  "fields" => [
    {
      "name" => "first_name",
      "type" => "STRING",
      "mode" => "REQUIRED"
    },
    {
      "name" => "age",
      "type" => "INTEGER",
      "mode" => "REQUIRED"
    }
  ]
}
table.schema = schema

:category: Attributes



382
383
384
# File 'lib/gcloud/bigquery/table.rb', line 382

def schema= new_schema
  patch_gapi! schema: new_schema
end

#table?Boolean

Checks if the table’s type is “TABLE”.

:category: Attributes

Returns:

  • (Boolean)


269
270
271
# File 'lib/gcloud/bigquery/table.rb', line 269

def table?
  @gapi["type"] == "TABLE"
end

#table_idObject

A unique ID for this table. The ID must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_). The maximum length is 1,024 characters.

:category: Attributes



87
88
89
# File 'lib/gcloud/bigquery/table.rb', line 87

def table_id
  @gapi["tableReference"]["tableId"]
end

#table_refObject

The gapi fragment containing the Project ID, Dataset ID, and Table ID as a camel-cased hash.



112
113
114
115
116
# File 'lib/gcloud/bigquery/table.rb', line 112

def table_ref #:nodoc:
  table_ref = @gapi["tableReference"]
  table_ref = table_ref.to_hash if table_ref.respond_to? :to_hash
  table_ref
end

#urlObject

A URL that can be used to access the dataset using the REST API.

:category: Attributes



187
188
189
190
# File 'lib/gcloud/bigquery/table.rb', line 187

def url
  ensure_full_data!
  @gapi["selfLink"]
end

#view?Boolean

Checks if the table’s type is “VIEW”.

:category: Attributes

Returns:

  • (Boolean)


278
279
280
# File 'lib/gcloud/bigquery/table.rb', line 278

def view?
  @gapi["type"] == "VIEW"
end