Class: Gcloud::Bigquery::Table
- Inherits:
-
Object
- Object
- Gcloud::Bigquery::Table
- Defined in:
- lib/gcloud/bigquery/table.rb,
lib/gcloud/bigquery/table/list.rb,
lib/gcloud/bigquery/table/schema.rb
Overview
Table
A named resource representing a BigQuery table that holds zero or more records. Every table is defined by a schema that may contain nested and repeated fields. (For more information about nested and repeated fields, see Preparing Data for BigQuery.)
require "gcloud"
gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.create_table "my_table" do |schema|
schema.string "first_name", mode: :required
schema.record "cities_lived", mode: :repeated do |nested_schema|
nested_schema.string "place", mode: :required
nested_schema.integer "number_of_years", mode: :required
end
end
row = {
"first_name" => "Alice",
"cities_lived" => [
{
"place" => "Seattle",
"number_of_years" => 5
},
{
"place" => "Stockholm",
"number_of_years" => 6
}
]
}
table.insert row
Defined Under Namespace
Instance Attribute Summary collapse
-
#connection ⇒ Object
The Connection object.
-
#gapi ⇒ Object
The Google API Client object.
Class Method Summary collapse
-
.from_gapi(gapi, conn) ⇒ Object
New Table from a Google API Client object.
Instance Method Summary collapse
-
#bytes_count ⇒ Object
The number of bytes in the table.
-
#copy(destination_table, options = {}) ⇒ Object
Copies the data from the table to another table.
-
#created_at ⇒ Object
The time when this table was created.
-
#data(options = {}) ⇒ Object
Retrieves data from the table.
-
#dataset_id ⇒ Object
The ID of the
Datasetcontaining this table. -
#delete ⇒ Object
Permanently deletes the table.
-
#description ⇒ Object
The description of the table.
-
#description=(new_description) ⇒ Object
Updates the description of the table.
-
#etag ⇒ Object
A string hash of the dataset.
-
#expires_at ⇒ Object
The time when this table expires.
-
#extract(extract_url, options = {}) ⇒ Object
Extract the data from the table to a Google Cloud Storage file.
-
#fields ⇒ Object
The fields of the table.
-
#headers ⇒ Object
The names of the columns in the table.
-
#id ⇒ Object
The combined Project ID, Dataset ID, and Table ID for this table, in the format specified by the Query Reference:
project_name:datasetId.tableId. -
#initialize ⇒ Table
constructor
Create an empty Table object.
-
#insert(rows, options = {}) ⇒ Object
Inserts data into the table for near-immediate querying, without the need to complete a #load operation before the data can appear in query results.
-
#link(source_url, options = {}) ⇒ Object
Links the table to a source table identified by a URI.
-
#load(file, options = {}) ⇒ Object
Loads data into the table.
-
#location ⇒ Object
The geographic location where the table should reside.
-
#modified_at ⇒ Object
The date when this table was last modified.
-
#name ⇒ Object
The name of the table.
-
#name=(new_name) ⇒ Object
Updates the name of the table.
-
#project_id ⇒ Object
The ID of the
Projectcontaining this table. -
#query_id ⇒ Object
The value returned by #id, wrapped in square brackets if the Project ID contains dashes, as specified by the Query Reference.
-
#reload! ⇒ Object
(also: #refresh!)
Reloads the table with current data from the BigQuery service.
-
#rows_count ⇒ Object
The number of rows in the table.
-
#schema(options = {}) {|schema_builder| ... } ⇒ Object
Returns the table’s schema as hash containing the keys and values returned by the Google Cloud BigQuery Rest API .
-
#schema=(new_schema) ⇒ Object
Updates the schema of the table.
-
#table? ⇒ Boolean
Checks if the table’s type is “TABLE”.
-
#table_id ⇒ Object
A unique ID for this table.
-
#table_ref ⇒ Object
The gapi fragment containing the Project ID, Dataset ID, and Table ID as a camel-cased hash.
-
#url ⇒ Object
A URL that can be used to access the dataset using the REST API.
-
#view? ⇒ Boolean
Checks if the table’s type is “VIEW”.
Constructor Details
#initialize ⇒ Table
Create an empty Table object.
75 76 77 78 |
# File 'lib/gcloud/bigquery/table.rb', line 75 def initialize #:nodoc: @connection = nil @gapi = {} end |
Instance Attribute Details
#connection ⇒ Object
The Connection object.
67 68 69 |
# File 'lib/gcloud/bigquery/table.rb', line 67 def connection @connection end |
#gapi ⇒ Object
The Google API Client object.
71 72 73 |
# File 'lib/gcloud/bigquery/table.rb', line 71 def gapi @gapi end |
Class Method Details
.from_gapi(gapi, conn) ⇒ Object
New Table from a Google API Client object.
833 834 835 836 837 838 839 |
# File 'lib/gcloud/bigquery/table.rb', line 833 def self.from_gapi gapi, conn #:nodoc: klass = class_for gapi klass.new.tap do |f| f.gapi = gapi f.connection = conn end end |
Instance Method Details
#bytes_count ⇒ Object
The number of bytes in the table.
:category: Data
216 217 218 219 |
# File 'lib/gcloud/bigquery/table.rb', line 216 def bytes_count ensure_full_data! @gapi["numBytes"] end |
#copy(destination_table, options = {}) ⇒ Object
Copies the data from the table to another table.
Parameters
destination_table-
The destination for the copied data. (
TableorString) options-
An optional Hash for controlling additional behavior. (
Hash) options[:create]-
Specifies whether the job is allowed to create new tables. (
String)The following values are supported:
-
needed- Create the table if it does not exist. -
never- The table must already exist. A ‘notFound’ error is raised if the table does not exist.
-
options[:write]-
Specifies how to handle data already present in the destination table. The default value is
empty. (String)The following values are supported:
-
truncate- BigQuery overwrites the table data. -
append- BigQuery appends the data to the table. -
empty- An error will be returned if the destination table already contains data.
-
Returns
Gcloud::Bigquery::CopyJob
Examples
require "gcloud"
gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"
destination_table = dataset.table "my_destination_table"
copy_job = table.copy destination_table
The destination table argument can also be a string identifier as specified by the Query Reference: project_name:datasetId.tableId. This is useful for referencing tables in other projects and datasets.
require "gcloud"
gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"
copy_job = table.copy "other-project:other_dataset.other_table"
:category: Data
512 513 514 515 516 517 518 519 520 521 522 |
# File 'lib/gcloud/bigquery/table.rb', line 512 def copy destination_table, = {} ensure_connection! resp = connection.copy_table table_ref, get_table_ref(destination_table), if resp.success? Job.from_gapi resp.data, connection else fail ApiError.from_response(resp) end end |
#created_at ⇒ Object
The time when this table was created.
:category: Attributes
236 237 238 239 |
# File 'lib/gcloud/bigquery/table.rb', line 236 def created_at ensure_full_data! Time.at(@gapi["creationTime"] / 1000.0) end |
#data(options = {}) ⇒ Object
Retrieves data from the table.
Parameters
options-
An optional Hash for controlling additional behavior. (
Hash) options[:token]-
Page token, returned by a previous call, identifying the result set. (
String) options[:max]-
Maximum number of results to return. (
Integer) options[:start]-
Zero-based index of the starting row to read. (
Integer)
Returns
Gcloud::Bigquery::Data
Example
require "gcloud"
gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"
data = table.data
data.each do |row|
puts row["first_name"]
end
more_data = table.data token: data.token
:category: Data
443 444 445 446 447 448 449 450 451 |
# File 'lib/gcloud/bigquery/table.rb', line 443 def data = {} ensure_connection! resp = connection.list_tabledata dataset_id, table_id, if resp.success? Data.from_response resp, self else fail ApiError.from_response(resp) end end |
#dataset_id ⇒ Object
The ID of the Dataset containing this table.
:category: Attributes
96 97 98 |
# File 'lib/gcloud/bigquery/table.rb', line 96 def dataset_id @gapi["tableReference"]["datasetId"] end |
#delete ⇒ Object
805 806 807 808 809 810 811 812 813 |
# File 'lib/gcloud/bigquery/table.rb', line 805 def delete ensure_connection! resp = connection.delete_table dataset_id, table_id if resp.success? true else fail ApiError.from_response(resp) end end |
#description ⇒ Object
The description of the table.
:category: Attributes
197 198 199 200 |
# File 'lib/gcloud/bigquery/table.rb', line 197 def description ensure_full_data! @gapi["description"] end |
#description=(new_description) ⇒ Object
Updates the description of the table.
:category: Attributes
207 208 209 |
# File 'lib/gcloud/bigquery/table.rb', line 207 def description= new_description patch_gapi! description: new_description end |
#etag ⇒ Object
A string hash of the dataset.
:category: Attributes
177 178 179 180 |
# File 'lib/gcloud/bigquery/table.rb', line 177 def etag ensure_full_data! @gapi["etag"] end |
#expires_at ⇒ Object
The time when this table expires. If not present, the table will persist indefinitely. Expired tables will be deleted and their storage reclaimed.
:category: Attributes
248 249 250 251 252 |
# File 'lib/gcloud/bigquery/table.rb', line 248 def expires_at ensure_full_data! return nil if @gapi["expirationTime"].nil? Time.at(@gapi["expirationTime"] / 1000.0) end |
#extract(extract_url, options = {}) ⇒ Object
Extract the data from the table to a Google Cloud Storage file. For more information, see Exporting Data From BigQuery .
Parameters
extract_url-
The Google Storage file or file URI pattern(s) to which BigQuery should extract the table data. (
Gcloud::Storage::FileorStringorArray) options-
An optional Hash for controlling additional behavior. (
Hash) options[:format]-
The exported file format. The default value is
csv. (String)The following values are supported:
-
csv- CSV -
json- Newline-delimited JSON -
avro- Avro
-
Returns
Gcloud::Bigquery::ExtractJob
Example
require "gcloud"
gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"
extract_job = table.extract "gs://my-bucket/file-name.json",
format: "json"
:category: Data
605 606 607 608 609 610 611 612 613 |
# File 'lib/gcloud/bigquery/table.rb', line 605 def extract extract_url, = {} ensure_connection! resp = connection.extract_table table_ref, extract_url, if resp.success? Job.from_gapi resp.data, connection else fail ApiError.from_response(resp) end end |
#fields ⇒ Object
The fields of the table.
:category: Attributes
391 392 393 394 395 396 |
# File 'lib/gcloud/bigquery/table.rb', line 391 def fields f = schema["fields"] f = f.to_hash if f.respond_to? :to_hash f = [] if f.nil? f end |
#headers ⇒ Object
The names of the columns in the table.
:category: Attributes
403 404 405 |
# File 'lib/gcloud/bigquery/table.rb', line 403 def headers fields.map { |f| f["name"] } end |
#id ⇒ Object
The combined Project ID, Dataset ID, and Table ID for this table, in the format specified by the Query Reference: project_name:datasetId.tableId. To use this value in queries see #query_id.
:category: Attributes
127 128 129 |
# File 'lib/gcloud/bigquery/table.rb', line 127 def id @gapi["id"] end |
#insert(rows, options = {}) ⇒ Object
Inserts data into the table for near-immediate querying, without the need to complete a #load operation before the data can appear in query results. See Streaming Data Into BigQuery .
Parameters
rows-
A hash object or array of hash objects containing the data. (
ArrayorHash) options-
An optional Hash for controlling additional behavior. (
Hash) options[:skip_invalid]-
Insert all valid rows of a request, even if invalid rows exist. The default value is
false, which causes the entire request to fail if any invalid rows exist. (Boolean) options[:ignore_unknown]-
Accept rows that contain values that do not match the schema. The unknown values are ignored. Default is false, which treats unknown values as errors. (
Boolean)
Returns
Gcloud::Bigquery::InsertResponse
Example
require "gcloud"
gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"
rows = [
{ "first_name" => "Alice", "age" => 21 },
{ "first_name" => "Bob", "age" => 22 }
]
table.insert rows
:category: Data
774 775 776 777 778 779 780 781 782 783 |
# File 'lib/gcloud/bigquery/table.rb', line 774 def insert rows, = {} rows = [rows] if rows.is_a? Hash ensure_connection! resp = connection.insert_tabledata dataset_id, table_id, rows, if resp.success? InsertResponse.from_gapi rows, resp.data else fail ApiError.from_response(resp) end end |
#link(source_url, options = {}) ⇒ Object
Links the table to a source table identified by a URI.
Parameters
source_url-
The URI of source table to link. (
String) options-
An optional Hash for controlling additional behavior. (
Hash) options[:create]-
Specifies whether the job is allowed to create new tables. (
String)The following values are supported:
-
needed- Create the table if it does not exist. -
never- The table must already exist. A ‘notFound’ error is raised if the table does not exist.
-
options[:write]-
Specifies how to handle data already present in the table. The default value is
empty. (String)The following values are supported:
-
truncate- BigQuery overwrites the table data. -
append- BigQuery appends the data to the table. -
empty- An error will be returned if the table already contains data.
-
Returns
Gcloud::Bigquery::Job
:category: Data
556 557 558 559 560 561 562 563 564 |
# File 'lib/gcloud/bigquery/table.rb', line 556 def link source_url, = {} #:nodoc: ensure_connection! resp = connection.link_table table_ref, source_url, if resp.success? Job.from_gapi resp.data, connection else fail ApiError.from_response(resp) end end |
#load(file, options = {}) ⇒ Object
Loads data into the table.
Parameters
file-
A file or the URI of a Google Cloud Storage file containing data to load into the table. (
FileorGcloud::Storage::FileorString) options-
An optional Hash for controlling additional behavior. (
Hash) options[:format]-
The exported file format. The default value is
csv. (String)The following values are supported:
-
csv- CSV -
json- Newline-delimited JSON -
avro- Avro -
datastore_backup- Cloud Datastore backup
-
options[:create]-
Specifies whether the job is allowed to create new tables. (
String)The following values are supported:
-
needed- Create the table if it does not exist. -
never- The table must already exist. A ‘notFound’ error is raised if the table does not exist.
-
options[:write]-
Specifies how to handle data already present in the table. The default value is
empty. (String)The following values are supported:
-
truncate- BigQuery overwrites the table data. -
append- BigQuery appends the data to the table. -
empty- An error will be returned if the table already contains data.
-
options[:projection_fields]-
If the
formatoption is set todatastore_backup, indicates which entity properties to load from a Cloud Datastore backup. Property names are case sensitive and must be top-level properties. If not set, BigQuery loads all properties. If any named property isn’t found in the Cloud Datastore backup, an invalid error is returned. (Array)
Returns
Gcloud::Bigquery::LoadJob
Examples
require "gcloud"
gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"
load_job = table.load "gs://my-bucket/file-name.csv"
You can also pass a gcloud storage file instance.
require "gcloud"
require "gcloud/storage"
gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"
storage = gcloud.storage
bucket = storage.bucket "my-bucket"
file = bucket.file "file-name.csv"
load_job = table.load file
Or, you can upload a file directly. See Data with a POST Request[ cloud.google.com/bigquery/loading-data-post-request#multipart].
require "gcloud"
gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"
file = File.open "my_data.csv"
load_job = table.load file
A note about large direct uploads
You may encounter a broken pipe error while attempting to upload large files. To avoid this problem, add httpclient as a dependency to your project, and configure Faraday to use it, after requiring Gcloud, but before initiating your Gcloud connection.
require "gcloud"
Faraday.default_adapter = :httpclient
gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
:category: Data
720 721 722 723 724 725 726 727 728 729 |
# File 'lib/gcloud/bigquery/table.rb', line 720 def load file, = {} ensure_connection! if storage_url? file load_storage file, elsif local_file? file load_local file, else fail Gcloud::Bigquery::Error, "Don't know how to load #{file}" end end |
#location ⇒ Object
The geographic location where the table should reside. Possible values include EU and US. The default value is US.
:category: Attributes
288 289 290 291 |
# File 'lib/gcloud/bigquery/table.rb', line 288 def location ensure_full_data! @gapi["location"] end |
#modified_at ⇒ Object
The date when this table was last modified.
:category: Attributes
259 260 261 262 |
# File 'lib/gcloud/bigquery/table.rb', line 259 def modified_at ensure_full_data! Time.at(@gapi["lastModifiedTime"] / 1000.0) end |
#name ⇒ Object
The name of the table.
:category: Attributes
159 160 161 |
# File 'lib/gcloud/bigquery/table.rb', line 159 def name @gapi["friendlyName"] end |
#name=(new_name) ⇒ Object
Updates the name of the table.
:category: Attributes
168 169 170 |
# File 'lib/gcloud/bigquery/table.rb', line 168 def name= new_name patch_gapi! name: new_name end |
#project_id ⇒ Object
The ID of the Project containing this table.
:category: Attributes
105 106 107 |
# File 'lib/gcloud/bigquery/table.rb', line 105 def project_id @gapi["tableReference"]["projectId"] end |
#query_id ⇒ Object
The value returned by #id, wrapped in square brackets if the Project ID contains dashes, as specified by the Query Reference. Useful in queries.
Example
require "gcloud"
gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"
data = bigquery.query "SELECT name FROM #{table.query_id}"
:category: Attributes
150 151 152 |
# File 'lib/gcloud/bigquery/table.rb', line 150 def query_id project_id["-"] ? "[#{id}]" : id end |
#reload! ⇒ Object Also known as: refresh!
Reloads the table with current data from the BigQuery service.
:category: Lifecycle
820 821 822 823 824 825 826 827 828 |
# File 'lib/gcloud/bigquery/table.rb', line 820 def reload! ensure_connection! resp = connection.get_table dataset_id, table_id if resp.success? @gapi = resp.data else fail ApiError.from_response(resp) end end |
#rows_count ⇒ Object
The number of rows in the table.
:category: Data
226 227 228 229 |
# File 'lib/gcloud/bigquery/table.rb', line 226 def rows_count ensure_full_data! @gapi["numRows"] end |
#schema(options = {}) {|schema_builder| ... } ⇒ Object
Returns the table’s schema as hash containing the keys and values returned by the Google Cloud BigQuery Rest API . This method can also be used to set, replace, or add to the schema by passing a block. See Table::Schema for available methods. To set the schema by passing a hash instead, use #schema=.
Parameters
options-
An optional Hash for controlling additional behavior. (
Hash) options[:replace]-
Whether to replace the existing schema with the new schema. If
true, the fields will replace the existing schema. Iffalse, the fields will be added to the existing schema. When a table already contains data, schema changes must be additive. Thus, the default value isfalse. (Boolean)
Examples
require "gcloud"
gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.create_table "my_table"
table.schema do |schema|
schema.string "first_name", mode: :required
schema.record "cities_lived", mode: :repeated do |nested_schema|
nested_schema.string "place", mode: :required
nested_schema.integer "number_of_years", mode: :required
end
end
:category: Attributes
331 332 333 334 335 336 337 338 339 340 341 |
# File 'lib/gcloud/bigquery/table.rb', line 331 def schema = {} ensure_full_data! g = @gapi g = g.to_hash if g.respond_to? :to_hash s = g["schema"] ||= {} return s unless block_given? s = nil if [:replace] schema_builder = Schema.new s yield schema_builder self.schema = schema_builder.schema if schema_builder.changed? end |
#schema=(new_schema) ⇒ Object
Updates the schema of the table. To update the schema using a block instead, use #schema.
Parameters
schema-
A hash containing keys and values as specified by the Google Cloud BigQuery Rest API . (
Hash)
Example
require "gcloud"
gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.create_table "my_table"
schema = {
"fields" => [
{
"name" => "first_name",
"type" => "STRING",
"mode" => "REQUIRED"
},
{
"name" => "age",
"type" => "INTEGER",
"mode" => "REQUIRED"
}
]
}
table.schema = schema
:category: Attributes
382 383 384 |
# File 'lib/gcloud/bigquery/table.rb', line 382 def schema= new_schema patch_gapi! schema: new_schema end |
#table? ⇒ Boolean
Checks if the table’s type is “TABLE”.
:category: Attributes
269 270 271 |
# File 'lib/gcloud/bigquery/table.rb', line 269 def table? @gapi["type"] == "TABLE" end |
#table_id ⇒ Object
A unique ID for this table. The ID must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_). The maximum length is 1,024 characters.
:category: Attributes
87 88 89 |
# File 'lib/gcloud/bigquery/table.rb', line 87 def table_id @gapi["tableReference"]["tableId"] end |
#table_ref ⇒ Object
The gapi fragment containing the Project ID, Dataset ID, and Table ID as a camel-cased hash.
112 113 114 115 116 |
# File 'lib/gcloud/bigquery/table.rb', line 112 def table_ref #:nodoc: table_ref = @gapi["tableReference"] table_ref = table_ref.to_hash if table_ref.respond_to? :to_hash table_ref end |
#url ⇒ Object
A URL that can be used to access the dataset using the REST API.
:category: Attributes
187 188 189 190 |
# File 'lib/gcloud/bigquery/table.rb', line 187 def url ensure_full_data! @gapi["selfLink"] end |
#view? ⇒ Boolean
Checks if the table’s type is “VIEW”.
:category: Attributes
278 279 280 |
# File 'lib/gcloud/bigquery/table.rb', line 278 def view? @gapi["type"] == "VIEW" end |