Class: Google::Cloud::Bigquery::External::CsvSource
- Inherits:
-
DataSource
- Object
- DataSource
- Google::Cloud::Bigquery::External::CsvSource
- Defined in:
- lib/google/cloud/bigquery/external/csv_source.rb
Overview
CsvSource
CsvSource is a subclass of DataSource and represents a CSV external data source that can be queried from directly, such as Google Cloud Storage or Google Drive, even though the data is not stored in BigQuery. Instead of loading or streaming the data, this object references the external data source.
Instance Method Summary collapse
-
#delimiter ⇒ String
The separator for fields in a CSV file.
-
#delimiter=(new_delimiter) ⇒ Object
Set the separator for fields in a CSV file.
-
#encoding ⇒ String
The character encoding of the data.
-
#encoding=(new_encoding) ⇒ Object
Set the character encoding of the data.
-
#fields ⇒ Array<Schema::Field>
The fields of the schema.
-
#headers ⇒ Array<Symbol>
The names of the columns in the schema.
-
#iso8859_1? ⇒ Boolean
Checks if the character encoding of the data is "ISO-8859-1".
-
#jagged_rows ⇒ Boolean
Indicates if BigQuery should accept rows that are missing trailing optional columns.
-
#jagged_rows=(new_jagged_rows) ⇒ Object
Set whether BigQuery should accept rows that are missing trailing optional columns.
-
#null_marker ⇒ String?
Specifies a string that represents a null value in a CSV file.
-
#null_marker=(null_marker) ⇒ Object
Sets a string that represents a null value in a CSV file.
-
#null_markers ⇒ Array<String>
The list of strings represented as SQL NULL value in a CSV file.
-
#null_markers=(null_markers) ⇒ Object
Sets the list of strings represented as SQL NULL value in a CSV file.
-
#param_types ⇒ Hash
The types of the fields in the data in the schema, using the same format as the optional query parameter types.
-
#preserve_ascii_control_characters ⇒ Boolean?
Indicates if the embedded ASCII control characters (the first 32 characters in the ASCII-table, from
\x00to\x1F) are preserved. -
#preserve_ascii_control_characters=(val) ⇒ Object
Sets whether the embedded ASCII control characters (the first 32 characters in the ASCII-table, from
\x00to\x1F) are preserved. -
#quote ⇒ String
The value that is used to quote data sections in a CSV file.
-
#quote=(new_quote) ⇒ Object
Set the value that is used to quote data sections in a CSV file.
-
#quoted_newlines ⇒ Boolean
Indicates if BigQuery should allow quoted data sections that contain newline characters in a CSV file.
-
#quoted_newlines=(new_quoted_newlines) ⇒ Object
Set whether BigQuery should allow quoted data sections that contain newline characters in a CSV file.
-
#schema(replace: false) {|schema| ... } ⇒ Google::Cloud::Bigquery::Schema
The schema for the data.
-
#schema=(new_schema) ⇒ Object
Set the schema for the data.
-
#skip_leading_rows ⇒ Integer
The number of rows at the top of a CSV file that BigQuery will skip when reading the data.
-
#skip_leading_rows=(row_count) ⇒ Object
Set the number of rows at the top of a CSV file that BigQuery will skip when reading the data.
-
#source_column_match ⇒ String?
Controls the strategy used to match loaded columns to the schema.
-
#source_column_match=(source_column_match) ⇒ Object
Sets the strategy used to match loaded columns to the schema.
-
#utf8? ⇒ Boolean
Checks if the character encoding of the data is "UTF-8".
Methods inherited from DataSource
#autodetect, #autodetect=, #avro?, #backup?, #bigtable?, #compression, #compression=, #csv?, #date_format, #date_format=, #datetime_format, #datetime_format=, #format, #hive_partitioning?, #hive_partitioning_mode, #hive_partitioning_mode=, #hive_partitioning_require_partition_filter=, #hive_partitioning_require_partition_filter?, #hive_partitioning_source_uri_prefix, #hive_partitioning_source_uri_prefix=, #ignore_unknown, #ignore_unknown=, #json?, #max_bad_records, #max_bad_records=, #orc?, #parquet?, #reference_file_schema_uri, #reference_file_schema_uri=, #sheets?, #time_format, #time_format=, #time_zone, #time_zone=, #timestamp_format, #timestamp_format=, #urls
Instance Method Details
#delimiter ⇒ String
The separator for fields in a CSV file.
256 257 258 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 256 def delimiter @gapi..field_delimiter end |
#delimiter=(new_delimiter) ⇒ Object
Set the separator for fields in a CSV file.
277 278 279 280 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 277 def delimiter= new_delimiter frozen_check! @gapi..field_delimiter = new_delimiter end |
#encoding ⇒ String
The character encoding of the data.
167 168 169 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 167 def encoding @gapi..encoding end |
#encoding=(new_encoding) ⇒ Object
Set the character encoding of the data.
188 189 190 191 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 188 def encoding= new_encoding frozen_check! @gapi..encoding = new_encoding end |
#fields ⇒ Array<Schema::Field>
The fields of the schema.
657 658 659 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 657 def fields schema.fields end |
#headers ⇒ Array<Symbol>
The names of the columns in the schema.
666 667 668 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 666 def headers schema.headers end |
#iso8859_1? ⇒ Boolean
Checks if the character encoding of the data is "ISO-8859-1".
235 236 237 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 235 def iso8859_1? encoding == "ISO-8859-1" end |
#jagged_rows ⇒ Boolean
Indicates if BigQuery should accept rows that are missing trailing optional columns.
78 79 80 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 78 def jagged_rows @gapi..allow_jagged_rows end |
#jagged_rows=(new_jagged_rows) ⇒ Object
Set whether BigQuery should accept rows that are missing trailing optional columns.
100 101 102 103 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 100 def jagged_rows= new_jagged_rows frozen_check! @gapi..allow_jagged_rows = new_jagged_rows end |
#null_marker ⇒ String?
Specifies a string that represents a null value in a CSV file. For
example, if you specify \N, BigQuery interprets \N as a null value when
querying a CSV file. The default value is the empty string. If you set this
property to a custom value, BigQuery throws an error if an empty string is
present for all data types except for STRING and BYTE. For STRING and BYTE
columns, BigQuery interprets the empty string as an empty value.
392 393 394 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 392 def null_marker @gapi..null_marker end |
#null_marker=(null_marker) ⇒ Object
Sets a string that represents a null value in a CSV file. For
example, if you specify \N, BigQuery interprets \N as a null value when
querying a CSV file. The default value is the empty string. If you set this
property to a custom value, BigQuery throws an error if an empty string is
present for all data types except for STRING and BYTE. For STRING and BYTE
columns, BigQuery interprets the empty string as an empty value.
418 419 420 421 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 418 def null_marker= null_marker frozen_check! @gapi..null_marker = null_marker end |
#null_markers ⇒ Array<String>
The list of strings represented as SQL NULL value in a CSV file. null_marker and null_markers can't be set at the same time. If null_marker is set, null_markers has to be not set. If null_markers is set, null_marker has to be not set. If both null_marker and null_markers are set at the same time, a user error would be thrown. Any strings listed in null_markers, including empty string would be interpreted as SQL NULL. This applies to all column types.
446 447 448 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 446 def null_markers @gapi..null_markers || [] end |
#null_markers=(null_markers) ⇒ Object
Sets the list of strings represented as SQL NULL value in a CSV file. null_marker and null_markers can't be set at the same time. If null_marker is set, null_markers has to be not set. If null_markers is set, null_marker has to be not set. If both null_marker and null_markers are set at the same time, a user error would be thrown. Any strings listed in null_markers, including empty string would be interpreted as SQL NULL. This applies to all column types.
474 475 476 477 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 474 def null_markers= null_markers frozen_check! @gapi..null_markers = null_markers end |
#param_types ⇒ Hash
The types of the fields in the data in the schema, using the same format as the optional query parameter types.
676 677 678 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 676 def param_types schema.param_types end |
#preserve_ascii_control_characters ⇒ Boolean?
Indicates if the embedded ASCII control characters (the first 32
characters in the ASCII-table, from \x00 to \x1F) are preserved.
By default, ASCII control characters are not preserved.
560 561 562 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 560 def preserve_ascii_control_characters @gapi..preserve_ascii_control_characters end |
#preserve_ascii_control_characters=(val) ⇒ Object
Sets whether the embedded ASCII control characters (the first 32
characters in the ASCII-table, from \x00 to \x1F) are preserved.
By default, ASCII control characters are not preserved.
583 584 585 586 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 583 def preserve_ascii_control_characters= val frozen_check! @gapi..preserve_ascii_control_characters = val end |
#quote ⇒ String
The value that is used to quote data sections in a CSV file.
299 300 301 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 299 def quote @gapi..quote end |
#quote=(new_quote) ⇒ Object
Set the value that is used to quote data sections in a CSV file.
320 321 322 323 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 320 def quote= new_quote frozen_check! @gapi..quote = new_quote end |
#quoted_newlines ⇒ Boolean
Indicates if BigQuery should allow quoted data sections that contain newline characters in a CSV file.
123 124 125 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 123 def quoted_newlines @gapi..allow_quoted_newlines end |
#quoted_newlines=(new_quoted_newlines) ⇒ Object
Set whether BigQuery should allow quoted data sections that contain newline characters in a CSV file.
145 146 147 148 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 145 def quoted_newlines= new_quoted_newlines frozen_check! @gapi..allow_quoted_newlines = new_quoted_newlines end |
#schema(replace: false) {|schema| ... } ⇒ Google::Cloud::Bigquery::Schema
The schema for the data.
615 616 617 618 619 620 621 622 623 624 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 615 def schema replace: false @schema ||= Schema.from_gapi @gapi.schema if replace frozen_check! @schema = Schema.from_gapi end @schema.freeze if frozen? yield @schema if block_given? @schema end |
#schema=(new_schema) ⇒ Object
Set the schema for the data.
647 648 649 650 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 647 def schema= new_schema frozen_check! @schema = new_schema end |
#skip_leading_rows ⇒ Integer
The number of rows at the top of a CSV file that BigQuery will skip when reading the data.
343 344 345 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 343 def skip_leading_rows @gapi..skip_leading_rows end |
#skip_leading_rows=(row_count) ⇒ Object
Set the number of rows at the top of a CSV file that BigQuery will skip when reading the data.
365 366 367 368 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 365 def skip_leading_rows= row_count frozen_check! @gapi..skip_leading_rows = row_count end |
#source_column_match ⇒ String?
Controls the strategy used to match loaded columns to the schema. If not set, a sensible default is chosen based on how the schema is provided. If autodetect is used, then columns are matched by name. Otherwise, columns are matched by position. This is done to keep the behavior backward-compatible.
Acceptable values are:
POSITION: matches by position. Assumes columns are ordered the same way as the schema.NAME: matches by name. Reads the header row as column names and reorders columns to match the schema.
505 506 507 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 505 def source_column_match @gapi..source_column_match end |
#source_column_match=(source_column_match) ⇒ Object
Sets the strategy used to match loaded columns to the schema. If not set, a sensible default is chosen based on how the schema is provided. If autodetect is used, then columns are matched by name. Otherwise, columns are matched by position. This is done to keep the behavior backward-compatible. Optional.
Acceptable values are:
POSITION: matches by position. Assumes columns are ordered the same way as the schema.NAME: matches by name. Reads the header row as column names and reorders columns to match the schema.
536 537 538 539 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 536 def source_column_match= source_column_match frozen_check! @gapi..source_column_match = source_column_match end |
#utf8? ⇒ Boolean
Checks if the character encoding of the data is "UTF-8". This is the default.
212 213 214 215 |
# File 'lib/google/cloud/bigquery/external/csv_source.rb', line 212 def utf8? return true if encoding.nil? encoding == "UTF-8" end |