Google Cloud Storage file input plugin for Embulk
Overview
- Plugin type: file input
- Resume supported: yes
- Cleanup supported: yes
Usage
Install plugin
embulk gem install embulk-input-gcs
Google Service Account Settings
Make project at Google Developers Console.
Make "Service Account" with this step.
Service Account has two specific scopes: read-only, read-write.
embulk-input-gcs can run "read-only" scopes.
Generate private key in P12(PKCS12) format, and upload to machine.
Write "EMAIL_ADDRESS" and fullpath of PKCS12 private key in yaml.
run
embulk run /path/to/config.yml
Configuration
- bucket Google Cloud Storage bucket name (string, required)
- path_prefix prefix of target keys (string, required)
- service_accound_email Google Cloud Storage service_account_email (string, required)
- p12_keyfile_fullpath fullpath of p12 key (string, required)
- application_name application name anything you like (string, optional)
Example
in:
type: gcs
bucket: my-gcs-bucket
path_prefix: logs/csv-
service_accound_email: ABCXYZ123ABCXYZ123.gserviceaccount.com
p12_keyfile_path: /path/to/p12_keyfile.p12
application_name: Anything you like
Example for "sample_01.csv.gz" , generated by embulk example
in:
type: gcs
bucket: my-gcs-bucket
path_prefix: sample_
service_accound_email: ABCXYZ123ABCXYZ123.gserviceaccount.com
p12_keyfile_path: /path/to/p12_keyfile.p12
application_name: Anything you like
decoders:
- {type: gzip}
parser:
charset: UTF-8
newline: CRLF
type: csv
delimiter: ','
quote: '"'
header_line: true
columns:
- {name: id, type: long}
- {name: account, type: long}
- {name: time, type: timestamp, format: '%Y-%m-%d %H:%M:%S'}
- {name: purchase, type: timestamp, format: '%Y%m%d'}
- {name: comment, type: string}
out: {type: stdout}
Build
./gradlew gem