PostgreSQL output plugins for Embulk
PostgreSQL output plugins for Embulk loads records to PostgreSQL.
Overview
- Plugin type: output
- Load all or nothing: depnds on the mode. see bellow.
- Resume supported: depnds on the mode. see bellow.
Configuration
- host: database host name (string, required)
- port: database port number (integer, default: 5432)
- user: database login user name (string, required)
- password: database login password (string, default: "")
- database: destination database name (string, required)
- schema: destination schema name (string, default: "public")
- table: destination table name (string, required)
- options: extra connection properties (hash, default: {})
- mode: "replace", "merge" or "insert" (string, required)
- ssl: enables SSL. data will be encrypted but CA or certification will not be verified (boolean, default: false)
- batch_size: size of a single batch insert (integer, default: 16777216)
- default_timezone: If input column type (embulk type) is timestamp, this plugin needs to format the timestamp into a SQL string. This default_timezone option is used to control the timezone. You can overwrite timezone for each columns using column_options option. (string, default:
UTC) - column_options: advanced: a key-value pairs where key is a column name and value is options for the column.
- type: type of a column when this plugin creates new tables (e.g.
VARCHAR(255),INTEGER NOT NULL UNIQUE). This used when this plugin creates intermediate tables (insert, truncate_insert and merge modes), when it creates the target table (insert_direct and replace modes), and when it creates nonexistent target table automatically. (string, default: depends on input column type.BIGINTif input column type is long,BOOLEANif boolean,DOUBLE PRECISIONif double,CLOBif string,TIMESTAMP WITH TIME ZONEif timestamp) - value_type: This plugin converts input column type (embulk type) into a database type to build a INSERT statement. This value_type option controls the type of the value in a INSERT statement. (string, default: depends on input column type. Available values options are:
byte,short,int,long,double,float,boolean,string,nstring,date,time,timestamp,decimal,null,pass) - timestamp_format: If input column type (embulk type) is timestamp and value_type is
stringornstring, this plugin needs to format the timestamp value into a string. This timestamp_format option is used to control the format of the timestamp. (string, default:%Y-%m-%d %H:%M:%S.%6N) - timezone: If input column type (embulk type) is timestamp, this plugin needs to format the timestamp value into a SQL string. In this cases, this timezone option is used to control the timezone. (string, value of default_timezone option is used by default)
- type: type of a column when this plugin creates new tables (e.g.
Modes
- insert:
- Behavior: This mode writes rows to some intermediate tables first. If all those tasks run correctly, runs
INSERT INTO <target_table> SELECT * FROM <intermediate_table_1> UNION ALL SELECT * FROM <intermediate_table_2> UNION ALL ...query. - Transactional: Yes. This mode successfully writes all rows, or fails with writing zero rows.
- Resumable: Yes.
- Behavior: This mode writes rows to some intermediate tables first. If all those tasks run correctly, runs
- insert_direct:
- Behavior: This mode inserts rows to the target table directly.
- Transactional: No. If fails, the target table could have some rows inserted.
- Resumable: No.
- truncate_insert:
- Behavior: Same with
insertmode excepting that it truncates the target table right before the lastINSERT ...query. - Transactional: Yes.
- Resumable: Yes.
- Behavior: Same with
- merge:
- Behavior: This mode writes rows to some intermediate tables first. If all those tasks run correctly, runs
with updated AS (UPDATE .... RETURNING ...) INSERT INTO ....query. - Transactional: Yes.
- Resumable: Yes.
- Behavior: This mode writes rows to some intermediate tables first. If all those tasks run correctly, runs
- replace:
- Behavior: Same with
insertmode excepting that it truncates the target table right before the lastINSERT ...query. - Transactional: Yes.
- Resumable: No.
- Behavior: Same with
Example
out:
type: postgresql
host: localhost
user: pg
password: ""
database: my_database
table: my_table
mode: insert
Advanced configuration:
out:
type: postgresql
host: localhost
user: pg
password: ""
database: my_database
table: my_table
options: {loglevel: 2}
mode: insert_direct
column_options:
my_col_1: {type: 'BIGSERIAL'}
my_col_3: {type: 'INT NOT NULL'}
my_col_4: {value_type: string, timestamp_format: `%Y-%m-%d %H:%M:%S %z`, timezone: '-0700'}
my_col_5: {type: 'DECIMAL(18,9)', value_type: pass}
Build
$ ./gradlew gem