S3 Per Record output plugin for Embulk
This plugin uploads a column's value to S3 as one S3 object per row. S3 object key can be composed of another column.
Breaking Changes from 0.3.x
This plugin serialize data columns by itself.
at present, Supported formats are msgpack
and json
.
Because of it, config parameters are changed.
- Rename
data_column
todata_columns
, and change type fromstring
toarray
- Add
mode
- Add
serializer
- Add
column_options
.
Please update your configs.
Overview
- Plugin type: output
- Load all or nothing: no
- Resume supported: no
Configuration
- bucket: S3 bucket name.
- key: S3 object key.
${column}
is replaced by the column's value. - data_columns: Columns for object's body.
- serializer: Serializer format. Supported formats are
msgpack
andjson
. default ismsgpack
. - mode: Set mode. Supported modes are
multi_column
andsingle_column
. default ismulti_column
. - aws_access_key_id: (optional) AWS access key id. If not given, DefaultAWSCredentialsProviderChain is used to get credentials.
- aws_secret_access_key: (optional) AWS secret access key. Required if
aws_access_key_id
is given. - retry_limit: (default 2) On connection errors,this plugin automatically retry up to this times.
- column_options: Timestamp formatting option for columns.
Example
multi_column mode
out:
type: s3_per_record
bucket: your-bucket-name
key: "sample/${id}.txt"
mode: multi_column
serializer: msgpack
data_columns: [id, payload]
id | payload (json type)
------------
1 | hello
5 | world
12 | embulk
This generates s3://your-bucket-name/sample/1.txt
with its content {"id": 1, "payload": "hello"}
that is formatted with msgpack format,
s3://your-bucket-name/sample/5.txt
with its content {"id": 5, "payload": "world"}
, and so on.
single_column mode
out:
type: s3_per_record
bucket: your-bucket-name
key: "sample/${id}.txt"
mode: single_column
serializer: msgpack
data_columns: [payload]
id | payload (json type)
------------
1 | hello
5 | world
12 | embulk
This generates s3://your-bucket-name/sample/1.txt
with its content "hello"
that is formatted with msgpack format,
s3://your-bucket-name/sample/5.txt
with its content "world"
, and so on.
Build
$ ./gradlew gem # -t to watch change of files and rebuild continuously