Orc output plugin for Embulk
Overview
- Plugin type: output
- Load all or nothing: no
- Resume supported: no
- Cleanup supported: yes
Configuration
- path_prefix: A prefix of output path. (string, required)
- support:
file
,s3n
ands3a
.
- support:
- file_ext: An extension of output file. (string, default:
.orc
) - sequence_format: (string, default:
.%03d
) - buffer_size: Set the ORC buffer size (integer, default:
10000
) - strip_size: Set the ORC strip size (integer, default:
100000
) - compression_kind: description (string, default:
'ZLIB'
)NONE
,ZLIB
,SNAPPY
- overwrite: (LocalFileSystem only) Overwrite if output files already exist. (boolean, default:
false
) default_from_timezone Time zone of timestamp columns. This can be overwritten for each column using column_options (DateTimeZone, default:
UTC
)auth_method: name of mechanism to authenticate requests (basic, env, instance, profile, properties, anonymous, or session. default: basic)
see: https://github.com/embulk/embulk-input-s3#configurationenv
,basic
,profile
,default
,session
,anonymous
,properties
Example
out:
type: orc
path_prefix: "/tmp/output"
buffer_size: 8000
strip_size: 90000
compression_kind: ZLIB
overwrite: true
ChangeLog
ver 0.2.0
support: output to s3
s3n
,s3a
protocol
ver 0.1.0
- initial release
Build
$ ./gradlew gem # -t to watch change of files and rebuild continuously