SFTP file output plugin for Embulk

Build Status

Stores files on a SFTP Server

Overview

  • Plugin type: file output
  • Load all or nothing: no
  • Resume supported: no
  • Cleanup supported: no

Configuration

  • host: (string, required)
  • port: (int, default: 22)
  • user: (string, required)
  • password: (string, default: null)
  • secret_key_file: (string, default: null) see below
  • secret_key_passphrase: (string, default: "")
  • user_directory_is_root: (boolean, default: true)
  • timeout: sftp connection timeout seconds (integer, default: 600)
  • path_prefix: Prefix of output paths (string, required)
  • file_ext: Extension of output files (string, required)
  • sequence_format: Format for sequence part of output files (string, default: ".%03d.%02d")
  • rename_file_after_upload: Upload file_ext + ".tmp" first, then rename it after upload finish (boolean, default: false)
  • local_buffering: Use local temp file to buffer records. If false, plugin will buffer records to remote file directly, with ".tmp" as filename suffix (boolean, default: true)
  • temp_file_threshold: Maximum file size of local temp file, plugin will flush (append) to remote file when local temp file reaches threshold (long, default: 5368709120, ie. 5GiB, min: 50MiB, max: 10GiB)

Proxy configuration

  • proxy:
    • type: (string(http | socks | stream), required, default: null)
      • http: use HTTP Proxy
      • socks: use SOCKS Proxy
      • stream: Connects to the SFTP server through a remote host reached by SSH
    • host: (string, required)
    • port: (int, default: 22)
    • user: (string, optional)
    • password: (string, optional, default: null)
    • command: (string, optional)

Example

out:
  type: sftp
  host: 127.0.0.1
  port: 22
  user: civitaspo
  secret_key_file: /Users/civitaspo/.ssh/id_rsa
  secret_key_passphrase: secret_pass
  user_directory_is_root: false
  timeout: 600
  path_prefix: /data/sftp
  file_ext: _20151020.tsv
  sequence_format: ".%01d%01d"
  temp_file_threshold: 10737418240 # 10GiB

With proxy

out:
  type: sftp
  host: 127.0.0.1
  port: 22
  user: embulk
  secret_key_file: /Users/embulk/.ssh/id_rsa
  secret_key_passphrase: secret_pass
  user_directory_is_root: false
  timeout: 600
  path_prefix: /data/sftp
  proxy:
    type: http
    host: proxy_host
    port: 8080
    user: proxy_user
    password: proxy_secret_pass
    command:

Secret Keyfile configuration

Please set path of secret_key_file as follows.

out:
  type: sftp
  ...
  secret_key_file: /path/to/id_rsa
  ...

You can also embed contents of secret_key_file at config.yml.

out:
  type: sftp
  ...
  secret_key_file:
    content |
      -----BEGIN RSA PRIVATE KEY-----
      ABCDEFG...
      HIJKLMN...
      OPQRSTU...
      -----END RSA PRIVATE KEY-----
  ...

Run Example

$ ./gradlew classpath

Use vagrant to start a remote sshd server:

$ vagrant up

Run:

$ embulk run -Ilib example/sample.yml

Build

$ ./gradlew gem  # -t to watch change of files and rebuild continuously
$ ./gradlew bintrayUpload # release embulk-output-sftp to Bintray maven repo

Note

This plugin uses "org.apache.commons:commons-vfs" and the library uses the logger "org.apache.commons.logging.Log". So, this plugin suppress the logger's message except when embulk log level is debug.

Contributors

  • Satoshi Akama (@sakama)
  • Rudolph Miller (@Rudolph-Miller)
  • Naotoshi Seo (@sonots)