Build Status

Remote file input plugin for Embulk

This plugin load data from Remote hosts by SCP

Overview

  • Plugin type: file input
  • Resume supported: yes
  • Cleanup supported: yes

Configuration

  • hosts: Target hosts, format should be host or host:port (overrides default_port) (list, default: [])
  • hosts_command: Command for getting hosts(Windows not supported). If given the option, "hosts" is overwritten. (string, default: null)
  • hosts_separator: Separator for "hosts_command" result (string, default: " ")
  • default_port: Default port number for SSH (integer, default: 22)
  • path: Path of remote host (File or Directory) (string, default: "")
  • path_command: Command for getting path (Windows not supported). If given the option "path" is overwritten. (string, default: null)
  • ignore_not_found_hosts: If the option is true, Hosts which meet the following conditions are skipped. (Means they are not included into resume target.) (boolean, default: false)
    • Target file (or directory) isn't found
    • Occurred SSH error
  • auth: SSH authentication setting (hash, default: {})
    • user: SSH username (string, default: execute user)
    • type: public_key or password (string, default: public_key)
    • key_path: Path of secret key (If you choose type "public_key") (string, default: "~/.ssh/id_rsa or id_dsa")
    • password: SSH password (If you choose type "password") (string)
    • skip_host_key_verification: If the option is true, HostKey verification will be skipped (boolean, default: false)

Example

in:
  type: remote
  hosts:
    - host1
    - host2:10022
#  hosts_command: echo 'host1,host2'
#  hosts_separator: ','
  path: /some/path/20150414125923
#  path_command: echo /some/path/`date "+%Y%m%d%H%M%S"`
  ignore_not_found_hosts: true
  auth:
    user: {username}
    type: public_key
    key_path: /usr/home/.ssh/id_rsa
#    type: password
#    password: {password}

Note

When this plugin run on Linux, a task might be blocked.
The cause is java.security.SecureRandom. Please try one of the followings.

set JVM_OPTION "-Djava.security.egd"

$ export JAVA_TOOL_OPTIONS="-Djava.security.egd=file:/dev/./urandom"
$ embulk run config.yml

rewrite $JAVA_HOME/jre/lib/security/java.security

# securerandom.source=file:/dev/random # before
securerandom.source=file:/dev/./urandom # after

see also

http://stackoverflow.com/questions/137212/how-to-solve-performance-problem-with-java-securerandom

Development on local machine

  • Install Docker and then we can create SSH-able containers sh $ ssh-keygen -t ecdsa -f ./id_rsa_test -N '' $ docker-compose up -d $ docker-compose ps Name Command State Ports -------------------------------------------------------------------------- embulkinputremote_host1_1 /entrypoint.sh Up 0.0.0.0:10022->22/tcp embulkinputremote_host2_1 /entrypoint.sh Up 0.0.0.0:10023->22/tcp

Build

$ ./gradlew gem