Watch and tail files in dirs with specified filename time based patterns and push them to kafka.


Install libsnappy dev libs if you want to take advantage of compression

apt-get install libsnappy-dev

Add this line to your application's Gemfile:

gem 'tailf2kafka'

And then execute:

$ bundle install

Or install it yourself as:

$ gem install tailf2kafka


$ tailf2kafka -h
Usage: tailf2kafka [options]
        --config PATH                Path to settings config
    -h, --help                       Display this screen


    - topic: haproxy
      prefix: /var/log/haproxy/haproxy
      suffix: ''
      time_pattern: ".%Y-%m-%d.%H"
  position_file: "/var/lib/haproxy/tail2kafka.offsets"
  flush_interval: 1
  max_batch_lines: 1024
  from_begining: true
  delete_old_tailed_files: true
  brokers: ["broker1:9092", "broker2:9092", "broker3:9092"]
  producer_type: sync
  produce: true
  • kafka.brokers - Array of kafka brokers to connect to
  • kafka.producer_type - type of producer sync or async
  • kafka.produce - if false will not conect to kafka and will not produce any messages to it
  • tailf.position_file - file where to save tailed files offsets which were pushed to kafka
  • tailf.flush_interval - how often in seconds to save the offsets to a file
  • tailf.max_batch_lines - max number of lines to batch in each send request
  • tailf.from_beggining - in case of a new file added to tailing , if to start tailing from beggining or end of the file
  • tailf.delete_old_tailed_files - if to delete files once their time_pattern does not match the current time window and if they have been fully produced to kafka
  • tailf.files - array of file configs for tail, each tailed file configs consists of:
    • topic - which kafka topic to produce the messages to
    • prefix - the files prefix to watch for
    • time_pattern - ruby time pattern of files to tail
    • suffix - optional suffix of files to watch for so the tool will watch for files that match - prefix + time_pattern + suffix


  • The config is validated by schash gem
  • Tailed files are watched for changes by rb-notify gem
  • Dirnames of all files prefixes are watched for new files creation or files moved to the dir and are automaticaly added to tailing.
  • As well dirnames are watched for deletion or files being moved out of directory, and they are removed from the list of files watched for changing.
  • Based time_pattern, files are periodicaly autodeleted , thus avoiding need for log rotation tools.
  • Files are matched by converting time_pattern to a regexp


