Spanx

Gem Version Build status

Spank down IP spam: IP-based rate limiting for web applications behind HTTP server such as nginx or Apache.

Spanx integrates into any web application simply by monitoring one or more HTTP server access log file(s) in real time (think Apache/nginx access.log). Spanx is built on top of the gem Pause, which is a simple Redis-based rate limiter.

Basic flow is as follows:

  • Spanx tails the access.log file(s)
  • parses out IP addresses of each request
  • maintains a tally of request counts per IP, and per a time slice.
  • Spanx is then able to detect when one or more IPs exceed the rate limiting configuration thresholds provided (multiple thresholds are supported).
    • When such IP is detected, Spanx immediately writes it out into a block-list file (suitable for consumption by nginx or apache, in format eg "deny 127.0.0.1;"), and then
    • executes a pre-configured command, presumed to reload HTTP server configuration (such as HUP nginx, etc) and activate new blocking rules.

Spanx additionally supports regular expression based white list file, that can be used to eliminate certain log lines from the consideration (for example, you Googlebot based on User-Agent).

Design

Spanx can be integrated into part of your application, or can run as a standalone ruby app. Spanx requires ruby 1.9.3, and it uses ruby threads to work on a few things in parallel.

Spanx has two main components:

  1. watcher is a process that monitors HTTP server log files, and updates Redis periodically with most recent counts. Watcher also writes out the blocked IP file, if blocked IPs are found in Redis database.

  2. analyzer is a process that reads up to date information on IP addresses from Redis, and analyzes it. If any rate limit-exceeding IPs are found, it writes them to the Redis DB, with an expiration TTL set.

If you have only one web server, you can run both watcher and analyzer as a single ruby process.

If you have multiple web servers, you need to run watcher on each server, and analyzer only once (somewhere).

Alerts

Besides actually writing out IPs to a block list file, Spanx supports notifiers that will be called when a new IP is blocked. Currently supported are audit log notifier (that writes that information to a log file), both a Campfire and Slack chat notifier (which will print IP blocking information into each respective chat room), and an Email notifier. It is very easy to write additional notifiers.

Installation

Add this line to your application's Gemfile:

gem 'spanx'

And then execute:

$ bundle

Or install it yourself as:

$ gem install spanx

Dependencies

Spanx uses the Pause gem to persist state. This depends on Redis to save state and do set logic on the information it finds.

Usage

Spanx has a single executable with several sub-commands. In practice, multiple commands will be run concurrently to do all of the necessary calculations.

Configuration can be provided via a YAML file (see example), and/or via command line options. Not all configuration can be set via command line. If an option is provided in both YAML file and command line, then latter is chosen.

watch

This command watches an HTTP server log file and writes out blocked IPs to a file specified.

  Usage: [bundle exec] spanx watch [options]
    -f, --file ACCESS_LOG            Apache/nginx access log file to scan continuously
    -z, --analyze                    Analyze IPs also (as opposed to running `spanx analyze` in another process)
    -b, --block_file BLOCK_FILE      Output file to store NGINX block list
    -c, --config CONFIG              Path to config file (YML) (required)
    -d, --daemonize                  Detach from TTY and run as a daemon
    -g, --debug                      Log to STDOUT status of execution and some time metrics
    -r, --run <shell command>        Shell command to run anytime blocked ip file changes, for example "sudo pkill -HUP nginx"
    -w, --whitelist WHITELIST        File with newline separated reg exps, to exclude lines from access log
    -h, --help                       Show this message

analyze

Analyzes IPs found by the watch command. If an IP exceeds its maximum count for a time period check (as set in the config file), the IP is written into Redis with a TTL defined by the period check.

Usage: [bundle exec] spanx analyze [options]
    -a, --audit AUDIT_FILE           Historical record of IP blocking decisions
    -c, --config CONFIG              Path to config file (YML) (required)
    -d, --daemonize
    -g, --debug                      Log status to STDOUT
    -h, --help                       Show this message

disable

Disables IP blocking. Note that this only effects the actual writing out of block files, not of IP tracking or analysis. Note that this requires a connection to redis, and thus requires the same config file used in analyze and watch.

Usage: [bundle exec] spanx disable [options]
    -c, --config CONFIG              Path to config file (YML) (required)
    -g, --debug                      Log status to STDOUT
    -h, --help                       Show this message

disable

Reenables IP blocking if disabled. As with disable, the config file is required to connect to redis.

Usage: [bundle exec] spanx enable [options]
    -c, --config CONFIG              Path to config file (YML) (required)
    -g, --debug                      Log status to STDOUT
    -h, --help                       Show this message

flush

This removes the persistence data around current IP blocks. Use this when you want to remove all data around current blocks without (or in addition to) disabling the blocker.

Usage: [bundle exec] spanx flush [options]
    -c, --config CONFIG              Path to config file (YML) (required)
    -g, --debug                      Log status to STDOUT
    -h, --help                       Show this message

api

This starts an HTTP server with endpoints for managing blocked ips. Your application (or admin interface) can connect to this, for example.

Usage: [bundle exec] spanx api [options]
    -c, --config CONFIG              Path to config file (YML) (required)
    -g, --debug                      Log status to STDOUT
    -h, --help                       Show this message
    -h, --host                       Host for the HTTP server to listen on
    -p, --port                       Port for the HTTP server to listen on

Endpoints:

To retrieve a list of currently blocked ips:

GET /ips/blocked
[
  "127.0.0.1",
  "11.100.193.12"
]

To unblock a specific ip:

This will remove the IP from redis and shortly afterwards it will be removed from the nginx block files.

DELETE /ips/blocked/11.100.193.12

Examples

If you have only one load balancer, you may want to centralize all work into a single process, as such:

 $ spanx watch -w /path/to/whitelist -c /path/to/spanx.conf.yml -z -d

With multiple load balancers, this may not be desirable. All hosts will need to process their own access log, but a minimum number of hosts should analyze the IP traffic.

 lb1 $ spanx watch -c spanx.conf.yml -r "sudo pkill -HUP nginx" --debug 2>&1 >> /var/log/spanx.watch.log &
 lb2 $ spanx watch -c spanx.conf.yml -r "sudo pkill -HUP nginx" --debug 2>&1 >> /var/log/spanx.watch.log &

 lb2 $ spanx analyze -c spanx.conf.yml -a spanx.audit.log --debug 2>&1 >> /var/log/spanx.analyze.log &

Contributing

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Added some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

Maintainers

Konstantin Gredeskoul (@kigster) and Eric Saxby (@sax) at Wanelo, Inc (http://github.com/wanelo)

(c) 2012, All rights reserved.