MTR Monitor

Build Status

In December 2017, Hetzner, our hosting provider for the Build Platform, had a major network incident that lasted for almost a whole week. Our users were rightly frustrated.

To prevent and monitor these situation in the future, we have set up a transatlantic monitoring system based on MTR reports and Curl-ing important vendors for our platform such are GitHub and DockerHub. This system should report any issues in the network between Germany(Hetzner) and US(GitHub, DockerHub).

This project is part of the effort to have a readily available MTR reports before, during and after incidents, that we can send to Hetzner.

The MTR monitor is an application that generates MTR reports every 5 minutes and uploads them to an S3 Bucket. It is available as a standalone Docker container, and as a Ruby gem that can be injected into other Ruby applications.

Currently, we have the following routes covered:

  • Germany(Hetzner) -> AWS US East 1 (part of Job Runner)
  • Germany(Hetzner) -> AWS US West 1 (part of Job Runner)
  • Germany(Hetzner) -> AWS US West 2 (part of Job Runner)
  • Germany(Hetzner) -> GitHub (part of Job Runner)
  • Germany(Hetzner) -> DockerHub (part of Job Runner)
  • Germany(Hetzner) -> Stripe (part of Job Runner)
  • Germany(Hetzner) -> SemaphoreCI (part of Job Runner)
  • AWS US East 1 -> Builder sb1 in Hetzner (standalone AWS instance with Docker container)
  • AWS US West 1 -> Builder sb1 in Hetzner (standalone AWS instance with Docker container)
  • AWS US West 2 -> Builder sb1 in Hetzner (standalone AWS instance with Docker container)

Dashboards for the MTR monitor can be found on the Platform — Network dashboard on Grafana.

Using MTR Monitor as a gem

The MTR monitor can be used as a gem and injected into existing Ruby applications. Currently, we inject the MTR monitor into Job Runner.

First, add the mtr_monitor gem to your Gemfile:

gem 'mtr_monitor'

Secondly, use the report class to generate a report:

name   = "google"
domain = "google.com"

s3_bucket             = "my-private-bucket-name" # change this
aws_access_key_id     = "<KEY>"
aws_secret_access_key = "<KEY>"

report = MtrMonitor::Report.new(name,
                                domain,
                                s3_bucket,
                                aws_access_key_id,
                                aws_secret_access_key)

report.generate

This above snippet will :

  • generate an MTR report on your local system under the /var/log/mtr directory
  • upload the report to the provided S3 bucket
  • submit metrics via Watchman and generate a metric "pulse"

If you want to generate reports continuously, create a CRON task that will call the above code. To monitor if the CRON task is running as expected, you should set up an alert on Grafana based on the "pulse" metric.

The pulse metric has the format network.mtr.pulse and is tagged with the hostname of the server where the MTR monitor is running and with the name of the metric.

MTR hops are also submitted to Grafana. Based on these metrics you can observe the packet loss, avg, best, and worst latency on the network. For more information read the code in lib/mtr_monitor/metrics.rb.

Using MTR Monitor as a standalone Docker container

docker run -d -v /var/log/mtr:/var/log/mtr -e NAME=<> -e DOMAIN=<> -e MTR_OPTIONS=<> -e S3_BUCKET=<> -e AWS_ACCESS_KEY_ID=<> -e AWS_SECRET_ACCESS_KEY=<> -e SLEEP_TIME=<> renderedtext/mtr_monitor

Generate an MTR report from Ruby

Invoke mtr generation: