MTR Monitor
In December 2017, Hetzner, our hosting provider for the Build Platform, had a major network incident that lasted for almost a whole week. Our users were rightly frustrated.
To prevent and monitor these situation in the future, we have set up a transatlantic monitoring system based on MTR reports and Curl-ing important vendors for our platform such are GitHub and DockerHub. This system should report any issues in the network between Germany(Hetzner) and US(GitHub, DockerHub).
This project is part of the effort to have a readily available MTR reports before, during and after incidents, that we can send to Hetzner.
The MTR monitor is an application that generates MTR reports every 5 minutes and uploads them to an S3 Bucket. It is available as a standalone Docker container, and as a Ruby gem that can be injected into other Ruby applications.
Currently, we have the following routes covered:
- Germany(Hetzner) -> AWS US East 1 (part of Job Runner)
- Germany(Hetzner) -> AWS US West 1 (part of Job Runner)
- Germany(Hetzner) -> AWS US West 2 (part of Job Runner)
- Germany(Hetzner) -> GitHub (part of Job Runner)
- Germany(Hetzner) -> DockerHub (part of Job Runner)
- Germany(Hetzner) -> Stripe (part of Job Runner)
- Germany(Hetzner) -> SemaphoreCI (part of Job Runner)
- AWS US East 1 -> Builder sb1 in Hetzner (standalone AWS instance with Docker container)
- AWS US West 1 -> Builder sb1 in Hetzner (standalone AWS instance with Docker container)
- AWS US West 2 -> Builder sb1 in Hetzner (standalone AWS instance with Docker container)
Dashboards for the MTR monitor can be found on the Platform — Network dashboard on Grafana.
The US based MTR monitors have the following DNS addresses:
mtr-monitor.us-east-1.semaphoreci.com
mtr-monitor.us-west-1.semaphoreci.com
mtr-monitor.us-west-2.semaphoreci.com
To SSH into the, run ssh ubuntu@<address>
Location of the generated MTR reports
The MTR monitor generate and stores MTR reports both on the local machine, and uploads them to S3.
Local reports on the machine are located in the /var/log/mtr
directory, and
the following structure:
/var/log/mtr/<name>-<YYYY-DD-MM>-<host-ip-address>-<HH-MM>.log
For example, if you call your report hetzner-to-us-east-1
and run it at
2017-12-18 12:33:06
, the log will be generated in:
/var/log/mtr/hetzner-to-us-east-1-2017-12-18-142-21-43-11-12-33.log
On S3, the path will follow the same convention, but will use a nested directory structure:
s3://<bucket-name>/<name>/<YYYY-DD-MM>/<host-ip-address>/<HH-MM>.log
s3://<bucket-name>/hetzner-to-us-east-1/2017-12-18/142-21-43-11/12-33.log
Report Name
The name of the report is used to group reports with the same purpose on S3 and on the local file system.
We use the following naming convention:
<from>-to-<destination>
Examples:
hetzner-to-github
us-east-1-to-hetzner-sb1
hetzner-to-us-west-2
Using MTR Monitor as a gem
The MTR monitor can be used as a gem and injected into existing Ruby applications. Currently, we inject the MTR monitor into Job Runner.
First, add the mtr_monitor
gem to your Gemfile:
gem 'mtr_monitor'
Secondly, use the report class to generate a report:
name = "google"
domain = "google.com"
s3_bucket = "my-private-bucket-name" # change this
aws_access_key_id = "<KEY>"
aws_secret_access_key = "<KEY>"
report = MtrMonitor::Report.new(name,
domain,
s3_bucket,
aws_access_key_id,
aws_secret_access_key)
report.generate
This above snippet will :
- generate an MTR report on your local system under the
/var/log/mtr
directory - upload the report to the provided S3 bucket
- submit metrics via Watchman and generate a metric "pulse"
If you want to generate reports continuously, create a CRON task that will call the above code. To monitor if the CRON task is running as expected, you should set up an alert on Grafana based on the "pulse" metric.
The pulse metric has the format network.mtr.pulse
and is tagged with the
hostname of the server where the MTR monitor is running and with the name of the
metric.
MTR hops are also submitted to Grafana. Based on these metrics you can observe
the packet loss, avg, best, and worst latency on the network. For more
information read the code in lib/mtr_monitor/metrics.rb
.
Using MTR Monitor as a standalone Docker container
The MTR monitor can be used as a standalone Docker container. This is our current approach for monitors that are hitting Germany from the United States.
To run a standalone MTR monitor, run the following command:
docker run --name mtr-monitor -d -v /var/log/mtr:/var/log/mtr -e NAME=<> -e DOMAIN=<> -e MTR_OPTIONS=<> -e S3_BUCKET=<> -e AWS_ACCESS_KEY_ID=<> -e AWS_SECRET_ACCESS_KEY=<> -e SLEEP_TIME=<> renderedtext/mtr_monitor
By default, the containers running on us-east-1, us-west-1, and us-west-2 are automatically deployed on every merge into master in for this repository.
The new container on the machine will trigger a MTR report generation every 5 minutes. Every time a Report is generated the following is executed:
- a new MTR report is generate on your local system under the
/var/log/mtr
directory - the report is uploaded to the provided S3 bucket
- metrics are submitted via Watchman and a pulse is generated
- the MTR cleaner is uninitiated that cleans all reports from the local system that are older then 2 weeks
To monitor if the CRON task is running as expected, you should set up an alert on Grafana based on the "pulse" metric.
The pulse metric has the format network.mtr.pulse
and is tagged with the
hostname of the server where the MTR monitor is running and with the name of the
metric.
MTR hops are also submitted to Grafana. Based on these metrics you can observe
the packet loss, avg, best, and worst latency on the network. For more
information read the code in lib/mtr_monitor/metrics.rb
.
Setting up a new EC2 machine for a MTR monitor
Buy a new EC2 machine on AWS. Choose, a
t2-nano
instance type with Ubuntu 14.04 operating system.SSH into the machine with the newly generated SSH keypair.
Add RT developers to the authorized keys file. For a list of public keys, refer to
s3://renderedtext-secrets/stg1-semaphore/authorized-keys
.Install docker. Run
curl https://get.docker.com | curl
.Add the
ubuntu
user to docker group.sudo usermod -aG docker ubuntu
Re-login into the SSH session.
Pull and Run the MTR monitor:
docker run --name mtr-monitor -d -v /var/log/mtr:/var/log/mtr -e NAME=<> -e DOMAIN=<> -e MTR_OPTIONS=<> -e S3_BUCKET=<> -e AWS_ACCESS_KEY_ID=<> -e AWS_SECRET_ACCESS_KEY=<> -e SLEEP_TIME=<> renderedtext/mtr_monitor
If you want to keep this machine permanently, add it to the list of continuously deployed servers.
Continuously deploying MTR monitor to a EC2 machine
TODO @bmarkons
Set up Alerts and Monitoring for a MTR monitor
TODO @bmarkons