Linkey

gem_version.png gem_downloads.png travis.png code_climate.png

Link checker for BBC News & World Services sites.

The idea is to quickly check a page for broken links by doing a status check on all the relative URL's on the page.

There are 4 parts to this tool, the URL, the base URL, the regex and the filename.

  • URL is the page that you want to check for broken links, e.g www.bbc.co.uk/news/uk-29928282
  • Base URL is used with the relative URL from the regex to create a full URL, e.g www.bbc.co.uk
  • Regex is the point of the URL that you want to keep from the regex, e.g bbc.co.uk/news/uk, specifying /news would create /news/uk.
  • Filename is markdown (.md) file where all the page links are stored, this can be useful for manual checks, e.g file.md

Installation

gem install linkey

Usage

Command Line

linkey check <url> <base_url> <regex> <filename>

Examples

linkey check http://www.bbc.co.uk/arabic http://www.bbc.co.uk /arabic arabic.md
linkey check http://www.theguardian.com/technology/2014/feb/15/year-of-code-needs-reboot-teachers http://theguardian.com /technology news.md

Output

Once running, you'll see either a 200 with Status is 200 for <URL> or Status is NOT GOOD for <URL>.

Script It

require 'linkey'

url = 'http://www.live.bbc.co.uk/arabic'
base = 'http://www.live.bbc.co.uk'
reg = '/arabic'
filename = 'arabic.md'

page = Linkey::SaveLinks.new(url, filename)
status = Linkey::CheckResponse.new(url, base, reg, filename)

page.capture_links
status.check_links

From a File

If you have a lot of URLs that you want to check all the time using from a file is an alternative option. This will utilise the smoke option, then point to a YAML file with the extension. In some situations, we are deploying applications that we don't want public facing, so ensuring they 404 is essential. There is a status code option to allow a specific status code to be set against a group of URL's, ensuring builds fail if the right code conditions are met.

linkey smoke test.yaml

Example YAML Config:

base: 'http://www.bbc.co.uk'

concurrency: 100

headers:
 -
   X-content-override: 'https://example.com'

status_code: 200

paths:
  - /news
  - /news/uk

Via a Ruby script:

require 'linkey'

tests = Linkey::Checker.new("path/to.yaml")
tests.smoke