robots.txt is an extremely simply format, but it still needs some parser loving. This class will normalise robots.txt entries for you.

USAGE


With the following robots.txt:

User-agent: * Disallow: /logs

User-agent: Google Disallow: /admin

Use it like this:

require ‘robotstxtparser’

# Also accepts a local file rp = RobotsTxtParser.new() rp.read(“something.com/robots.txt”)

rp.user_agents(‘Google’) # returns [“/logs”, “/admin”]