robots.txt is an extremely simply format, but it still needs some parser loving. This class will normalise robots.txt entries for you.
USAGE
With the following robots.txt:
User-agent: * Disallow: /logs
User-agent: Google Disallow: /admin
Use it like this:
require ‘robotstxtparser’
# Also accepts a local file rp = RobotsTxtParser.new() rp.read(“something.com/robots.txt”)
rp.user_agents(‘Google’) # returns [“/logs”, “/admin”]