EmailCrawler

Email crawler: crawls the top N Google search results looking for email addresses and exports them to CSV.

Installation

$ gem install email_crawler

Usage

  • Ask for help
email-crawler --help
  • Simplest Google search
email-crawler --query "berlin walks"
  • Select which Google website to use (defaults to google.com.br)
email-crawler --query "berlin walks" --google-website google.de
  • Specify how many search results URLs to collect (defaults to 100)
email-crawler --query "berlin walks" --max-results 250
  • Specify how many internal links are to be scanned for email addresses (defaults to 100)
email-crawler --query "berlin walks" --max-links 250
  • Specify how many threads to use when searching for links and email addresses (defaults to 50)
email-crawler --query "berlin walks" --concurrency 25
  • Exclude certain domains from pages scanned for email addresses
email-crawler --query "berlin walks" --blacklist berlin.de --blacklist berlin.com
  • Redirect output to a file
email-crawler --query "berlin walks" > ~/Desktop/belin-walks-emails.csv

Contributing

  1. Fork it ( http://github.com/wecodeio/email_crawler/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request