Dependency issues are welcome to be reported in this repo at Issues section. Please include:
1. Your Operating System + architecture (Example: "Ubuntu 32 bits").
2. Full error backtrace.
3. Your ruby version (you can see it by typing "ruby -v" in your command prompt.

Janis

Janis will help you find proxy servers quickly, by grabbing them from a list of many (hopefully available and up-to-date) proxy listing websites. You can also tell Janis to parse from a specific website and it will do it if it knows how to. If it doesn't you can improve it by adding new Parsers (more on this on Usage section).

Installation

Add this line to your application's Gemfile:

gem 'janis'

And then execute:

$ bundle

Or install it yourself as:

$ gem install janis

Then download the latest version of PhantomJS from http://phantomjs.org/download.html, according to your platform.

Place the PhantomJs executable somewhere in your PATH.

On Unix, you can see your path from your shell by typing '$PATH'. Common folders to place phantomjs binary in are /usr/bin and usr/local/bin.

On Windows, you can consult your PATH from your system settings in "Environment Variables" section. C:\windows\system32\ is a common location you can place phantomjs.exe in.

Usage

From your own script/app or from irb, require the gem with:

require 'janis'

And then do:

Janis.find(max_amount_of_results)

That will gather proxy server info from all url's (and local files) included in the default source list, bringing a maximum of results specified in the argument. Note: Entries in the default source list can be disabled by commenting them out with a # at their beginning.

Extending Janis

If there's a proxy listing website you consider reliable and up-to-date which you'd like to add it to the list:

  1. Fork Janis repository.
  2. Define a module file following the format shown in /specific_parsers/template.rb. There, subclass ProxyWebsiteParser and override the #parse method. Example:

    class MyAwesomeProxyListParser < Janis::Parsing::WebSpecificParsers::ProxyWebsiteParser
    
      include CapybaraWithPanthomJs # optional - only if you use capybara-poltergeist for parsing
    
      def self.url
        # url to the proxy list website you will be parsing in the #parse method
      end
    
      def configure_capybara # optional - only if you use capybara-poltergeist for parsing
        Capybara.configure { |c| c.app_host = url }
      end
    
      def initialize
        super
        configure_capybara # optional - only if you use capybara-poltergeist for parsing
        @session = new_session # optional - only if you use capybara-poltergeist for parsing
        @session.visit(url) # optional - only if you use capybara-poltergeist for parsing
        obtain_html_doc
      end
    
      def parse
          # Your code to parse the page's content and deliver an array of strings
          # Those strings must have the format "IP:PORT_NUMBER"
      end
    
      private
    
      def obtain_html_doc # optional - Redefine the way the html document to parse is obtained if you use capybara/poltergeist
        @html_doc = Nokogiri.HTML(@session.html)
      end
    
    end
    
  3. Implement #parse method so that it successfully returns an array of strings with the "IP:PORT_NUMBER" format. Example output: ["1.1.1.1:3434", "2.2.2.2:3333", "255.3.1.4: 8787"]. The implementation must use Nokogiri, our parser dependency.

  4. Run the tests.

  5. If all tests pass, create a pull request.

  6. Wait for the applauses to come!

Development

After checking out the repo, run bin/setup to install dependencies. Then, run rake test to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/mgiagante/janis.