Ruby lib for managing proxies

Gem Version Build Status Coverage Status License

This gem can help your Ruby application to make HTTP(S) requests from proxy by fetching and validating actual proxy lists from the different providers like HideMyName.

It gives you a Manager class that can load proxy list, validate it and return random or specific proxy entry. Take a look at the documentation below to find all the gem features.

Installation

If using bundler, first add 'proxy_fetcher' to your Gemfile:

gem 'proxy_fetcher', '~> 0.2'

or if you want to use the latest version (from master branch), then:

gem 'proxy_fetcher', git: 'https://github.com/nbulaj/proxy_fetcher.git'

And run:

bundle install

Otherwise simply install the gem:

gem install proxy_fetcher -v '0.2'

Example of usage

Get current proxy list:

manager = ProxyFetcher::Manager.new # will immediately load proxy list from the server
manager.proxies

 #=> [#<ProxyFetcher::Proxy:0x00000002879680 @addr="97.77.104.22", @port=3128, @country="USA", 
 #     @response_time=5217, @speed=48, @type="HTTP", @anonymity="High">, ... ]

You can initialize proxy manager without loading proxy list from the remote server by passing refresh: false on initialization:

manager = ProxyFetcher::Manager.new(refresh: false) # just initialize class instance
manager.proxies

 #=> []

Get raw proxy URLs:

manager = ProxyFetcher::Manager.new
manager.raw_proxies

 # => ["http://97.77.104.22:3128", "http://94.23.205.32:3128", "http://209.79.65.140:8080",
 #     "http://91.217.42.2:8080", "http://97.77.104.22:80", "http://165.234.102.177:8080", ...]

If ProxyFetcher::Manager was already initialized somewhere, you can refresh the proxy list by calling #refresh_list! method:

manager.refresh_list! # or manager.fetch!

 #=> [#<ProxyFetcher::Proxy:0x00000002879680 @addr="97.77.104.22", @port=3128, @country="USA", 
 #     @response_time=5217, @speed=48, @type="HTTP", @anonymity="High">, ... ]

Every proxy is a ProxyFetcher::Proxy object that has next readers (instance variables):

  • addr (IP address)
  • port
  • country (USA or Brazil for example)
  • response_time (5217 for example)
  • speed (:slow, :medium or :fast. Note: depends on the proxy provider and can be nil)
  • type (URI schema, HTTP or HTTPS)
  • anonimity (Low or High +KA for example)

Also you can call next instance methods for every Proxy object:

  • connectable? (whether proxy server is available)
  • http? (whether proxy server has a HTTP protocol)
  • https? (whether proxy server has a HTTPS protocol)
  • uri (returns URI::Generic object)
  • url (returns a formatted URL like "http://IP:PORT" )

You can use two methods to get the first proxy from the list:

  • get or aliased pop (will return first proxy and move it to the end of the list)
  • get! or aliased pop! (will return first connectable proxy and move it to the end of the list; all the proxies till the working one will be removed)

If you wanna clear current proxy manager list from dead servers, you can just call cleanup! method:

manager.cleanup! # or manager.validate!

You can sort or find any proxy by speed using next 3 instance methods:

  • fast?
  • medium?
  • slow?'

Configuration

To change open/read timeout for cleanup! and connectable? methods you need to change ProxyFetcher.config:

ProxyFetcher.configure do |config|
  config.read_timeout = 1 # default is 3
  config.open_timeout = 1 # default is 3
end

manager = ProxyFetcher::Manager.new
manager.cleanup!

ProxyFetcher uses simple Ruby solution for dealing with HTTP requests - net/http library. If you wanna add, for example, your custom provider that was developed as a Single Page Application (SPA) with some JavaScript, then you will need something like []selenium-webdriver](https://github.com/SeleniumHQ/selenium/tree/master/rb) to properly load the content of the website. For those and other cases you can write your own class for fetching HTML content by the URL and setup it in the ProxyFetcher config:

class MyHTTPClient
  class << self
    # [IMPORTANT]: self.fetch method is required!
    def fetch(url)
      # ... some magic to return proper HTML ...
    end
  end
end

ProxyFetcher.config.http_client = MyHTTPClient

manager = ProxyFetcher::Manager.new
manager.proxies

#=> [#<ProxyFetcher::Proxy:0x00000002879680 @addr="97.77.104.22", @port=3128, @country="USA", 
 #     @response_time=5217, @speed=48, @type="HTTP", @anonymity="High">, ... ]

Providers

Currently ProxyFetcher can deal with next proxy providers (services):

  • Hide My Name (default one)
  • Free Proxy List
  • SSL Proxies
  • Proxy Docker
  • XRoxy

If you wanna use one of them just setup required in the config:

ProxyFetcher.config.provider = :free_proxy_list

manager = ProxyFetcher::Manager.new
manager.proxies
 #=> ...

Also you can write your own provider. All you need is to create a class, that would be inherited from the ProxyFetcher::Providers::Base class, and register your provider like this:

ProxyFetcher::Configuration.register_provider(:your_provider, YourProviderClass)

Provider class must implement self.load_proxy_list and #parse!(html_entry) methods that will load and parse provider HTML page with proxy list. Take a look at the samples in the proxy_fetcher/providers directory.

TODO

  • Add proxy filters
  • Code refactoring
  • Rewrite specs

Contributing

You are very welcome to help improve ProxyFetcher if you have suggestions for features that other people can use.

To contribute:

  1. Fork the project.
  2. Create your feature branch (git checkout -b my-new-feature).
  3. Implement your feature or bug fix.
  4. Add documentation for your feature or bug fix.
  5. Run rake doc:yard. If your changes are not 100% documented, go back to step 4.
  6. Add tests for your feature or bug fix.
  7. Run rake spec to make sure all tests pass.
  8. Commit your changes (git commit -am 'Add new feature').
  9. Push to the branch (git push origin my-new-feature).
  10. Create new pull request.

Thanks.

License

proxy_fetcher gem is released under the MIT License.

Copyright (c) 2017 Nikita Bulai ([email protected]).

Some parser code (c) pifleo