Ruby lib for managing proxies
This gem can help your Ruby application to make HTTP(S) requests from proxy by fetching and validating actual proxy lists from the different providers like HideMyName.
It gives you a Manager class that can load proxy list, validate it and return random or specific proxy entry. Take a look
at the documentation below to find all the gem features.
Installation
If using bundler, first add 'proxy_fetcher' to your Gemfile:
gem 'proxy_fetcher', '~> 0.2'
or if you want to use the latest version (from master branch), then:
gem 'proxy_fetcher', git: 'https://github.com/nbulaj/proxy_fetcher.git'
And run:
bundle install
Otherwise simply install the gem:
gem install proxy_fetcher -v '0.2'
Example of usage
Get current proxy list:
manager = ProxyFetcher::Manager.new # will immediately load proxy list from the server
manager.proxies
#=> [#<ProxyFetcher::Proxy:0x00000002879680 @addr="97.77.104.22", @port=3128, @country="USA",
# @response_time=5217, @speed=48, @type="HTTP", @anonymity="High">, ... ]
You can initialize proxy manager without loading proxy list from the remote server by passing refresh: false on initialization:
manager = ProxyFetcher::Manager.new(refresh: false) # just initialize class instance
manager.proxies
#=> []
Get raw proxy URLs:
manager = ProxyFetcher::Manager.new
manager.raw_proxies
# => ["http://97.77.104.22:3128", "http://94.23.205.32:3128", "http://209.79.65.140:8080",
# "http://91.217.42.2:8080", "http://97.77.104.22:80", "http://165.234.102.177:8080", ...]
If ProxyFetcher::Manager was already initialized somewhere, you can refresh the proxy list by calling #refresh_list! method:
manager.refresh_list! # or manager.fetch!
#=> [#<ProxyFetcher::Proxy:0x00000002879680 @addr="97.77.104.22", @port=3128, @country="USA",
# @response_time=5217, @speed=48, @type="HTTP", @anonymity="High">, ... ]
Every proxy is a ProxyFetcher::Proxy object that has next readers (instance variables):
addr(IP address)portcountry(USA or Brazil for example)response_time(5217 for example)speed(:slow,:mediumor:fast. Note: depends on the proxy provider and can benil)type(URI schema, HTTP or HTTPS)anonimity(Low or High +KA for example)
Also you can call next instance methods for every Proxy object:
connectable?(whether proxy server is available)http?(whether proxy server has a HTTP protocol)https?(whether proxy server has a HTTPS protocol)uri(returnsURI::Genericobject)url(returns a formatted URL like "http://IP:PORT" )
You can use two methods to get the first proxy from the list:
getor aliasedpop(will return first proxy and move it to the end of the list)get!or aliasedpop!(will return first connectable proxy and move it to the end of the list; all the proxies till the working one will be removed)
If you wanna clear current proxy manager list from dead servers, you can just call cleanup! method:
manager.cleanup! # or manager.validate!
You can sort or find any proxy by speed using next 3 instance methods:
fast?medium?slow?'
Configuration
To change open/read timeout for cleanup! and connectable? methods you need to change ProxyFetcher.config:
ProxyFetcher.configure do |config|
config.read_timeout = 1 # default is 3
config.open_timeout = 1 # default is 3
end
manager = ProxyFetcher::Manager.new
manager.cleanup!
ProxyFetcher uses simple Ruby solution for dealing with HTTP requests - net/http library. If you wanna add, for example, your custom provider that
was developed as a Single Page Application (SPA) with some JavaScript, then you will need something like []selenium-webdriver](https://github.com/SeleniumHQ/selenium/tree/master/rb)
to properly load the content of the website. For those and other cases you can write your own class for fetching HTML content by the URL and setup it
in the ProxyFetcher config:
class MyHTTPClient
class << self
# [IMPORTANT]: self.fetch method is required!
def fetch(url)
# ... some magic to return proper HTML ...
end
end
end
ProxyFetcher.config.http_client = MyHTTPClient
manager = ProxyFetcher::Manager.new
manager.proxies
#=> [#<ProxyFetcher::Proxy:0x00000002879680 @addr="97.77.104.22", @port=3128, @country="USA",
# @response_time=5217, @speed=48, @type="HTTP", @anonymity="High">, ... ]
Providers
Currently ProxyFetcher can deal with next proxy providers (services):
- Hide My Name (default one)
- Free Proxy List
- SSL Proxies
- Proxy Docker
- XRoxy
If you wanna use one of them just setup required in the config:
ProxyFetcher.config.provider = :free_proxy_list
manager = ProxyFetcher::Manager.new
manager.proxies
#=> ...
Also you can write your own provider. All you need is to create a class, that would be inherited from the
ProxyFetcher::Providers::Base class, and register your provider like this:
ProxyFetcher::Configuration.register_provider(:your_provider, YourProviderClass)
Provider class must implement self.load_proxy_list and #parse!(html_entry) methods that will load and parse
provider HTML page with proxy list. Take a look at the samples in the proxy_fetcher/providers directory.
TODO
- Add proxy filters
- Code refactoring
- Rewrite specs
Contributing
You are very welcome to help improve ProxyFetcher if you have suggestions for features that other people can use.
To contribute:
- Fork the project.
- Create your feature branch (
git checkout -b my-new-feature). - Implement your feature or bug fix.
- Add documentation for your feature or bug fix.
- Run rake doc:yard. If your changes are not 100% documented, go back to step 4.
- Add tests for your feature or bug fix.
- Run
rake specto make sure all tests pass. - Commit your changes (
git commit -am 'Add new feature'). - Push to the branch (
git push origin my-new-feature). - Create new pull request.
Thanks.
License
proxy_fetcher gem is released under the MIT License.
Copyright (c) 2017 Nikita Bulai ([email protected]).
Some parser code (c) pifleo