Webarchive
This is a CUI tool for sending URIs to public web archiving tools such as web.archive.org and archive.today.
Requests are throttled.
Rationale
This tool's motivation is simple - increased availability by redundancy. Your favorite web archiving service might be down at some point in time, or blocked by certain websites. Use 2 or more services to archive something, and your archive will be safe if at least one of them remains available.
Browser extensions with similar functionalities might exist, but this tool might be for you when you need to archive a large number of URLs, and well, if you like CUI.
Installation
Use this line to install it:
$ gem install webarchive
Usage
Launch it with
$ webarchive
and enter URIs from the standard input.
If you have a list of URIs in a file, use pipe.
$ cat list.txt | webarchive
Note that, by default, this program logs all the URIs you enter into
~/.webarchive_history
.
It has optional command-line parameters:
$ webarchive -h
Usage: webarchive [options]
-w, --wait=SECONDS wait for SECONDS between requests [default: 5.0]
-r, --retry=N retry for N times when failed [default: 5]
-t, --timeout=SECONDS timeout after SECONDS [default: 60.0]
--[no-]history record history [default: enabled]
-d, --debug add debug output, implies verbose
--verbose
-h, --help show help
Development
After checking out the repo, run bundle install
to install
dependencies. Then, run rake spec
to run the tests.
To install this gem onto your local machine, run bundle exec rake
install
. To release a new version, update the version number in
version.rb
, and then run bundle exec rake release
, which will
create a git tag for the version, push git commits and tags, and push
the .gem
file to rubygems.org.
Contributing
Bug reports and pull requests are welcome at https://gitlab.com/yusuke.matsubara/webarchive.
License
The gem is available as open source under the terms of the MIT License.