Webarchive

This is a CUI tool for sending URIs to public web archiving tools such as web.archive.org and archive.today.

Requests are throttled.

Rationale

This tool's motivation is simple - increased availability by redundancy. Your favorite web archiving service might be down at some point in time, or blocked by certain websites. Use 2 or more services to archive something, and your archive will be safe if at least one of them remains available.

Browser extensions with similar functionalities might exist, but this tool might be for you when you need to archive a large number of URLs, and well, if you like CUI.

Installation

Use this line to install it:

$ gem install webarchive

Usage

Launch it with

$ webarchive

and enter URIs from the standard input.

If you have a list of URIs in a file, use pipe.

$ cat list.txt | webarchive

Note that, by default, this program logs all the URIs you enter into ~/.webarchive_history.

It has optional command-line parameters:

$ webarchive -h

Usage: webarchive [options]
    -w, --wait=SECONDS               wait for SECONDS between requests [default: 5.0]
    -r, --retry=N                    retry for N times when failed [default: 5]
    -t, --timeout=SECONDS            timeout after SECONDS [default: 60.0]
        --[no-]history               record history [default: enabled]
    -d, --debug                      add debug output, implies verbose
        --verbose
    -h, --help                       show help

Development

After checking out the repo, run bundle install to install dependencies. Then, run rake spec to run the tests.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Contributing

Bug reports and pull requests are welcome at https://gitlab.com/yusuke.matsubara/webarchive.

License

The gem is available as open source under the terms of the MIT License.