Snapcrawl - crawl a website and take screenshots

Snapcrawl is a command line utility for crawling a website and saving screenshots.

Features

Using Docker

You can run Snapcrawl by using this docker image (which contains all the necessary prerequisites):

shell $ alias snapcrawl='docker run --rm -it --volume $PWD:/app dannyben/snapcrawl'

For more information on the Docker image, refer to the docker-snapcrawl repository.

Using Ruby

shell $ gem install snapcrawl

Note that Snapcrawl requires PhantomJS and ImageMagick.

Snapcrawl can be configured either through a configuration file (YAML), or by specifying options in the command line.

shell $ snapcrawl Usage: snapcrawl URL [--config FILE] [SETTINGS...] snapcrawl -h | --help snapcrawl -v | --version

The default configuration filename is snapcrawl.yml.

Using the --config flag will create a template configuration file if it is not present:

shell $ snapcrawl example.com --config snapcrawl

All configuration options can be specified in the command line as key=value pairs:

shell $ snapcrawl example.com log_level=0 depth=2 width=1024

```yaml # All values below are the default values

log_level: 1

# yes = always show log color # no = never use colors # auto = only use colors when running in an interactive terminal log_color: auto

depth: 1

width: 1280

height: 0

cache_life: 86400

cache_dir: cache

snaps_dir: snaps

# slug version of the URL (no need to include the .png extension) name_template: ‘%url’

url_whitelist:

url_blacklist:

css_selector: ```

If you experience any issue, have a question or a suggestion, or if you wish to contribute, feel free to open an issue.