Down

Down is a wrapper around open-uri standard library for safe downloading of remote files.

Installation

gem 'down'

Usage

require "down"
tempfile = Down.download("http://example.com/nature.jpg")
tempfile #=> #<Tempfile:/var/folders/k7/6zx6dx6x7ys3rv3srh0nyfj00000gn/T/20150925-55456-z7vxqz.jpg>

Features

If you're downloading files from URLs that come from you, then it's probably enough to just use open-uri. However, if you're accepting URLs from your users (e.g. through remote_<avatar>_url in CarrierWave), then downloading is suddenly not as simple as it appears to be.

StringIO

Firstly, you may think that open-uri always downloads a file to disk, but that's not true. If the downloaded file has 10 KB or less, open-uri actually returns a StringIO. In my application I needed that the file is always downloaded to disk. This was obviously a wrong design decision from the MRI team, so Down patches this behaviour and always returns a Tempfile.

File extension

When using open-uri directly, the extension of the remote file is not preserved. Down patches that behaviour and preserves the file extension.

Metadata

open-uri adds some metadata to the returned file, like #content_type. Down adds #original_filename as well, which is extracted from the URL.

require "down"
tempfile = Down.download("http://example.com/nature.jpg")

tempfile #=> #<Tempfile:/var/folders/k7/6zx6dx6x7ys3rv3srh0nyfj00000gn/T/20150925-55456-z7vxqz.jpg>
tempfile.content_type #=> "image/jpeg"
tempfile.original_filename #=> "nature.jpg"

Maximum size

When you're accepting URLs from an outside source, it's a good idea to limit the filesize (because attackers want to give a lot of work to your servers). Down allows you to pass a :max_size option:

Down.download("http://example.com/image.jpg", max_size: 5 * 1024 * 1024) # 5 MB
# raises Down::TooLarge

What is the advantage over simply checking size after downloading? Well, Down terminates the download very early, as soon as it gets the Content-Length header. And if the Content-Length header is missing, Down will terminate the download as soon as it receives a chunk which surpasses the maximum size.

Redirects

By default open-uri's redirects are turned off, since open-uri doesn't have a way to limit maximum number of redirects. Instead Down itself implements following redirects, by default allowing maximum of 2 redirects.

Down.download("http://example.com/image.jpg")                   # 2 redirects allowed
Down.download("http://example.com/image.jpg", max_redirects: 5) # 5 redirects allowed
Down.download("http://example.com/image.jpg", max_redirects: 0) # 0 redirects allowed

Download errors

There are a lot of ways in which a download can fail:

  • URL is really invalid (URI::InvalidURIError)
  • URL is a little bit invalid, e.g. "http:/example.com" (Errno::ECONNREFUSED)
  • Domain wasn't not found (SocketError)
  • Domain was found, but status is 4xx or 5xx (OpenURI::HTTPError)
  • Request timeout out (Timeout::Error)

Down unifies all of these errors into one Down::NotFound error (because this is what actually happened from the outside perspective). If you want to get the actual error raised by open-uri, in Ruby 2.1 or later you can use Exception#cause:

begin
  Down.download("http://example.com")
rescue Down::Error => error
  error.cause #=> #<RuntimeError: HTTP redirection loop: http://example.com>
end

Additional options

Any additional options will be forwarded to open-uri, so you can for example add basic authentication or a timeout:

Down.download "http://example.com/image.jpg",
  http_basic_authentication: ['john', 'secret'],
  read_timeout: 5

Streaming

Down has the ability to stream remote files, yielding chunks when they're received:

Down.stream("http://example.com/image.jpg") { |chunk, content_length| ... }

The content_length argument is set from the Content-Length response header if it's present.

Copying to tempfile

Down has another "hidden" utility method, #copy_to_tempfile, which creates a Tempfile out of the given file. The #download method uses it internally, but it's also publicly available for direct use:

io # IO object that you want to copy to tempfile
tempfile = Down.copy_to_tempfile "basename.jpg", io
tempfile.path #=> "/var/folders/k7/6zx6dx6x7ys3rv3srh0nyfj00000gn/T/down20151116-77262-jgcx65.jpg"

Supported Ruby versions

  • MRI 1.9.3
  • MRI 2.0
  • MRI 2.1
  • MRI 2.2
  • JRuby
  • Rubinius

Development

$ rake test

If you want to test across Ruby versions and you're using rbenv, run

$ bin/test-versions

License

MIT