Class: Githubgrab::Downloader

Inherits:
Object
  • Object
show all
Defined in:
lib/githubgrab/downloader.rb

Constant Summary collapse

TIME_VALIDITY =

images are valid if downloaded within recent one hour

60*60.to_f
THREAD_LIMIT =

change this based on your hard drive IO capability and internet bandwidth

10.to_i

Instance Method Summary collapse

Constructor Details

#initialize(urls, destination) ⇒ Downloader

Returns a new instance of Downloader.



11
12
13
14
# File 'lib/githubgrab/downloader.rb', line 11

def initialize urls, destination
  @urls = urls
  @destination = destination
end

Instance Method Details

#save_allObject

download and save files in THREAD_LIMIT item batches, because we don’t have any queue system to save failed requests, it if better to be cautious and think of IO and bandwidth limits. TODO: add something like Sidekiq to retry failed requests.



19
20
21
22
23
24
25
26
# File 'lib/githubgrab/downloader.rb', line 19

def save_all
  pool = Thread.pool(THREAD_LIMIT)
  create_folder
  @urls.each do |url|
    pool.process {save url} 
  end
  pool.shutdown
end