Module: Zorki

Extended by:
Configuration
Defined in:
lib/zorki/scrapers/post_scraper.rb,
lib/zorki.rb,
lib/zorki/post.rb,
lib/zorki/user.rb,
lib/zorki/version.rb,
lib/zorki/scrapers/scraper.rb,
lib/zorki/scrapers/user_scraper.rb

Overview

rubocop:disable Metrics/ClassLength

Defined Under Namespace

Classes: ContentUnavailableError, Error, ImageRequestFailedError, ImageRequestTimedOutError, ImageRequestZeroSize, Post, PostScraper, RetryableError, Scraper, User, UserScraper, UserScrapingError

Constant Summary collapse

VERSION =
"0.2.8"

Class Method Summary collapse

Methods included from Configuration

configuration, define_setting

Class Method Details

.attempt_retrieve_media(url) ⇒ Object



85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
# File 'lib/zorki.rb', line 85

def self.attempt_retrieve_media(url)
  response = Typhoeus.get(url)

  # Get the file extension if it's in the file
  stripped_url = url.split("?").first  # remove URL query params
  extension = stripped_url.split(".").last

  # Do some basic checks so we just empty out if there's something weird in the file extension
  # that could do some harm.
  if extension.length.positive?
    extension = nil unless /^[a-zA-Z0-9]+$/.match?(extension)
    extension = ".#{extension}" unless extension.nil?
  end

  temp_file_name = "#{Zorki.temp_storage_location}/instagram_media_#{SecureRandom.uuid}#{extension}"

  # We do this in case the folder isn't created yet, since it's a temp folder we'll just do so
  self.create_temp_storage_location
  File.binwrite(temp_file_name, response.body)

  temp_file_name
end

.retrieve_media(url) ⇒ Object

We do this because sometimes the images are coming back sized zero



68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
# File 'lib/zorki.rb', line 68

def self.retrieve_media(url)
  count = 0

  until count == 5
    temp_file_name = attempt_retrieve_media(url)

    # If it's more than 1kb return properly
    return temp_file_name if File.size(temp_file_name) > 100

    # Delete the file since we want to retry
    File.delete(temp_file_name)
    count += 1
  end

  raise(ImageRequestZeroSize)
end