Class: RightScraper::Retrievers::Base

Inherits:
Object
  • Object
show all
Defined in:
lib/right_scraper/retrievers/base.rb

Overview

Base class for all retrievers.

Retrievers fetch remote repositories into a given path They will attempt to fetch incrementally when possible (e.g. leveraging the underlying source control management system incremental capabilities)

Direct Known Subclasses

CheckoutBase, Download

Defined Under Namespace

Classes: RetrieverError

Constant Summary collapse

@@types =

(Hash) Lookup table from textual description of scraper type (‘cookbook’ or ‘workflow’ currently) to the class that represents that scraper.

{}

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(repository, options = {}) ⇒ Base

Create a new retriever for the given repository. This class recognizes several options, and subclasses may recognize additional options. Options may never be required.

Options

:basedir

Required, base directory where all files should be retrieved

:max_bytes

Maximum number of bytes to read

:max_seconds

Maximum number of seconds to spend reading

:logger

Logger to use

Parameters

repository(RightScraper::Repositories::Base)

repository to scrape

options(Hash)

retriever options

Raise

‘Missing base directory’

if :basedir option is missing



61
62
63
64
65
66
67
68
69
70
71
72
73
74
# File 'lib/right_scraper/retrievers/base.rb', line 61

def initialize(repository, options={})
  raise 'Missing base directory' unless options[:basedir]
  @repository = repository
  @max_bytes = options[:max_bytes] || nil
  @max_seconds = options[:max_seconds] || nil
  @basedir = options[:basedir]
  @repo_dir = RightScraper::Retrievers::Base.repo_dir(@basedir, repository)
  unless @logger = options[:logger]
    raise ::ArgumentError, ':logger is required'
  end
  @logger.operation(:initialize, "setting up in #{@repo_dir}") do
    ::FileUtils.mkdir_p(@repo_dir)
  end
end

Instance Attribute Details

#loggerObject (readonly)

Returns the value of attribute logger.



40
41
42
# File 'lib/right_scraper/retrievers/base.rb', line 40

def logger
  @logger
end

#max_bytesObject

Returns the value of attribute max_bytes.



38
39
40
# File 'lib/right_scraper/retrievers/base.rb', line 38

def max_bytes
  @max_bytes
end

#max_secondsObject

Returns the value of attribute max_seconds.



38
39
40
# File 'lib/right_scraper/retrievers/base.rb', line 38

def max_seconds
  @max_seconds
end

#repo_dirObject (readonly)

Returns the value of attribute repo_dir.



40
41
42
# File 'lib/right_scraper/retrievers/base.rb', line 40

def repo_dir
  @repo_dir
end

#repositoryObject (readonly)

Returns the value of attribute repository.



40
41
42
# File 'lib/right_scraper/retrievers/base.rb', line 40

def repository
  @repository
end

Class Method Details

.repo_dir(root_dir, repo) ⇒ Object

Path to directory where given repo should be or was downloaded

Parameters

root_dir(String)

Path to directory containing all scraped repositories

repo(Hash|RightScraper::Repositories::Base)

Remote repository corresponding to local directory

Return

String

Path to local directory that corresponds to given repository



103
104
105
106
107
108
# File 'lib/right_scraper/retrievers/base.rb', line 103

def self.repo_dir(root_dir, repo)
  repo = ::RightScraper::Repositories::Base.from_hash(repo) if repo.is_a?(Hash)
  dir_name  = repo.repository_hash
  dir_path  = ::File.join(root_dir, dir_name)
  "#{dir_path}/repo"
end

Instance Method Details

#available?Boolean

Determines if retriever is available (has required CLI tools, etc.)

Returns:

  • (Boolean)

Raises:

  • (::NotImplementedError)


77
78
79
# File 'lib/right_scraper/retrievers/base.rb', line 77

def available?
  raise ::NotImplementedError
end

#ignorable_pathsObject

Paths to ignore when traversing the filesystem. Mostly used for things like Git and Subversion version control directories.

Return

list(Array)

list of filenames to ignore.



86
87
88
# File 'lib/right_scraper/retrievers/base.rb', line 86

def ignorable_paths
  []
end

#retrieveObject

Retrieve repository, overridden in heirs

Raises:

  • (::NotImplementedError)


91
92
93
# File 'lib/right_scraper/retrievers/base.rb', line 91

def retrieve
  raise ::NotImplementedError
end