Module: DaimonSkycrawlers

Defined in:
lib/daimon_skycrawlers.rb,
lib/daimon_skycrawlers/cli.rb,
lib/daimon_skycrawlers/queue.rb,
lib/daimon_skycrawlers/timer.rb,
lib/daimon_skycrawlers/config.rb,
lib/daimon_skycrawlers/filter.rb,
lib/daimon_skycrawlers/logger.rb,
lib/daimon_skycrawlers/crawler.rb,
lib/daimon_skycrawlers/storage.rb,
lib/daimon_skycrawlers/version.rb,
lib/daimon_skycrawlers/consumer.rb,
lib/daimon_skycrawlers/callbacks.rb,
lib/daimon_skycrawlers/processor.rb,
lib/daimon_skycrawlers/filter/base.rb,
lib/daimon_skycrawlers/storage/rdb.rb,
lib/daimon_skycrawlers/configurable.rb,
lib/daimon_skycrawlers/consumer/url.rb,
lib/daimon_skycrawlers/crawler/base.rb,
lib/daimon_skycrawlers/storage/base.rb,
lib/daimon_skycrawlers/storage/file.rb,
lib/daimon_skycrawlers/storage/null.rb,
lib/daimon_skycrawlers/consumer/base.rb,
lib/daimon_skycrawlers/generator/new.rb,
lib/daimon_skycrawlers/processor/base.rb,
lib/daimon_skycrawlers/processor/proc.rb,
lib/daimon_skycrawlers/sitemap_parser.rb,
lib/daimon_skycrawlers/commands/runner.rb,
lib/daimon_skycrawlers/crawler/default.rb,
lib/daimon_skycrawlers/commands/enqueue.rb,
lib/daimon_skycrawlers/processor/spider.rb,
lib/daimon_skycrawlers/generator/crawler.rb,
lib/daimon_skycrawlers/processor/default.rb,
lib/daimon_skycrawlers/generator/generate.rb,
lib/daimon_skycrawlers/generator/processor.rb,
lib/daimon_skycrawlers/filter/update_checker.rb,
lib/daimon_skycrawlers/consumer/http_response.rb,
lib/daimon_skycrawlers/filter/duplicate_checker.rb,
lib/daimon_skycrawlers/filter/robots_txt_checker.rb

Defined Under Namespace

Modules: Callbacks, Commands, ConfigMixin, Configurable, Consumer, Crawler, Filter, Generator, LoggerMixin, Processor, Storage, Timer Classes: CLI, Configuration, Logger, Queue, SitemapParser

Constant Summary collapse

VERSION =
"0.11.3"

Class Method Summary collapse

Class Method Details

.configurationDaimonSkycrawlers::Configuration

Retrieve configuration object



46
47
48
49
50
51
52
53
# File 'lib/daimon_skycrawlers.rb', line 46

def configuration
  @configuration ||= DaimonSkycrawlers::Configuration.new.tap do |config|
    config.logger = DaimonSkycrawlers::Logger.default
    config.queue_name_prefix = "daimon-skycrawlers"
    config.crawler_interval = 1
    config.shutdown_interval = 10
  end
end

.configure {|configuration| ... } ⇒ void

This method returns an undefined value.

Configure DaimonSkycrawlers

Yields:

Yield Parameters:

Yield Returns:

  • (void)


62
63
64
# File 'lib/daimon_skycrawlers.rb', line 62

def configure
  yield configuration
end

.envObject

Return current environment



81
82
83
# File 'lib/daimon_skycrawlers.rb', line 81

def env
  ENV["SKYCRAWLERS_ENV"] || "development"
end

.load_initvoid

This method returns an undefined value.

Load “config/init.rb”



71
72
73
74
75
76
# File 'lib/daimon_skycrawlers.rb', line 71

def load_init
  require(File.expand_path("config/init.rb", Dir.pwd))
rescue LoadError => ex
  puts ex.message
  exit(false)
end

.register_crawler(crawler) ⇒ void

This method returns an undefined value.

Register a crawler

Parameters:

  • crawler (Crawler)

    instance which implements ‘fetch` method



37
38
39
# File 'lib/daimon_skycrawlers.rb', line 37

def register_crawler(crawler)
  DaimonSkycrawlers::Consumer::URL.register(crawler)
end

.register_processor(processor) ⇒ void .register_processor {|message| ... } ⇒ void

Register a processor

Overloads:

  • .register_processor(processor) ⇒ void

    This method returns an undefined value.

    Parameters:

    • processor (Processor)

      instance which implements ‘call` method

  • .register_processor {|message| ... } ⇒ void

    This method returns an undefined value.

    Yields:

    • (message)

      Register given block as a processor.

    Yield Parameters:

    • message (Hash)

      A message from queue

    Yield Returns:

    • (void)


27
28
29
# File 'lib/daimon_skycrawlers.rb', line 27

def register_processor(processor = nil, &block)
  DaimonSkycrawlers::Consumer::HTTPResponse.register(processor, &block)
end