Class: DaimonSkycrawlers::Filter::DuplicateChecker

Inherits:
Base
  • Object
show all
Defined in:
lib/daimon_skycrawlers/filter/duplicate_checker.rb

Overview

This filter provides duplication checker for given URL.

Skip processing duplicated URLs.

Instance Method Summary collapse

Methods inherited from Base

#storage

Constructor Details

#initialize(base_url: nil) ⇒ DuplicateChecker

Returns a new instance of DuplicateChecker.



12
13
14
15
16
# File 'lib/daimon_skycrawlers/filter/duplicate_checker.rb', line 12

def initialize(base_url: nil)
  @base_url = nil
  @base_url = URI(base_url) if base_url
  @urls = Set.new
end

Instance Method Details

#call(message) ⇒ true|false

Return false when duplicated, otherwise return true.

Parameters:

  • message (Hash)

    message to check duplication. If given URL is relative URL, use @base_url + url as absolute URL.

Returns:

  • (true|false)

    Return false when duplicated, otherwise return true.



23
24
25
26
27
28
# File 'lib/daimon_skycrawlers/filter/duplicate_checker.rb', line 23

def call(message)
  url = normalize_url(message[:url])
  return false if @urls.include?(url)
  @urls << url
  true
end

#duplicated?(message) ⇒ true|false

Return true when duplicated, otherwise return false.

Parameters:

  • message (Hash)

    message to check duplication. If given URL is relative URL, use @base_url + url as absolute URL.

Returns:

  • (true|false)

    Return true when duplicated, otherwise return false.



35
36
37
# File 'lib/daimon_skycrawlers/filter/duplicate_checker.rb', line 35

def duplicated?(message)
  !call(message)
end