Class: Pokeedex::Pokemon::Scrapper::Fetchers::Base

Inherits:
Object
  • Object
show all
Defined in:
lib/pokeedex/pokemon/scrapper/fetchers/base.rb

Overview

It holds the Playwright instance and the methods to fetch the content of a URL and fake mouse movements and scroll the page to avoid detection, and to generate a random viewport size to avoid detection. This implementation is based on the Playwright gem (

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(url:) ⇒ Base

Returns a new instance of Base.



26
27
28
29
30
# File 'lib/pokeedex/pokemon/scrapper/fetchers/base.rb', line 26

def initialize(url:)
  @url = url
  @playwright_exec = Playwright.create(playwright_cli_executable_path: 'npx playwright')
  @playwright = playwright_exec.playwright
end

Instance Attribute Details

#playwrightObject (readonly)

The Playwright instance to use to interact with the browser and the page to fetch the content from the URL



24
25
26
# File 'lib/pokeedex/pokemon/scrapper/fetchers/base.rb', line 24

def playwright
  @playwright
end

#playwright_execObject (readonly)

The Playwright executable instance to use



20
21
22
# File 'lib/pokeedex/pokemon/scrapper/fetchers/base.rb', line 20

def playwright_exec
  @playwright_exec
end

#urlObject (readonly)

The URL to fetch the content from



16
17
18
# File 'lib/pokeedex/pokemon/scrapper/fetchers/base.rb', line 16

def url
  @url
end

Instance Method Details

#browser(&block) ⇒ Object

Open a browser instance and execute the block with the browser instance and close the browser instance after the block is executed



51
52
53
54
55
# File 'lib/pokeedex/pokemon/scrapper/fetchers/base.rb', line 51

def browser(&block)
  block.call(chromium)
ensure
  chromium.close
end

#contentObject

Fetch the content of the URL and return the content as a string (HTML) or raise an exception if the content could not be fetched



34
35
36
37
38
39
40
41
42
43
44
45
46
47
# File 'lib/pokeedex/pokemon/scrapper/fetchers/base.rb', line 34

def content
  browser do |context|
    page = context.new_page(viewport: generate_random_viewport)
    page.goto(url)
    page.wait_for_load_state

    fake_mouse_movements(page, steps: 6)
    fake_scroll_page_down_and_up(page)

    page.content
  end
rescue Playwright::Error
  raise Exceptions::ScrapperError
end