Class: Pokeedex::Pokemon::Scrapper::Fetchers::Base
- Inherits:
-
Object
- Object
- Pokeedex::Pokemon::Scrapper::Fetchers::Base
- Defined in:
- lib/pokeedex/pokemon/scrapper/fetchers/base.rb
Overview
It holds the Playwright instance and the methods to fetch the content of a URL and fake mouse movements and scroll the page to avoid detection, and to generate a random viewport size to avoid detection. This implementation is based on the Playwright gem (
Instance Attribute Summary collapse
-
#playwright ⇒ Object
readonly
The Playwright instance to use to interact with the browser and the page to fetch the content from the URL.
-
#playwright_exec ⇒ Object
readonly
The Playwright executable instance to use.
-
#url ⇒ Object
readonly
The URL to fetch the content from.
Instance Method Summary collapse
-
#browser(&block) ⇒ Object
Open a browser instance and execute the block with the browser instance and close the browser instance after the block is executed.
-
#content ⇒ Object
Fetch the content of the URL and return the content as a string (HTML) or raise an exception if the content could not be fetched.
-
#initialize(url:) ⇒ Base
constructor
A new instance of Base.
Constructor Details
#initialize(url:) ⇒ Base
Returns a new instance of Base.
26 27 28 29 30 |
# File 'lib/pokeedex/pokemon/scrapper/fetchers/base.rb', line 26 def initialize(url:) @url = url @playwright_exec = Playwright.create(playwright_cli_executable_path: 'npx playwright') @playwright = playwright_exec.playwright end |
Instance Attribute Details
#playwright ⇒ Object (readonly)
The Playwright instance to use to interact with the browser and the page to fetch the content from the URL
24 25 26 |
# File 'lib/pokeedex/pokemon/scrapper/fetchers/base.rb', line 24 def playwright @playwright end |
#playwright_exec ⇒ Object (readonly)
The Playwright executable instance to use
20 21 22 |
# File 'lib/pokeedex/pokemon/scrapper/fetchers/base.rb', line 20 def playwright_exec @playwright_exec end |
#url ⇒ Object (readonly)
The URL to fetch the content from
16 17 18 |
# File 'lib/pokeedex/pokemon/scrapper/fetchers/base.rb', line 16 def url @url end |
Instance Method Details
#browser(&block) ⇒ Object
Open a browser instance and execute the block with the browser instance and close the browser instance after the block is executed
51 52 53 54 55 |
# File 'lib/pokeedex/pokemon/scrapper/fetchers/base.rb', line 51 def browser(&block) block.call(chromium) ensure chromium.close end |
#content ⇒ Object
Fetch the content of the URL and return the content as a string (HTML) or raise an exception if the content could not be fetched
34 35 36 37 38 39 40 41 42 43 44 45 46 47 |
# File 'lib/pokeedex/pokemon/scrapper/fetchers/base.rb', line 34 def content browser do |context| page = context.new_page(viewport: ) page.goto(url) page.wait_for_load_state fake_mouse_movements(page, steps: 6) fake_scroll_page_down_and_up(page) page.content end rescue Playwright::Error raise Exceptions::ScrapperError end |