Class: Ietf::Data::Importer::Scrapers::BaseScraper

Inherits:
Object
  • Object
show all
Defined in:
lib/ietf/data/importer/scrapers/base_scraper.rb

Overview

Base class for web scrapers

Direct Known Subclasses

IetfScraper, IrtfScraper

Instance Method Summary collapse

Instance Method Details

#fetch_html(url) ⇒ Nokogiri::HTML::Document

Fetch HTML content from a URL and parse it with Nokogiri

Parameters:

  • url (String)

    The URL to fetch

Returns:

  • (Nokogiri::HTML::Document)

    The parsed HTML document



15
16
17
18
19
20
# File 'lib/ietf/data/importer/scrapers/base_scraper.rb', line 15

def fetch_html(url)
  Nokogiri::HTML(URI.open(url))
rescue => e
  puts "  Error fetching URL #{url}: #{e.message}"
  nil
end

#log(message, level = 0) ⇒ Object

Log a message with indentation

Parameters:

  • message (String)

    The message to log

  • level (Integer) (defaults to: 0)

    The indentation level (default: 0)



25
26
27
28
# File 'lib/ietf/data/importer/scrapers/base_scraper.rb', line 25

def log(message, level = 0)
  indent = "  " * level
  puts "#{indent}#{message}"
end