Module: RelatonIetf::Scrapper

Extended by:
RelatonBib::BibXMLParser, Scrapper
Included in:
Scrapper
Defined in:
lib/relaton_ietf/scrapper.rb

Overview

Scrapper module

Constant Summary collapse

FLAVOR =
"IETF"
GH_URL =
"https://raw.githubusercontent.com/relaton/relaton-data-ietf/master/data/reference."

Instance Method Summary collapse

Instance Method Details

#scrape_page(text, is_relation: false) ⇒ RelatonIetf::IetfBibliographicItem

Parameters:

  • text (String)
  • is_relation (TrueClass, FalseClass) (defaults to: false)

Returns:



20
21
22
23
24
25
26
27
28
29
30
# File 'lib/relaton_ietf/scrapper.rb', line 20

def scrape_page(text, is_relation: false)
  # Remove initial "IETF " string if specified
  ref = text.gsub(/^IETF /, "")
  /^(?:RFC|BCP|FYI|STD)\s(?<num>\d+)/ =~ ref
  ref.sub!(/(?<=^(?:RFC|BCP|FYI|STD)\s)(\d+)/, num.rjust(4, "0")) if num
  rfc_item ref, is_relation
rescue Timeout::Error, Errno::EINVAL, Errno::ECONNRESET, EOFError,
        Net::HTTPBadResponse, Net::HTTPHeaderSyntaxError,
        Net::ProtocolError, SocketError
  raise RelatonBib::RequestError, "No document found for #{ref} reference"
end