Module: RelatonItu::Scrapper
- Defined in:
- lib/relaton_itu/scrapper.rb
Overview
Scrapper. rubocop:disable Metrics/ModuleLength
Constant Summary collapse
- DOMAIN =
'https://www.itu.int'- TYPES =
{ 'ISO' => 'international-standard', 'TS' => 'technicalSpecification', 'TR' => 'technicalReport', 'PAS' => 'publiclyAvailableSpecification', 'AWI' => 'appruvedWorkItem', 'CD' => 'committeeDraft', 'FDIS' => 'finalDraftInternationalStandard', 'NP' => 'newProposal', 'DIS' => 'draftInternationalStandard', 'WD' => 'workingDraft', 'R' => 'recommendation', 'Guide' => 'guide' }.freeze
Class Method Summary collapse
-
.parse_page(hit_data) ⇒ Hash
Parse page.
Class Method Details
.parse_page(hit_data) ⇒ Hash
Parse page. rubocop:disable Metrics/AbcSize, Metrics/MethodLength
52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 |
# File 'lib/relaton_itu/scrapper.rb', line 52 def parse_page(hit_data) doc = get_page hit_data[:url] # Fetch edition. edition = doc.at("//table/tr/td/span[contains(@id, 'Label8')]/b").text IsoBibItem::IsoBibliographicItem.new( docid: fetch_docid(hit_data[:code]), edition: edition, language: ['en'], script: ['Latn'], titles: fetch_titles(hit_data), type: fetch_type(doc), docstatus: fetch_status(doc), ics: [], # fetch_ics(doc), dates: fetch_dates(doc), contributors: fetch_contributors(hit_data[:code]), workgroup: fetch_workgroup(doc), abstract: fetch_abstract(doc), copyright: fetch_copyright(hit_data[:code], doc), link: fetch_link(doc, hit_data[:url]), relations: fetch_relations(doc) ) end |