Module: ProjectEulerCli::Scraper
- Included in:
- ArchiveController, ArchiveSearcher, ArchiveViewer
- Defined in:
- lib/project_euler_cli/concerns/scraper.rb
Overview
Holds all of the methods related to accessing the site.
Instance Method Summary collapse
-
#load_page(page, problems) ⇒ Object
Loads the problem numbers and titles for an individual page of the archive.
-
#load_problem_details(id, problems) ⇒ Object
Loads the details of an individual problem.
-
#load_recent(problems) ⇒ Object
Loads in all of the problem numbers and titles from the recent page.
-
#lookup_totals ⇒ Object
Pulls information from the recent page to determine the total number of problems and pages.
Instance Method Details
#load_page(page, problems) ⇒ Object
Loads the problem numbers and titles for an individual page of the archive.
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 |
# File 'lib/project_euler_cli/concerns/scraper.rb', line 51 def load_page(page, problems) return if Page.visited.include?(page) html = open("https://projecteuler.net/archives;page=#{page}") fragment = Nokogiri::HTML(html) problem_links = fragment.css('#problems_table td a') i = (page - 1) * Page::LENGTH + 1 problem_links.each do |link| problems[i].title = link.text i += 1 end Page.visited << page end |
#load_problem_details(id, problems) ⇒ Object
Loads the details of an individual problem.
69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 |
# File 'lib/project_euler_cli/concerns/scraper.rb', line 69 def load_problem_details(id, problems) return unless problems[id].published.nil? html = open("https://projecteuler.net/problem=#{id}") fragment = Nokogiri::HTML(html) problem_info = fragment.css('div#problem_info span span') details = problem_info.text.split(';') problems[id].published = details[0].strip problems[id].solved_by = details[1].strip # recent problems do not have a difficult rating problems[id].difficulty = details[2].strip if id < Problem.total - 9 end |
#load_recent(problems) ⇒ Object
Loads in all of the problem numbers and titles from the recent page.
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
# File 'lib/project_euler_cli/concerns/scraper.rb', line 33 def load_recent(problems) return if Page.visited.include?(0) html = open("https://projecteuler.net/recent") fragment = Nokogiri::HTML(html) problem_links = fragment.css('#problems_table td a') i = Problem.total problem_links.each do |link| problems[i].title = link.text i -= 1 end Page.visited << 0 end |
#lookup_totals ⇒ Object
Pulls information from the recent page to determine the total number of problems and pages.
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
# File 'lib/project_euler_cli/concerns/scraper.rb', line 8 def lookup_totals begin Timeout.timeout(4) do html = open("https://projecteuler.net/recent") fragment = Nokogiri::HTML(html) id_col = fragment.css('#problems_table td.id_column') # The newest problem is the first one listed on the recent page. The ID # of this problem will always equal the total number of problems. Problem.total = id_col.first.text.to_i # There are ten problems on the recent page, so the last archive problem # can be found by subtracting 10 from the total number of problems. This # is used to calculate the total number of pages. last_archive_id = Problem.total - 10 Page.total = (last_archive_id - 1) / Page::LENGTH + 1 end rescue Timeout::Error puts "Project Euler is not responding." exit(true) end end |