Module: PageParser

Defined in:
lib/get_tapas/page_parser.rb

Constant Summary collapse

RUBY_TAPAS_URL_TO_FILENAME =

Example Input: “rubytapas-media.s3.amazonaws.com/298-file-find.mp4?response-content-disposition=… Example Return: ‘298-file-find.mp4’

->(url) { url.split('?').first.split('/').last }

Class Method Summary collapse

Class Method Details

.parse(html_string, fn_url_to_filename = RUBY_TAPAS_URL_TO_FILENAME) ⇒ Object

Returns an array of DownloadLink instances.

Parameters:

Returns:

  • an array of DownloadLink instances.



15
16
17
18
19
20
21
22
23
24
25
# File 'lib/get_tapas/page_parser.rb', line 15

def self.parse(html_string, fn_url_to_filename = RUBY_TAPAS_URL_TO_FILENAME)
  html_doc = Nokogiri::HTML(html_string)
  html_links = html_doc.xpath("//*[contains(@class, 'video-download-link')]")

  html_links.map do |link|
    url         = link.children.first.attributes['href'].value
    description = link.children.first.text.strip
    filename    = fn_url_to_filename.(url)
    DownloadLink.new(url, filename, description)
  end
end