Class: Youtube::SearchResultScraper
- Inherits:
-
Object
- Object
- Youtube::SearchResultScraper
- Defined in:
- lib/youtube/searchresultscraper.rb
Overview
Introduction
Youtube::SearchResultScraper scrapes video information from search result page on www.youtube.com.
You can get result as array or xml.
XML format is same as YouTube Developer API (www.youtube.com/dev_api_ref?m=youtube.videos.list_by_tag).
Example
require "rubygems"
require "youtube/searchresultscraper"
scraper = Youtube::SearchResultScraper.new(keyword, page)
scraper.open
scraper.scrape
puts scraper.get_xml
More Information
www.ark-web.jp/sandbox/wiki/184.html (japanese only)
- Author
-
Yuki SHIDA <[email protected]>
- Author
-
Konuma Akio <[email protected]>
- Version
-
0.0.3
- License
-
MIT license
Constant Summary collapse
- @@youtube_search_base_url =
"http://www.youtube.com/results?search_query="
Instance Attribute Summary collapse
-
#keyword ⇒ Object
Returns the value of attribute keyword.
-
#page ⇒ Object
Returns the value of attribute page.
-
#video_count ⇒ Object
readonly
Returns the value of attribute video_count.
-
#video_from ⇒ Object
readonly
Returns the value of attribute video_from.
-
#video_to ⇒ Object
readonly
Returns the value of attribute video_to.
Instance Method Summary collapse
-
#each ⇒ Object
Iterator for scraped videos.
-
#get_xml ⇒ Object
Return videos information as XML Format.
-
#initialize(keyword, page = nil) ⇒ SearchResultScraper
constructor
Create Youtube::SearchResultScraper object specifying keyword and number of page.
-
#open ⇒ Object
Get search result from youtube by specified keyword.
-
#scrape ⇒ Object
Scrape video information from search result html.
Constructor Details
#initialize(keyword, page = nil) ⇒ SearchResultScraper
Create Youtube::SearchResultScraper object specifying keyword and number of page.
You cannot specify number of videos per page. Always, the number of videos is 20 per page.
-
keyword - specify keyword that you want to search on YouTube. You must specify keyword encoded by UTF-8.
-
page - specify number of page
79 80 81 82 |
# File 'lib/youtube/searchresultscraper.rb', line 79 def initialize keyword, page=nil @keyword = keyword @page = page if not page == nil end |
Instance Attribute Details
#keyword ⇒ Object
Returns the value of attribute keyword.
62 63 64 |
# File 'lib/youtube/searchresultscraper.rb', line 62 def keyword @keyword end |
#page ⇒ Object
Returns the value of attribute page.
63 64 65 |
# File 'lib/youtube/searchresultscraper.rb', line 63 def page @page end |
#video_count ⇒ Object (readonly)
Returns the value of attribute video_count.
64 65 66 |
# File 'lib/youtube/searchresultscraper.rb', line 64 def video_count @video_count end |
#video_from ⇒ Object (readonly)
Returns the value of attribute video_from.
65 66 67 |
# File 'lib/youtube/searchresultscraper.rb', line 65 def video_from @video_from end |
#video_to ⇒ Object (readonly)
Returns the value of attribute video_to.
66 67 68 |
# File 'lib/youtube/searchresultscraper.rb', line 66 def video_to @video_to end |
Instance Method Details
#each ⇒ Object
Iterator for scraped videos.
126 127 128 129 130 |
# File 'lib/youtube/searchresultscraper.rb', line 126 def each @videos.each do |video| yield video end end |
#get_xml ⇒ Object
Return videos information as XML Format.
133 134 135 136 137 138 139 140 141 |
# File 'lib/youtube/searchresultscraper.rb', line 133 def get_xml xml = "<ut_response status=\"ok\">" + "<video_count>" + @video_count.to_s + "</video_count>" + "<video_list>\n" each do |video| xml += video.to_xml end xml += "</video_list></ut_response>" end |
#open ⇒ Object
Get search result from youtube by specified keyword.
85 86 87 88 89 90 91 |
# File 'lib/youtube/searchresultscraper.rb', line 85 def open @url = @@youtube_search_base_url + CGI.escape(@keyword) @url += "&page=#{@page}" if not @page == nil @html = Kernel.open(@url).read replace_document_write_javascript @search_result = Hpricot.parse(@html) end |
#scrape ⇒ Object
Scrape video information from search result html.
94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 |
# File 'lib/youtube/searchresultscraper.rb', line 94 def scrape @videos = [] @search_result.search("//div[@class='vEntry']").each do |video_html| video = Youtube::Video.new video.id = scrape_id(video_html) video. = (video_html) video.title = scrape_title(video_html) video.length_seconds = scrape_length_seconds(video_html) video. = (video_html) video. = (video_html) video.description = scrape_description(video_html) video.view_count = scrape_view_count(video_html) video.thumbnail_url = scrape_thumbnail_url(video_html) video. = (video_html) video.url = scrape_url(video_html) check_video video @videos << video end @video_count = scrape_video_count @video_from = scrape_video_from @video_to = scrape_video_to raise "scraping error" if (is_no_result != @videos.empty?) @videos end |