Class: WikitravelTasks
- Inherits:
-
Object
- Object
- WikitravelTasks
- Defined in:
- lib/wikitravel_tasks.rb
Class Method Summary collapse
- .cleanup_report(report) ⇒ Object
-
.find_or_create_wiki_user ⇒ Object
all the wikitravel.org stuff should belong to this user.
-
.parse_list_of_pages(arg = {}) ⇒ Object
Take manually precompiled list of pages off of wikitravel.org, and create a WikitravelPage for each one that does not exist.
Instance Method Summary collapse
- #all_pages_to_report_and_newsitems ⇒ Object
-
#initialize(args = {}) ⇒ WikitravelTasks
constructor
A new instance of WikitravelTasks.
- #one_page_to_report_and_newsitems(random_page) ⇒ Object
- #random_page_to_newsitem ⇒ Object
- #travel_tag ⇒ Object
Constructor Details
#initialize(args = {}) ⇒ WikitravelTasks
Returns a new instance of WikitravelTasks.
8 9 10 11 12 13 14 |
# File 'lib/wikitravel_tasks.rb', line 8 def initialize args = {} args[:lang] ||= 'en' args[:domain] ||= 'travel-guide.mobi' @site = Site.where( :domain => args[:domain], :lang => args[:lang] ).first @user = WikitravelTasks.find_or_create_wiki_user end |
Class Method Details
.cleanup_report(report) ⇒ Object
80 81 82 83 84 85 86 87 88 89 90 91 92 93 |
# File 'lib/wikitravel_tasks.rb', line 80 def self.cleanup_report report text = report.descr text = Nokogiri::HTML( report.descr ) text.search('.//script').remove text.search('.//noscript').remove text.search(".//span[contains(@class,'editsection')]").remove text.search(".//table[contains(@class,'toc')]").remove text.search(".//ul[contains(@class,'wt-toc')]").remove text.search(".//div[@id='toctitle']").remove report.descr = text return report end |
.find_or_create_wiki_user ⇒ Object
all the wikitravel.org stuff should belong to this user.
41 42 43 44 45 46 47 48 |
# File 'lib/wikitravel_tasks.rb', line 41 def self.find_or_create_wiki_user u = User.where( :username => 'wikitraveler' ).first if u.blank? u = User.create({ :username => 'wikitraveler', :name => 'Wikitraveler', :email => '[email protected]', :password => 'omg such password you will never guess wow' }) end return u end |
.parse_list_of_pages(arg = {}) ⇒ Object
Take manually precompiled list of pages off of wikitravel.org, and create a WikitravelPage for each one that does not exist.
53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 |
# File 'lib/wikitravel_tasks.rb', line 53 def self.parse_list_of_pages arg = {} arg[:filename] ||= 'wikitravel.org-popular-pages.htm' puts "Attn. parsing wikitravel.org list of pages with filename '#{arg[:filename]}'" unless Rails.env.test? index_html_path = Rails.root.join( 'data', arg[:filename] ) page = Nokogiri::HTML(open(index_html_path)) links = page.css( "ol.special li > a" ) links.each do |link| unless link['href'].include?(':') page = WikitravelPage.new :url => link['href'], :title => link['title'] if page.save puts "Saving page '#{page.title}'" unless Rails.env.test? else puts "Maybe the page '#{page.title}' already exists." unless Rails.env.test? end end end end |
Instance Method Details
#all_pages_to_report_and_newsitems ⇒ Object
16 17 18 19 20 21 22 23 24 25 |
# File 'lib/wikitravel_tasks.rb', line 16 def all_pages_to_report_and_newsitems WikitravelPage.all.each do |page| report = Report.where( :name => page[:title] ).first if report.blank? one_page_to_report_and_newsitems( page ) else puts "Report already exists: #{report.name}" end end end |
#one_page_to_report_and_newsitems(random_page) ⇒ Object
95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 |
# File 'lib/wikitravel_tasks.rb', line 95 def one_page_to_report_and_newsitems random_page urll = "#{WikitravelPage::DOMAIN}#{random_page.url}" remote_page = Nokogiri::HTML( open( urll ) ) text = remote_page.css("#mw-content-text tr > td") begin subhead = remote_page.css("#mw-content-text tr > td p")[0].text rescue subhead = '' end # create the report r = Report.new r.name = random_page.title r.name_seo = random_page.title.to_simple_string r.subhead = subhead r.descr = text r.site = @site r.user = @user r.tag = travel_tag r = WikitravelTasks.cleanup_report r r.save || puts!(r.errors) # create newsitem for the city nn = Newsitem.new nn.report = r city = City.where( :name => /#{r.name}/i ).first unless city.blank? city.newsitems << nn city.save end end |
#random_page_to_newsitem ⇒ Object
27 28 29 30 31 32 33 34 35 36 |
# File 'lib/wikitravel_tasks.rb', line 27 def random_page_to_newsitem # select a random page begin n_pages = WikitravelPage.all.length random_page = WikitravelPage.all[rand(n_pages-1)] existing_report = Report.where( :name => random_page.title ).first end while !existing_report.blank? one_page_to_report_and_newsitems( random_page ) end |