Module: NewsCrawler::CrawlerModule
- Included in:
- LinkSelector::SameDomainSelector
- Defined in:
- lib/news_crawler/crawler_module.rb
Overview
Include this to get basic module methods
Instance Method Summary collapse
-
#find_all(state, max_depth = -1)) ⇒ Array
Find one visited url with given current module process state.
-
#find_one(state, max_depth = -1)) ⇒ String?
Find all visited urls with current module’s state.
-
#find_unprocessed(max_depth = -1)) ⇒ Array
Find all visited unprocessed url.
-
#mark_processed(url) ⇒ Object
Mark current url process state of current module is processed.
-
#mark_unprocessed(url) ⇒ Object
Mark current url process state of current module is unprocessed.
-
#next_unprocessed(max_depth = -1)) ⇒ String?
Get next unprocessed a url and mark it as processing in atomic.
Instance Method Details
#find_all(state, max_depth = -1)) ⇒ Array
Find one visited url with given current module process state
51 52 53 |
# File 'lib/news_crawler/crawler_module.rb', line 51 def find_all(state, max_depth = -1) URLQueue.find_all(self.class.name, state, max_depth) end |
#find_one(state, max_depth = -1)) ⇒ String?
Find all visited urls with current module’s state
59 60 61 |
# File 'lib/news_crawler/crawler_module.rb', line 59 def find_one(state, max_depth = -1) URLQueue.find_one(self.class.name, state, max_depth) end |
#find_unprocessed(max_depth = -1)) ⇒ Array
Find all visited unprocessed url
43 44 45 |
# File 'lib/news_crawler/crawler_module.rb', line 43 def find_unprocessed(max_depth = -1) URLQueue.find_all(self.class.name, URLQueue::UNPROCESSED, max_depth) end |
#mark_processed(url) ⇒ Object
Mark current url process state of current module is processed
30 31 32 |
# File 'lib/news_crawler/crawler_module.rb', line 30 def mark_processed(url) URLQueue.mark(self.class.name, url, URLQueue::PROCESSED) end |
#mark_unprocessed(url) ⇒ Object
Mark current url process state of current module is unprocessed
36 37 38 |
# File 'lib/news_crawler/crawler_module.rb', line 36 def mark_unprocessed(url) URLQueue.mark(self.class.name, url, URLQueue::UNPROCESSED) end |
#next_unprocessed(max_depth = -1)) ⇒ String?
Get next unprocessed a url and mark it as processing in atomic
66 67 68 |
# File 'lib/news_crawler/crawler_module.rb', line 66 def next_unprocessed(max_depth = -1) URLQueue.next_unprocessed(self.class.name, max_depth) end |