Class: Scruber::QueueAdapters::Memory
- Inherits:
-
AbstractAdapter
- Object
- AbstractAdapter
- Scruber::QueueAdapters::Memory
- Defined in:
- lib/scruber/queue_adapters/memory.rb
Overview
Memory Queue Adapter
Simple queue adapted which stores pages in memory. Nice solution for small scrapes. Easy to use. No need to setup any database, but no ability to reparse pages if something went wrong.
Defined Under Namespace
Classes: Page
Instance Attribute Summary collapse
-
#error_pages ⇒ Object
readonly
Returns the value of attribute error_pages.
Instance Method Summary collapse
-
#add(url_or_page, options = {}) ⇒ void
(also: #push)
Add page to queue.
-
#add_downloaded(page) ⇒ void
Internal method to add page to downloaded queue.
-
#add_error_page(page) ⇒ void
Internal method to add page to error queue.
-
#add_processed_page(page) ⇒ void
Saving processed page id to prevent adding identical pages to queue.
-
#delete(page) ⇒ void
Delete page from all internal queues.
-
#downloaded_count ⇒ Integer
Count of downloaded pages Using to show downloading progress.
-
#fetch_downloaded(count = nil) ⇒ Scruber::QueueAdapters::AbstractAdapter::Page|Array<Scruber::QueueAdapters::AbstractAdapter::Page>
Fetch downloaded and not processed pages for feching.
-
#fetch_error(count = nil) ⇒ Scruber::QueueAdapters::AbstractAdapter::Page|Array<Scruber::QueueAdapters::AbstractAdapter::Page>
Fetch error page.
-
#fetch_pending(count = nil) ⇒ Scruber::QueueAdapters::AbstractAdapter::Page|Array<Scruber::QueueAdapters::AbstractAdapter::Page>
Fetch pending page for fetching.
-
#find(id) ⇒ Page
Search page by id.
-
#has_work? ⇒ Boolean
Used by Core.
-
#initialize(options = {}) ⇒ Scruber::QueueAdapters::Memory
constructor
Queue initializer.
-
#initialized? ⇒ Boolean
Check if queue was initialized.
-
#size ⇒ Integer
Size of queue.
Constructor Details
#initialize(options = {}) ⇒ Scruber::QueueAdapters::Memory
Queue initializer
58 59 60 61 62 63 64 |
# File 'lib/scruber/queue_adapters/memory.rb', line 58 def initialize(={}) super() @processed_ids = [] @queue = [] @downloaded_pages = [] @error_pages = [] end |
Instance Attribute Details
#error_pages ⇒ Object (readonly)
Returns the value of attribute error_pages.
14 15 16 |
# File 'lib/scruber/queue_adapters/memory.rb', line 14 def error_pages @error_pages end |
Instance Method Details
#add(url_or_page, options = {}) ⇒ void Also known as: push
This method returns an undefined value.
Add page to queue
72 73 74 75 76 77 |
# File 'lib/scruber/queue_adapters/memory.rb', line 72 def add(url_or_page, ={}) unless url_or_page.is_a?(Page) url_or_page = Page.new(self, .merge(url: url_or_page)) end @queue.push(url_or_page) unless @processed_ids.include?(url_or_page.id) || find(url_or_page.id) end |
#add_downloaded(page) ⇒ void
This method returns an undefined value.
Internal method to add page to downloaded queue
156 157 158 |
# File 'lib/scruber/queue_adapters/memory.rb', line 156 def add_downloaded(page) @downloaded_pages.push page end |
#add_error_page(page) ⇒ void
This method returns an undefined value.
Internal method to add page to error queue
166 167 168 |
# File 'lib/scruber/queue_adapters/memory.rb', line 166 def add_error_page(page) @error_pages.push page end |
#add_processed_page(page) ⇒ void
This method returns an undefined value.
Saving processed page id to prevent adding identical pages to queue
177 178 179 |
# File 'lib/scruber/queue_adapters/memory.rb', line 177 def add_processed_page(page) @processed_ids.push page.id end |
#delete(page) ⇒ void
This method returns an undefined value.
Delete page from all internal queues
196 197 198 199 200 |
# File 'lib/scruber/queue_adapters/memory.rb', line 196 def delete(page) @queue -= [page] @downloaded_pages -= [page] @error_pages -= [page] end |
#downloaded_count ⇒ Integer
Count of downloaded pages Using to show downloading progress.
107 108 109 |
# File 'lib/scruber/queue_adapters/memory.rb', line 107 def downloaded_count @downloaded_pages.count end |
#fetch_downloaded(count = nil) ⇒ Scruber::QueueAdapters::AbstractAdapter::Page|Array<Scruber::QueueAdapters::AbstractAdapter::Page>
Fetch downloaded and not processed pages for feching
116 117 118 119 120 121 122 |
# File 'lib/scruber/queue_adapters/memory.rb', line 116 def fetch_downloaded(count=nil) if count.nil? @downloaded_pages.shift else @downloaded_pages.shift(count) end end |
#fetch_error(count = nil) ⇒ Scruber::QueueAdapters::AbstractAdapter::Page|Array<Scruber::QueueAdapters::AbstractAdapter::Page>
Fetch error page
129 130 131 132 133 134 135 |
# File 'lib/scruber/queue_adapters/memory.rb', line 129 def fetch_error(count=nil) if count.nil? @error_pages.shift else @error_pages.shift(count) end end |
#fetch_pending(count = nil) ⇒ Scruber::QueueAdapters::AbstractAdapter::Page|Array<Scruber::QueueAdapters::AbstractAdapter::Page>
Fetch pending page for fetching
142 143 144 145 146 147 148 |
# File 'lib/scruber/queue_adapters/memory.rb', line 142 def fetch_pending(count=nil) if count.nil? @queue.shift else @queue.shift(count) end end |
#find(id) ⇒ Page
Search page by id
85 86 87 88 89 90 91 92 |
# File 'lib/scruber/queue_adapters/memory.rb', line 85 def find(id) [@queue, @downloaded_pages, @error_pages].each do |q| q.each do |i| return i if i.id == id end end nil end |
#has_work? ⇒ Boolean
Used by Core. It checks for pages that are not downloaded or not parsed yet.
186 187 188 |
# File 'lib/scruber/queue_adapters/memory.rb', line 186 def has_work? @queue.count > 0 || @downloaded_pages.count > 0 end |
#initialized? ⇒ Boolean
Check if queue was initialized. Using for ‘seed` method. If queue was initialized, then no need to run seed block.
208 209 210 |
# File 'lib/scruber/queue_adapters/memory.rb', line 208 def initialized? @queue.present? || @downloaded_pages.present? || @error_pages.present? end |
#size ⇒ Integer
Size of queue
98 99 100 |
# File 'lib/scruber/queue_adapters/memory.rb', line 98 def size @queue.count end |