Class: Grell::PageCollection
- Inherits:
-
Object
- Object
- Grell::PageCollection
- Defined in:
- lib/grell/page_collection.rb
Overview
Keeps a record of all the pages crawled. When a new url is found it is added to this collection, which makes sure it is unique. This page is part of the discovered pages. Eventually that page will be navigated to, then the page will be part of the visited pages.
Instance Attribute Summary collapse
-
#collection ⇒ Object
readonly
Returns the value of attribute collection.
Instance Method Summary collapse
- #create_page(url, parent_id) ⇒ Object
- #discovered_pages ⇒ Object
-
#initialize(add_match_block) ⇒ PageCollection
constructor
A block containing the logic that determines if a new URL should be added to the collection or if it is already present will be passed to the initializer.
- #next_page ⇒ Object
- #visited_pages ⇒ Object
Constructor Details
#initialize(add_match_block) ⇒ PageCollection
A block containing the logic that determines if a new URL should be added to the collection or if it is already present will be passed to the initializer.
11 12 13 14 |
# File 'lib/grell/page_collection.rb', line 11 def initialize(add_match_block) @collection = [] @add_match_block = add_match_block || default_add_match end |
Instance Attribute Details
#collection ⇒ Object (readonly)
Returns the value of attribute collection.
7 8 9 |
# File 'lib/grell/page_collection.rb', line 7 def collection @collection end |
Instance Method Details
#create_page(url, parent_id) ⇒ Object
16 17 18 19 20 21 |
# File 'lib/grell/page_collection.rb', line 16 def create_page(url, parent_id) page_id = next_id page = Page.new(url, page_id, parent_id) add(page) page end |
#discovered_pages ⇒ Object
27 28 29 |
# File 'lib/grell/page_collection.rb', line 27 def discovered_pages @collection - visited_pages end |
#next_page ⇒ Object
31 32 33 |
# File 'lib/grell/page_collection.rb', line 31 def next_page discovered_pages.sort_by{|page| page.parent_id}.first end |
#visited_pages ⇒ Object
23 24 25 |
# File 'lib/grell/page_collection.rb', line 23 def visited_pages @collection.select {|page| page.visited?} end |