Class: Hpricot::Text
- Inherits:
-
Object
- Object
- Hpricot::Text
- Defined in:
- lib/crawler.rb
Overview
extends Hpricot::Text with the functionaliy to extract semantic information
Class Method Summary collapse
-
.store_semantics(tags) ⇒ Object
stores an array of meaningful tags with their rank value.
Instance Method Summary collapse
-
#semantic_value ⇒ Object
extracts the semantic value of a text block.
Class Method Details
.store_semantics(tags) ⇒ Object
stores an array of meaningful tags with their rank value
218 219 220 |
# File 'lib/crawler.rb', line 218 def self.store_semantics() @@semantic_tags= end |
Instance Method Details
#semantic_value ⇒ Object
extracts the semantic value of a text block
223 224 225 226 227 228 229 230 231 232 233 234 |
# File 'lib/crawler.rb', line 223 def semantic_value Hpricot::Text.store_semantics($config['crawler']['tags'].keys) unless defined?(@@semantic_tags) rank = 1 node = parent return nil if(node.name == 'script') while @@semantic_tags.include?(node.name) rank += $config['crawler']['tags'][node.name] node = node.parent return nil if(node.name == 'script') end rank end |