Class: REXML::Text
- Inherits:
-
Object
- Object
- REXML::Text
- Defined in:
- lib/crawler.rb
Overview
extends REXML::Text with the functionaliy to extract semantic information
Class Method Summary collapse
-
.store_semantics(tags) ⇒ Object
stores an array of meaningful tags with their rank value.
Instance Method Summary collapse
-
#semantic_value ⇒ Object
extracts the semantic value of a text block.
Class Method Details
.store_semantics(tags) ⇒ Object
stores an array of meaningful tags with their rank value
157 158 159 |
# File 'lib/crawler.rb', line 157 def self.store_semantics() @@semantic_tags= end |
Instance Method Details
#semantic_value ⇒ Object
extracts the semantic value of a text block
162 163 164 165 166 167 168 169 170 171 172 173 |
# File 'lib/crawler.rb', line 162 def semantic_value REXML::Text.store_semantics($config['crawler']['tags'].keys) unless defined?(@@semantic_tags) rank = 1 node = parent return nil if(node.name == 'script') while @@semantic_tags.include?(node.name) rank += $config['crawler']['tags'][node.name] node = node.parent return nil if(node.name == 'script') end rank end |