Class: Hpricot::Text

Inherits:
Object
  • Object
show all
Defined in:
lib/crawler.rb

Overview

extends Hpricot::Text with the functionaliy to extract semantic information

Class Method Summary collapse

Instance Method Summary collapse

Class Method Details

.store_semantics(tags) ⇒ Object

stores an array of meaningful tags with their rank value



218
219
220
# File 'lib/crawler.rb', line 218

def self.store_semantics(tags)
  @@semantic_tags=tags
end

Instance Method Details

#semantic_valueObject

extracts the semantic value of a text block



223
224
225
226
227
228
229
230
231
232
233
234
# File 'lib/crawler.rb', line 223

def semantic_value
  Hpricot::Text.store_semantics($config['crawler']['tags'].keys) unless defined?(@@semantic_tags)
  rank = 1
  node = parent
				return nil if(node.name == 'script')
  while @@semantic_tags.include?(node.name)
    rank += $config['crawler']['tags'][node.name]
    node = node.parent
					return nil if(node.name == 'script')
  end
  rank
end