Class: Boilerpipe::Extractors::ArticleSentenceExtractor

Inherits:
Object
  • Object
show all
Defined in:
lib/boilerpipe/extractors/article_sentence_extractor.rb

Class Method Summary collapse

Class Method Details

.process(doc) ⇒ Object



11
12
13
14
15
# File 'lib/boilerpipe/extractors/article_sentence_extractor.rb', line 11

def self.process(doc)
  ::Boilerpipe::Extractors::ArticleExtractor.process doc
  ::Boilerpipe::Filters::SplitParagraphBlocksFilter.process doc
  ::Boilerpipe::Filters::MinClauseWordsFilter.process doc
end

.text(contents) ⇒ Object



5
6
7
8
9
# File 'lib/boilerpipe/extractors/article_sentence_extractor.rb', line 5

def self.text(contents)
  doc = ::Boilerpipe::SAX::BoilerpipeHTMLParser.parse(contents)
  ::Boilerpipe::Extractors::ArticleSentenceExtractor.process(doc)
  doc.content
end