Module: Boilerpipe::Extractors

Defined in:
lib/boilerpipe/extractors/keep_everything_extractor.rb,
lib/boilerpipe/extractors/canola_extractor.rb,
lib/boilerpipe/extractors/article_extractor.rb,
lib/boilerpipe/extractors/default_extractor.rb,
lib/boilerpipe/extractors/largest_content_extractor.rb,
lib/boilerpipe/extractors/num_words_rules_extractor.rb,
lib/boilerpipe/extractors/article_sentence_extractor.rb,
lib/boilerpipe/extractors/keep_everything_with_k_min_words_extractor.rb

Overview

A full-text extractor which extracts the largest text component of a page. For news articles, it may perform better than the DefaultExtractor, but usually worse than ArticleExtractor.

Defined Under Namespace

Classes: ArticleExtractor, ArticleSentenceExtractor, CanolaExtractor, DefaultExtractor, KeepEverythingExtractor, KeepEverythingWithKMinWordsExtractor, LargestContentExtractor, NumWordsRulesExtractor