Class: PragmaticSegmenter::Languages::English::Cleaner

Inherits:
Cleaner
  • Object
show all
Defined in:
lib/pragmatic_segmenter/languages/english.rb

Constant Summary

Constants included from Rules

Rules::AbbreviationsWithMultiplePeriodsAndEmailRule, Rules::ConsecutiveForwardSlashRule, Rules::ConsecutivePeriodsRule, Rules::DoubleNewLineRule, Rules::DoubleNewLineWithSpaceRule, Rules::EscapedCarriageReturnRule, Rules::EscapedNewLineRule, Rules::ExtraWhiteSpaceRule, Rules::GeoLocationRule, Rules::InlineFormattingRule, Rules::NEWLINE_IN_MIDDLE_OF_SENTENCE_REGEX, Rules::NO_SPACE_BETWEEN_SENTENCES_DIGIT_REGEX, Rules::NO_SPACE_BETWEEN_SENTENCES_REGEX, Rules::NewLineFollowedByBulletRule, Rules::NewLineFollowedByPeriodRule, Rules::NewLineInMiddleOfWordRule, Rules::NoSpaceBetweenSentencesDigitRule, Rules::NoSpaceBetweenSentencesRule, Rules::PDF_NewLineInMiddleOfSentenceNoSpacesRule, Rules::PDF_NewLineInMiddleOfSentenceRule, Rules::QuestionMarkInQuotationRule, Rules::QuotationsFirstRule, Rules::QuotationsSecondRule, Rules::ReplaceNewlineWithCarriageReturnRule, Rules::SingleNewLineRule, Rules::SubSingleQuoteRule, Rules::TableOfContentsRule, Rules::TypoEscapedCarriageReturnRule, Rules::TypoEscapedNewLineRule, Rules::URL_EMAIL_KEYWORDS

Instance Attribute Summary

Attributes inherited from Cleaner

#doc_type, #text

Instance Method Summary collapse

Methods inherited from Cleaner

#initialize

Constructor Details

This class inherits a constructor from PragmaticSegmenter::Cleaner

Instance Method Details

#cleanObject



7
8
9
10
# File 'lib/pragmatic_segmenter/languages/english.rb', line 7

def clean
  super
  clean_quotations(@clean_text)
end