Class: PragmaticSegmenter::AbbreviationReplacer
- Inherits:
-
Object
- Object
- PragmaticSegmenter::AbbreviationReplacer
- Defined in:
- lib/pragmatic_segmenter/abbreviation_replacer.rb
Overview
This class searches for periods within an abbreviation and replaces the periods.
Direct Known Subclasses
Languages::Arabic::AbbreviationReplacer, Languages::Deutsch::AbbreviationReplacer, Languages::Dutch::AbbreviationReplacer, Languages::English::AbbreviationReplacer, Languages::French::AbbreviationReplacer, Languages::Italian::AbbreviationReplacer, Languages::Persian::AbbreviationReplacer, Languages::Polish::AbbreviationReplacer, Languages::Russian::AbbreviationReplacer, Languages::Spanish::AbbreviationReplacer
Defined Under Namespace
Modules: AmPmRules
Constant Summary collapse
- PossessiveAbbreviationRule =
Rubular: rubular.com/r/yqa4Rit8EY
Rule.new(/\.(?='s\s)|\.(?='s$)|\.(?='s\z)/, '∯')
- KommanditgesellschaftRule =
Rubular: rubular.com/r/NEv265G2X2
Rule.new(/(?<=Co)\.(?=\sKG)/, '∯')
- MULTI_PERIOD_ABBREVIATION_REGEX =
Rubular: rubular.com/r/xDkpFZ0EgH
/\b[a-z](?:\.[a-z])+[.]/i- SENTENCE_STARTERS =
%w(A Being Did For He How However I In It Millions More She That The There They We What When Where Who Why)
Instance Attribute Summary collapse
-
#text ⇒ Object
readonly
Returns the value of attribute text.
Instance Method Summary collapse
-
#initialize(text:) ⇒ AbbreviationReplacer
constructor
A new instance of AbbreviationReplacer.
- #replace ⇒ Object
Constructor Details
#initialize(text:) ⇒ AbbreviationReplacer
Returns a new instance of AbbreviationReplacer.
37 38 39 |
# File 'lib/pragmatic_segmenter/abbreviation_replacer.rb', line 37 def initialize(text:) @text = Text.new(text) end |
Instance Attribute Details
#text ⇒ Object (readonly)
Returns the value of attribute text.
36 37 38 |
# File 'lib/pragmatic_segmenter/abbreviation_replacer.rb', line 36 def text @text end |
Instance Method Details
#replace ⇒ Object
41 42 43 44 45 46 47 48 49 |
# File 'lib/pragmatic_segmenter/abbreviation_replacer.rb', line 41 def replace @reformatted_text = text.apply(PossessiveAbbreviationRule) @reformatted_text = text.apply(KommanditgesellschaftRule) @reformatted_text = PragmaticSegmenter::SingleLetterAbbreviation.new(text: @reformatted_text).replace @reformatted_text = search_for_abbreviations_in_string(@reformatted_text, abbreviations) @reformatted_text = replace_multi_period_abbreviations(@reformatted_text) @reformatted_text = @reformatted_text.apply(AmPmRules::All) replace_abbreviation_as_sentence_boundary(@reformatted_text) end |