Class: PragmaticTokenizer::Languages::French::SingleQuotes

Inherits:
Object
  • Object
show all
Defined in:
lib/pragmatic_tokenizer/languages/french.rb

Instance Method Summary collapse

Instance Method Details

#handle_single_quotes(text) ⇒ Object



10
11
12
13
14
15
# File 'lib/pragmatic_tokenizer/languages/french.rb', line 10

def handle_single_quotes(text)
  text.gsub!(/(\w|\D)'(?!')(?=\W|$)/o) { Regexp.last_match(1) + ' ' + PragmaticTokenizer::Languages::Common::PUNCTUATION_MAP["'"] + ' ' } || text
  text.gsub!(/(\W|^)'(?=.*\w)/o, ' ' + PragmaticTokenizer::Languages::Common::PUNCTUATION_MAP["'"]) || text
  text.gsub!(/l\'/, '\1 l☮ \2') || text
  text.gsub!(/L\'/, '\1 L☮ \2') || text
end