Class: Treat::Workers::Processors::Segmenters::Scalpel

Inherits:
Object
  • Object
show all
Defined in:
lib/treat/workers/processors/segmenters/scalpel.rb

Overview

Sentence segmentation based on a set of predefined rules that handle a large number of usage cases of sentence enders. The idea is to remove all cases of .!? being used for other purposes than marking a full stop before naively segmenting the text.

Class Method Summary collapse

Class Method Details

.segment(entity, options = {}) ⇒ Object

Segment a text using the Scalpel algorithm.



11
12
13
14
15
16
17
18
# File 'lib/treat/workers/processors/segmenters/scalpel.rb', line 11

def self.segment(entity, options = {})
  sentences = Scalpel.cut(entity.to_s)
  sentences.each do |sentence|
    entity << Treat::Entities::Phrase.
    from_string(sentence.strip)
  end
  entity
end