Class: StanfordParser::DocumentPreprocessor

Inherits:
Rjb::JavaObjectWrapper show all
Defined in:
lib/stanfordparser.rb

Overview

Tokenizes documents into words and sentences.

This is a wrapper for the edu.stanford.nlp.process.DocumentPreprocessor object.

Direct Known Subclasses

StandoffDocumentPreprocessor

Instance Attribute Summary

Attributes inherited from Rjb::JavaObjectWrapper

#java_object

Instance Method Summary collapse

Methods inherited from Rjb::JavaObjectWrapper

#each, #method_missing

Constructor Details

#initialize(suppressEscaping = false) ⇒ DocumentPreprocessor

Returns a new instance of DocumentPreprocessor.



238
239
240
# File 'lib/stanfordparser.rb', line 238

def initialize(suppressEscaping = false)
  super("edu.stanford.nlp.process.DocumentPreprocessor", suppressEscaping)
end

Dynamic Method Handling

This class handles dynamic methods through the method_missing method in the class Rjb::JavaObjectWrapper

Instance Method Details

#getSentencesFromString(s) ⇒ Object

Returns a list of sentences in a string.



243
244
245
246
# File 'lib/stanfordparser.rb', line 243

def getSentencesFromString(s)
  s = Rjb::JavaObjectWrapper.new("java.io.StringReader", s)
  _invoke(:getSentencesFromText, "Ljava.io.Reader;", s.java_object)
end

#inspectObject



248
249
250
# File 'lib/stanfordparser.rb', line 248

def inspect
  "<#{self.class.to_s.split('::').last}>"
end

#to_sObject



252
253
254
# File 'lib/stanfordparser.rb', line 252

def to_s
  inspect
end