Class: TwitterKorean::Processor
- Inherits:
-
Object
- Object
- TwitterKorean::Processor
- Defined in:
- lib/twitter_korean/processor.rb
Overview
Ruby interface to Scala TwitterKoreanProcessor
Instance Attribute Summary collapse
-
#java_convertor ⇒ Object
readonly
Returns the value of attribute java_convertor.
-
#jvm_processor ⇒ Object
readonly
Returns the value of attribute jvm_processor.
Instance Method Summary collapse
- #extract_phrases(text, options = {}) ⇒ Object
-
#initialize(*jvmargs) ⇒ Processor
constructor
A new instance of Processor.
- #normalize(text) ⇒ Object
- #stem(text) ⇒ Object
- #tokenize(text) ⇒ Object
Constructor Details
#initialize(*jvmargs) ⇒ Processor
Returns a new instance of Processor.
8 9 10 11 |
# File 'lib/twitter_korean/processor.rb', line 8 def initialize(*jvmargs) bridge = TwitterKorean::JvmBridge.new(jvmargs) @jvm_processor = bridge.scala_twitter_korean_processor end |
Instance Attribute Details
#java_convertor ⇒ Object (readonly)
Returns the value of attribute java_convertor.
6 7 8 |
# File 'lib/twitter_korean/processor.rb', line 6 def java_convertor @java_convertor end |
#jvm_processor ⇒ Object (readonly)
Returns the value of attribute jvm_processor.
6 7 8 |
# File 'lib/twitter_korean/processor.rb', line 6 def jvm_processor @jvm_processor end |
Instance Method Details
#extract_phrases(text, options = {}) ⇒ Object
32 33 34 35 36 37 38 39 |
# File 'lib/twitter_korean/processor.rb', line 32 def extract_phrases(text, = {}) return unless text filter_spam = [:filter_spam] || false = [:including_hashtags] || true converto_to_korean_tokens do jvm_processor.extractPhrases(jvm_processor.tokenize(text), filter_spam, ) end end |
#normalize(text) ⇒ Object
13 14 15 16 |
# File 'lib/twitter_korean/processor.rb', line 13 def normalize(text) return unless text jvm_processor.normalize(text).toString end |
#stem(text) ⇒ Object
25 26 27 28 29 30 |
# File 'lib/twitter_korean/processor.rb', line 25 def stem(text) return unless text converto_to_korean_tokens do jvm_processor.stem(jvm_processor.tokenize(text)) end end |
#tokenize(text) ⇒ Object
18 19 20 21 22 23 |
# File 'lib/twitter_korean/processor.rb', line 18 def tokenize(text) return unless text converto_to_korean_tokens do jvm_processor.tokenize(text) end end |