Module: TexterraNLP
Constant Summary
Constants included from TexterraNLPSpecs
Instance Method Summary collapse
-
#disambiguation_annotate(text) ⇒ Hash
Detects the most appropriate meanings (concepts) for terms occurred in a given text.
-
#domain_detection_annotate(text) ⇒ Hash
Detects the most appropriate domain for the given text.
-
#domain_polarity_detection_annotate(text, domain = '') ⇒ Hash
Detects whether the given text has positive, negative, or no sentiment, with respect to domain.
-
#key_concepts_annotate(text) ⇒ Hash
Key concepts are the concepts providing short (conceptual) and informative text description.
-
#language_detection_annotate(text) ⇒ Hash
Detects language of given text.
-
#lemmatization_annotate(text) ⇒ Hash
Detects lemma of each word of a given text.
-
#named_entities_annotate(text) ⇒ Hash
Finds all named entities occurences in a given text.
-
#polarity_detection_annotate(text) ⇒ Hash
Detects whether the given text has positive, negative or no sentiment.
-
#pos_tagging_annotate(text) ⇒ Hash
Detects part of speech tag for each word of a given text.
-
#sentence_detection_annotate(text) ⇒ Hash
Detects boundaries of sentences in a given text.
-
#spelling_correction_annotate(text) ⇒ Hash
Tries to correct disprints and other spelling errors in a given text.
-
#subjectivity_detection_annotate(text) ⇒ Hash
Detects whether the given text is subjective or not.
-
#syntax_detection(text) ⇒ Hash
Detects Syntax relations in text.
-
#term_detection_annotate(text) ⇒ Hash
Extracts not overlapping terms within a given text; term is a textual representation for some concept of the real world.
-
#tokenization_annotate(text) ⇒ Hash
Detects all tokens (minimal significant text parts) in a given text.
-
#tweet_normalization(text) ⇒ Hash
Detects Twitter-specific entities: Hashtags, User names, Emoticons, URLs.
Instance Method Details
#disambiguation_annotate(text) ⇒ Hash
Detects the most appropriate meanings (concepts) for terms occurred in a given text
73 74 75 |
# File 'lib/ispras-api/texterra/nlp.rb', line 73 def disambiguation_annotate(text) preset_nlp(:disambiguation, text) end |
#domain_detection_annotate(text) ⇒ Hash
Detects the most appropriate domain for the given text. Currently only 2 specific domains are supported: ‘movie’ and ‘politics’ If no domain from this list has been detected, the text is assumed to be no domain, or general domain
92 93 94 |
# File 'lib/ispras-api/texterra/nlp.rb', line 92 def domain_detection_annotate(text) preset_nlp(:domainDetection, text) end |
#domain_polarity_detection_annotate(text, domain = '') ⇒ Hash
Detects whether the given text has positive, negative, or no sentiment, with respect to domain. If domain isn’t provided, Domain detection is applied, this way method tries to achieve best results. If no domain is detected general domain algorithm is applied
119 120 121 122 123 124 125 126 127 |
# File 'lib/ispras-api/texterra/nlp.rb', line 119 def domain_polarity_detection_annotate(text, domain = '') specs = NLP_SPECS[:domainPolarityDetection] domain = "(#{domain})" unless domain.empty? result = POST(specs[:path] % domain, specs[:params], {text: text}, :json) result[:annotations].each do |key, value| value.map! { |an| assign_text(an, text) } end result end |
#key_concepts_annotate(text) ⇒ Hash
Key concepts are the concepts providing short (conceptual) and informative text description. This service extracts a set of key concepts for a given text
82 83 84 |
# File 'lib/ispras-api/texterra/nlp.rb', line 82 def key_concepts_annotate(text) preset_nlp(:keyConcepts, text) end |
#language_detection_annotate(text) ⇒ Hash
Detects language of given text
9 10 11 |
# File 'lib/ispras-api/texterra/nlp.rb', line 9 def language_detection_annotate(text) preset_nlp(:languageDetection, text) end |
#lemmatization_annotate(text) ⇒ Hash
Detects lemma of each word of a given text
33 34 35 |
# File 'lib/ispras-api/texterra/nlp.rb', line 33 def lemmatization_annotate(text) preset_nlp(:lemmatization, text) end |
#named_entities_annotate(text) ⇒ Hash
Finds all named entities occurences in a given text
57 58 59 |
# File 'lib/ispras-api/texterra/nlp.rb', line 57 def named_entities_annotate(text) preset_nlp(:namedEntities, text) end |
#polarity_detection_annotate(text) ⇒ Hash
Detects whether the given text has positive, negative or no sentiment
108 109 110 |
# File 'lib/ispras-api/texterra/nlp.rb', line 108 def polarity_detection_annotate(text) preset_nlp(:polarityDetection, text) end |
#pos_tagging_annotate(text) ⇒ Hash
Detects part of speech tag for each word of a given text
41 42 43 |
# File 'lib/ispras-api/texterra/nlp.rb', line 41 def pos_tagging_annotate(text) preset_nlp(:posTagging, text) end |
#sentence_detection_annotate(text) ⇒ Hash
Detects boundaries of sentences in a given text
17 18 19 |
# File 'lib/ispras-api/texterra/nlp.rb', line 17 def sentence_detection_annotate(text) preset_nlp(:sentenceDetection, text) end |
#spelling_correction_annotate(text) ⇒ Hash
Tries to correct disprints and other spelling errors in a given text
49 50 51 |
# File 'lib/ispras-api/texterra/nlp.rb', line 49 def spelling_correction_annotate(text) preset_nlp(:spellingCorrection, text) end |
#subjectivity_detection_annotate(text) ⇒ Hash
Detects whether the given text is subjective or not
100 101 102 |
# File 'lib/ispras-api/texterra/nlp.rb', line 100 def subjectivity_detection_annotate(text) preset_nlp(:subjectivityDetection, text) end |
#syntax_detection(text) ⇒ Hash
Detects Syntax relations in text. Only works for russian texts
142 143 144 145 146 147 148 |
# File 'lib/ispras-api/texterra/nlp.rb', line 142 def syntax_detection(text) result = preset_nlp(:syntaxDetection, text) result[:annotations][:'syntax-relation'].each do |an| an[:value][:parent] = assign_text(an[:value][:parent], text) if an[:value] && an[:value][:parent] end result end |
#term_detection_annotate(text) ⇒ Hash
Extracts not overlapping terms within a given text; term is a textual representation for some concept of the real world
65 66 67 |
# File 'lib/ispras-api/texterra/nlp.rb', line 65 def term_detection_annotate(text) preset_nlp(:termDetection, text) end |
#tokenization_annotate(text) ⇒ Hash
Detects all tokens (minimal significant text parts) in a given text
25 26 27 |
# File 'lib/ispras-api/texterra/nlp.rb', line 25 def tokenization_annotate(text) preset_nlp(:tokenization, text) end |
#tweet_normalization(text) ⇒ Hash
Detects Twitter-specific entities: Hashtags, User names, Emoticons, URLs. And also: Stop-words, Misspellings, Spelling suggestions, Spelling corrections
134 135 136 |
# File 'lib/ispras-api/texterra/nlp.rb', line 134 def tweet_normalization(text) preset_nlp(:tweetNormalization, text) end |