Module: TexterraNLP

Includes:: TexterraNLPSpecs

Included in:: TexterraAPI

Defined in:: lib/ispras-api/texterra/nlp.rb

Constant Summary

Constants included from TexterraNLPSpecs

TexterraNLPSpecs::NLP_SPECS

Instance Method Summary collapse

#disambiguation_annotate(text) ⇒ Hash

Detects the most appropriate meanings (concepts) for terms occurred in a given text.
#domain_detection_annotate(text) ⇒ Hash

Detects the most appropriate domain for the given text.
#domain_polarity_detection_annotate(text, domain = '') ⇒ Hash

Detects whether the given text has positive, negative, or no sentiment, with respect to domain.
#key_concepts_annotate(text) ⇒ Hash

Key concepts are the concepts providing short (conceptual) and informative text description.
#language_detection_annotate(text) ⇒ Hash

Detects language of given text.
#lemmatization_annotate(text) ⇒ Hash

Detects lemma of each word of a given text.
#named_entities_annotate(text) ⇒ Hash

Finds all named entities occurences in a given text.
#polarity_detection_annotate(text) ⇒ Hash

Detects whether the given text has positive, negative or no sentiment.
#pos_tagging_annotate(text) ⇒ Hash

Detects part of speech tag for each word of a given text.
#sentence_detection_annotate(text) ⇒ Hash

Detects boundaries of sentences in a given text.
#spelling_correction_annotate(text) ⇒ Hash

Tries to correct disprints and other spelling errors in a given text.
#subjectivity_detection_annotate(text) ⇒ Hash

Detects whether the given text is subjective or not.
#syntax_detection(text) ⇒ Hash

Detects Syntax relations in text.
#term_detection_annotate(text) ⇒ Hash

Extracts not overlapping terms within a given text; term is a textual representation for some concept of the real world.
#tokenization_annotate(text) ⇒ Hash

Detects all tokens (minimal significant text parts) in a given text.
#tweet_normalization(text) ⇒ Hash

Detects Twitter-specific entities: Hashtags, User names, Emoticons, URLs.

Instance Method Details

#disambiguation_annotate(text) ⇒ `Hash`

Detects the most appropriate meanings (concepts) for terms occurred in a given text

Parameters:

text (String) —

Text to process

Returns:

(Hash) —

Texterra document



73
74
75

# File 'lib/ispras-api/texterra/nlp.rb', line 73

def disambiguation_annotate(text)
  preset_nlp(:disambiguation, text)
end

#domain_detection_annotate(text) ⇒ `Hash`

Detects the most appropriate domain for the given text. Currently only 2 specific domains are supported: ‘movie’ and ‘politics’ If no domain from this list has been detected, the text is assumed to be no domain, or general domain

Parameters:

text (String) —

Text to process

Returns:

(Hash) —

Texterra document



92
93
94

# File 'lib/ispras-api/texterra/nlp.rb', line 92

def domain_detection_annotate(text)
  preset_nlp(:domainDetection, text)
end

#domain_polarity_detection_annotate(text, domain = '') ⇒ `Hash`

Detects whether the given text has positive, negative, or no sentiment, with respect to domain. If domain isn’t provided, Domain detection is applied, this way method tries to achieve best results. If no domain is detected general domain algorithm is applied

Parameters:

text (String) —

Text to process
domain (String) (defaults to: '') —

Domain for polarity detection

Returns:

(Hash) —

Texterra document

# File 'lib/ispras-api/texterra/nlp.rb', line 119

def domain_polarity_detection_annotate(text, domain = '')
  specs = NLP_SPECS[:domainPolarityDetection]
  domain = "(#{domain})" unless domain.empty?
  result = POST(specs[:path] % domain, specs[:params], {text: text}, :json)
  result[:annotations].each do |key, value|
    value.map! { |an| assign_text(an, text) }
  end
  result
end

#key_concepts_annotate(text) ⇒ `Hash`

Key concepts are the concepts providing short (conceptual) and informative text description. This service extracts a set of key concepts for a given text

Parameters:

text (String) —

Text to process

Returns:

(Hash) —

Texterra document



82
83
84

# File 'lib/ispras-api/texterra/nlp.rb', line 82

def key_concepts_annotate(text)
  preset_nlp(:keyConcepts, text)
end

#language_detection_annotate(text) ⇒ `Hash`

Detects language of given text

Parameters:

text (String) —

Text to process

Returns:

(Hash) —

Texterra document



9
10
11

# File 'lib/ispras-api/texterra/nlp.rb', line 9

def language_detection_annotate(text)
  preset_nlp(:languageDetection, text)
end

#lemmatization_annotate(text) ⇒ `Hash`

Detects lemma of each word of a given text

Parameters:

text (String) —

Text to process

Returns:

(Hash) —

Texterra document



33
34
35

# File 'lib/ispras-api/texterra/nlp.rb', line 33

def lemmatization_annotate(text)
  preset_nlp(:lemmatization, text)
end

#named_entities_annotate(text) ⇒ `Hash`

Finds all named entities occurences in a given text

Parameters:

text (String) —

Text to process

Returns:

(Hash) —

Texterra document



57
58
59

# File 'lib/ispras-api/texterra/nlp.rb', line 57

def named_entities_annotate(text)
  preset_nlp(:namedEntities, text)
end

#polarity_detection_annotate(text) ⇒ `Hash`

Detects whether the given text has positive, negative or no sentiment

Parameters:

text (String) —

Text to process

Returns:

(Hash) —

Texterra document



108
109
110

# File 'lib/ispras-api/texterra/nlp.rb', line 108

def polarity_detection_annotate(text)
  preset_nlp(:polarityDetection, text)
end

#pos_tagging_annotate(text) ⇒ `Hash`

Detects part of speech tag for each word of a given text

Parameters:

text (String) —

Text to process

Returns:

(Hash) —

Texterra document



41
42
43

# File 'lib/ispras-api/texterra/nlp.rb', line 41

def pos_tagging_annotate(text)
  preset_nlp(:posTagging, text)
end

#sentence_detection_annotate(text) ⇒ `Hash`

Detects boundaries of sentences in a given text

Parameters:

text (String) —

Text to process

Returns:

(Hash) —

Texterra document



17
18
19

# File 'lib/ispras-api/texterra/nlp.rb', line 17

def sentence_detection_annotate(text)
  preset_nlp(:sentenceDetection, text)
end

#spelling_correction_annotate(text) ⇒ `Hash`

Tries to correct disprints and other spelling errors in a given text

Parameters:

text (String) —

Text to process

Returns:

(Hash) —

Texterra document



49
50
51

# File 'lib/ispras-api/texterra/nlp.rb', line 49

def spelling_correction_annotate(text)
  preset_nlp(:spellingCorrection, text)
end

#subjectivity_detection_annotate(text) ⇒ `Hash`

Detects whether the given text is subjective or not

Parameters:

text (String) —

Text to process

Returns:

(Hash) —

Texterra document



100
101
102

# File 'lib/ispras-api/texterra/nlp.rb', line 100

def subjectivity_detection_annotate(text)
  preset_nlp(:subjectivityDetection, text)
end

#syntax_detection(text) ⇒ `Hash`

Detects Syntax relations in text. Only works for russian texts

Parameters:

text (String) —

Text to process

Returns:

(Hash) —

Texterra document

# File 'lib/ispras-api/texterra/nlp.rb', line 142

def syntax_detection(text)
  result = preset_nlp(:syntaxDetection, text)
  result[:annotations][:'syntax-relation'].each do |an|
    an[:value][:parent] = assign_text(an[:value][:parent], text) if an[:value] && an[:value][:parent]
  end
  result
end

#term_detection_annotate(text) ⇒ `Hash`

Extracts not overlapping terms within a given text; term is a textual representation for some concept of the real world

Parameters:

text (String) —

Text to process

Returns:

(Hash) —

Texterra document



65
66
67

# File 'lib/ispras-api/texterra/nlp.rb', line 65

def term_detection_annotate(text)
  preset_nlp(:termDetection, text)
end

#tokenization_annotate(text) ⇒ `Hash`

Detects all tokens (minimal significant text parts) in a given text

Parameters:

text (String) —

Text to process

Returns:

(Hash) —

Texterra document



25
26
27

# File 'lib/ispras-api/texterra/nlp.rb', line 25

def tokenization_annotate(text)
  preset_nlp(:tokenization, text)
end

#tweet_normalization(text) ⇒ `Hash`

Detects Twitter-specific entities: Hashtags, User names, Emoticons, URLs. And also: Stop-words, Misspellings, Spelling suggestions, Spelling corrections

Parameters:

text (String) —

Text to process

Returns:

(Hash) —

Texterra document



134
135
136

# File 'lib/ispras-api/texterra/nlp.rb', line 134

def tweet_normalization(text)
  preset_nlp(:tweetNormalization, text)
end

Module: TexterraNLP

Constant Summary

Constants included from TexterraNLPSpecs

Instance Method Summary collapse

Instance Method Details

#disambiguation_annotate(text) ⇒ Hash

#domain_detection_annotate(text) ⇒ Hash

#domain_polarity_detection_annotate(text, domain = '') ⇒ Hash

#key_concepts_annotate(text) ⇒ Hash

#language_detection_annotate(text) ⇒ Hash

#lemmatization_annotate(text) ⇒ Hash

#named_entities_annotate(text) ⇒ Hash

#polarity_detection_annotate(text) ⇒ Hash

#pos_tagging_annotate(text) ⇒ Hash

#sentence_detection_annotate(text) ⇒ Hash

#spelling_correction_annotate(text) ⇒ Hash

#subjectivity_detection_annotate(text) ⇒ Hash

#syntax_detection(text) ⇒ Hash

#term_detection_annotate(text) ⇒ Hash

#tokenization_annotate(text) ⇒ Hash

#tweet_normalization(text) ⇒ Hash

#disambiguation_annotate(text) ⇒ `Hash`

#domain_detection_annotate(text) ⇒ `Hash`

#domain_polarity_detection_annotate(text, domain = '') ⇒ `Hash`

#key_concepts_annotate(text) ⇒ `Hash`

#language_detection_annotate(text) ⇒ `Hash`

#lemmatization_annotate(text) ⇒ `Hash`

#named_entities_annotate(text) ⇒ `Hash`

#polarity_detection_annotate(text) ⇒ `Hash`

#pos_tagging_annotate(text) ⇒ `Hash`

#sentence_detection_annotate(text) ⇒ `Hash`

#spelling_correction_annotate(text) ⇒ `Hash`

#subjectivity_detection_annotate(text) ⇒ `Hash`

#syntax_detection(text) ⇒ `Hash`

#term_detection_annotate(text) ⇒ `Hash`

#tokenization_annotate(text) ⇒ `Hash`

#tweet_normalization(text) ⇒ `Hash`