Class: WordNet::Lemma
- Inherits:
-
Object
- Object
- WordNet::Lemma
- Defined in:
- lib/rwordnet/lemma.rb
Overview
Represents a single word in the WordNet lexicon, which can be used to look up a set of synsets.
Constant Summary collapse
- SPACE =
' '
- POS_SHORTHAND =
{:v => :verb, :n => :noun, :a => :adj, :r => :adv}
- @@cache =
{}
Instance Attribute Summary collapse
-
#id ⇒ Object
A unique integer id that references this lemma.
-
#pointer_symbols ⇒ Object
An array of valid pointer symbols for this lemma.
-
#pos ⇒ Object
The part of speech (noun, verb, adjective) of this lemma.
-
#synset_offsets ⇒ Object
The offset, in bytes, at which the synsets contained in this lemma are stored in WordNet’s internal database.
-
#tagsense_count ⇒ Object
The number of times the sense is tagged in various semantic concordance texts.
-
#word ⇒ Object
The word this lemma represents.
Class Method Summary collapse
-
.find(word, pos) ⇒ Object
Find a lemma for a given word and pos.
-
.find_all(word) ⇒ Object
Find all lemmas for this word across all known parts of speech.
Instance Method Summary collapse
-
#initialize(lexicon_line, id) ⇒ Lemma
constructor
Create a lemma from a line in an lexicon file.
-
#synsets ⇒ Object
Return a list of synsets for this Lemma.
-
#to_s ⇒ Object
Returns a compact string representation of this lemma, e.g.
Constructor Details
#initialize(lexicon_line, id) ⇒ Lemma
Create a lemma from a line in an lexicon file. You should not be creating Lemmas by hand; instead, use the WordNet::Lemma.find and WordNet::Lemma.find_all methods to find the Lemma for a word.
28 29 30 31 32 33 34 35 36 37 38 39 |
# File 'lib/rwordnet/lemma.rb', line 28 def initialize(lexicon_line, id) @id = id line = lexicon_line.split(" ") @word = line.shift @pos = line.shift synset_count = line.shift.to_i @pointer_symbols = line.slice!(0, line.shift.to_i) line.shift # Throw away redundant sense_cnt = line.shift.to_i @synset_offsets = line.slice!(0, synset_count).map(&:to_i) end |
Instance Attribute Details
#id ⇒ Object
A unique integer id that references this lemma. Used internally within WordNet’s database.
20 21 22 |
# File 'lib/rwordnet/lemma.rb', line 20 def id @id end |
#pointer_symbols ⇒ Object
An array of valid pointer symbols for this lemma. The list of all valid pointer symbols is defined in pointers.rb.
24 25 26 |
# File 'lib/rwordnet/lemma.rb', line 24 def pointer_symbols @pointer_symbols end |
#pos ⇒ Object
The part of speech (noun, verb, adjective) of this lemma. One of ‘n’, ‘v’, ‘a’ (adjective), or ‘r’ (adverb)
11 12 13 |
# File 'lib/rwordnet/lemma.rb', line 11 def pos @pos end |
#synset_offsets ⇒ Object
The offset, in bytes, at which the synsets contained in this lemma are stored in WordNet’s internal database.
17 18 19 |
# File 'lib/rwordnet/lemma.rb', line 17 def synset_offsets @synset_offsets end |
#tagsense_count ⇒ Object
The number of times the sense is tagged in various semantic concordance texts. A tagsense_count of 0 indicates that the sense has not been semantically tagged.
14 15 16 |
# File 'lib/rwordnet/lemma.rb', line 14 def end |
#word ⇒ Object
The word this lemma represents
8 9 10 |
# File 'lib/rwordnet/lemma.rb', line 8 def word @word end |
Class Method Details
.find(word, pos) ⇒ Object
Find a lemma for a given word and pos. Valid parts of speech are: ‘adj’, ‘adv’, ‘noun’, ‘verb’. Additionally, you can use the shorthand forms of each of these (‘a’, ‘r’, ‘n’, ‘v’)/
65 66 67 68 69 70 71 72 73 |
# File 'lib/rwordnet/lemma.rb', line 65 def find(word, pos) # Map shorthand POS to full forms pos = POS_SHORTHAND[pos] || pos cache = @@cache[pos] ||= build_cache(pos) if found = cache[word] Lemma.new(*found) end end |
.find_all(word) ⇒ Object
Find all lemmas for this word across all known parts of speech
56 57 58 59 60 |
# File 'lib/rwordnet/lemma.rb', line 56 def find_all(word) [:noun, :verb, :adj, :adv].flat_map do |pos| find(word, pos) || [] end end |
Instance Method Details
#synsets ⇒ Object
Return a list of synsets for this Lemma. Each synset represents a different sense, or meaning, of the word.
42 43 44 |
# File 'lib/rwordnet/lemma.rb', line 42 def synsets @synset_offsets.map { |offset| Synset.new(@pos, offset) } end |
#to_s ⇒ Object
Returns a compact string representation of this lemma, e.g. “fall, v” for the verb form of the word “fall”.
48 49 50 |
# File 'lib/rwordnet/lemma.rb', line 48 def to_s [@word, @pos].join(",") end |