Class: WordNet::Lemma

Inherits:
Object
  • Object
show all
Defined in:
lib/wordnet/lemma.rb

Overview

Represents a single word in the WordNet lexicon, which can be used to look up a set of synsets.

Constant Summary collapse

SPACE =
' '
@@cache =
{}

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(lexicon_line, id) ⇒ Lemma

Create a lemma from a line in an lexicon file. You should not be creating Lemmas by hand; instead, use the WordNet::Lemma.find and WordNet::Lemma.find_all methods to find the Lemma for a word.



27
28
29
30
31
32
33
34
35
36
37
38
# File 'lib/wordnet/lemma.rb', line 27

def initialize(lexicon_line, id)
  @id = id
  line = lexicon_line.split(" ")

  @word = line.shift
  @pos = line.shift
  synset_count = line.shift.to_i
  @pointer_symbols = line.slice!(0, line.shift.to_i)
  line.shift # Throw away redundant sense_cnt
  @tagsense_count = line.shift.to_i
  @synset_offsets = line.slice!(0, synset_count).map(&:to_i)
end

Instance Attribute Details

#idObject

A unique integer id that references this lemma. Used internally within WordNet’s database.



19
20
21
# File 'lib/wordnet/lemma.rb', line 19

def id
  @id
end

#pointer_symbolsObject

An array of valid pointer symbols for this lemma. The list of all valid pointer symbols is defined in pointers.rb.



23
24
25
# File 'lib/wordnet/lemma.rb', line 23

def pointer_symbols
  @pointer_symbols
end

#posObject

The part of speech (noun, verb, adjective) of this lemma. One of ‘n’, ‘v’, ‘a’ (adjective), or ‘r’ (adverb)



10
11
12
# File 'lib/wordnet/lemma.rb', line 10

def pos
  @pos
end

#synset_offsetsObject

The offset, in bytes, at which the synsets contained in this lemma are stored in WordNet’s internal database.



16
17
18
# File 'lib/wordnet/lemma.rb', line 16

def synset_offsets
  @synset_offsets
end

#tagsense_countObject

The number of times the sense is tagged in various semantic concordance texts. A tagsense_count of 0 indicates that the sense has not been semantically tagged.



13
14
15
# File 'lib/wordnet/lemma.rb', line 13

def tagsense_count
  @tagsense_count
end

#wordObject

The word this lemma represents



7
8
9
# File 'lib/wordnet/lemma.rb', line 7

def word
  @word
end

Class Method Details

.find(word, pos) ⇒ Object

Find a lemma for a given word and pos



62
63
64
65
66
67
# File 'lib/wordnet/lemma.rb', line 62

def find(word, pos)
  cache = @@cache[pos] ||= build_cache(pos)
  if found = cache[word]
    Lemma.new(*found)
  end
end

.find_all(word) ⇒ Object

Find all lemmas for this word across all known parts of speech



55
56
57
58
59
# File 'lib/wordnet/lemma.rb', line 55

def find_all(word)
  [:noun, :verb, :adj, :adv].flat_map do |pos|
    find(word, pos) || []
  end
end

Instance Method Details

#synsetsObject

Return a list of synsets for this Lemma. Each synset represents a different sense, or meaning, of the word.



41
42
43
# File 'lib/wordnet/lemma.rb', line 41

def synsets
  @synset_offsets.map { |offset| Synset.new(@pos, offset) }
end

#to_sObject

Returns a compact string representation of this lemma, e.g. “fall, v” for the verb form of the word “fall”.



47
48
49
# File 'lib/wordnet/lemma.rb', line 47

def to_s
  [@word, @pos].join(",")
end