Class: WordNet::Synset

Inherits:
Object
  • Object
show all
Includes:
Constants
Defined in:
lib/wordnet/synset.rb

Overview

WordNet synonym-set object class

Instances of this class encapsulate the data for a synonym set (‘synset’) in a WordNet lexical database. A synonym set is a set of words that are interchangeable in some context.

We can either fetch the synset from a connected Lexicon:

lexicon = WordNet::Lexicon.new( 'postgres://localhost/wordnet31' )
ss = lexicon[ :first, 'time' ]
# => #<WordNet::Synset:0x7ffbf2643bb0 {115265518} 'commencement, first,
#       get-go, offset, outset, start, starting time, beginning, kickoff,
#       showtime' (noun): [noun.time] the time at which something is
#       supposed to begin>

or if you’ve already created a Lexicon, use its connection indirectly to look up a Synset by its ID:

ss = WordNet::Synset[ 115265518 ]
# => #<WordNet::Synset:0x7ffbf257e928 {115265518} 'commencement, first,
#       get-go, offset, outset, start, starting time, beginning, kickoff,
#       showtime' (noun): [noun.time] the time at which something is
#       supposed to begin>

You can fetch a list of the lemmas (base forms) of the words included in the synset:

ss.words.map( &:lemma )
# => ["commencement", "first", "get-go", "offset", "outset", "start",
#     "starting time", "beginning", "kickoff", "showtime"]

But the primary reason for a synset is its lexical and semantic links to other words and synsets. For instance, its hypernym is the equivalent of its superclass: it’s the class of things of which the receiving synset is a member.

ss.hypernyms
# => [#<WordNet::Synset:0x7ffbf25c76c8 {115180528} 'point, point in
#        time' (noun): [noun.time] an instant of time>]

The synset’s hyponyms, on the other hand, are kind of like its subclasses:

ss.hyponyms
# => [#<WordNet::Synset:0x7ffbf25d83b0 {115142167} 'birth' (noun):
#       [noun.time] the time when something begins (especially life)>,
#     #<WordNet::Synset:0x7ffbf25d8298 {115268993} 'threshold' (noun):
#       [noun.time] the starting point for a new state or experience>,
#     #<WordNet::Synset:0x7ffbf25d8180 {115143012} 'incipiency,
#       incipience' (noun): [noun.time] beginning to exist or to be
#       apparent>,
#     #<WordNet::Synset:0x7ffbf25d8068 {115266164} 'starting point,
#       terminus a quo' (noun): [noun.time] earliest limiting point>]

Traversal

Synset also provides a few ‘traversal’ methods which provide recursive searching of a Synset’s semantic links:

# Recursively search for more-general terms for the synset, and print out
# each one with indentation according to how distantly it's related.
lexicon[ :fencing, 'sword' ].
    traverse(:hypernyms).with_depth.
    each {|ss, depth| puts "%s%s [%d]" % ['  ' * (depth-1), ss.words.first, ss.synsetid] }
# (outputs:)
play [100041468]
  action [100037396]
    act [100030358]
      event [100029378]
        psychological feature [100023100]
          abstract entity [100002137]
            entity [100001740]
combat [101170962]
  battle [100958896]
    group action [101080366]
      event [100029378]
        psychological feature [100023100]
          abstract entity [100002137]
            entity [100001740]
      act [100030358]
        event [100029378]
          psychological feature [100023100]
            abstract entity [100002137]
              entity [100001740]

See the Traversal Methods section for more details.

Low-Level API

This library is implemented using Sequel::Model, an ORM layer on top of the excellent Sequel database toolkit. This means that in addition to the high-level methods above, you can also make use of a database-oriented API if you need to do something not provided by a high-level method.

In order to make use of this API, you’ll need to be familiar with Sequel, especially Datasets and Model Associations. Most of Ruby-WordNet’s functionality is implemented in terms of one or both of these.

Datasets

The main dataset is available from WordNet::Synset.dataset:

WordNet::Synset.dataset
# => #<Sequel::SQLite::Dataset: "SELECT * FROM `synsets`">

In addition to this, Synset also defines a few other canned datasets. To facilitate searching by part of speech on the Synset class:

  • WordNet::Synset.nouns

  • WordNet::Synset.verbs

  • WordNet::Synset.adjectives

  • WordNet::Synset.adverbs

  • WordNet::Synset.adjective_satellites

or by the semantic links for a particular Synset:

  • WordNet::Synset#also_see_dataset

  • WordNet::Synset#attributes_dataset

  • WordNet::Synset#causes_dataset

  • WordNet::Synset#domain_categories_dataset

  • WordNet::Synset#domain_member_categories_dataset

  • WordNet::Synset#domain_member_regions_dataset

  • WordNet::Synset#domain_member_usages_dataset

  • WordNet::Synset#domain_regions_dataset

  • WordNet::Synset#domain_usages_dataset

  • WordNet::Synset#entailments_dataset

  • WordNet::Synset#hypernyms_dataset

  • WordNet::Synset#hyponyms_dataset

  • WordNet::Synset#instance_hypernyms_dataset

  • WordNet::Synset#instance_hyponyms_dataset

  • WordNet::Synset#member_holonyms_dataset

  • WordNet::Synset#member_meronyms_dataset

  • WordNet::Synset#part_holonyms_dataset

  • WordNet::Synset#part_meronyms_dataset

  • WordNet::Synset#semlinks_dataset

  • WordNet::Synset#semlinks_to_dataset

  • WordNet::Synset#senses_dataset

  • WordNet::Synset#similar_words_dataset

  • WordNet::Synset#substance_holonyms_dataset

  • WordNet::Synset#substance_meronyms_dataset

  • WordNet::Synset#sumo_terms_dataset

  • WordNet::Synset#verb_groups_dataset

  • WordNet::Synset#words_dataset

Constant Summary collapse

SEMANTIC_TYPEKEYS =

Semantic link type keys; maps what the API calls them to what they are in the DB.

Hash.new {|h,type| h[type] = type.to_s.chomp('s').to_sym }

Constants included from Constants

Constants::DEFAULT_DB_OPTIONS, Constants::DELIM, Constants::DELIM_RE, Constants::DOMAIN_TYPES, Constants::DomainSymbols, Constants::HOLONYM_SYMBOLS, Constants::HOLONYM_TYPES, Constants::HYPERNYM_SYMBOLS, Constants::HYPERNYM_TYPES, Constants::HYPONYM_SYMBOLS, Constants::HYPONYM_TYPES, Constants::LEXFILES, Constants::MEMBER_SYMBOLS, Constants::MEMBER_TYPES, Constants::MERONYM_SYMBOLS, Constants::MERONYM_TYPES, Constants::POINTER_SUBTYPES, Constants::POINTER_SYMBOLS, Constants::POINTER_TYPES, Constants::SUB_DELIM, Constants::SUB_DELIM_RE, Constants::SYNTACTIC_CATEGORIES, Constants::SYNTACTIC_SYMBOLS, Constants::VERB_SENTS

Class Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Class Attribute Details

Returns the value of attribute semantic_link_methods.



371
372
373
# File 'lib/wordnet/synset.rb', line 371

def semantic_link_methods
  @semantic_link_methods
end

Class Method Details

.db=(newdb) ⇒ Object

Overridden to reset any lookup tables that may have been loaded from the previous database.



298
299
300
301
# File 'lib/wordnet/synset.rb', line 298

def self::db=( newdb )
	self.reset_lookup_tables
	super
end

.lexdomain_tableObject

Return the table of lexical domains, keyed by id.



316
317
318
# File 'lib/wordnet/synset.rb', line 316

def self::lexdomain_table
	@lexdomain_table ||= self.db[:lexdomains].to_hash( :lexdomainid )
end

.lexdomainsObject

Lexical domains, keyed by name as a String (e.g., “verb.cognition”)



322
323
324
325
326
327
# File 'lib/wordnet/synset.rb', line 322

def self::lexdomains
	@lexdomains ||= self.lexdomain_table.inject({}) do |hash,(id,domain)|
		hash[ domain[:lexdomainname] ] = domain
		hash
	end
end

.linktype_tableObject

Return the table of link types, keyed by linkid



331
332
333
334
335
336
337
338
339
340
341
# File 'lib/wordnet/synset.rb', line 331

def self::linktype_table
	@linktype_table ||= self.db[:linktypes].inject({}) do |hash,row|
		hash[ row[:linkid] ] = {
			id: row[:linkid],
			typename: row[:link],
			type: row[:link].gsub( /\s+/, '_' ).to_sym,
			recurses: row[:recurses] && row[:recurses] != 0,
		}
		hash
	end
end

.linktypesObject

Return the table of link types, keyed by name.



345
346
347
348
349
350
# File 'lib/wordnet/synset.rb', line 345

def self::linktypes
	@linktypes ||= self.linktype_table.inject({}) do |hash,(id,link)|
		hash[ link[:type] ] = link
		hash
	end
end

.postype_tableObject

Return the table of part-of-speech types, keyed by letter identifier.



354
355
356
357
358
359
# File 'lib/wordnet/synset.rb', line 354

def self::postype_table
	@postype_table ||= self.db[:postypes].inject({}) do |hash, row|
		hash[ row[:pos].untaint.to_sym ] = row[:posname]
		hash
	end
end

.postypesObject

Return the table of part-of-speech names to letter identifiers (both Symbols).



363
364
365
# File 'lib/wordnet/synset.rb', line 363

def self::postypes
	@postypes ||= self.postype_table.invert
end

.reset_lookup_tablesObject

Unload all of the cached lookup tables that have been loaded.



305
306
307
308
309
310
311
312
# File 'lib/wordnet/synset.rb', line 305

def self::reset_lookup_tables
	@lexdomain_table = nil
	@lexdomains      = nil
	@linktype_table  = nil
	@linktypes       = nil
	@postype_table   = nil
	@postypes        = nil
end

Generate methods that will return Synsets related by the given semantic pointer type.



377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
# File 'lib/wordnet/synset.rb', line 377

def self::semantic_link( type )
	self.log.debug "Generating a %p method" % [ type ]

	ds_method_body = Proc.new do
		self.semanticlink_dataset( type )
	end
	define_method( "#{type}_dataset", &ds_method_body )

	ss_method_body = Proc.new do
		self.semanticlink_dataset( type ).all
	end
	define_method( type, &ss_method_body )

	self.semantic_link_methods << type.to_sym
end

Instance Method Details

#also_seeObject

“See Also” synsets



468
# File 'lib/wordnet/synset.rb', line 468

semantic_link :also_see

#attributesObject

Attribute synsets



472
# File 'lib/wordnet/synset.rb', line 472

semantic_link :attributes

#causesObject

Cause synsets



476
# File 'lib/wordnet/synset.rb', line 476

semantic_link :causes

#domain_categoriesObject

Domain category synsets



480
# File 'lib/wordnet/synset.rb', line 480

semantic_link :domain_categories

#domain_member_categoriesObject

Domain member category synsets



484
# File 'lib/wordnet/synset.rb', line 484

semantic_link :domain_member_categories

#domain_member_regionsObject

Domain member region synsets



488
# File 'lib/wordnet/synset.rb', line 488

semantic_link :domain_member_regions

#domain_member_usagesObject

Domain member usage synsets



492
# File 'lib/wordnet/synset.rb', line 492

semantic_link :domain_member_usages

#domain_regionsObject

Domain region synsets



496
# File 'lib/wordnet/synset.rb', line 496

semantic_link :domain_regions

#domain_usagesObject

Domain usage synsets



500
# File 'lib/wordnet/synset.rb', line 500

semantic_link :domain_usages

#entailmentsObject

Verb entailment synsets



504
# File 'lib/wordnet/synset.rb', line 504

semantic_link :entailments

#hypernymsObject

Hypernym sunsets



508
# File 'lib/wordnet/synset.rb', line 508

semantic_link :hypernyms

#hyponymsObject

Hyponym synsets



512
# File 'lib/wordnet/synset.rb', line 512

semantic_link :hyponyms

#inspectObject

Return a human-readable representation of the objects, suitable for debugging.



690
691
692
693
694
695
696
697
698
699
700
# File 'lib/wordnet/synset.rb', line 690

def inspect
	return "#<%p:%0#x {%d} '%s' (%s): [%s] %s>" % [
		self.class,
		self.object_id * 2,
		self.synsetid,
		self.wordlist.join(', '),
		self.part_of_speech,
		self.lexical_domain,
		self.definition,
	]
end

#instance_hypernymsObject

Instance hypernym synsets



516
# File 'lib/wordnet/synset.rb', line 516

semantic_link :instance_hypernyms

#instance_hyponymsObject

Instance hyponym synsets



520
# File 'lib/wordnet/synset.rb', line 520

semantic_link :instance_hyponyms

#lexical_domainObject

Return the name of the lexical domain the synset belongs to; this also corresponds to the lexicographer’s file the synset was originally loaded from.



448
449
450
# File 'lib/wordnet/synset.rb', line 448

def lexical_domain
	return self.class.lexdomain_table[ self.lexdomainid ][ :lexdomainname ]
end

#member_holonymsObject

Member holonym synsets



524
# File 'lib/wordnet/synset.rb', line 524

semantic_link :member_holonyms

#member_meronymsObject

Member meronym synsets



528
# File 'lib/wordnet/synset.rb', line 528

semantic_link :member_meronyms

#part_holonymsObject

Part holonym synsets



532
# File 'lib/wordnet/synset.rb', line 532

semantic_link :part_holonyms

#part_meronymsObject

Part meronym synsets



536
# File 'lib/wordnet/synset.rb', line 536

semantic_link :part_meronyms

#part_of_speechObject

Return the name of the Synset’s part of speech (#pos).



418
419
420
# File 'lib/wordnet/synset.rb', line 418

def part_of_speech
	return self.class.postype_table[ self.pos.to_sym ]
end

#samplesObject

Return any sample sentences.



454
455
456
457
458
459
# File 'lib/wordnet/synset.rb', line 454

def samples
	return self.db[:samples].
		filter( synsetid: self.synsetid ).
		order( :sampleid ).
		map( :sample )
end

#search(type, synset) ⇒ Object

Search for the specified synset in the semantic links of the given type of the receiver, returning the depth it was found at if it’s found, or nil if it wasn’t found.



673
674
675
676
# File 'lib/wordnet/synset.rb', line 673

def search( type, synset )
	found, depth = self.traverse( type ).with_depth.find {|ss,depth| synset == ss }
	return depth
end

Return a Sequel::Dataset for synsets related to the receiver via the semantic link of the specified type.



400
401
402
403
404
405
406
407
# File 'lib/wordnet/synset.rb', line 400

def semanticlink_dataset( type )
	typekey  = SEMANTIC_TYPEKEYS[ type ]
	linkinfo = self.class.linktypes[ typekey ] or
		raise ArgumentError, "no such link type %p" % [ typekey ]
	ssids    = self.semlinks_dataset.filter( linkid: linkinfo[:id] ).select( :synset2id )

	return self.class.filter( synsetid: ssids )
end

Return an Enumerator that will iterate over the Synsets related to the receiver via the semantic links of the specified linktype.



412
413
414
# File 'lib/wordnet/synset.rb', line 412

def semanticlink_enum( linktype )
	return self.semanticlink_dataset( linktype ).to_enum
end

The WordNet::SemanticLinks indicating a relationship with other WordNet::Synsets



197
198
199
200
201
# File 'lib/wordnet/synset.rb', line 197

one_to_many :semlinks,
class: 'WordNet::SemanticLink',
key: :synset1id,
primary_key: :synsetid,
eager: :target

The WordNet::SemanticLinks pointing to this Synset



206
207
208
209
# File 'lib/wordnet/synset.rb', line 206

many_to_one :semlinks_to,
class: 'WordNet::SemanticLink',
key: :synsetid,
primary_key: :synset2id

#sensesObject

The WordNet::Senses associated with the receiver



189
190
191
# File 'lib/wordnet/synset.rb', line 189

one_to_many :senses,
key: :synsetid,
primary_key: :synsetid

#similar_wordsObject

Similar word synsets



540
# File 'lib/wordnet/synset.rb', line 540

semantic_link :similar_words

#substance_holonymsObject

Substance holonym synsets



544
# File 'lib/wordnet/synset.rb', line 544

semantic_link :substance_holonyms

#substance_meronymsObject

Substance meronym synsets



548
# File 'lib/wordnet/synset.rb', line 548

semantic_link :substance_meronyms

#sumo_termsObject

Terms from the Suggested Upper Merged Ontology



214
215
216
217
# File 'lib/wordnet/synset.rb', line 214

many_to_many :sumo_terms,
join_table: :sumomaps,
left_key: :synsetid,
right_key: :sumoid

#to_sObject

Stringify the synset.



424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
# File 'lib/wordnet/synset.rb', line 424

def to_s

	# Make a sorted list of the semantic link types from this synset
	semlink_list = self.semlinks_dataset.
		group_and_count( :linkid ).
		to_hash( :linkid, :count ).
		collect do |linkid, count|
			'%s: %d' % [ self.class.linktype_table[linkid][:typename], count ]
		end.
		sort.
		join( ', ' )

	return "%s (%s): [%s] %s (%s)" % [
		self.words.map( &:to_s ).join(', '),
		self.part_of_speech,
		self.lexical_domain,
		self.definition,
		semlink_list
	]
end

#traverse(type, &block) ⇒ Object

With a block, yield a WordNet::Synset related to the receiver via a link of the specified type, recursing depth first into each of its links if the link type is recursive. To exit from the traversal at any depth, throw :stop_traversal.

If no block is given, return an Enumerator that will do the same thing instead.

# Print all the parts of a boot
puts lexicon[:boot].traverse( :member_meronyms ).to_a

You can also traverse with an addiitional argument that indicates the depth of recursion by calling #with_depth on the Enumerator:

$lex[:fencing].traverse( :hypernyms ).with_depth.each {|ss,d| puts "%02d: %s" % [d,ss] }
# (outputs:)

01: play, swordplay (noun): [noun.act] the act using a sword (or other weapon) vigorously
  and skillfully (hypernym: 1, hyponym: 1)
02: action (noun): [noun.act] something done (usually as opposed to something said)
  (hypernym: 1, hyponym: 33)
03: act, deed, human action, human activity (noun): [noun.tops] something that people do
  or cause to happen (hypernym: 1, hyponym: 40)
...


626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
# File 'lib/wordnet/synset.rb', line 626

def traverse( type, &block )
	enum = Enumerator.new do |yielder|
		traversals = [ self.semanticlink_enum(type) ]
		syn        = nil
		typekey    = SEMANTIC_TYPEKEYS[ type ]
		recurses   = self.class.linktypes[ typekey ][:recurses]

		self.log.debug "Traversing %s semlinks%s" % [ type, recurses ? " (recursive)" : ''  ]

		catch( :stop_traversal ) do
			until traversals.empty?
				begin
					self.log.debug "  %d traversal/s left" % [ traversals.length ]
					syn = traversals.last.next

					if enum.with_depth?
						yielder.yield( syn, traversals.length )
					else
						yielder.yield( syn )
					end

					traversals << syn.semanticlink_enum( type ) if recurses
				rescue StopIteration
					traversals.pop
				end
			end
		end
	end

	def enum.with_depth?
		@with_depth = false if !defined?( @with_depth )
		return @with_depth
	end

	def enum.with_depth
		@with_depth = true
		self
	end

	return enum.each( &block ) if block
	return enum
end

#verb_groupsObject

Verb group synsets



552
# File 'lib/wordnet/synset.rb', line 552

semantic_link :verb_groups

#wordlistObject

Return the Synset’s Words as an Array of Strings.



684
685
686
# File 'lib/wordnet/synset.rb', line 684

def wordlist
	return self.words.map( &:to_s )
end

#wordsObject

The WordNet::Words associated with the receiver



181
182
183
184
# File 'lib/wordnet/synset.rb', line 181

many_to_many :words,
join_table: :senses,
left_key: :synsetid,
right_key: :wordid

#|(othersyn) ⇒ Object

Union: Return the least general synset that the receiver and othersyn have in common as a hypernym, or nil if it doesn’t share any.



584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
# File 'lib/wordnet/synset.rb', line 584

def |( othersyn )

	# Find all of this syn's hypernyms
	hypersyns = self.traverse( :hypernyms ).to_a
	commonsyn = nil

	# Now traverse the other synset's hypernyms looking for one of our
	# own hypernyms.
	othersyn.traverse( :hypernyms ) do |syn|
		if hypersyns.include?( syn )
			commonsyn = syn
			throw :stop_traversal
		end
	end

	return commonsyn
end