Class: WordNet::Synset

Inherits:
Object
  • Object
show all
Includes:
Constants
Defined in:
lib/wordnet/synset.rb

Overview

WordNet synonym-set object class

Instances of this class encapsulate the data for a synonym set (‘synset’) in a WordNet lexical database. A synonym set is a set of words that are interchangeable in some context.

We can either fetch the synset from a connected Lexicon:

lexicon = WordNet::Lexicon.new( 'postgres://localhost/wordnet31' )
ss = lexicon[ :first, 'time' ]
# => #<WordNet::Synset:0x7ffbf2643bb0 {115265518} 'commencement, first,
#       get-go, offset, outset, start, starting time, beginning, kickoff,
#       showtime' (noun): [noun.time] the time at which something is
#       supposed to begin>

or if you’ve already created a Lexicon, use its connection indirectly to look up a Synset by its ID:

ss = WordNet::Synset[ 115265518 ]
# => #<WordNet::Synset:0x7ffbf257e928 {115265518} 'commencement, first,
#       get-go, offset, outset, start, starting time, beginning, kickoff,
#       showtime' (noun): [noun.time] the time at which something is
#       supposed to begin>

You can fetch a list of the lemmas (base forms) of the words included in the synset:

ss.words.map( &:lemma )
# => ["commencement", "first", "get-go", "offset", "outset", "start",
#     "starting time", "beginning", "kickoff", "showtime"]

But the primary reason for a synset is its lexical and semantic links to other words and synsets. For instance, its hypernym is the equivalent of its superclass: it’s the class of things of which the receiving synset is a member.

ss.hypernyms
# => [#<WordNet::Synset:0x7ffbf25c76c8 {115180528} 'point, point in
#        time' (noun): [noun.time] an instant of time>]

The synset’s hyponyms, on the other hand, are kind of like its subclasses:

ss.hyponyms
# => [#<WordNet::Synset:0x7ffbf25d83b0 {115142167} 'birth' (noun):
#       [noun.time] the time when something begins (especially life)>,
#     #<WordNet::Synset:0x7ffbf25d8298 {115268993} 'threshold' (noun):
#       [noun.time] the starting point for a new state or experience>,
#     #<WordNet::Synset:0x7ffbf25d8180 {115143012} 'incipiency,
#       incipience' (noun): [noun.time] beginning to exist or to be
#       apparent>,
#     #<WordNet::Synset:0x7ffbf25d8068 {115266164} 'starting point,
#       terminus a quo' (noun): [noun.time] earliest limiting point>]

Traversal

Synset also provides a few ‘traversal’ methods which provide recursive searching of a Synset’s semantic links:

# Recursively search for more-general terms for the synset, and print out
# each one with indentation according to how distantly it's related.
lexicon[ :fencing, 'sword' ].
    traverse(:hypernyms).with_depth.
    each {|ss, depth| puts "%s%s [%d]" % ['  ' * (depth-1), ss.words.first, ss.synsetid] }
# (outputs:)
play [100041468]
  action [100037396]
    act [100030358]
      event [100029378]
        psychological feature [100023100]
          abstract entity [100002137]
            entity [100001740]
combat [101170962]
  battle [100958896]
    group action [101080366]
      event [100029378]
        psychological feature [100023100]
          abstract entity [100002137]
            entity [100001740]
      act [100030358]
        event [100029378]
          psychological feature [100023100]
            abstract entity [100002137]
              entity [100001740]

See the Traversal Methods section for more details.

Low-Level API

This library is implemented using Sequel::Model, an ORM layer on top of the excellent Sequel database toolkit. This means that in addition to the high-level methods above, you can also make use of a database-oriented API if you need to do something not provided by a high-level method.

In order to make use of this API, you’ll need to be familiar with Sequel, especially Datasets and Model Associations. Most of Ruby-WordNet’s functionality is implemented in terms of one or both of these.

Datasets

The main dataset is available from WordNet::Synset.dataset:

WordNet::Synset.dataset
# => #<Sequel::SQLite::Dataset: "SELECT * FROM `synsets`">

In addition to this, Synset also defines a few other canned datasets. To facilitate searching by part of speech on the Synset class:

  • WordNet::Synset.nouns

  • WordNet::Synset.verbs

  • WordNet::Synset.adjectives

  • WordNet::Synset.adverbs

  • WordNet::Synset.adjective_satellites

or by the semantic links for a particular Synset:

  • WordNet::Synset#also_see_dataset

  • WordNet::Synset#attributes_dataset

  • WordNet::Synset#causes_dataset

  • WordNet::Synset#domain_categories_dataset

  • WordNet::Synset#domain_member_categories_dataset

  • WordNet::Synset#domain_member_regions_dataset

  • WordNet::Synset#domain_member_usages_dataset

  • WordNet::Synset#domain_regions_dataset

  • WordNet::Synset#domain_usages_dataset

  • WordNet::Synset#entailments_dataset

  • WordNet::Synset#hypernyms_dataset

  • WordNet::Synset#hyponyms_dataset

  • WordNet::Synset#instance_hypernyms_dataset

  • WordNet::Synset#instance_hyponyms_dataset

  • WordNet::Synset#member_holonyms_dataset

  • WordNet::Synset#member_meronyms_dataset

  • WordNet::Synset#part_holonyms_dataset

  • WordNet::Synset#part_meronyms_dataset

  • WordNet::Synset#semlinks_dataset

  • WordNet::Synset#semlinks_to_dataset

  • WordNet::Synset#senses_dataset

  • WordNet::Synset#similar_words_dataset

  • WordNet::Synset#substance_holonyms_dataset

  • WordNet::Synset#substance_meronyms_dataset

  • WordNet::Synset#sumo_terms_dataset

  • WordNet::Synset#verb_groups_dataset

  • WordNet::Synset#words_dataset

Constant Summary collapse

SEMANTIC_TYPEKEYS =

Semantic link type keys; maps what the API calls them to what they are in the DB.

Hash.new {|h,type| h[type] = type.to_s.chomp('s').to_sym }

Constants included from Constants

Constants::DEFAULT_DB_OPTIONS, Constants::DELIM, Constants::DELIM_RE, Constants::DOMAIN_TYPES, Constants::DomainSymbols, Constants::HOLONYM_SYMBOLS, Constants::HOLONYM_TYPES, Constants::HYPERNYM_SYMBOLS, Constants::HYPERNYM_TYPES, Constants::HYPONYM_SYMBOLS, Constants::HYPONYM_TYPES, Constants::LEXFILES, Constants::MEMBER_SYMBOLS, Constants::MEMBER_TYPES, Constants::MERONYM_SYMBOLS, Constants::MERONYM_TYPES, Constants::POINTER_SUBTYPES, Constants::POINTER_SYMBOLS, Constants::POINTER_TYPES, Constants::SUB_DELIM, Constants::SUB_DELIM_RE, Constants::SYNTACTIC_CATEGORIES, Constants::SYNTACTIC_SYMBOLS, Constants::VERB_SENTS

Class Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Class Attribute Details

Returns the value of attribute semantic_link_methods.



370
371
372
# File 'lib/wordnet/synset.rb', line 370

def semantic_link_methods
  @semantic_link_methods
end

Class Method Details

.db=(newdb) ⇒ Object

Overridden to reset any lookup tables that may have been loaded from the previous database.



297
298
299
300
# File 'lib/wordnet/synset.rb', line 297

def self::db=( newdb )
	self.reset_lookup_tables
	super
end

.lexdomain_tableObject

Return the table of lexical domains, keyed by id.



315
316
317
# File 'lib/wordnet/synset.rb', line 315

def self::lexdomain_table
	@lexdomain_table ||= self.db[:lexdomains].to_hash( :lexdomainid )
end

.lexdomainsObject

Lexical domains, keyed by name as a String (e.g., “verb.cognition”)



321
322
323
324
325
326
# File 'lib/wordnet/synset.rb', line 321

def self::lexdomains
	@lexdomains ||= self.lexdomain_table.inject({}) do |hash,(id,domain)|
		hash[ domain[:lexdomainname] ] = domain
		hash
	end
end

.linktype_tableObject

Return the table of link types, keyed by linkid



330
331
332
333
334
335
336
337
338
339
340
# File 'lib/wordnet/synset.rb', line 330

def self::linktype_table
	@linktype_table ||= self.db[:linktypes].inject({}) do |hash,row|
		hash[ row[:linkid] ] = {
			id: row[:linkid],
			typename: row[:link],
			type: row[:link].gsub( /\s+/, '_' ).to_sym,
			recurses: row[:recurses] && row[:recurses] != 0,
		}
		hash
	end
end

.linktypesObject

Return the table of link types, keyed by name.



344
345
346
347
348
349
# File 'lib/wordnet/synset.rb', line 344

def self::linktypes
	@linktypes ||= self.linktype_table.inject({}) do |hash,(id,link)|
		hash[ link[:type] ] = link
		hash
	end
end

.postype_tableObject

Return the table of part-of-speech types, keyed by letter identifier.



353
354
355
356
357
358
# File 'lib/wordnet/synset.rb', line 353

def self::postype_table
	@postype_table ||= self.db[:postypes].inject({}) do |hash, row|
		hash[ row[:pos].to_sym ] = row[:posname]
		hash
	end
end

.postypesObject

Return the table of part-of-speech names to letter identifiers (both Symbols).



362
363
364
# File 'lib/wordnet/synset.rb', line 362

def self::postypes
	@postypes ||= self.postype_table.invert
end

.reset_lookup_tablesObject

Unload all of the cached lookup tables that have been loaded.



304
305
306
307
308
309
310
311
# File 'lib/wordnet/synset.rb', line 304

def self::reset_lookup_tables
	@lexdomain_table = nil
	@lexdomains      = nil
	@linktype_table  = nil
	@linktypes       = nil
	@postype_table   = nil
	@postypes        = nil
end

Generate methods that will return Synsets related by the given semantic pointer type.



376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
# File 'lib/wordnet/synset.rb', line 376

def self::semantic_link( type )
	self.log.debug "Generating a %p method" % [ type ]

	ds_method_body = Proc.new do
		self.semanticlink_dataset( type )
	end
	define_method( "#{type}_dataset", &ds_method_body )

	ss_method_body = Proc.new do
		self.semanticlink_dataset( type ).all
	end
	define_method( type, &ss_method_body )

	self.semantic_link_methods << type.to_sym
end

Instance Method Details

#also_seeObject

“See Also” synsets



467
# File 'lib/wordnet/synset.rb', line 467

semantic_link :also_see

#attributesObject

Attribute synsets



471
# File 'lib/wordnet/synset.rb', line 471

semantic_link :attributes

#causesObject

Cause synsets



475
# File 'lib/wordnet/synset.rb', line 475

semantic_link :causes

#domain_categoriesObject

Domain category synsets



479
# File 'lib/wordnet/synset.rb', line 479

semantic_link :domain_categories

#domain_member_categoriesObject

Domain member category synsets



483
# File 'lib/wordnet/synset.rb', line 483

semantic_link :domain_member_categories

#domain_member_regionsObject

Domain member region synsets



487
# File 'lib/wordnet/synset.rb', line 487

semantic_link :domain_member_regions

#domain_member_usagesObject

Domain member usage synsets



491
# File 'lib/wordnet/synset.rb', line 491

semantic_link :domain_member_usages

#domain_regionsObject

Domain region synsets



495
# File 'lib/wordnet/synset.rb', line 495

semantic_link :domain_regions

#domain_usagesObject

Domain usage synsets



499
# File 'lib/wordnet/synset.rb', line 499

semantic_link :domain_usages

#entailmentsObject

Verb entailment synsets



503
# File 'lib/wordnet/synset.rb', line 503

semantic_link :entailments

#hypernymsObject

Hypernym sunsets



507
# File 'lib/wordnet/synset.rb', line 507

semantic_link :hypernyms

#hyponymsObject

Hyponym synsets



511
# File 'lib/wordnet/synset.rb', line 511

semantic_link :hyponyms

#inspectObject

Return a human-readable representation of the objects, suitable for debugging.



689
690
691
692
693
694
695
696
697
698
699
# File 'lib/wordnet/synset.rb', line 689

def inspect
	return "#<%p:%0#x {%d} '%s' (%s): [%s] %s>" % [
		self.class,
		self.object_id * 2,
		self.synsetid,
		self.wordlist.join(', '),
		self.part_of_speech,
		self.lexical_domain,
		self.definition,
	]
end

#instance_hypernymsObject

Instance hypernym synsets



515
# File 'lib/wordnet/synset.rb', line 515

semantic_link :instance_hypernyms

#instance_hyponymsObject

Instance hyponym synsets



519
# File 'lib/wordnet/synset.rb', line 519

semantic_link :instance_hyponyms

#lexical_domainObject

Return the name of the lexical domain the synset belongs to; this also corresponds to the lexicographer’s file the synset was originally loaded from.



447
448
449
# File 'lib/wordnet/synset.rb', line 447

def lexical_domain
	return self.class.lexdomain_table[ self.lexdomainid ][ :lexdomainname ]
end

#member_holonymsObject

Member holonym synsets



523
# File 'lib/wordnet/synset.rb', line 523

semantic_link :member_holonyms

#member_meronymsObject

Member meronym synsets



527
# File 'lib/wordnet/synset.rb', line 527

semantic_link :member_meronyms

#part_holonymsObject

Part holonym synsets



531
# File 'lib/wordnet/synset.rb', line 531

semantic_link :part_holonyms

#part_meronymsObject

Part meronym synsets



535
# File 'lib/wordnet/synset.rb', line 535

semantic_link :part_meronyms

#part_of_speechObject

Return the name of the Synset’s part of speech (#pos).



417
418
419
# File 'lib/wordnet/synset.rb', line 417

def part_of_speech
	return self.class.postype_table[ self.pos.to_sym ]
end

#samplesObject

Return any sample sentences.



453
454
455
456
457
458
# File 'lib/wordnet/synset.rb', line 453

def samples
	return self.db[:samples].
		filter( synsetid: self.synsetid ).
		order( :sampleid ).
		map( :sample )
end

#search(type, synset) ⇒ Object

Search for the specified synset in the semantic links of the given type of the receiver, returning the depth it was found at if it’s found, or nil if it wasn’t found.



672
673
674
675
# File 'lib/wordnet/synset.rb', line 672

def search( type, synset )
	found, depth = self.traverse( type ).with_depth.find {|ss,depth| synset == ss }
	return depth
end

Return a Sequel::Dataset for synsets related to the receiver via the semantic link of the specified type.



399
400
401
402
403
404
405
406
# File 'lib/wordnet/synset.rb', line 399

def semanticlink_dataset( type )
	typekey  = SEMANTIC_TYPEKEYS[ type ]
	linkinfo = self.class.linktypes[ typekey ] or
		raise ArgumentError, "no such link type %p" % [ typekey ]
	ssids    = self.semlinks_dataset.filter( linkid: linkinfo[:id] ).select( :synset2id )

	return self.class.filter( synsetid: ssids )
end

Return an Enumerator that will iterate over the Synsets related to the receiver via the semantic links of the specified linktype.



411
412
413
# File 'lib/wordnet/synset.rb', line 411

def semanticlink_enum( linktype )
	return self.semanticlink_dataset( linktype ).to_enum
end

The WordNet::SemanticLinks indicating a relationship with other WordNet::Synsets



196
197
198
199
200
# File 'lib/wordnet/synset.rb', line 196

one_to_many :semlinks,
class: 'WordNet::SemanticLink',
key: :synset1id,
primary_key: :synsetid,
eager: :target

The WordNet::SemanticLinks pointing to this Synset



205
206
207
208
# File 'lib/wordnet/synset.rb', line 205

many_to_one :semlinks_to,
class: 'WordNet::SemanticLink',
key: :synsetid,
primary_key: :synset2id

#sensesObject

The WordNet::Senses associated with the receiver



188
189
190
# File 'lib/wordnet/synset.rb', line 188

one_to_many :senses,
key: :synsetid,
primary_key: :synsetid

#similar_wordsObject

Similar word synsets



539
# File 'lib/wordnet/synset.rb', line 539

semantic_link :similar_words

#substance_holonymsObject

Substance holonym synsets



543
# File 'lib/wordnet/synset.rb', line 543

semantic_link :substance_holonyms

#substance_meronymsObject

Substance meronym synsets



547
# File 'lib/wordnet/synset.rb', line 547

semantic_link :substance_meronyms

#sumo_termsObject

Terms from the Suggested Upper Merged Ontology



213
214
215
216
# File 'lib/wordnet/synset.rb', line 213

many_to_many :sumo_terms,
join_table: :sumomaps,
left_key: :synsetid,
right_key: :sumoid

#to_sObject

Stringify the synset.



423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
# File 'lib/wordnet/synset.rb', line 423

def to_s

	# Make a sorted list of the semantic link types from this synset
	semlink_list = self.semlinks_dataset.
		group_and_count( :linkid ).
		to_hash( :linkid, :count ).
		collect do |linkid, count|
			'%s: %d' % [ self.class.linktype_table[linkid][:typename], count ]
		end.
		sort.
		join( ', ' )

	return "%s (%s): [%s] %s (%s)" % [
		self.words.map( &:to_s ).join(', '),
		self.part_of_speech,
		self.lexical_domain,
		self.definition,
		semlink_list
	]
end

#traverse(type, &block) ⇒ Object

With a block, yield a WordNet::Synset related to the receiver via a link of the specified type, recursing depth first into each of its links if the link type is recursive. To exit from the traversal at any depth, throw :stop_traversal.

If no block is given, return an Enumerator that will do the same thing instead.

# Print all the parts of a boot
puts lexicon[:boot].traverse( :member_meronyms ).to_a

You can also traverse with an addiitional argument that indicates the depth of recursion by calling #with_depth on the Enumerator:

$lex[:fencing].traverse( :hypernyms ).with_depth.each {|ss,d| puts "%02d: %s" % [d,ss] }
# (outputs:)

01: play, swordplay (noun): [noun.act] the act using a sword (or other weapon) vigorously
  and skillfully (hypernym: 1, hyponym: 1)
02: action (noun): [noun.act] something done (usually as opposed to something said)
  (hypernym: 1, hyponym: 33)
03: act, deed, human action, human activity (noun): [noun.tops] something that people do
  or cause to happen (hypernym: 1, hyponym: 40)
...


625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
# File 'lib/wordnet/synset.rb', line 625

def traverse( type, &block )
	enum = Enumerator.new do |yielder|
		traversals = [ self.semanticlink_enum(type) ]
		syn        = nil
		typekey    = SEMANTIC_TYPEKEYS[ type ]
		recurses   = self.class.linktypes[ typekey ][:recurses]

		self.log.debug "Traversing %s semlinks%s" % [ type, recurses ? " (recursive)" : ''  ]

		catch( :stop_traversal ) do
			until traversals.empty?
				begin
					self.log.debug "  %d traversal/s left" % [ traversals.length ]
					syn = traversals.last.next

					if enum.with_depth?
						yielder.yield( syn, traversals.length )
					else
						yielder.yield( syn )
					end

					traversals << syn.semanticlink_enum( type ) if recurses
				rescue StopIteration
					traversals.pop
				end
			end
		end
	end

	def enum.with_depth?
		@with_depth = false if !defined?( @with_depth )
		return @with_depth
	end

	def enum.with_depth
		@with_depth = true
		self
	end

	return enum.each( &block ) if block
	return enum
end

#verb_groupsObject

Verb group synsets



551
# File 'lib/wordnet/synset.rb', line 551

semantic_link :verb_groups

#wordlistObject

Return the Synset’s Words as an Array of Strings.



683
684
685
# File 'lib/wordnet/synset.rb', line 683

def wordlist
	return self.words.map( &:to_s )
end

#wordsObject

The WordNet::Words associated with the receiver



180
181
182
183
# File 'lib/wordnet/synset.rb', line 180

many_to_many :words,
join_table: :senses,
left_key: :synsetid,
right_key: :wordid

#|(othersyn) ⇒ Object

Union: Return the least general synset that the receiver and othersyn have in common as a hypernym, or nil if it doesn’t share any.



583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
# File 'lib/wordnet/synset.rb', line 583

def |( othersyn )

	# Find all of this syn's hypernyms
	hypersyns = self.traverse( :hypernyms ).to_a
	commonsyn = nil

	# Now traverse the other synset's hypernyms looking for one of our
	# own hypernyms.
	othersyn.traverse( :hypernyms ) do |syn|
		if hypersyns.include?( syn )
			commonsyn = syn
			throw :stop_traversal
		end
	end

	return commonsyn
end