Class: Myasorubka::MSD
- Inherits:
-
Object
- Object
- Myasorubka::MSD
- Defined in:
- lib/myasorubka/msd.rb
Overview
MSD is a morphosyntactic descriptor model.
This representation, with the concrete applications which display and exemplify the attributes and values and provide their internal constraints and relationships, makes the proposal self-explanatory. Other groups can easily test the specifications on their language, simply by following the method of the applications. The possibility of incorporating idiosyncratic classes and distinctions after the common core features makes the proposal relatively adaptable and flexible, without compromising compatibility.
MSD implementation and documentation are based on MULTEXT-East Morphosyntactic Specifications, Version 4: nl.ijs.si/ME/V4/msd/html/msd.html
You may use Myasorubka::MSD either as parser and generator.
“‘ruby msd = Myasorubka::MSD.new(Myasorubka::MSD::Russian) msd = :noun msd = :common msd = :plural msd = :locative msd.to_s # => “Nc-pl” “`
“‘ruby msd = Myasorubka::MSD.new(Myasorubka::MSD::Russian, ’Vmps-snpfel’) msd # => :verb msd # => :past msd # => nil msd.grammemes # => :vform=>:participle, … “‘
Defined Under Namespace
Modules: English, Russian Classes: InvalidDescriptor
Constant Summary collapse
- EMPTY_DESCRIPTOR =
Empty descriptor character.
'-'
Instance Attribute Summary collapse
-
#grammemes ⇒ Object
readonly
Returns the value of attribute grammemes.
-
#language ⇒ Object
readonly
Returns the value of attribute language.
-
#pos ⇒ Object
Returns the value of attribute pos.
Instance Method Summary collapse
- #<=>(other) ⇒ Object
- #==(other) ⇒ Object
-
#[](key) ⇒ Symbol
Retrieves the morphosyntactic descriptor corresponding to the ‘key` object.
-
#[]=(key, value) ⇒ Symbol
Assignes the morphosyntactic descriptor given by ‘value` with the key given by `key` object.
-
#initialize(language, msd = '') ⇒ MSD
constructor
Creates a new morphosyntactic descriptor model instance.
- #inspect ⇒ Object
-
#merge!(hash) ⇒ MSD
Merges grammemes that are stored in ‘hash` into the MSD grammemes.
-
#prune! ⇒ MSD
Drop every attribute that does not appear in the category.
-
#to_regexp ⇒ Regexp
Generates Regexp from the MSD that is useful to perform database queries.
- #to_s ⇒ Object
-
#valid? ⇒ true, false
Validates the MSD instance.
Constructor Details
#initialize(language, msd = '') ⇒ MSD
Creates a new morphosyntactic descriptor model instance. Please specify a ‘language` module with defined `CATEGORIES`.
Optionally, you can parse MSD string that is passed as ‘msd` argument.
63 64 65 66 67 68 69 70 71 72 |
# File 'lib/myasorubka/msd.rb', line 63 def initialize(language, msd = '') @language, @pos, @grammemes = language, nil, {} unless language.const_defined? 'CATEGORIES' raise ArgumentError, 'given language has no morphosyntactic descriptions' end parse! msd if msd && !msd.empty? end |
Instance Attribute Details
#grammemes ⇒ Object (readonly)
Returns the value of attribute grammemes.
50 51 52 |
# File 'lib/myasorubka/msd.rb', line 50 def grammemes @grammemes end |
#language ⇒ Object (readonly)
Returns the value of attribute language.
50 51 52 |
# File 'lib/myasorubka/msd.rb', line 50 def language @language end |
#pos ⇒ Object
Returns the value of attribute pos.
51 52 53 |
# File 'lib/myasorubka/msd.rb', line 51 def pos @pos end |
Instance Method Details
#<=>(other) ⇒ Object
104 105 106 |
# File 'lib/myasorubka/msd.rb', line 104 def <=> other to_s <=> other.to_s end |
#==(other) ⇒ Object
109 110 111 |
# File 'lib/myasorubka/msd.rb', line 109 def == other to_s == other.to_s end |
#[](key) ⇒ Symbol
Retrieves the morphosyntactic descriptor corresponding to the ‘key` object. If not, returns `nil`.
80 81 82 83 |
# File 'lib/myasorubka/msd.rb', line 80 def [] key return pos if :pos == key grammemes[key] end |
#[]=(key, value) ⇒ Symbol
Assignes the morphosyntactic descriptor given by ‘value` with the key given by `key` object.
92 93 94 95 96 |
# File 'lib/myasorubka/msd.rb', line 92 def []= key, value return @pos = value if :pos == key raise InvalidDescriptor, 'category is not set yet' unless pos grammemes[key] = value end |
#inspect ⇒ Object
99 100 101 |
# File 'lib/myasorubka/msd.rb', line 99 def inspect '#<%s msd=%s>' % [language.name, to_s.inspect] end |
#merge!(hash) ⇒ MSD
Merges grammemes that are stored in ‘hash` into the MSD grammemes.
140 141 142 143 144 145 146 |
# File 'lib/myasorubka/msd.rb', line 140 def merge! hash hash.each do |key, value| self[key.to_sym] = value.to_sym end self end |
#prune! ⇒ MSD
Drop every attribute that does not appear in the category.
192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 |
# File 'lib/myasorubka/msd.rb', line 192 def prune! unless category = language::CATEGORIES[pos] self.pos = nil grammemes.clear return self end attributes = category[:attrs] grammemes.reject! do |attribute, value| if index = attributes.index { |name, _| name == attribute } _, values = attributes[index] !values[value] else true end end self end |
#to_regexp ⇒ Regexp
Generates Regexp from the MSD that is useful to perform database queries.
“‘ruby msd = Myasorubka::MSD.new(Myasorubka::MSD::Russian, ’Vm’) r = msd.to_regexp # => /^Vm.*$/ ‘Vmp’ =~ r # 0 ‘Nc-pl’ =~ r # nil “‘
125 126 127 128 129 130 131 132 |
# File 'lib/myasorubka/msd.rb', line 125 def to_regexp Regexp.new([ '^', self.to_s.gsub(EMPTY_DESCRIPTOR, '.'), '.*', '$' ].join) end |
#to_s ⇒ Object
149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 |
# File 'lib/myasorubka/msd.rb', line 149 def to_s return '' unless pos unless category = language::CATEGORIES[pos] raise InvalidDescriptor, "category is nil" end attributes = category[:attrs] msd = [category[:code]] grammemes.each do |attribute, value| next unless value unless index = attributes.index { |name, _| name == attribute } raise InvalidDescriptor, 'no such attribute "%s" of category "%s"' % [attribute, pos] end _, values = attributes[index] unless attribute_value = values[value] raise InvalidDescriptor, 'no such value "%s" for attribute "%s" of category "%s"' % [value, attribute, pos] end msd[index + 1] = attribute_value end msd.map { |e| e || EMPTY_DESCRIPTOR }.join end |
#valid? ⇒ true, false
Validates the MSD instance.
182 183 184 185 186 |
# File 'lib/myasorubka/msd.rb', line 182 def valid? !!to_s rescue InvalidDescriptor false end |