Class: Sequitur::Production

Inherits:
Object
  • Object
show all
Defined in:
lib/sequitur/production.rb

Overview

In a context-free grammar, a production is a rule in which its left-hand side (LHS) consists solely of a non-terminal symbol and the right-hand side (RHS) consists of a sequence of symbols. The symbols in RHS can be either terminal or non-terminal symbols. The rule stipulates that the LHS is equivalent to the RHS, in other words every occurrence of the LHS can be substituted to corresponding RHS. The object id of the production is taken as its LHS.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initializeProduction

Constructor. Build a production with an empty RHS.



25
26
27
28
# File 'lib/sequitur/production.rb', line 25

def initialize()
  clear_rhs
  @backrefs = {}
end

Instance Attribute Details

#backrefsObject (readonly)

A Hash with pairs of the form: production id => reference count Where the reference count is the number of times this production appears in the rhs of the production with given id.



22
23
24
# File 'lib/sequitur/production.rb', line 22

def backrefs
  @backrefs
end

#rhsObject (readonly)

The right-hand side (rhs) consists of a sequence of grammar symbols



16
17
18
# File 'lib/sequitur/production.rb', line 16

def rhs
  @rhs
end

Instance Method Details

#add_backref(aProduction) ⇒ Object

Add a back reference to the given production.



83
84
85
86
87
88
89
# File 'lib/sequitur/production.rb', line 83

def add_backref(aProduction)
  prod_id = aProduction.object_id

  count = backrefs.fetch(prod_id, 0)
  backrefs[prod_id] = count + 1
  return count
end

#append_symbol(aSymbol) ⇒ Object



131
132
133
134
# File 'lib/sequitur/production.rb', line 131

def append_symbol(aSymbol)
  aSymbol.add_backref(self) if aSymbol.kind_of?(Production)
  rhs << aSymbol
end

#calc_append_symbol(aSymbol) ⇒ Object

Return the digrams for this production as if the given symbol is appended at the end of the rhs



125
126
127
128
129
# File 'lib/sequitur/production.rb', line 125

def calc_append_symbol(aSymbol)
  return [] if empty?

  return digrams + [ Digram.new(rhs.last, aSymbol, self) ]
end

#clear_rhsObject

Clear the right-hand side. Any referenced production has its back reference counter decremented



138
139
140
141
142
143
144
# File 'lib/sequitur/production.rb', line 138

def clear_rhs()
  if rhs
    refs = references
    refs.each { |a_ref| a_ref.remove_backref(self) }
  end
  @rhs = []
end

#digramsObject

Return the list digrams found in rhs of this production.



147
148
149
150
151
152
153
154
# File 'lib/sequitur/production.rb', line 147

def digrams()
  return [] if rhs.size < 2

  result = []
  rhs.each_cons(2) { |couple| result << Digram.new(*couple, self) }

  return result
end

#empty?Boolean

Is the rhs empty?

Returns:

  • (Boolean)


33
34
35
# File 'lib/sequitur/production.rb', line 33

def empty?
  return rhs.empty?
end

#last_digramObject

Return the last digram appearing in the RHS.



64
65
66
67
68
# File 'lib/sequitur/production.rb', line 64

def last_digram()
  return nil if rhs.size < 2
  
  return Digram.new(rhs[-2], rhs[-1], self)
end

#refcountObject

The back reference count is the number of times this production appears in the rhs of all the productions of the grammar



74
75
76
77
78
79
80
# File 'lib/sequitur/production.rb', line 74

def refcount()
  total = backrefs.values.reduce(0) do |sub_result, count|
    sub_result += count
  end

  return total
end

#referencesObject

Return the set of productions appearing in the rhs



39
40
41
# File 'lib/sequitur/production.rb', line 39

def references()
  return rhs.select { |symb| symb.kind_of?(Production) }
end

#remove_backref(aProduction) ⇒ Object

Decrement the reference count for the given production. If result is zero, then the entry is removed from the Hash.



93
94
95
96
97
98
99
100
101
102
103
104
105
106
# File 'lib/sequitur/production.rb', line 93

def remove_backref(aProduction)
  prod_id = aProduction.object_id

  count = backrefs.fetch(prod_id)
  fail StandardError if count < 1

  if count > 1
    backrefs[prod_id] = count - 1
  else
    backrefs.delete(prod_id)
  end

  return count
end

#repeated_digram?Boolean

Detect whether the last digram occurs twice Assumption: when a digram occurs twice in a production then it must occur at the end of the rhs

Returns:

  • (Boolean)


53
54
55
56
57
58
59
60
61
# File 'lib/sequitur/production.rb', line 53

def repeated_digram?
  return false if rhs.size < 3

  my_digrams = digrams
  all_keys = my_digrams.map(&:key)
  last_key = all_keys.pop
  same_key_found = all_keys.index(last_key)
  return !same_key_found.nil?
end

#replace_digram(another) ⇒ Object

Substitute in self all occurence of the digram that appears in the rhs of the other production Pre-condition: another has a rhs with exactly one digram (= a two-symbol sequence).



160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
# File 'lib/sequitur/production.rb', line 160

def replace_digram(another)
  # Find the positions where the digram occur in rhs
  (symb1, symb2) = another.rhs
  indices = [ -2 ] # Dummy index!

  (0...rhs.size).each do |i|
    next if i == indices.last + 1
    indices << i if (rhs[i] == symb1) && (rhs[i + 1] == symb2)
  end
  indices.shift

  pos = indices.reverse

  # Replace the two symbol sequence by the production
  pos.each do |index|
    rhs[index].remove_backref(self) if rhs[index].kind_of?(Production)
    rhs[index] = another
    index1 = index + 1
    rhs[index1].remove_backref(self) if rhs[index1].kind_of?(Production)
    rhs.delete_at(index1)
    another.add_backref(self)
  end
end

#replace_production(another) ⇒ Object

Replace every occurrence of 'another' production in rhs by the rhs of 'another'.



186
187
188
189
190
191
192
193
194
195
196
# File 'lib/sequitur/production.rb', line 186

def replace_production(another)
  (0...rhs.size).to_a.reverse.each do |index|
    next unless rhs[index] == another
    rhs.insert(index + 1, *another.rhs)
    another.rhs.each do |new_symb|
      new_symb.add_backref(self) if new_symb.kind_of?(Production)
    end
    another.remove_backref(self)
    rhs.delete_at(index)
  end
end

#single_digram?Boolean

Does the rhs have exactly one digram only (= 2 symbols)?

Returns:

  • (Boolean)


45
46
47
# File 'lib/sequitur/production.rb', line 45

def single_digram?
  return rhs.size == 2
end

#to_stringObject

Emit a text representation of the production rule. Text is of the form: object id of production : rhs as space-separated sequence of symbols.



111
112
113
114
115
116
117
118
119
120
121
# File 'lib/sequitur/production.rb', line 111

def to_string()
  rhs_text = rhs.map do |elem|
    case elem
      when String then "'#{elem}'"
      when Production then "#{elem.object_id}"
    else "#{elem}"
    end
  end

  return "#{object_id} : #{rhs_text.join(' ')}."
end