Class: Sequitur::Production

Inherits:
Object
  • Object
show all
Defined in:
lib/sequitur/production.rb

Overview

In a context-free grammar, a production is a rule in which its left-hand side (LHS) consists solely of a non-terminal symbol and the right-hand side (RHS) consists of a sequence of symbols. The symbols in RHS can be either terminal or non-terminal symbols. The rule stipulates that the LHS is equivalent to the RHS, in other words every occurrence of the LHS can be substituted to corresponding RHS. The object id of the production is taken as its LHS.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initializeProduction

Constructor. Build a production with an empty RHS.



25
26
27
28
# File 'lib/sequitur/production.rb', line 25

def initialize()
  clear_rhs
  @backrefs = {}
end

Instance Attribute Details

#backrefsObject (readonly)

A Hash with pairs of the form: production id => reference count Where the reference count is the number of times this production appears in the rhs of the production with given id.



22
23
24
# File 'lib/sequitur/production.rb', line 22

def backrefs
  @backrefs
end

#rhsObject (readonly)

The right-hand side (rhs) consists of a sequence of grammar symbols



16
17
18
# File 'lib/sequitur/production.rb', line 16

def rhs
  @rhs
end

Instance Method Details

#add_backref(aProduction) ⇒ Object

Add a back reference to the given production.

Parameters:

  • aProduction (Production)

    Assume that production P appears in the RHS of production Q, then a reference count of P is incremented in Q.



85
86
87
88
89
90
91
# File 'lib/sequitur/production.rb', line 85

def add_backref(aProduction)
  prod_id = aProduction.object_id

  count = backrefs.fetch(prod_id, 0)
  backrefs[prod_id] = count + 1
  return count
end

#append_symbol(aSymbol) ⇒ Object



133
134
135
136
# File 'lib/sequitur/production.rb', line 133

def append_symbol(aSymbol)
  aSymbol.add_backref(self) if aSymbol.kind_of?(Production)
  rhs << aSymbol
end

#calc_append_symbol(aSymbol) ⇒ Object

Return the digrams for this production as if the given symbol is appended at the end of the rhs



127
128
129
130
131
# File 'lib/sequitur/production.rb', line 127

def calc_append_symbol(aSymbol)
  return [] if empty?

  return digrams + [ Digram.new(rhs.last, aSymbol, self) ]
end

#clear_rhsObject

Clear the right-hand side. Any referenced production has its back reference counter decremented



140
141
142
143
144
145
146
# File 'lib/sequitur/production.rb', line 140

def clear_rhs()
  if rhs
    refs = references
    refs.each { |a_ref| a_ref.remove_backref(self) }
  end
  @rhs = []
end

#digramsObject

Return the list digrams found in rhs of this production.



149
150
151
152
153
154
155
156
# File 'lib/sequitur/production.rb', line 149

def digrams()
  return [] if rhs.size < 2

  result = []
  rhs.each_cons(2) { |couple| result << Digram.new(*couple, self) }

  return result
end

#empty?Boolean

Is the rhs empty?

Returns:

  • (Boolean)


33
34
35
# File 'lib/sequitur/production.rb', line 33

def empty?
  return rhs.empty?
end

#last_digramObject

Return the last digram appearing in the RHS.



64
65
66
67
68
# File 'lib/sequitur/production.rb', line 64

def last_digram()
  return nil if rhs.size < 2
  
  return Digram.new(rhs[-2], rhs[-1], self)
end

#refcountObject

The back reference count is the number of times this production appears in the rhs of all the productions of the grammar



74
75
76
77
78
79
80
# File 'lib/sequitur/production.rb', line 74

def refcount()
  total = backrefs.values.reduce(0) do |sub_result, count|
    sub_result += count
  end

  return total
end

#referencesObject

Return the set of productions appearing in the rhs.



39
40
41
# File 'lib/sequitur/production.rb', line 39

def references()
  return rhs.select { |symb| symb.kind_of?(Production) }
end

#remove_backref(aProduction) ⇒ Object

Decrement the reference count for the given production. If result is zero, then the entry is removed from the Hash.



95
96
97
98
99
100
101
102
103
104
105
106
107
108
# File 'lib/sequitur/production.rb', line 95

def remove_backref(aProduction)
  prod_id = aProduction.object_id

  count = backrefs.fetch(prod_id)
  fail StandardError if count < 1

  if count > 1
    backrefs[prod_id] = count - 1
  else
    backrefs.delete(prod_id)
  end

  return count
end

#repeated_digram?Boolean

Detect whether the last digram occurs twice Assumption: when a digram occurs twice in a production then it must occur at the end of the rhs

Returns:

  • (Boolean)


53
54
55
56
57
58
59
60
61
# File 'lib/sequitur/production.rb', line 53

def repeated_digram?
  return false if rhs.size < 3

  my_digrams = digrams
  all_keys = my_digrams.map(&:key)
  last_key = all_keys.pop
  same_key_found = all_keys.index(last_key)
  return !same_key_found.nil?
end

#replace_digram(another) ⇒ Object

Substitute in self all occurence of the digram that appears in the rhs of the other production Pre-condition: another has a rhs with exactly one digram (= a two-symbol sequence).



162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
# File 'lib/sequitur/production.rb', line 162

def replace_digram(another)
  # Find the positions where the digram occur in rhs

  (symb1, symb2) = another.rhs
  indices = [ -2 ] # Dummy index!


  (0...rhs.size).each do |i|
    next if i == indices.last + 1
    indices << i if (rhs[i] == symb1) && (rhs[i + 1] == symb2)
  end
  indices.shift

  pos = indices.reverse

  # Replace the two symbol sequence by the production

  pos.each do |index|
    rhs[index].remove_backref(self) if rhs[index].kind_of?(Production)
    rhs[index] = another
    index1 = index + 1
    rhs[index1].remove_backref(self) if rhs[index1].kind_of?(Production)
    rhs.delete_at(index1)
    another.add_backref(self)
  end
end

#replace_production(another) ⇒ Object

Replace every occurrence of 'another' production in rhs by the rhs of 'another'.



188
189
190
191
192
193
194
195
196
197
198
# File 'lib/sequitur/production.rb', line 188

def replace_production(another)
  (0...rhs.size).to_a.reverse.each do |index|
    next unless rhs[index] == another
    rhs.insert(index + 1, *another.rhs)
    another.rhs.each do |new_symb|
      new_symb.add_backref(self) if new_symb.kind_of?(Production)
    end
    another.remove_backref(self)
    rhs.delete_at(index)
  end
end

#single_digram?Boolean

Does the rhs have exactly one digram only (= 2 symbols)?

Returns:

  • (Boolean)


45
46
47
# File 'lib/sequitur/production.rb', line 45

def single_digram?
  return rhs.size == 2
end

#to_stringObject

Emit a text representation of the production rule. Text is of the form: object id of production : rhs as space-separated sequence of symbols.



113
114
115
116
117
118
119
120
121
122
123
# File 'lib/sequitur/production.rb', line 113

def to_string()
  rhs_text = rhs.map do |elem|
    case elem
      when String then "'#{elem}'"
      when Production then "#{elem.object_id}"
    else elem.to_s
    end
  end

  return "#{object_id} : #{rhs_text.join(' ')}."
end