Class: EBNF::Rule

Inherits:

Object

Object
EBNF::Rule

show all

Defined in:: lib/ebnf/rule.rb

Overview

Represent individual parsed rules

Constant Summary collapse

BNF_OPS = Operations which are flattened to seprate rules in to_bnf.

%w{
  alt diff not opt plus rept seq star
}.map(&:to_sym).freeze

TERM_OPS =

%w{
  hex istr range
}.map(&:to_sym).freeze

OP_ARGN = The number of arguments expected per operator. ‘nil` for unspecified

{
  alt: nil,
  diff: 2,
  hex: 1,
  istr: 1,
  not: 1,
  opt: 1,
  plus: 1,
  range: 1,
  rept: 3,
  seq: nil,
  star: 1
}

Instance Attribute Summary collapse

#cleanup ⇒ Object

Determines preparation and cleanup rules for reconstituting EBNF ? * + from BNF.
#comp ⇒ Rule

A comprehension is a sequence which contains all elements but the first of the original rule.
#expr ⇒ Array

Rule expression.
#first ⇒ Array<Rule> readonly

Terminals that immediately procede this rule.
#follow ⇒ Array<Rule> readonly

Terminals that immediately follow this rule.
#id ⇒ String

ID of rule.
#kind ⇒ :rule, ...

Kind of rule.
#orig ⇒ String

Original EBNF.
#start ⇒ Boolean

Indicates that this is a starting rule.
#sym ⇒ Symbol

Symbol of rule.

Class Method Summary collapse

.from_sxp(sxp) ⇒ Rule

Return a rule from its SXP representation:.

Instance Method Summary collapse

#<=>(other) ⇒ Object

Rules compare using their ids.
#==(other) ⇒ Boolean

Two rules are equal if they have the same #sym, #kind and #expr.
#add_first(terminals) ⇒ Integer

Add terminal as proceding this rule.
#add_follow(terminals) ⇒ Integer

Add terminal as following this rule.
#alt? ⇒ Boolean

Is this rule of the form (alt …)?.
#build(expr, kind: nil, cleanup: nil, **options) ⇒ Object

Build a new rule creating a symbol and numbering from the current rule Symbol and number creation is handled by the top-most rule in such a chain.
#eql?(other) ⇒ Boolean

Two rules are equivalent if they have the same #expr.
#first_includes_eps? ⇒ Boolean

Do the firsts of this rule include the empty string?.
#for_sxp ⇒ Array

Return representation for building S-Expressions.
#initialize(sym, id, expr, kind: nil, ebnf: nil, first: nil, follow: nil, start: nil, top_rule: nil, cleanup: nil) ⇒ Rule constructor

A new instance of Rule.
#inspect ⇒ Object
#non_terminals(ast, expr = @expr) ⇒ Array<Rule>

Return the non-terminals for this rule.
#pass? ⇒ Boolean

Is this a pass?.
#rule? ⇒ Boolean

Is this a rule?.
#seq? ⇒ Boolean

Is this rule of the form (seq …)?.
#starts_with?(sym) ⇒ Array<Symbol, String>

Does this rule start with ‘sym`? It does if expr is that sym, expr starts with alt and contains that sym, or expr starts with seq and the next element is that sym.
#symbols(expr = @expr) ⇒ Array<Rule>

Return the symbols used in the rule.
#terminal? ⇒ Boolean

Is this a terminal?.
#terminals(ast, expr = @expr) ⇒ Array<Rule>

Return the terminals for this rule.
#to_bnf ⇒ Array<Rule>

Transform EBNF rule to BNF rules:.
#to_peg ⇒ Array<Rule>

Transform EBNF rule for PEG:.
#to_regexp ⇒ Regexp

For :hex or :range, create a regular expression.
#to_ruby ⇒ String

Return a Ruby representation of this rule.
#to_sxp(**options) ⇒ String (also: #to_s)

Return SXP representation of this rule.
#to_ttl ⇒ String

Serializes this rule to an Turtle.
#translate_codepoints(str) ⇒ Object

Utility function to translate code points of the form ‘#xN’ into ruby unicode characters.
#valid?(ast) ⇒ Boolean

Validate the rule, with respect to an AST.
#validate!(ast, expr = @expr) ⇒ Object

Validate the rule, with respect to an AST.

Constructor Details

#initialize(sym, id, expr, kind: nil, ebnf: nil, first: nil, follow: nil, start: nil, top_rule: nil, cleanup: nil) ⇒ `Rule`

Returns a new instance of Rule.

Parameters:

sym (Symbol, nil) —

‘nil` is allowed only for @pass or @terminals
id (Integer, nil)
expr (Array) —
The expression is an internal-representation of an S-Expression with one of the following oparators:
- ‘alt` – A list of alternative rules, which are attempted in order. It terminates with the first matching rule, or is terminated as unmatched, if no such rule is found.
- ‘diff` – matches any string that matches `A` but does not match `B`.
- ‘hex` – A single character represented using the hexadecimal notation `#xnn`.
- ‘istr` – A string which matches in a case-insensitive manner, so that `(istr “fOo”)` will match either of the strings `“foo”`, `“FOO”` or any other combination.
- ‘opt` – An optional rule or terminal. It either results in the matching rule or returns `nil`.
- ‘plus` – A sequence of one or more of the matching rule. If there is no such rule, it is terminated as unmatched; otherwise, the result is an array containing all matched input.
- ‘range` – A range of characters, possibly repeated, of the form `(range “a-z”)`. May also use hexadecimal notation.
- ‘rept m n` – A sequence of at lest `m` and at most `n` of the matching rule. It will always return an array.
- ‘seq` – A sequence of rules or terminals. If any (other than `opt` or `star`) to not parse, the rule is terminated as unmatched.
- ‘star` – A sequence of zero or more of the matching rule. It will always return an array.
kind (:rule, :terminal, :terminals, :pass) (defaults to: nil) —

(nil)
ebnf (String) (defaults to: nil) —

(nil) When parsing, records the EBNF string used to create the rule.
first (Array) (defaults to: nil) —

(nil) Recorded set of terminals that can proceed this rule (LL(1))
follow (Array) (defaults to: nil) —

(nil) Recorded set of terminals that can follow this rule (LL(1))
start (Boolean) (defaults to: nil) —

(nil) Is this the starting rule for the grammar?
top_rule (Rule) (defaults to: nil) —

(nil) The top-most rule. All expressed rules are top-rules, derived rules have the original rule as their top-rule.
cleanup (Boolean) (defaults to: nil) —

(nil) Records information useful for cleaning up converted :plus, and :star expansions (LL(1)).

Raises:

(ArgumentError)

# File 'lib/ebnf/rule.rb', line 107

def initialize(sym, id, expr, kind: nil, ebnf: nil, first: nil, follow: nil, start: nil, top_rule: nil, cleanup: nil)
  @sym, @id = sym, id
  @expr = expr.is_a?(Array) ? expr : [:seq, expr].compact
  @ebnf, @kind, @first, @follow, @start, @cleanup, @top_rule = ebnf, kind, first, follow, start, cleanup, top_rule
  @top_rule ||= self
  @kind ||= case
  when sym.to_s == sym.to_s.upcase then :terminal
  when !BNF_OPS.include?(@expr.first) then :terminal
  else :rule
  end

  # Allow @pass and @terminals to not be named
  @sym ||= :_pass if @kind == :pass
  @sym ||= :_terminals if @kind == :terminals

  raise ArgumentError, "Rule sym must be a symbol, was #{@sym.inspect}" unless @sym.is_a?(Symbol)
  raise ArgumentError, "Rule id must be a string or nil, was #{@id.inspect}" unless (@id || "").is_a?(String)
  raise ArgumentError, "Rule kind must be one of :rule, :terminal, :terminals, or :pass, was #{@kind.inspect}" unless
    @kind.is_a?(Symbol) && %w(rule terminal terminals pass).map(&:to_sym).include?(@kind)

  case @expr.first
  when :alt
    raise ArgumentError, "#{@expr.first} operation must have at least one operand, had #{@expr.length - 1}" unless @expr.length > 1
  when :diff
    raise ArgumentError, "#{@expr.first} operation must have exactly two operands, had #{@expr.length - 1}" unless @expr.length == 3
  when :hex, :istr, :not, :opt, :plus, :range, :star
    raise ArgumentError, "#{@expr.first} operation must have exactly one operand, had #{@expr.length - 1}" unless @expr.length == 2
  when :rept
    raise ArgumentError, "#{@expr.first} operation must have exactly three, had #{@expr.length - 1}" unless @expr.length == 4
    raise ArgumentError, "#{@expr.first} operation must an non-negative integer minimum, was #{@expr[1]}" unless
      @expr[1].is_a?(Integer) && @expr[1] >= 0
    raise ArgumentError, "#{@expr.first} operation must an non-negative integer maximum or '*', was #{@expr[2]}" unless
      @expr[2] == '*' || @expr[2].is_a?(Integer) && @expr[2] >= 0
  when :seq
    # It's legal to have a zero-length sequence
  else
    raise ArgumentError, "Rule expression must be an array using a known operator, was #{@expr.first}"
  end
end

Instance Attribute Details

#cleanup ⇒ `Object`

Determines preparation and cleanup rules for reconstituting EBNF ? * + from BNF



76
77
78

# File 'lib/ebnf/rule.rb', line 76

def cleanup
  @cleanup
end

#comp ⇒ `Rule`

A comprehension is a sequence which contains all elements but the first of the original rule.

Returns:

(Rule)



43
44
45

# File 'lib/ebnf/rule.rb', line 43

def comp
  @comp
end

#expr ⇒ `Array`

Rule expression

Returns:

(Array)



53
54
55

# File 'lib/ebnf/rule.rb', line 53

def expr
  @expr
end

#first ⇒ `Array<Rule>` (readonly)

Terminals that immediately procede this rule

Returns:

(Array<Rule>)



63
64
65

# File 'lib/ebnf/rule.rb', line 63

def first
  @first
end

#follow ⇒ `Array<Rule>` (readonly)

Terminals that immediately follow this rule

Returns:

(Array<Rule>)



68
69
70

# File 'lib/ebnf/rule.rb', line 68

def follow
  @follow
end

#id ⇒ `String`

ID of rule

Returns:

(String)



38
39
40

# File 'lib/ebnf/rule.rb', line 38

def id
  @id
end

#kind ⇒ `:rule`, ...

Kind of rule

Returns:

(:rule, :terminal, :terminals, or :pass)



48
49
50

# File 'lib/ebnf/rule.rb', line 48

def kind
  @kind
end

#orig ⇒ `String`

Original EBNF

Returns:

(String)



58
59
60

# File 'lib/ebnf/rule.rb', line 58

def orig
  @orig
end

#start ⇒ `Boolean`

Indicates that this is a starting rule

Returns:

(Boolean)



73
74
75

# File 'lib/ebnf/rule.rb', line 73

def start
  @start
end

#sym ⇒ `Symbol`

Symbol of rule

Returns:

(Symbol)



34
35
36

# File 'lib/ebnf/rule.rb', line 34

def sym
  @sym
end

Class Method Details

.from_sxp(sxp) ⇒ `Rule`

Return a rule from its SXP representation:

Also may have ‘(first …)`, `(follow …)`, or `(start #t)`.

Examples:

inputs

(pass _pass (plus (range "#x20\\t\\r\\n")))
(rule ebnf "1" (star (alt declaration rule)))
(terminal R_CHAR "19" (diff CHAR (alt "]" "-")))

Parameters:

sxp (String, Array)

Returns:

(Rule)

# File 'lib/ebnf/rule.rb', line 159

def self.from_sxp(sxp)
  if sxp.is_a?(String)
    require 'sxp' unless defined?(SXP)
    sxp = SXP.parse(sxp)
  end
  expr = sxp.detect {|e| e.is_a?(Array) && ![:first, :follow, :start].include?(e.first.to_sym)}
  first = sxp.detect {|e| e.is_a?(Array) && e.first.to_sym == :first}
  first = first[1..-1] if first
  follow = sxp.detect {|e| e.is_a?(Array) && e.first.to_sym == :follow}
  follow = follow[1..-1] if follow
  cleanup = sxp.detect {|e| e.is_a?(Array) && e.first.to_sym == :cleanup}
  cleanup = cleanup[1..-1] if cleanup
  start = sxp.any? {|e| e.is_a?(Array) && e.first.to_sym == :start}
  sym = sxp[1] if sxp[1].is_a?(Symbol)
  id = sxp[2] if sxp[2].is_a?(String)
  self.new(sym, id, expr, kind: sxp.first, first: first, follow: follow, cleanup: cleanup, start: start)
end

Instance Method Details

#<=>(other) ⇒ `Object`

Rules compare using their ids

# File 'lib/ebnf/rule.rb', line 435

def <=>(other)
  if id && other.id
    if id == other.id
      id.to_s <=> other.id.to_s
    else
      id.to_f <=> other.id.to_f
    end
  else
    sym.to_s <=> other.sym.to_s
  end
end

#==(other) ⇒ `Boolean`

Two rules are equal if they have the same #sym, #kind and #expr.

Parameters:

other (Rule)

Returns:

(Boolean)

# File 'lib/ebnf/rule.rb', line 419

def ==(other)
  other.is_a?(Rule) &&
  sym   == other.sym &&
  kind  == other.kind &&
  expr  == other.expr
end

#add_first(terminals) ⇒ `Integer`

Add terminal as proceding this rule.

Parameters:

terminals (Array<Rule, Symbol, String>)

Returns:

(Integer) —

if number of terminals added

# File 'lib/ebnf/rule.rb', line 658

def add_first(terminals)
  @first ||= []
  terminals = terminals.map {|t| t.is_a?(Rule) ? t.sym : t} - @first
  @first += terminals
  terminals.length
end

#add_follow(terminals) ⇒ `Integer`

Add terminal as following this rule. Don’t add _eps as a follow

Parameters:

terminals (Array<Rule, Symbol, String>)

Returns:

(Integer) —

if number of terminals added

# File 'lib/ebnf/rule.rb', line 669

def add_follow(terminals)
  # Remove terminals already in follows, and empty string
  terminals = terminals.map {|t| t.is_a?(Rule) ? t.sym : t} - (@follow || []) - [:_eps]
  unless terminals.empty?
    @follow ||= []
    @follow += terminals
  end
  terminals.length
end

#alt? ⇒ `Boolean`

Is this rule of the form (alt …)?

Returns:

(Boolean)



400
401
402

# File 'lib/ebnf/rule.rb', line 400

def alt?
  expr.is_a?(Array) && expr.first == :alt
end

#build(expr, kind: nil, cleanup: nil, **options) ⇒ `Object`

Build a new rule creating a symbol and numbering from the current rule Symbol and number creation is handled by the top-most rule in such a chain.

Parameters:

expr (Array)
kind (Symbol) (defaults to: nil) —

(nil)
cleanup (Hash{Symbol => Symbol}) (defaults to: nil) —

(nil)
options (Hash{Symbol => Object})

# File 'lib/ebnf/rule.rb', line 184

def build(expr, kind: nil, cleanup: nil, **options)
  new_sym, new_id = @top_rule.send(:make_sym_id)
  self.class.new(new_sym, new_id, expr,
                 kind: kind,
                 ebnf: @ebnf,
                 top_rule: @top_rule,
                 cleanup: cleanup,
                 **options)
end

#eql?(other) ⇒ `Boolean`

Two rules are equivalent if they have the same #expr.

Parameters:

other (Rule)

Returns:

(Boolean)



430
431
432

# File 'lib/ebnf/rule.rb', line 430

def eql?(other)
  expr == other.expr
end

#first_includes_eps? ⇒ `Boolean`

Do the firsts of this rule include the empty string?

Returns:

(Boolean)



650
651
652

# File 'lib/ebnf/rule.rb', line 650

def first_includes_eps?
  @first && @first.include?(:_eps)
end

#for_sxp ⇒ `Array`

Return representation for building S-Expressions.

Returns:

(Array)

# File 'lib/ebnf/rule.rb', line 197

def for_sxp
  elements = [kind, sym]
  elements << id if id
  elements << [:start, true] if start
  elements << first.sort_by(&:to_s).unshift(:first) if first
  elements << follow.sort_by(&:to_s).unshift(:follow) if follow
  elements << [:cleanup, cleanup] if cleanup
  elements << expr
  elements
end

#inspect ⇒ `Object`

# File 'lib/ebnf/rule.rb', line 409

def inspect
  "#<EBNF::Rule:#{object_id} " +
  {sym: sym, id: id, kind: kind, expr: expr}.inspect +
  ">"
end

#non_terminals(ast, expr = @expr) ⇒ `Array<Rule>`

Note:

this is used for LL(1) tansformation, so rule types are limited

Return the non-terminals for this rule.

‘alt` => this is every non-terminal.
‘diff` => this is every non-terminal.
‘hex` => nil
‘istr` => nil
‘not` => this is the last expression, if any.
‘opt` => this is the last expression, if any.
‘plus` => this is the last expression, if any.
‘range` => nil
‘rept` => this is the last expression, if any.
‘seq` => this is the first expression in the sequence, if any.
‘star` => this is the last expression, if any.

Parameters:

ast (Array<Rule>) —

The set of rules, used to turn symbols into rules
expr (Array<Symbol,String,Array>) (defaults to: @expr) —

(@expr) The expression to check, defaults to the rule expression. Typically, if the expression is recursive, the embedded expression is called recursively.

Returns:

(Array<Rule>)

# File 'lib/ebnf/rule.rb', line 474

def non_terminals(ast, expr = @expr)
  ([:alt, :diff].include?(expr.first) ? expr[1..-1] : expr[1,1]).map do |sym|
    case sym
    when Symbol
      r = ast.detect {|r| r.sym == sym}
      r if r && r.rule?
    when Array
      non_terminals(ast, sym)
    else
      nil
    end
  end.flatten.compact.uniq
end

#pass? ⇒ `Boolean`

Is this a pass?

Returns:

(Boolean)



389
390
391

# File 'lib/ebnf/rule.rb', line 389

def pass?
  kind == :pass
end

#rule? ⇒ `Boolean`

Is this a rule?

Returns:

(Boolean)



395
396
397

# File 'lib/ebnf/rule.rb', line 395

def rule?
  kind == :rule
end

#seq? ⇒ `Boolean`

Is this rule of the form (seq …)?

Returns:

(Boolean)



405
406
407

# File 'lib/ebnf/rule.rb', line 405

def seq?
  expr.is_a?(Array) && expr.first == :seq
end

#starts_with?(sym) ⇒ `Array<Symbol, String>`

Does this rule start with ‘sym`? It does if expr is that sym, expr starts with alt and contains that sym, or expr starts with seq and the next element is that sym.

Parameters:

sym (Symbol, class) —

Symbol matching any start element, or if it is String, any start element which is a String

Returns:

(Array<Symbol, String>) —

list of symbol (singular), or strings which are start symbol, or nil if there are none

# File 'lib/ebnf/rule.rb', line 551

def starts_with?(sym)
  if seq? && sym === (v = expr.fetch(1, nil))
    [v]
  elsif alt? && expr.any? {|e| sym === e}
    expr.select {|e| sym === e}
  else
    nil
  end
end

#symbols(expr = @expr) ⇒ `Array<Rule>`

Return the symbols used in the rule.

Parameters:

expr (Array<Symbol,String,Array>) (defaults to: @expr) —

(@expr) The expression to check, defaults to the rule expression. Typically, if the expression is recursive, the embedded expression is called recursively.

Returns:

(Array<Rule>)

# File 'lib/ebnf/rule.rb', line 529

def symbols(expr = @expr)
  expr[1..-1].map do |sym|
    case sym
    when Symbol
      sym
    when Array
      symbols(sym)
    end
  end.flatten.compact.uniq
end

#terminal? ⇒ `Boolean`

Is this a terminal?

Returns:

(Boolean)



383
384
385

# File 'lib/ebnf/rule.rb', line 383

def terminal?
  kind == :terminal
end

#terminals(ast, expr = @expr) ⇒ `Array<Rule>`

Note:

this is used for LL(1) tansformation, so rule types are limited

Return the terminals for this rule.

‘alt` => this is every terminal.
‘diff` => this is every terminal.
‘hex` => nil
‘istr` => nil
‘not` => this is the last expression, if any.
‘opt` => this is the last expression, if any.
‘plus` => this is the last expression, if any.
‘range` => nil
‘rept` => this is the last expression, if any.
‘seq` => this is the first expression in the sequence, if any.
‘star` => this is the last expression, if any.

Parameters:

ast (Array<Rule>) —

The set of rules, used to turn symbols into rules
expr (Array<Symbol,String,Array>) (defaults to: @expr) —

(@expr) The expression to check, defaults to the rule expression. Typically, if the expression is recursive, the embedded expression is called recursively.

Returns:

(Array<Rule>)

# File 'lib/ebnf/rule.rb', line 509

def terminals(ast, expr = @expr)
  ([:alt, :diff].include?(expr.first) ? expr[1..-1] : expr[1,1]).map do |sym|
    case sym
    when Symbol
      r = ast.detect {|r| r.sym == sym}
      r if r && r.terminal?
    when String
      sym
    when Array
      terminals(ast, sym)
    end
  end.flatten.compact.uniq
end

#to_bnf ⇒ `Array<Rule>`

Transform EBNF rule to BNF rules:

* Transform `(rule a "n" (op1 (op2)))` into two rules:

      (rule a "n" (op1 _a_1))
      (rule _a_1 "n.1" (op2))
* Transform `(rule a (opt b))` into `(rule a (alt _empty b))`
* Transform `(rule a (star b))` into `(rule a (alt _empty (seq b a)))`
* Transform `(rule a (plus b))` into `(rule a (seq b (star b)`

Transformation includes information used to re-construct non-transformed.

AST representation

Returns:

(Array<Rule>)

# File 'lib/ebnf/rule.rb', line 259

def to_bnf
  return [self] unless rule?
  new_rules = []

  # Look for rules containing recursive definition and rewrite to multiple rules. If `expr` contains elements which are in array form, where the first element of that array is a symbol, create a new rule for it.
  if expr.any? {|e| e.is_a?(Array) && (BNF_OPS + TERM_OPS).include?(e.first)}
    #   * Transform (a [n] rule (op1 (op2))) into two rules:
    #     (a.1 [n.1] rule (op1 a.2))
    #     (a.2 [n.2] rule (op2))
    # duplicate ourselves for rewriting
    this = dup
    new_rules << this

    expr.each_with_index do |e, index|
      next unless e.is_a?(Array) && e.first.is_a?(Symbol)
      new_rule = build(e)
      this.expr[index] = new_rule.sym
      new_rules << new_rule
    end

    # Return new rules after recursively applying #to_bnf
    new_rules = new_rules.map {|r| r.to_bnf}.flatten
  elsif expr.first == :opt
    this = dup
    #   * Transform (rule a (opt b)) into (rule a (alt _empty b))
    this.expr = [:alt, :_empty, expr.last]
    this.cleanup = :opt
    new_rules = this.to_bnf
  elsif expr.first == :star
    #   * Transform (rule a (star b)) into (rule a (alt _empty (seq b a)))
    this = dup
    this.cleanup = :star
    new_rule = this.build([:seq, expr.last, this.sym], cleanup: :merge)
    this.expr = [:alt, :_empty, new_rule.sym]
    new_rules = [this] + new_rule.to_bnf
  elsif expr.first == :plus
    #   * Transform (rule a (plus b)) into (rule a (seq b (star b)
    this = dup
    this.cleanup = :plus
    this.expr = [:seq, expr.last, [:star, expr.last]]
    new_rules = this.to_bnf
  elsif [:alt, :seq].include?(expr.first)
    # Otherwise, no further transformation necessary
    new_rules << self
  elsif [:diff, :hex, :range].include?(expr.first)
    # This rules are fine, they just need to be terminals
    raise "Encountered #{expr.first.inspect}, which is a #{self.kind}, not :terminal" unless self.terminal?
    new_rules << self
  else
    # Some case we didn't think of
    raise "Error trying to transform #{expr.inspect} to BNF"
  end
  
  return new_rules
end

#to_peg ⇒ `Array<Rule>`

Transform EBNF rule for PEG:

* Transform `(rule a "n" (op1 ... (op2 y) ...z))` into two rules:

      (rule a "n" (op1 ... _a_1 ... z))
      (rule _a_1 "n.1" (op2 y))
* Transform `(rule a "n" (diff op1 op2))` into two rules:

      (rule a "n" (seq _a_1 op1))
      (rule _a_1 "n.1" (not op1))

Returns:

(Array<Rule>)

# File 'lib/ebnf/rule.rb', line 328

def to_peg
  new_rules = []

  # Look for rules containing sub-sequences
  if expr.any? {|e| e.is_a?(Array) && e.first.is_a?(Symbol)}
    # duplicate ourselves for rewriting
    this = dup
    new_rules << this

    expr.each_with_index do |e, index|
      next unless e.is_a?(Array) && e.first.is_a?(Symbol)
      new_rule = build(e)
      this.expr[index] = new_rule.sym
      new_rules << new_rule
    end

    # Return new rules after recursively applying #to_bnf
    new_rules = new_rules.map {|r| r.to_peg}.flatten
  elsif expr.first == :diff && !terminal?
    this = dup
    new_rule = build([:not, expr[2]])
    this.expr = [:seq, new_rule.sym, expr[1]]
    new_rules << this
    new_rules << new_rule
  elsif [:hex, :istr, :range].include?(expr.first)
    # This rules are fine, they just need to be terminals
    raise "Encountered #{expr.first.inspect}, which is a #{self.kind}, not :terminal" unless self.terminal?
    new_rules << self
  else
    new_rules << self
  end
  
  return new_rules.map {|r| r.extend(EBNF::PEG::Rule)}
end

#to_regexp ⇒ `Regexp`

For :hex or :range, create a regular expression.

Returns:

(Regexp)

# File 'lib/ebnf/rule.rb', line 367

def to_regexp
  case expr.first
  when :hex
    Regexp.new(translate_codepoints(expr[1]))
  when :istr
    /#{expr.last}/ui
  when :range
    Regexp.new("[#{translate_codepoints(expr[1])}]")
  else
    raise "Can't turn #{expr.inspect} into a regexp"
  end
end

#to_ruby ⇒ `String`

Return a Ruby representation of this rule

Returns:

(String)



240
241
242

# File 'lib/ebnf/rule.rb', line 240

def to_ruby
  "EBNF::Rule.new(#{sym.inspect}, #{id.inspect}, #{expr.inspect}#{', kind: ' + kind.inspect unless kind == :rule})"
end

#to_sxp(**options) ⇒ `String` Also known as: to_s

Return SXP representation of this rule

Returns:

(String)

# File 'lib/ebnf/rule.rb', line 211

def to_sxp(**options)
  require 'sxp' unless defined?(SXP)
  for_sxp.to_sxp(**options)
end

#to_ttl ⇒ `String`

Serializes this rule to an Turtle.

Returns:

(String)

# File 'lib/ebnf/rule.rb', line 221

def to_ttl
  @ebnf.debug("to_ttl") {inspect} if @ebnf
  statements = [%{:#{sym} rdfs:label "#{sym}";}]
  if orig
    comment = orig.to_s.strip.
      gsub(/"""/, '\"\"\"').
      gsub("\\", "\\\\").
      sub(/^\"/, '\"').
      sub(/\"$/m, '\"')
    statements << %{  rdfs:comment #{comment.inspect};}
  end
  statements << %{  dc:identifier "#{id}";} if id
  
  statements += ttl_expr(expr, terminal? ? "re" : "g", 1, false)
  "\n" + statements.join("\n")
end

#translate_codepoints(str) ⇒ `Object`

Utility function to translate code points of the form ‘#xN’ into ruby unicode characters



449
450
451

# File 'lib/ebnf/rule.rb', line 449

def translate_codepoints(str)
  str.gsub(/#x\h+/) {|c| c[2..-1].scanf("%x").first.chr(Encoding::UTF_8)}
end

#valid?(ast) ⇒ `Boolean`

Validate the rule, with respect to an AST.

Uses ‘#validate!` and catches `RangeError`

Parameters:

ast (Array<Rule>) —

The set of rules, used to turn symbols into rules

Returns:

(Boolean)

# File 'lib/ebnf/rule.rb', line 640

def valid?(ast)
  validate!(ast)
  true
rescue SyntaxError
  false
end

#validate!(ast, expr = @expr) ⇒ `Object`

Validate the rule, with respect to an AST.

Parameters:

ast (Array<Rule>) —

The set of rules, used to turn symbols into rules
expr (Array<Symbol,String,Array>) (defaults to: @expr) —

(@expr) The expression to check, defaults to the rule expression. Typically, if the expression is recursive, the embedded expression is called recursively.

Raises:

(RangeError)

# File 'lib/ebnf/rule.rb', line 570

def validate!(ast, expr = @expr)
  op = expr.first
  raise SyntaxError, "Unknown operator: #{op}" unless OP_ARGN.key?(op)
  raise SyntaxError, "Argument count missmatch on operator #{op}, had #{expr.length - 1} expected #{OP_ARGN[op]}" if
    OP_ARGN[op] && OP_ARGN[op] != expr.length - 1

  # rept operator needs min and max
  if op == :alt
    raise SyntaxError, "alt operation must have at least one operand, had #{expr.length - 1}" unless expr.length > 1
  elsif op == :rept
    raise SyntaxError, "rept operation must an non-negative integer minimum, was #{expr[1]}" unless
      expr[1].is_a?(Integer) && expr[1] >= 0
    raise SyntaxError, "rept operation must an non-negative integer maximum or '*', was #{expr[2]}" unless
      expr[2] == '*' || expr[2].is_a?(Integer) && expr[2] >= 0
  end

  case op
  when :hex
    raise SyntaxError, "Hex operand must be of form '#xN+': #{sym}" unless expr.last.match?(/^#x\h+$/)
  when :range
    str = expr.last.dup
    str = str[1..-1] if str.start_with?('^')
    str = str[0..-2] if str.end_with?('-')  # Allowed at end of range
    scanner = StringScanner.new(str)
    hex = rchar = in_range = false
    while !scanner.eos?
      begin
        if scanner.scan(Terminals::HEX)
          raise SyntaxError if in_range && rchar
          rchar = in_range = false
          hex = true
        elsif scanner.scan(Terminals::R_CHAR)
          raise SyntaxError if in_range && hex
          hex = in_range = false
          rchar = true
        else
          raise(SyntaxError, "Range contains illegal components at offset #{scanner.pos}: was #{expr.last}")
        end

        if scanner.scan(/\-/)
          raise SyntaxError if in_range
          in_range = true
        end
      rescue SyntaxError
        raise(SyntaxError, "Range contains illegal components at offset #{scanner.pos}: was #{expr.last}")
      end
    end
  else
    ([:alt, :diff].include?(expr.first) ? expr[1..-1] : expr[1,1]).each do |sym|
      case sym
      when Symbol
        r = ast.detect {|r| r.sym == sym}
        raise SyntaxError, "No rule found for #{sym}" unless r
      when Array
        validate!(ast, sym)
      when String
        raise SyntaxError, "String must be of the form CHAR*" unless sym.match?(/^#{Terminals::CHAR}*$/)
      end
    end
  end
end

Class: EBNF::Rule

Overview

Constant Summary collapse

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(sym, id, expr, kind: nil, ebnf: nil, first: nil, follow: nil, start: nil, top_rule: nil, cleanup: nil) ⇒ Rule

Instance Attribute Details

#cleanup ⇒ Object

#comp ⇒ Rule

#expr ⇒ Array

#first ⇒ Array<Rule> (readonly)

#follow ⇒ Array<Rule> (readonly)

#id ⇒ String

#kind ⇒ :rule, ...

#orig ⇒ String

#start ⇒ Boolean

#sym ⇒ Symbol

Class Method Details

.from_sxp(sxp) ⇒ Rule

Instance Method Details

#<=>(other) ⇒ Object

#==(other) ⇒ Boolean

#add_first(terminals) ⇒ Integer

#add_follow(terminals) ⇒ Integer

#alt? ⇒ Boolean

#build(expr, kind: nil, cleanup: nil, **options) ⇒ Object

#eql?(other) ⇒ Boolean

#first_includes_eps? ⇒ Boolean

#for_sxp ⇒ Array

#inspect ⇒ Object

#non_terminals(ast, expr = @expr) ⇒ Array<Rule>

#pass? ⇒ Boolean

#rule? ⇒ Boolean

#seq? ⇒ Boolean

#starts_with?(sym) ⇒ Array<Symbol, String>

#symbols(expr = @expr) ⇒ Array<Rule>

#terminal? ⇒ Boolean

#terminals(ast, expr = @expr) ⇒ Array<Rule>

#to_bnf ⇒ Array<Rule>

#to_peg ⇒ Array<Rule>

#to_regexp ⇒ Regexp

#to_ruby ⇒ String

#to_sxp(**options) ⇒ String Also known as: to_s

#to_ttl ⇒ String

#translate_codepoints(str) ⇒ Object

#valid?(ast) ⇒ Boolean

#validate!(ast, expr = @expr) ⇒ Object

#initialize(sym, id, expr, kind: nil, ebnf: nil, first: nil, follow: nil, start: nil, top_rule: nil, cleanup: nil) ⇒ `Rule`

#cleanup ⇒ `Object`

#comp ⇒ `Rule`

#expr ⇒ `Array`

#first ⇒ `Array<Rule>` (readonly)

#follow ⇒ `Array<Rule>` (readonly)

#id ⇒ `String`

#kind ⇒ `:rule`, ...

#orig ⇒ `String`

#start ⇒ `Boolean`

#sym ⇒ `Symbol`

.from_sxp(sxp) ⇒ `Rule`

#<=>(other) ⇒ `Object`

#==(other) ⇒ `Boolean`

#add_first(terminals) ⇒ `Integer`

#add_follow(terminals) ⇒ `Integer`

#alt? ⇒ `Boolean`

#build(expr, kind: nil, cleanup: nil, **options) ⇒ `Object`

#eql?(other) ⇒ `Boolean`

#first_includes_eps? ⇒ `Boolean`

#for_sxp ⇒ `Array`

#inspect ⇒ `Object`

#non_terminals(ast, expr = @expr) ⇒ `Array<Rule>`

#pass? ⇒ `Boolean`

#rule? ⇒ `Boolean`

#seq? ⇒ `Boolean`

#starts_with?(sym) ⇒ `Array<Symbol, String>`

#symbols(expr = @expr) ⇒ `Array<Rule>`

#terminal? ⇒ `Boolean`

#terminals(ast, expr = @expr) ⇒ `Array<Rule>`

#to_bnf ⇒ `Array<Rule>`

#to_peg ⇒ `Array<Rule>`

#to_regexp ⇒ `Regexp`

#to_ruby ⇒ `String`

#to_sxp(**options) ⇒ `String` Also known as: to_s

#to_ttl ⇒ `String`

#translate_codepoints(str) ⇒ `Object`

#valid?(ast) ⇒ `Boolean`

#validate!(ast, expr = @expr) ⇒ `Object`