Summary
Babel Bridge let’s you generate parsers 100% in Ruby code. It is a memoizing Parsing Expression Grammar (PEG) generator like Treetop, but it doesn’t require special file-types or new syntax. Overall focus is on simplicity and usability over performance.
Example
require “babel_bridge” class MyParser < BabelBridge::Parser
rule :foo, "foo", :bar? # match "foo" optionally followed by the :bar
rule :bar, "bar" # match "bar"
end
MyParser.new.parse(“foo”) # matches “foo” MyParser.new.parse(“foobar”) # matches “foobar”
Babel Bridge is a parser-generator for Parsing Expression Grammars
Goals
Allow expression 100% in ruby
Productivity through Simplicity and Understandability first
Performance second
Features
rule=MyParser[:foo] # returns the BabelBridge::Rule instance for that rule
rule.to_s
nice human-readable view of the rule with extra info
rule.inspect
returns the code necessary for generating the rule and all its variants
(minus any class_eval code)
MyParser.node_class(rule)
returns the Node class for a rule
MyParser.node_class(rule) do
# class_eval inside the rule's Node-class
end
MyParser.new.parse(text)
# parses Text starting with the MyParser.root_rule
# The root_rule is defined automatically by the first rule defined, but can be set by:
# MyParser.root_rule=v # where v is the symbol name of the rule or the actual rule object from MyParser[rule]
MyParser.new.parse(text,offset,rule) # only has to match the rule - it's ok if there is input left
parser.parse uses the root_rule
detailed parser_failure_info report
Defining Rules
Inside the parser class, a rule is defined as follows:
class MyParser < BabelBridge::Parser
rule :rule_name, pattern
end
Where:
:rule_name is a symbol
pattern see Patterns below
You can also add new rules outside the class definition by:
MyParser.rule :rule_name, pattern
Patterns
Patterns are an Array of pattern elements, matched in order:
Ex (both are equivelent):
rule :my_rule, "match", "this", "in", "order" # matches "matchthisinorder"
rule :my_rule, ["match", "this", "in", "order"] # matches "matchthisinorder"
Pattern Elements
Pattern elements are basic-pattern-element or extended-pattern-element ( expressed as a hash). Internally, they are “compiled” into instances of PatternElement with optimized lambda functions for parsing.
basic-pattern-element:
:my_rule matches the Rule named :my_rule
:my_rule? optional: optionally matches Rule :my_rule
:my_rule! negative: success only if it DOESN'T match Rule :my_rule
"string" matches the string exactly
/regex/ matches the regex exactly
true always matches the empty string (useful as a no-op if you don't want to change the length of your pattern)
extended-pattern-element:
A Hash with :match or :parser set and zero or more additional options:
:match => basic_element
provide one of the basic elements above
NOTE: Optional and Negative options are preserved, but they are overridden by any such directives in the Hash-Element
:parser => lambda {|parent_node| ... }
Custom lambda function for parsing the input.
Return "nil" if could not find a parse, otherwise return a new Node, typically the TerminalNode
Make sure the returned node.next value is the index where you wish parsing to resume
:as => :my_name
Assign a name to an element for later programatic reference:
rule_variant_node_class_instance.my_name
:optionally => true
PEG equivelent: term?
turn this into an optional-match element
optional elements cannot be negative
:dont => true
PEG equivalent: !term
turn this into a Negative-match element
negative elements cannot be optional
:could => true
PEG equivalent: &term
:many => PatternElement
PEG equivalent: term+ (for "term*", use optionally + many)
accept 1 or more reptitions of this element delimited by PatternElement
NOTE: PatternElement can be "true" for no delimiter (since "true" matches the empty string)
:delimiter => PatternElement
pattern to match between the :many patterns
:post_delimiter => true # use the :delimiter PatternElement for final match
:post_delimiter => PatternElement # use custom post_delimiter PatternElement for final match
if true, then poly will match a delimiter after the last poly-match
Structure
Each Rule defines a subclass of Node
Each RuleVariant defines a subclass of the parent Rule's node-class
Therefor you can easily define code to be shared across all variants as well
as define code specific to one variant.