Module: EBNF

Defined in:


Extended Bakus-Nour Form (EBNF), being the W3C variation is originaly defined in the [W3C XML 1.0 Spec](

This version attempts to be less strict than the strict definition to allow for coloquial variations (such as in the Turtle syntax).

A rule takes the following form:

\[1\]  symbol ::= expression

Comments include the content between '/*' and '*/'

Based on bnf2turtle by Dan Connolly.


Many specifications include grammars that look formal but are not actually checked, by machine, against test data sets. Debugging the grammar in the XML specification has been a long, tedious manual process. Only when the loop is closed between a fully formal grammar and a large test data set can we be confident that we have an accurate specification of a language (and even then, only the syntax of the language).

The grammar in the [N3 design note][] has evolved based on the original manual transcription into a python recursive-descent parser and subsequent development of test cases. Rather than maintain the grammar and the parser independently, our [goal] is to formalize the language syntax sufficiently to replace the manual implementation with one derived mechanically from the specification.

[N3 design note]:

Related Work

Sean Palmer's [n3p announcement][] demonstrated the feasibility of the approach, though that work did not cover some aspects of N3.

In development of the [SPARQL specification][], Eric Prud'hommeaux developed [Yacker][], which converts EBNF syntax to perl and C and C++ yacc grammars. It includes an interactive facility for checking strings against the resulting grammars. Yosi Scharf used it in [cwm Release 1.1.0rc1][], which includes a SPAQRL parser that is almost completely mechanically generated.

The N3/turtle output from yacker is lower level than the EBNF notation from the XML specification; it has the ?, +, and * operators compiled down to pure context-free rules, obscuring the grammar structure. Since that transformation is straightforwardly expressed in semantic web rules (see [bnf-rules.n3][]), it seems best to keep the RDF expression of the grammar in terms of the higher level EBNF constructs.

[goal]:[email protected];list=public-cwm-bugs [n3p announcement]: [Yacker]: [SPARQL specification]: [Cwm Release 1.1.0rc1]: [bnf-rules.n3]:

Open Issues and Future Work

The yacker output also has the terminals compiled to elaborate regular expressions. The best strategy for dealing with lexical tokens is not yet clear. Many tokens in SPARQL are case insensitive; this is not yet captured formally.

The schema for the EBNF vocabulary used here (“g:seq“, “g:alt“, …) is not yet published; it should be aligned with [swap/grammar/bnf][] and the [bnf2html.n3][] rules (and/or the style of linked XHTML grammar in the SPARQL and XML specificiations).

It would be interesting to corroborate the claim in the SPARQL spec that the grammar is LL(1) with a mechanical proof based on N3 rules.

[swap/grammar/bnf]: [bnf2html.n3]:


The [N3 Primer] by Tim Berners-Lee introduces RDF and the Semantic web using N3, a teaching and scribbling language. Turtle is a subset of N3 that maps directly to (and from) the standard XML syntax for RDF.

[N3 Primer]:

Defined Under Namespace

Modules: BNF, LL1, Parser, VERSION Classes: Base, Rule

Class Method Summary collapse

Class Method Details

.parse(input, options = {}) ⇒ EBNF::Base

Parse the given EBNF `query` input.


ebnf = EBNF.parse(input)


  • input (#read, String, #to_s)
  • options (Hash{Symbol => Object}) (defaults to: {})



  • (Exception)

    on invalid input

# File 'lib/ebnf.rb', line 19

def self.parse(input, options = {})
  query =, options)