D★Parse
D★Parse is a parser combinator library for Ruby.
STATUS: Experimental. Pre-alpha. Use at your own risk.
Example
Here is a parser for a series of numbers:
require 'd-parse'
module Grammar
extend DParse::DSL
DIGIT = char_in('0'..'9')
ROOT =
seq(
intersperse(
repeat(DIGIT).capture.map { |d| d.to_i(10) },
char(',').ignore,
).compact,
eof,
).first
end
res = Grammar::ROOT.apply("1,2,100,582048,07,09")
case res
when DParse::Success
p res.data
when DParse::Failure
$stderr.puts res.
exit 1
end
Parsers
alt(p1, p2, …)
attempts to apply any of the given parsers.any
parses any character.char_in(cs)
parses a character that is in thecs
collection.char_not_in(cs)
parses a character that is not in thecs
collection.char_not(c)
parses a character that is notc
.char(c)
parses the single characterc
.eof
parses the end of file.fail
always fails.describe(p, name)
sets the name of the parser, so that parsing failures ofp
return a failure with message “expected name”.repeat(p)
tries to applyp
as many times as possible, and never fails.seq(p1, p2, …)
tries to apply the given parsers in sequence.succeed
always succeeds, without advancing the position.
Special modifiers:
lazy { p }
references the parserp
, which might not be defined yet. This is useful for recursive definitions.p.capture
sets the data of the parsing result ofp
, if successful, to the data between the start and the end of the match.p.ignore
sets the data of the parsing result ofp
, if successful, tonil
. This is particularly useful in combination withp.compact
.p.map { |data| … }
sets the data of the parsing result ofp
, if successful, to the return value of the block. The block gets the data of the success as an argument.p.first
sets the data of the parsing result ofp
, if successful, to the first element of the data of the success. This only works if the success data is an array.p.second
sets the data of the parsing result ofp
, if successful, to the second element of the data of the success. This only works if the success data is an array.p.select_odd
sets the data of the parsing result ofp
, if successful, to each odd element of the data of the success. This only works if the success data is an array.p.select_even
sets the data of the parsing result ofp
, if successful, to each even element of the data of the success. This only works if the success data is an array.p.compact
sets the data of the parsing result ofp
, if successful, to each non-nil element of the data of the success. This only works if the success data is an array. This is particularly useful in combination withp.ignore
.
To do
As mentioned above, this software is in an early state, and still lacks many features. It is not yet a fully functional parser combinator library, but it’ll hopefully get there.
Add more combinators (e.g.
repeat1
).Add support for backtracking, so that
seq(repeat(any), string('donkey'))
can parsesuperdonkey
.Add failure descriptions to all parsers.
Allow renaming failures, so that errors can be easier to understand for hoominz.
Add tests for everything.
Add documentation.
Add support for parsing generic token streams, rather than just characters.
Commit message conventions
As an experiment, I’m going to use commit message conventions slightly adapted from Angular.js’s.
<type>(<scope>): <subject>
<BLANK LINE>
<body>
The following types are supported:
feat
(new feature)fix
(bug fix)docs
(documentation)style
(formatting, …)refactor
test
(adding tests)chore
(maintenance, such as build infrastructure changes)
The following scopes are supported:
core
parsers
samples
The following rules apply to the subject:
- Use the imperative, present tense.
- Do not capitalize the first letter.
- Do not end the subject with a period.