D★Parse
D★Parse is a parser combinator library for Ruby.
STATUS: Experimental. Pre-alpha. Use at your own risk.
Example
Here is a parser for a series of numbers:
require 'd-parse'
module Grammar
extend DParse::DSL
DIGIT = char_in('0'..'9')
ROOT =
seq(
intersperse(
repeat(DIGIT).capture.map { |d| d.to_i(10) },
char(',').ignore,
).compact,
eof,
).first
end
res = Grammar::ROOT.apply("1,2,100,582048,07,09")
case res
when DParse::Success
p res.data
when DParse::Failure
$stderr.puts res.
exit 1
end
Parsers
alt(p1, p2, …)attempts to apply any of the given parsers.anyparses any character.char_in(cs)parses a character that is in thecscollection.char_not_in(cs)parses a character that is not in thecscollection.char_not(c)parses a character that is notc.char(c)parses the single characterc.eofparses the end of file.failalways fails.describe(p, name)sets the name of the parser, so that parsing failures ofpreturn a failure with message “expected name”.repeat(p)tries to applypas many times as possible, and never fails.seq(p1, p2, …)tries to apply the given parsers in sequence.succeedalways succeeds, without advancing the position.
Special modifiers:
lazy { p }references the parserp, which might not be defined yet. This is useful for recursive definitions.p.capturesets the data of the parsing result ofp, if successful, to the data between the start and the end of the match.p.ignoresets the data of the parsing result ofp, if successful, tonil. This is particularly useful in combination withp.compact.p.map { |data| … }sets the data of the parsing result ofp, if successful, to the return value of the block. The block gets the data of the success as an argument.p.firstsets the data of the parsing result ofp, if successful, to the first element of the data of the success. This only works if the success data is an array.p.secondsets the data of the parsing result ofp, if successful, to the second element of the data of the success. This only works if the success data is an array.p.select_oddsets the data of the parsing result ofp, if successful, to each odd element of the data of the success. This only works if the success data is an array.p.select_evensets the data of the parsing result ofp, if successful, to each even element of the data of the success. This only works if the success data is an array.p.compactsets the data of the parsing result ofp, if successful, to each non-nil element of the data of the success. This only works if the success data is an array. This is particularly useful in combination withp.ignore.
To do
As mentioned above, this software is in an early state, and still lacks many features. It is not yet a fully functional parser combinator library, but it’ll hopefully get there.
Add more combinators (e.g.
repeat1).Add support for backtracking, so that
seq(repeat(any), string('donkey'))can parsesuperdonkey.Add failure descriptions to all parsers.
Allow renaming failures, so that errors can be easier to understand for hoominz.
Add tests for everything.
Add documentation.
Add support for parsing generic token streams, rather than just characters.
Commit message conventions
As an experiment, I’m going to use commit message conventions slightly adapted from Angular.js’s.
<type>(<scope>): <subject>
<BLANK LINE>
<body>
The following types are supported:
feat(new feature)fix(bug fix)docs(documentation)style(formatting, …)refactortest(adding tests)chore(maintenance, such as build infrastructure changes)
The following scopes are supported:
coreparserssamples
The following rules apply to the subject:
- Use the imperative, present tense.
- Do not capitalize the first letter.
- Do not end the subject with a period.