Class: ToknInternal::RegParse
- Inherits:
-
Object
- Object
- ToknInternal::RegParse
- Defined in:
- lib/tokn/reg_parse.rb
Overview
Parses a single regular expression from a string. Produces an NFA with distinguished start and end states (none of these states are marked as final states)
Here is the grammar for regular expressions. Spaces are ignored, and can be liberally sprinkled within the regular expressions to aid readability. To represent a space, the s escape sequence must be used. See the file ‘sampletokens.txt’ for some examples.
Expressions have one of these types:
E : base class
J : a Join expression, formed by concatenating one or more together
Q : a Quantified expression; followed optionally by '*', '+', or '?'
P : a Parenthesized expression, which is optionally surrounded with (), {}, []
E -> J '|' E
| J
J -> Q J
| Q
Q -> P '*'
| P '+'
| P '?'
| P
P -> '(' E ')'
| '{' TOKENNAME '}'
| '[^' SETSEQ ']' A code not appearing in the set
| '[' SETSEQ ']'
| CHARCODE
SETSEQ -> SET SETSEQ
| SET
SET -> CHARCODE
| CHARCODE '-' CHARCODE
CHARCODE ->
a | b | c ... any printable except {,},[, etc.
| \xhh hex value from 00...ff
| \uhhhh hex value from 0000...ffff (e.g., unicode)
| \f | \n | \r | \t formfeed, linefeed, return, tab
| \s a space (' ')
| \* where * is some other non-alphabetic
character that needs to be escaped
The parser performs recursive descent parsing; each method returns an NFA represented by a pair of states: the start and end states.
Instance Attribute Summary collapse
-
#endState ⇒ Object
readonly
Returns the value of attribute endState.
-
#startState ⇒ Object
readonly
Returns the value of attribute startState.
Instance Method Summary collapse
-
#initialize(script, tokenDefMap = nil) ⇒ RegParse
constructor
Construct a parser and perform the parsing.
- #inspect ⇒ Object
Constructor Details
#initialize(script, tokenDefMap = nil) ⇒ RegParse
Construct a parser and perform the parsing
73 74 75 76 77 78 |
# File 'lib/tokn/reg_parse.rb', line 73 def initialize(script, tokenDefMap = nil) @script = script.strip @nextStateId = 0 @tokenDefMap = tokenDefMap parseScript end |
Instance Attribute Details
#endState ⇒ Object (readonly)
Returns the value of attribute endState.
65 66 67 |
# File 'lib/tokn/reg_parse.rb', line 65 def endState @endState end |
#startState ⇒ Object (readonly)
Returns the value of attribute startState.
65 66 67 |
# File 'lib/tokn/reg_parse.rb', line 65 def startState @startState end |
Instance Method Details
#inspect ⇒ Object
81 82 83 84 85 |
# File 'lib/tokn/reg_parse.rb', line 81 def inspect s = "RegParse: #{@script}" s += " start:"+d(@startState)+" end:"+d(@endState) return s end |