Class: Mani::Tokenizer
- Inherits:
-
Object
- Object
- Mani::Tokenizer
- Defined in:
- lib/mani/tokenizer.rb
Overview
This class contains methods to handle the tokenization of strings.
Constant Summary collapse
- ESCAPE_CHARACTER =
The escape character
'%'
- SEQUENCE_OPEN_DELIMITER =
The delimiter signifying the start of a sequence
'{{'
- SEQUENCE_CLOSE_DELIMITER =
The delimiter signifying the end of a sequence
'}}'
- LITERAL_OPEN_DELIMITER =
The delimiter signifying an “open sequence” escape sequence
ESCAPE_CHARACTER + SEQUENCE_OPEN_DELIMITER
- LITERAL_CLOSE_DELIMITER =
The delimiter signifying a “close sequence” escape sequence
ESCAPE_CHARACTER + SEQUENCE_CLOSE_DELIMITER
- SEQUENCE_OPEN =
The pattern to match the start of a sequence
/ # find opening delimiter at beginning of string... ^#{SEQUENCE_OPEN_DELIMITER} # ...or elsewhere in the string, provided it's not preceded by # ESCAPE_CHARACTER |[^#{ESCAPE_CHARACTER}]#{SEQUENCE_OPEN_DELIMITER} /x
- SEQUENCE_CLOSE =
The pattern to match the end of a sequence
/ # find closing delimiter at beginning of string... ^#{SEQUENCE_CLOSE_DELIMITER} # ...or elsewhere in the string, provided it's not preceded by # ESCAPE_CHARACTER |[^#{ESCAPE_CHARACTER}]#{SEQUENCE_CLOSE_DELIMITER} /x
Class Method Summary collapse
-
.get_tokens(text) ⇒ Array
Retrieves the tokens comprising the supplied text.
-
.strip_comment_delimiters(text) ⇒ String
Strips the comment delimiters from the supplied text.
-
.tokenize(scanner, tokens) ⇒ Array
Recursively scans the string within the supplied scanner to produce a list of tokens.
Class Method Details
.get_tokens(text) ⇒ Array
Retrieves the tokens comprising the supplied text.
43 44 45 |
# File 'lib/mani/tokenizer.rb', line 43 def self.get_tokens(text) tokenize StringScanner.new(text), [] end |
.strip_comment_delimiters(text) ⇒ String
Strips the comment delimiters from the supplied text.
51 52 53 54 55 |
# File 'lib/mani/tokenizer.rb', line 51 def self.strip_comment_delimiters(text) text .gsub(LITERAL_OPEN_DELIMITER, SEQUENCE_OPEN_DELIMITER) .gsub(LITERAL_CLOSE_DELIMITER, SEQUENCE_CLOSE_DELIMITER) end |
.tokenize(scanner, tokens) ⇒ Array
Recursively scans the string within the supplied scanner to produce a list of tokens.
63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 |
# File 'lib/mani/tokenizer.rb', line 63 def self.tokenize(scanner, tokens) match = scanner.scan_until SEQUENCE_OPEN unless match static = strip_comment_delimiters scanner.rest tokens.concat [[:static, static]] unless static.empty? return tokens end if scanner.check_until SEQUENCE_CLOSE static = strip_comment_delimiters match.chomp(SEQUENCE_OPEN_DELIMITER) tokens.concat [[:static, static]] unless static.empty? match = scanner.scan_until SEQUENCE_CLOSE match.chomp! SEQUENCE_CLOSE_DELIMITER sequence = strip_comment_delimiters match tokens.concat [[:sequence, sequence]] unless sequence.empty? tokenize scanner, tokens else static = strip_comment_delimiters(match + scanner.rest) tokens.concat [[:static, static]] end end |