Class: SrlRuby::Tokenizer

Inherits:
Object
  • Object
show all
Defined in:
lib/srl_ruby/tokenizer.rb

Overview

A tokenizer for the Simple Regex Language. Responsibility: break input SRL into a sequence of token objects. The tokenizer should recognize: Keywords: as, capture, letter Integer literals including single digit String literals (quote delimited) Single character literal Delimiters: parentheses '(' and ')' Separators: comma (optional)

Defined Under Namespace

Classes: ScanError

Constant Summary collapse

@@lexeme2name =
{
  '(' => 'LPAREN',
  ')' => 'RPAREN',
  ',' => 'COMMA'
}.freeze
@@keywords =

Here are all the SRL keywords (in uppercase)

%w[
  ALL
  ALREADY
  AND
  ANY
  ANYTHING
  AS
  AT
  BACKSLASH
  BEGIN
  BETWEEN
  BY
  CAPTURE
  CARRIAGE
  CASE
  CHARACTER
  DIGIT
  EITHER
  END
  EXACTLY
  FOLLOWED
  FROM
  HAD
  IF
  INSENSITIVE
  LAZY
  LEAST
  LETTER
  LINE
  LITERALLY
  MORE
  MULTI
  MUST
  NEVER
  NEW
  NO
  NONE
  NOT
  NUMBER
  OF
  ONCE
  ONE
  OPTIONAL
  OR
  RAW
  RETURN
  STARTS
  TAB
  TIMES
  TO
  TWICE
  UNTIL
  UPPERCASE
  VERTICAL
  WHITESPACE
  WITH
  WORD
].map { |x| [x, x] }.to_h

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(source) ⇒ Tokenizer

Constructor. Initialize a tokenizer for SRL.



99
100
101
102
103
# File 'lib/srl_ruby/tokenizer.rb', line 99

def initialize(source)
  @scanner = StringScanner.new(source)
  @lineno = 1
  @line_start = 0
end

Instance Attribute Details

#line_startInteger (readonly)



27
28
29
# File 'lib/srl_ruby/tokenizer.rb', line 27

def line_start
  @line_start
end

#linenoInteger (readonly)



24
25
26
# File 'lib/srl_ruby/tokenizer.rb', line 24

def lineno
  @lineno
end

#scannerStringScanner (readonly)



21
22
23
# File 'lib/srl_ruby/tokenizer.rb', line 21

def scanner
  @scanner
end

Instance Method Details

#tokensObject



105
106
107
108
109
110
111
112
113
# File 'lib/srl_ruby/tokenizer.rb', line 105

def tokens
  tok_sequence = []
  until @scanner.eos?
    token = _next_token
    tok_sequence << token unless token.nil?
  end

  return tok_sequence
end