Class: Skeem::Tokenizer

Inherits:
Object
  • Object
show all
Defined in:
lib/skeem/tokenizer.rb

Overview

A tokenizer for the Skeem dialect. Responsibility: break Skeem input into a sequence of token objects. The tokenizer should recognize: Identifiers: Integer literals including single digit String literals (quote delimited) Single character literal Delimiters: parentheses '(', ')' Separators: comma

Defined Under Namespace

Classes: ScanError

Constant Summary collapse

Lexeme2name =
{
  "'" => 'APOSTROPHE',
  '=>' => 'ARROW',
  '`' => 'GRAVE_ACCENT',
  '(' => 'LPAREN',
  ')' => 'RPAREN',
  '.' => 'PERIOD',
  '...' => 'ELLIPSIS',
  ',' => 'COMMA',
  ',@' =>  'COMMA_AT_SIGN',
  '#(' => 'VECTOR_BEGIN',
  '_' => 'UNDERSCORE'
}.freeze
Keywords =

Here are all the implemented Scheme keywords (in uppercase)

%w[
  BEGIN
  COND
  DEFINE
  DEFINE-SYNTAX
  DO
  ELSE
  IF
  INCLUDE
  LAMBDA
  LET
  LET*
  QUASIQUOTE
  QUOTE
  SET!
  SYNTAX-RULES
  UNQUOTE
  UNQUOTE-SPLICING
].to_h { |x| [x, x.sub(/\*$/, '_STAR')] }

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(source) ⇒ Tokenizer

Constructor. Initialize a tokenizer for Skeem.

Parameters:

  • source (String)

    Skeem text to tokenize.



67
68
69
70
# File 'lib/skeem/tokenizer.rb', line 67

def initialize(source)
  @scanner = StringScanner.new('')
  reset(source)
end

Instance Attribute Details

#line_startInteger (readonly)

Returns Offset of start of current line.

Returns:

  • (Integer)

    Offset of start of current line



26
27
28
# File 'lib/skeem/tokenizer.rb', line 26

def line_start
  @line_start
end

#linenoInteger (readonly)

Returns Current line number.

Returns:

  • (Integer)

    Current line number



23
24
25
# File 'lib/skeem/tokenizer.rb', line 23

def lineno
  @lineno
end

#scannerStringScanner (readonly)

Returns:

  • (StringScanner)


20
21
22
# File 'lib/skeem/tokenizer.rb', line 20

def scanner
  @scanner
end

Instance Method Details

#reset(source) ⇒ Object

Parameters:

  • source (String)

    Skeem text to tokenize.



73
74
75
76
77
# File 'lib/skeem/tokenizer.rb', line 73

def reset(source)
  @scanner.string = source
  @lineno = 1
  @line_start = 0
end

#tokensArray<SkmToken>

Returns | Returns a sequence of tokens.

Returns:

  • (Array<SkmToken>)

    | Returns a sequence of tokens



80
81
82
83
84
85
86
87
88
# File 'lib/skeem/tokenizer.rb', line 80

def tokens
  tok_sequence = []
  until @scanner.eos?
    token = _next_token
    tok_sequence << token unless token.nil?
  end

  tok_sequence
end