Class: Ast::Tokeniser

Inherits:
Object
  • Object
show all
Defined in:
lib/ast_ast/tokeniser.rb

Defined Under Namespace

Classes: Rule

Class Method Summary collapse

Class Method Details

.rule(name, regex, &block) ⇒ Object

Creates a new Rule and adds to the @rules list.

Parameters:

  • name (Symbol)
  • regex (Regexp)

See Also:



69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
# File 'lib/ast_ast/tokeniser.rb', line 69

def self.rule(name, regex, &block)
  @rules ||= []
  # make rules with same name overwrite first rule
  @rules.delete_if {|i| i.name == name}
  
  # Create default block which just returns a value
  block ||= Proc.new {|i| i}
  # Make sure to return a token
  proc = Proc.new {|_i| 
    block_result = block.call(_i)
    if block_result.is_a? Array
      r = []
      block_result.each do |j|
        r << Ast::Token.new(name, j)
      end
      r
    else
      Ast::Token.new(name, block_result) 
    end
  }
  @rules << Rule.new(name, regex, proc)
end

.rulesArray

Returns Rules that have been defined.

Returns:

  • (Array)

    Rules that have been defined.



95
# File 'lib/ast_ast/tokeniser.rb', line 95

def self.rules; @rules; end

.token(regex, &block) ⇒ Object

Creates a new token rule, that is the block returns an Ast::Token instance.

Examples:


keywords = ['def', 'next', 'while', 'end']

token /[a-z]+/ do |i|
  if keywords.include?(i)
    Ast::Token.new(:keyword, i)
  else
    Ast::Token.new(:word, i)
end

Parameters:

  • regex (Regexp)


112
113
114
115
# File 'lib/ast_ast/tokeniser.rb', line 112

def self.token(regex, &block)
  @rules ||= []
  @rules << Rule.new(nil, regex, block)
end

.tokenise(input) ⇒ Tokens

Takes the input and uses the rules that were created to scan it.

Parameters:

  • Input (String)

    string to scan.

Returns:



124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
# File 'lib/ast_ast/tokeniser.rb', line 124

def self.tokenise(input)
  @scanner = StringScanner.new(input)
  
  result = Tokens.new
  until @scanner.eos?
    m = false # keep track of matches
    @rules.each do |i|
      a = @scanner.scan(i.regex)
      unless a.nil?
        m = true # match happened
        ran = i.run(a)
        # split array into separate tokens, *not* values
        if ran.is_a? Array
          #ran.each {|a| result << [i.name, a]}
          ran.each {|a| result << a }
        else
          #result << [i.name, ran]
          result << ran
        end
      end
    end
    unless m # if no match happened
      # obviously no rule matches this so ignore it
      # could add verbose mode?
      @scanner.pos += 1 unless @scanner.eos?
    end
  end
  result
end