Class: Lexer

Inherits:
Object
  • Object
show all
Defined in:
lib/linmeric/Lexer.rb

Overview

This simple Lexer tokenizes the input stream of commands for the sintax analyzer

Author

Massimiliano Dal Mas ([email protected])

License

Distributed under MIT license

Instance Method Summary collapse

Instance Method Details

#tokenize(expr) ⇒ Object

Tokenizes the input stream according to particular tokenizer symbols which determine the end of an element.Eg: a+3 => ‘+` determines the end of variable `a` and the beginning of another element (`3`)

  • argument: string to tokenize

  • returns: array of tokens (see: Token )



21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
# File 'lib/linmeric/Lexer.rb', line 21

def tokenize(expr)
  token = []
  temp = ""
  ignore = false
  pos = 0
  i = 0
  gen_exp = []
  tokenizers = Tool.operators + [" ","(",")",":",'"',"~"]
  while i < expr.size
    if  (tokenizers.include? expr[i]) then
      temp += ':' if expr[i] == ':'
      token << Token.new(temp, pos) unless temp == ""
      temp = ""
      token << Token.new(expr[i],i) unless " :".include? expr[i]
      gen_exp = extract_gen_exp(expr[(i+1)...expr.size]) if expr[i] == '"'
      token << Token.new(gen_exp[0],pos+1,"GENERAL_STRING") unless gen_exp == [] 
      i += gen_exp[1] unless gen_exp == []
      pos = i + ((gen_exp == []) ? 1 : 0)
      token << Token.new('"',pos) && pos += 1 if gen_exp[2]
      gen_exp = []
    else
      temp += expr[i]
    end
    i += 1
  end 
  token << Token.new(temp,pos) unless temp == ""
  return token
end