Class: AsciiMath::Tokenizer
- Inherits:
-
Object
- Object
- AsciiMath::Tokenizer
- Defined in:
- lib/asciimath/parser.rb
Overview
Internal: Splits an ASCIIMath expression into a sequence of tokens. Each token is represented as a Hash containing the keys :value and :type. The :value key is used to store the text associated with each token. The :type key indicates the semantics of the token. The value for :type will be one of the following symbols:
-
:symbol a symbolic name or a bit of text without any further semantics
-
:text a bit of arbitrary text
-
:number a number
-
:operator a mathematical operator symbol
-
:unary a unary operator (e.g., sqrt, text, …)
-
:infix an infix operator (e.g, /, _, ^, …)
-
:binary a binary operator (e.g., frac, root, …)
-
:eof indicates no more tokens are available
Constant Summary collapse
- WHITESPACE =
/\s+/
- NUMBER =
/[0-9]+(?:\.[0-9]+)?/
- QUOTED_TEXT =
/"[^"]*"/
- TEX_TEXT =
/text\([^)]*\)/
Instance Method Summary collapse
-
#initialize(string, symbols) ⇒ Tokenizer
constructor
Public: Initializes an ASCIIMath tokenizer.
-
#next_token ⇒ Object
Public: Read the next token from the ASCIIMath expression and move the tokenizer ahead by one token.
-
#push_back(token) ⇒ Object
Public: Pushes the given token back to the tokenizer.
Constructor Details
#initialize(string, symbols) ⇒ Tokenizer
Public: Initializes an ASCIIMath tokenizer.
string - The ASCIIMath expression to tokenize symbols - The symbol table to use while tokenizing
58 59 60 61 62 63 64 |
# File 'lib/asciimath/parser.rb', line 58 def initialize(string, symbols) @string = StringScanner.new(string) @symbols = symbols lookahead = @symbols.keys.map { |k| k.length }.max @symbol_regexp = /((?:\\[\s0-9]|[^\s0-9]){1,#{lookahead}})/ @push_back = nil end |
Instance Method Details
#next_token ⇒ Object
Public: Read the next token from the ASCIIMath expression and move the tokenizer ahead by one token.
Returns the next token as a Hash
70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 |
# File 'lib/asciimath/parser.rb', line 70 def next_token if @push_back t = @push_back @push_back = nil return t end @string.scan(WHITESPACE) return {:value => nil, :type => :eof} if @string.eos? case @string.peek(1) when '"' read_quoted_text when 't' case @string.peek(5) when 'text(' read_tex_text else read_symbol end when '-', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' read_number || read_symbol else read_symbol end end |
#push_back(token) ⇒ Object
Public: Pushes the given token back to the tokenizer. A subsequent call to next_token will return the given token rather than generating a new one. At most one token can be pushed back.
token - The token to push back
103 104 105 |
# File 'lib/asciimath/parser.rb', line 103 def push_back(token) @push_back = token unless token[:type] == :eof end |