Class: RubyTokenParser

Inherits:
Object
  • Object
show all
Includes:
Expressions
Defined in:
lib/ruby_token_parser/ruby_token_parser.rb,
lib/ruby_token_parser/parse.rb,
lib/ruby_token_parser/ruby_token_constants.rb

Overview

Note:

Limitations

  • RubyTokenParser does not support ruby 1.9's 'value` syntax.

  • RubyTokenParser does not currently support all of rubys escape sequences in strings and symbols.

  • Trailing commas in Array and Hash are not supported.

Note:

BigDecimals

You can instruct RubyTokenParser to parse "12.5" as a bigdecimal and use "12.5e" to have it parsed as float (short for "12.5e0", equivalent to "1.25e1")

Note:

Date & Time

RubyTokenParser supports a subset of ISO-8601 for Date and Time which are not actual valid ruby literals. The form YYYY-MM-DD (e.g. 2012-05-20) is translated to a Date object, and YYYY-MM-DD"T"HH:MM:SS (e.g. 2012-05-20T18:29:52) is translated to a Time object.

#

RubyTokenParser

Parse Strings containing ruby literals.

RubyTokenParser recognizes constants and the following literals:

nil                    # nil
true                   # true
false                  # false
-123                   # Fixnum/Bignum (decimal)
0b1011                 # Fixnum/Bignum (binary)
0755                   # Fixnum/Bignum (octal)
0xff                   # Fixnum/Bignum (hexadecimal)
120.30                 # Float (optional: BigDecimal)
1e0                    # Float
"foo"                  # String, no interpolation, but \t etc. work
'foo'                  # String, only \\ and \' are escaped
/foo/                  # Regexp
:foo                   # Symbol
:"foo"                 # Symbol
2012-05-20             # Date
2012-05-20T18:29:52    # DateTime
[Any, Literals, Here]  # Array
{Any => Literals}      # Hash
(1..20)                # Range
#

Defined Under Namespace

Modules: Expressions Classes: SyntaxError

Constant Summary

Constants included from Expressions

Expressions::DoubleQuotedStringEscapes, Expressions::RArrayBegin, Expressions::RArrayEnd, Expressions::RArraySeparator, Expressions::RArrayVoid, Expressions::RBigDecimal, Expressions::RBinaryInteger, Expressions::RConstant, Expressions::RDString, Expressions::RDate, Expressions::RDateTime, Expressions::RFalse, Expressions::RFloat, Expressions::RHashArrow, Expressions::RHashBegin, Expressions::RHashEnd, Expressions::RHashKeySymbol, Expressions::RHashSeparator, Expressions::RHashVoid, Expressions::RHexInteger, Expressions::RInteger, Expressions::RNil, Expressions::ROctalInteger, Expressions::RRange, Expressions::RRegexp, Expressions::RSString, Expressions::RSymbol, Expressions::RTime, Expressions::RTimeZone, Expressions::RTrue

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(string, options = nil) ⇒ RubyTokenParser

#

initialize

Parse a String, returning the object which it contains.

#

Parameters:

  • string (String)

    The string which should be parsed

  • options (nil, Hash) (defaults to: nil)

    An options-hash

Options Hash (options):

  • :use_big_decimal (Boolean)

    Whether to use BigDecimal instead of Float for objects like "1.23". Defaults to false.

  • :constant_base (Boolean)

    Determines from what constant other constants are searched.

    Defaults to Object (nil is treated as Object too, Object is the toplevel-namespace).



122
123
124
125
126
127
128
# File 'lib/ruby_token_parser/ruby_token_parser.rb', line 122

def initialize(string, options = nil)
  @string          = string
  options          = options ? options.dup : {}
  @constant_base   = options[:constant_base] # nil means toplevel
  @use_big_decimal = options.delete(:use_big_decimal) { false }
  @scanner         = StringScanner.new(string)
end

Class Method Details

.parse(string, options = nil, do_raise_an_exception = RubyTokenParser.raise_exception?) ⇒ Object

#

RubyTokenParser.parse

The RubyTokenParser.parse() method will parse a String, and return the (ruby) object which it contains.

This boolean will determine whether we will raise an exception or whether we will not.

Usage example:

x = RubyTokenParser.parse("[1,2,3]")
#

Parameters:

  • string (String)

    The (input) String which should be parsed.

  • options (nil, Hash) (defaults to: nil)

    An options-hash

  • do_raise_an_exception (Boolean) (defaults to: RubyTokenParser.raise_exception?)

Options Hash (options):

  • :use_big_decimal (Boolean)

    Whether to use BigDecimal instead of Float for objects like “1.23”. Defaults to false.

  • :constant_base (Boolean)

    Determines from what constant other constants are searched. Defaults to Object (nil is treated as Object too, Object is the toplevel-namespace).

Returns:

  • (Object)

    The object in the string.

Raises:



47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
# File 'lib/ruby_token_parser/parse.rb', line 47

def self.parse(
    string,
    options = nil,
    do_raise_an_exception = RubyTokenParser.raise_exception?
  )
  # ======================================================================= #
  # === Instantiate a new parser
  # ======================================================================= #
  parser  = new(string, options)
  begin
    value = parser.scan_value
  rescue RubyTokenParser::SyntaxError # Must rescue things such as: @foo = foo
    value = RubyTokenParser::SyntaxError
  end
  if do_raise_an_exception
    unless parser.end_of_string? or
           value.nil?
      # =================================================================== #
      # Raise the Syntax Error.
      # =================================================================== #
      raise SyntaxError,
            "Unexpected superfluous data: #{parser.rest.inspect}"
    end unless value.is_a? Range # Make an exception for Range objects.
  end
  value
end

.raise_exception?Boolean

#

RubyTokenParser.raise_exception?

#

Returns:

  • (Boolean)


89
90
91
# File 'lib/ruby_token_parser/ruby_token_parser.rb', line 89

def self.raise_exception?
  @do_raise_exception
end

.set_do_raise_exception(i = true) ⇒ Object

#

RubyTokenParser.set_do_raise_exception

#


82
83
84
# File 'lib/ruby_token_parser/ruby_token_parser.rb', line 82

def self.set_do_raise_exception(i = true)
  @do_raise_exception = i
end

Instance Method Details

#constant_base?Module? Also known as: constant_base

#

constant_base?

#

Returns:

  • (Module, nil)

    Where to lookup constants. Nil is toplevel (equivalent to Object).



209
210
211
# File 'lib/ruby_token_parser/ruby_token_parser.rb', line 209

def constant_base?
  @constant_base
end

#content?Boolean

#

content?

Reader method over the current value of the scanner.

#

Returns:

  • (Boolean)


179
180
181
# File 'lib/ruby_token_parser/ruby_token_parser.rb', line 179

def content?
  @scanner.string
end

#end_of_string?Boolean

#

end_of_string?

Whether the scanner reached the end of the string.

#

Returns:

  • (Boolean)


161
162
163
# File 'lib/ruby_token_parser/ruby_token_parser.rb', line 161

def end_of_string?
  @scanner.eos?
end

#inspect?Boolean

#

inspect?

#

Returns:

  • (Boolean)


198
199
200
# File 'lib/ruby_token_parser/ruby_token_parser.rb', line 198

def inspect?
  @scanner.rest.inspect
end

#position=(i) ⇒ Object

#

position=

Moves the scanners position to the given character-index.

#

Parameters:

  • value (Integer)

    The new position of the scanner



150
151
152
# File 'lib/ruby_token_parser/ruby_token_parser.rb', line 150

def position=(i)
  @scanner.pos = i
end

#position?Integer Also known as: position

#

position?

The position of the scanner in the string.

#

Returns:

  • (Integer)


137
138
139
# File 'lib/ruby_token_parser/ruby_token_parser.rb', line 137

def position?
  @scanner.pos
end

#restString

#

rest

#

Returns:

  • (String)

    The currently unprocessed rest of the string.



170
171
172
# File 'lib/ruby_token_parser/ruby_token_parser.rb', line 170

def rest
  @scanner.rest
end

#scan_valueObject

#

scan_value

Scans the string for a single value and advances the parsers position.

#

Returns:

  • (Object)

    the scanned value

Raises:



229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
# File 'lib/ruby_token_parser/ruby_token_parser.rb', line 229

def scan_value
  case
  # ======================================================================= #
  # === Handle Ranges                               (range tag, ranges tag)
  # ======================================================================= #
  when (!content?.scan(RRange).empty?)
    _ = content?.delete('(').delete(')').squeeze('.').split('.').map(&:to_i)
    min = _.first
    max = _.last
    Range.new(min, max)
  # ======================================================================= #
  # === Handle Arrays                               (arrays tag, array tag)
  # ======================================================================= #
  when @scanner.scan(RArrayBegin)
    value = []
    @scanner.scan(RArrayVoid)
    if @scanner.scan(RArrayEnd)
      value
    else
      value << scan_value
      while @scanner.scan(RArraySeparator)
        value << scan_value
      end
      unless @scanner.scan(RArrayVoid) && @scanner.scan(RArrayEnd)
        raise SyntaxError, 'Expected ]'
      end
      value
    end
  # ======================================================================= #
  # === Handle Hashes
  #
  # This is quite complicated. We have to scan whether we may find
  # the {} syntax or the end of a hash.
  # ======================================================================= #
  when @scanner.scan(RHashBegin)
    value = {}
    @scanner.scan(RHashVoid)
    if @scanner.scan(RHashEnd)
      value
    else
      if @scanner.scan(RHashKeySymbol)
        key = @scanner[1].to_sym
        @scanner.scan(RHashVoid)
      else
        key = scan_value
        unless @scanner.scan(RHashArrow)
          raise SyntaxError, 'Expected =>'
        end
      end
      val = scan_value
      value[key] = val
      while @scanner.scan(RHashSeparator)
        if @scanner.scan(RHashKeySymbol)
          key = @scanner[1].to_sym
          @scanner.scan(RHashVoid)
        else
          key = scan_value
          raise SyntaxError, 'Expected =>' unless @scanner.scan(RHashArrow)
        end
        val = scan_value
        value[key] = val
      end
      unless @scanner.scan(RHashVoid) && @scanner.scan(RHashEnd)
        raise SyntaxError, 'Expected }'
      end
      value
    end
  # ======================================================================= #
  # === Handle Constants
  #
  # eval() is evil but it may be sane due to the regex, also
  # it's less annoying than deep_const_get.
  #
  # @constant_base can be set via the Hash options[:constant_base].
  # ======================================================================= #
  when @scanner.scan(RConstant)
    eval("#{@constant_base}::#{@scanner.first}")
  # ======================================================================= #
  # === Handle Nil values
  # ======================================================================= #
  when @scanner.scan(RNil)
    nil
  # ======================================================================= #
  # === Handle True values
  # ======================================================================= #
  when @scanner.scan(RTrue) # true tag
    true
  # ======================================================================= #
  # === Handle False values
  # ======================================================================= #
  when @scanner.scan(RFalse) # false tag
    false
  # ======================================================================= #
  # === Handle DateTime values
  # ======================================================================= #
  when @scanner.scan(RDateTime)
    Time.mktime( # Tap into the regex pattern next.
      @scanner[1], @scanner[2],
      @scanner[3], @scanner[4],
      @scanner[5], @scanner[6]
    )
  # ======================================================================= #
  # === Handle Date values
  # ======================================================================= #
  when @scanner.scan(RDate)
    date = @scanner[1].to_i, @scanner[2].to_i, @scanner[3].to_i
    Date.civil(*date)
  # ======================================================================= #
  # === Handle RTime values
  # ======================================================================= #
  when @scanner.scan(RTime)
    now = Time.now
    Time.mktime(
      now.year, now.month, now.day,
      @scanner[1].to_i, @scanner[2].to_i, @scanner[3].to_i
    )
  # ======================================================================= #
  # === Handle Float values
  # ======================================================================= #
  when @scanner.scan(RFloat)
    Float(@scanner.matched.delete('^0-9.e-'))
  # ======================================================================= #
  # === Handle BigDecimal values
  # ======================================================================= #
  when @scanner.scan(RBigDecimal)
    data = @scanner.matched.delete('^0-9.-')
    @use_big_decimal ? BigDecimal(data) : Float(data)
  # ======================================================================= #
  # === Handle OctalInteger values
  # ======================================================================= #
  when @scanner.scan(ROctalInteger)
    # ===================================================================== #
    # We can make use of Integer to turn them into valid ruby objects.
    # ===================================================================== #
    Integer(@scanner.matched.delete('^0-9-'))
  # ======================================================================= #
  # === Handle HexInteger values
  # ======================================================================= #
  when @scanner.scan(RHexInteger)
    Integer(@scanner.matched.delete('^xX0-9A-Fa-f-'))
  # ======================================================================= #
  # === Handle BinaryInteger values
  # ======================================================================= #
  when @scanner.scan(RBinaryInteger)
    Integer(@scanner.matched.delete('^bB01-'))
  # ======================================================================= #
  # === Handle Integer values
  # ======================================================================= #
  when @scanner.scan(RInteger)
    @scanner.matched.delete('^0-9-').to_i
  # ======================================================================= #
  # === Handle Regexp values
  # ======================================================================= #
  when @scanner.scan(RRegexp)
    source = @scanner[1]
    flags  = 0
    lang   = nil
    if @scanner[2]
      flags |= Regexp::IGNORECASE if @scanner[2].include?('i') # Value of 1
      flags |= Regexp::EXTENDED   if @scanner[2].include?('m') # Value of 2
      flags |= Regexp::MULTILINE  if @scanner[2].include?('x') # Value of true
      lang   = @scanner[2].delete('^nNeEsSuU')[-1,1]
    end
    Regexp.new(source, flags, lang)
  # ======================================================================= #
  # === Handle double-quoted string values
  # ======================================================================= #
  when @scanner.scan(RDString)
    @scanner.matched[1..-2].gsub(/\\(?:[0-3]?\d\d?|x[A-Fa-f\d]{2}|.)/) { |m|
      DStringEscapes[m]
    }
  # ======================================================================= #
  # === Handle Symbol values                      (symbol tag, symbols tag)
  # ======================================================================= #
  when @scanner.scan(RSymbol)
    # ===================================================================== #
    # Next, check the first character matched.
    # ===================================================================== #
    case @scanner.matched[1,1] # Might be "f".
    # ===================================================================== #
    # If it is a '"' quote, enter here.
    # ===================================================================== #
    when '"'
      @scanner.matched[2..-2].gsub(/\\(?:[0-3]?\d\d?|x[A-Fa-f\d]{2}|.)/) { |m|
        DStringEscapes[m]
      }.to_sym
    # ===================================================================== #
    # If it is a "'" quote, enter here.
    # ===================================================================== #
    when "'"
      @scanner.matched[2..-2].gsub(/\\'/, "'").gsub(/\\\\/, "\\").to_sym
    else # Default here. Match all but the leading ':'
      @scanner.matched[1..-1].to_sym
    end
  # ======================================================================= #
  # === Handle single-quoted string values
  # ======================================================================= #
  when @scanner.scan(RSString)
    @scanner.matched[1..-2].gsub(/\\'/, "'").gsub(/\\\\/, "\\")
  # ======================================================================= #
  # === Handle everything else
  #
  # This can lead to a runtime error, so we must raise a SyntaxError.
  # ======================================================================= #
  else # else tag
    raise SyntaxError, "Unrecognized pattern: #{inspect?}"
  end
end

#use_big_decimal?Boolean Also known as: use_big_decimal

#

use_big_decimal?

#

Returns:

  • (Boolean)

    True if "1.25" should be parsed into a big-decimal, false if it should be parsed as Float.



191
192
193
# File 'lib/ruby_token_parser/ruby_token_parser.rb', line 191

def use_big_decimal?
  @use_big_decimal
end