Class: Sexp::Matcher::Parser

Inherits:
Object
  • Object
show all
Defined in:
lib/sexp_matcher.rb

Overview

Converts from a lispy string to Sexp matchers in a safe manner.

"(a 42 _ (c) [t x] ___)" => s{ s(:a, 42, _, s(:c), t(:x), ___) }

Constant Summary collapse

ALLOWED =

A collection of allowed commands to convert into matchers.

[:t, :m, :k, :atom, :not?, :-, :any, :child, :include].freeze

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(s) ⇒ Parser

Create a new Parser instance on s



409
410
411
# File 'lib/sexp_matcher.rb', line 409

def initialize s
  self.tokens = lex s
end

Instance Attribute Details

#tokensObject

The stream of tokens to parse. See #lex.



404
405
406
# File 'lib/sexp_matcher.rb', line 404

def tokens
  @tokens
end

Instance Method Details

#lex(s) ⇒ Object

Converts s into a stream of tokens and adds them to tokens.



416
417
418
# File 'lib/sexp_matcher.rb', line 416

def lex s
  s.scan %r%[()\[\]]|\"[^"]*\"|/[^/]*/|:?[\w?!=~-]+%
end

#next_tokenObject

Returns the next token and removes it from the stream or raises if empty.

Raises:

  • (SyntaxError)


423
424
425
426
# File 'lib/sexp_matcher.rb', line 423

def next_token
  raise SyntaxError, "unbalanced input" if tokens.empty?
  tokens.shift
end

#parseObject

Parses tokens and returns a Matcher instance.



438
439
440
441
# File 'lib/sexp_matcher.rb', line 438

def parse
  result = parse_sexp until tokens.empty?
  result
end

#parse_cmdObject

Parses a balanced command. A command is denoted by square brackets and must conform to a whitelisted set of allowed commands (see ALLOWED).

Raises:

  • (SyntaxError)


515
516
517
518
519
520
521
522
523
524
525
526
527
528
# File 'lib/sexp_matcher.rb', line 515

def parse_cmd
  args = []
  args << parse_sexp while peek_token && peek_token != "]"
  next_token # pop off "]"

  cmd = args.shift
  args = Sexp.q(*args)

  raise SyntaxError, "bad cmd: %p" % [cmd] unless ALLOWED.include? cmd

  result = Sexp.send cmd, *args

  result
end

#parse_listObject

Parses a balanced list of expressions and returns the equivalent matcher.



496
497
498
499
500
501
502
503
# File 'lib/sexp_matcher.rb', line 496

def parse_list
  result = []

  result << parse_sexp while peek_token && peek_token != ")"
  next_token # pop off ")"

  Sexp.q(*result)
end

#parse_sexpObject

Parses a string into a sexp matcher:

SEXP : "(" SEXP:args* ")"          => Sexp.q(*args)
     | "[" CMD:cmd sexp:args* "]"  => Sexp.cmd(*args)
     | "nil"                       => nil
     | /\d+/:n                     => n.to_i
     | "___"                       => Sexp.___
     | "_"                         => Sexp._
     | /^\/(.*)\/$/:re             => Regexp.new re[0]
     | /^"(.*)"$/:s                => String.new s[0]
     | UP_NAME:name                => Object.const_get name
     | NAME:name                   => name.to_sym

UP_NAME: /[A-Z]w*/

NAME : /:?[\w?!=~-]+/
 CMD : t | k | m | atom | not? | - | any | child | include


460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
# File 'lib/sexp_matcher.rb', line 460

def parse_sexp
  token = next_token

  case token
  when "(" then
    parse_list
  when "[" then
    parse_cmd
  when "nil" then
    nil
  when /^\d+$/ then
    token.to_i
  when "___" then
    Sexp.___
  when "_" then
    Sexp._
  when %r%^/(.*)/$% then
    re = $1
    raise SyntaxError, "Not allowed: /%p/" % [re] unless
      re =~ /\A([\w()|.*+^$]+)\z/
    Regexp.new re
  when /^"(.*)"$/ then
    $1
  when /^([A-Z]\w*)$/ then
    Object.const_get $1
  when /^:?([\w?!=~-]+)$/ then
    $1.to_sym
  else
    raise SyntaxError, "unhandled token: %p" % [token]
  end
end

#peek_tokenObject

Returns the next token without removing it from the stream.



431
432
433
# File 'lib/sexp_matcher.rb', line 431

def peek_token
  tokens.first
end