Class: EBNF::LL1::Scanner

Inherits:
StringScanner
  • Object
show all
Defined in:
lib/ebnf/ll1/scanner.rb

Overview

Overload StringScanner with file operations

  • Reloads scanner as required until EOF.

  • Loads to a high-water and reloads when remaining size reaches a low-water.

FIXME: Only implements the subset required by the Lexer for now.

Constant Summary collapse

HIGH_WATER =

Hopefully large enough to deal with long multi-line comments

512 * 1024
LOW_WATER =
4 * 1024

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(input, options = {}) ⇒ Scanner

Create a scanner, from an IO

Parameters:

  • input (String, IO, #read)
  • options (Hash{Symbol => Object}) (defaults to: {})
  • options[Integer] (Hash)

    a customizable set of options



45
46
47
48
49
50
51
52
# File 'lib/ebnf/ll1/scanner.rb', line 45

def initialize(input, options = {})
  @options = options.merge(high_water: HIGH_WATER, low_water: LOW_WATER)

  @input = input
  super("")
  feed_me
  self
end

Instance Attribute Details

#inputIO, StringIO (readonly)

Returns:

  • (IO, StringIO)


18
19
20
# File 'lib/ebnf/ll1/scanner.rb', line 18

def input
  @input
end

Class Method Details

.new(input, options = {}) ⇒ Object

If we don’t have an IO input, simply use StringScanner directly



23
24
25
26
27
28
29
30
31
32
33
34
35
# File 'lib/ebnf/ll1/scanner.rb', line 23

def self.new(input, options = {})
  input ||= ""
  if input.respond_to?(:read)
    scanner = self.allocate
    scanner.send(:initialize, input, options)
  else
    if input.encoding != Encoding::UTF_8
      input = input.dup if input.frozen?
      input.force_encoding(Encoding::UTF_8)
    end
    StringScanner.new(input)
  end
end

Instance Method Details

#ensure_buffer_fullObject

Ensures that the input buffer is full to the high water mark, or end of file. Useful when matching tokens that may be longer than the low water mark



116
117
118
119
120
121
122
123
124
# File 'lib/ebnf/ll1/scanner.rb', line 116

def ensure_buffer_full
  # Read up to high-water mark ensuring we're at an end of line
  if @input && !@input.eof?
    diff = @options[:high_water] - rest_size
    string = encode_utf8(@input.read(diff))
    string << encode_utf8(@input.gets) unless @input.eof?
    self << string if string
  end
end

#eos?Boolean

Returns true if the scan pointer is at the end of the string

Returns:

  • (Boolean)


80
81
82
83
# File 'lib/ebnf/ll1/scanner.rb', line 80

def eos?
  feed_me
  super
end

#restString

Returns the “rest” of the line, or the next line if at EOL (i.e. everything after the scan pointer). If there is no more data (eos? = true), it returns “”.

Returns:

  • (String)


59
60
61
62
# File 'lib/ebnf/ll1/scanner.rb', line 59

def rest
  feed_me
  encode_utf8 super
end

#scan(pattern) ⇒ String

Tries to match with ‘pattern` at the current position.

If there is a match, the scanner advances the “scan pointer” and returns the matched string. Otherwise, the scanner returns nil.

If the scanner begins with the multi-line start expression

Examples:

s = StringScanner.new('test string')
p s.scan(/\w+/)   # -> "test"
p s.scan(/\w+/)   # -> nil
p s.scan(/\s+/)   # -> " "
p s.scan(/\w+/)   # -> "string"
p s.scan(/./)     # -> nil

Parameters:

  • pattern (Regexp)

Returns:

  • (String)


109
110
111
112
# File 'lib/ebnf/ll1/scanner.rb', line 109

def scan(pattern)
  feed_me
  encode_utf8 super
end

#skip(pattern) ⇒ Object

Attempts to skip over the given ‘pattern` beginning with the scan pointer. If it matches, the scan pointer is advanced to the end of the match, and the length of the match is returned. Otherwise, `nil` is returned.

similar to ‘scan`, but without returning the matched string.

Parameters:

  • pattern (Regexp)


71
72
73
74
# File 'lib/ebnf/ll1/scanner.rb', line 71

def skip(pattern)
  feed_me
  super
end

#terminateObject

Set the scan pointer to the end of the string and clear matching data



87
88
89
90
# File 'lib/ebnf/ll1/scanner.rb', line 87

def terminate
  feed_me
  super
end