Module: Yoga::Scanner

Defined in:
lib/yoga/scanner.rb

Overview

A scanner. This performs scanning over a series of tokens. It is built to lazily scan whenever it is required, instead of all at once. This integrates nicely with the parser.

Constant Summary collapse

LINE_MATCHER =

A regular expression to match all kinds of lines. All of them.

Returns:

  • (::Regexp)
/\r\n|\n\r|\n|\r/

Instance Attribute Summary collapse

Instance Method Summary collapse

Instance Attribute Details

#file::String (readonly)

The file of the scanner. This can be overwritten to provide a descriptor for the file.

Returns:

  • (::String)


13
14
15
# File 'lib/yoga/scanner.rb', line 13

def file
  @file
end

Instance Method Details

#call {|token| ... } ⇒ self #call::Enumerable<Scanner::Token>

Overloads:

  • #call {|token| ... } ⇒ self

    For every token that is scanned, the block is yielded to.

    Yield Parameters:

    • token (Scanner::Token)

    Returns:

    • (self)
  • #call::Enumerable<Scanner::Token>

    Returns an enumerable over the tokens in the scanner.

    Returns:

    • (::Enumerable<Scanner::Token>)

Yields:



36
37
38
39
40
41
42
43
44
45
46
47
48
# File 'lib/yoga/scanner.rb', line 36

def call
  return to_enum(:call) unless block_given?
  @scanner = StringScanner.new(@source)
  @line = 1

  until @scanner.eos?
    value = scan
    yield value unless value == true || !value
  end

  yield eof_token
  self
end

#current_line::Numeric (protected)

Returns the number of lines that have been covered so far in the scanner. I recommend replacing this with an instance variable that caches the result of it, so that whenever you scan a new line, it just increments the line count.

Returns:

  • (::Numeric)


141
142
143
144
# File 'lib/yoga/scanner.rb', line 141

def current_line
  # @scanner.string[[email protected]].scan(/\A|\r\n|\n\r|\n|\r/).size
  @line
end

#emit(kind, source = @scanner[0]) ⇒ Yoga::Token (protected)

Creates a scanner token with the given name and source. This grabs the location using #location, setting the size to the size of the source text. The source is frozen before initializing the token.

Examples:

emit(:<, "<") # => #<Yoga::Token kind=:< source="<">

Returns:



90
91
92
# File 'lib/yoga/scanner.rb', line 90

def emit(kind, source = @scanner[0])
  Token.new(kind.freeze, source.freeze, location(source.length))
end

#eof_tokenYoga::Token (protected)

Returns a token that denotes that the scanner is done scanning.

Returns:



160
161
162
# File 'lib/yoga/scanner.rb', line 160

def eof_token
  emit(:EOF, "")
end

#initialize(source, file = "<anon>") ⇒ Object

Initializes the scanner with the given source. Once the source is set, it shouldn't be changed.

Parameters:

  • source (::String)

    The source.

  • file (::String) (defaults to: "<anon>")

    The file the scanner comes from.



20
21
22
23
24
25
# File 'lib/yoga/scanner.rb', line 20

def initialize(source, file = "<anon>")
  @source = source
  @file = file
  @line = 1
  @last_line_at = 0
end

#location(size = 0) ⇒ Yoga::Location (protected)

Returns a location at the given location. If a size is given, it reduces the column number by the size and returns the size from that.

Examples:

@scanner.string # => "hello"
@line # => 1
@scanner.charpos # => 5
location # => #<Yoga::Location <anon>:1.6>
location(5) # => #<Yoga::Location <anon>:1.1-6

Parameters:

  • size (::Numeric) (defaults to: 0)

    The size of the token.

Returns:



77
78
79
80
81
# File 'lib/yoga/scanner.rb', line 77

def location(size = 0)
  start = (@scanner.charpos - @last_line_at) + 1
  column = (start - size)..start
  Location.new(file, current_line, column)
end

#match(matcher, kind = :"#{matcher}") ⇒ Yoga::Token? (protected)

Attempts to match the given token. The first argument can be a string, a symbol, or a regular expression. If the matcher is a symbol, it's coerced into a regular expression, with a forward negative assertion for any alphanumeric characters, to prevent partial matches (see #symbol_negative_assertion). If the matcher is a regular expression, it is left alone. Otherwise, #to_s is called and passed to Regexp.escape. If the text is matched at the current position, a token is returned; otherwise, nil is returned. If a newline is matched within a match, the scanner automatically updates the line and column information.

Parameters:

  • matcher (::Symbol, ::Regexp, #to_s)
  • kind (::Symbol) (defaults to: :"#{matcher}")

    The kind of token to emit. This defaults to a symbol version of the matcher.

Returns:



114
115
116
117
118
119
120
121
122
123
124
125
# File 'lib/yoga/scanner.rb', line 114

def match(matcher, kind = :"#{matcher}")
  matcher = case matcher
            when ::Symbol then /#{::Regexp.escape(matcher.to_s)}#{symbol_negative_assertion}/
            when ::Regexp then matcher
            else /#{::Regexp.escape(matcher.to_s)}/
            end

  return unless @scanner.scan(matcher)

  update_line_information
  ((kind && emit(kind)) || true)
end

#match_line(kind = false) ⇒ Boolean (protected)

Matches a line. This is separate in order to allow internal logic, such as line counting and caching, to be performed.

Returns:

  • (Boolean)

    If the line was matched.



131
132
133
# File 'lib/yoga/scanner.rb', line 131

def match_line(kind = false)
  match(LINE_MATCHER, kind)
end

#scanYoga::Token, true

This method is abstract.

Please implement this method in order to make the class a scanner.

The scanning method. This should return one of two values: a Token, or true. nil should never be returned. This performs an incremental scan of the document; it returns one token at a time. If something matched, but should not emit a token, true should be returned. The implementing class should mark this as private or protected.

Returns:



60
61
62
# File 'lib/yoga/scanner.rb', line 60

def scan
  fail NotImplementedError, "Please implement #{self.class}#scan"
end

#symbol_negative_assertion#to_s (protected)

The negative assertion used for converting a symbol matcher to a regular expression. This is used to prevent premature matching of other identifiers. For example, if module is a keyword, and moduleA is an identifier, this negative assertion allows the following expression to properly match as such: match(:module) || module(/[a-zA-Z], :IDENT).

Returns:

  • (#to_s)


153
154
155
# File 'lib/yoga/scanner.rb', line 153

def symbol_negative_assertion
  "(?![a-zA-Z])"
end

#update_line_informationvoid (protected)

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

This method returns an undefined value.

Updates the line information for the scanner. This is called for any successful matches.



169
170
171
172
173
174
# File 'lib/yoga/scanner.rb', line 169

def update_line_information
  return unless (lines = @scanner[0].scan(LINE_MATCHER)).any?
  @line += lines.size
  @last_line_at =
    @scanner.string.rindex(LINE_MATCHER, @scanner.charpos) + 1
end