Class: Mustermann::StringScanner

Inherits:
Object
  • Object
show all
Defined in:
lib/mustermann/string_scanner.rb

Overview

Note:

This structure is not thread-safe, you should not scan on the same StringScanner instance concurrently. Even if it was thread-safe, scanning concurrently would probably lead to unwanted behaviour.

Class inspired by Ruby’s StringScanner to scan an input string using multiple patterns.

Examples:

require 'mustermann/string_scanner'
scanner = Mustermann::StringScanner.new("here is our example string")

scanner.scan("here") # => "here"
scanner.getch        # => " "

if scanner.scan(":verb our")
  scanner.scan(:noun, capture: :word)
  scanner[:verb]  # => "is"
  scanner[:nound] # => "example"
end

scanner.rest # => "string"

Defined Under Namespace

Classes: ScanResult

Constant Summary collapse

ScanError =

Exception raised if scan/unscan operation cannot be performed.

Class.new(::ScanError)

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(string = "", **pattern_options) ⇒ StringScanner

Returns a new instance of StringScanner.

Examples:

with different default type

require 'mustermann/string_scanner'
scanner = Mustermann::StringScanner.new("foo/bar/baz", type: :shell)
scanner.scan('*')     # => "foo"
scanner.scan('**/*')  # => "/bar/baz"

Parameters:

  • string (String) (defaults to: "")

    the string to scan

  • pattern_options (Hash)

    default options used for #scan



133
134
135
136
137
# File 'lib/mustermann/string_scanner.rb', line 133

def initialize(string = "", **pattern_options)
  @pattern_options = pattern_options
  @string          = String(string).dup
  reset
end

Instance Attribute Details

#paramsHash (readonly)

Params from all previous matches from #scan and #scan_until, but not from #check and #check_until. Changes can be reverted with #unscan and it can be completely cleared via #reset.

Returns:

  • (Hash)

    current params



118
119
120
# File 'lib/mustermann/string_scanner.rb', line 118

def params
  @params
end

#pattern_optionsHash (readonly)

Returns default pattern options used for #scan and similar methods.

Returns:

  • (Hash)

    default pattern options used for #scan and similar methods

See Also:



111
112
113
# File 'lib/mustermann/string_scanner.rb', line 111

def pattern_options
  @pattern_options
end

#positionInteger Also known as: pos

Returns current scan position on the input string.

Returns:

  • (Integer)

    current scan position on the input string



121
122
123
# File 'lib/mustermann/string_scanner.rb', line 121

def position
  @position
end

Class Method Details

.cache_sizeInteger

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Returns number of cached patterns.

Returns:

  • (Integer)

    number of cached patterns

See Also:



45
46
47
# File 'lib/mustermann/string_scanner.rb', line 45

def self.cache_size
  PATTERN_CACHE.size
end

.clear_cacheObject

Patterns created by #scan will be globally cached, since we assume that there is a finite number of different patterns used and that they are more likely to be reused than not. This method allows clearing the cache.

See Also:

  • PatternCache


38
39
40
# File 'lib/mustermann/string_scanner.rb', line 38

def self.clear_cache
  PATTERN_CACHE.clear
end

Instance Method Details

#<<(string) ⇒ Mustermann::StringScanner

Appends the given string to the string being scanned

Examples:

require 'mustermann/string_scanner'
scanner = Mustermann::StringScanner.new
scanner << "foo"
scanner.scan(/.+/) # => "foo"

Parameters:

  • string (String)

    will be appended

Returns:



236
237
238
239
# File 'lib/mustermann/string_scanner.rb', line 236

def <<(string)
  @string << string
  self
end

#[](key) ⇒ Object

Shorthand for accessing #params. Accepts symbols as keys.



270
271
272
# File 'lib/mustermann/string_scanner.rb', line 270

def [](key)
  params[key.to_s]
end

#beginning_of_line?true, false

Returns whether or not the current position is at the start of a line.

Returns:

  • (true, false)

    whether or not the current position is at the start of a line



247
248
249
# File 'lib/mustermann/string_scanner.rb', line 247

def beginning_of_line?
  @position == 0 or @string[@position - 1] == "\n"
end

#check(pattern, **options) ⇒ Mustermann::StringScanner::ScanResult?

Checks if the given pattern matches any substring starting at the current position.

Does not affect #position or #params.

Returns:



196
197
198
199
# File 'lib/mustermann/string_scanner.rb', line 196

def check(pattern, **options)
  params, length = create_pattern(pattern, **options).peek_params(rest)
  ScanResult.new(self, @position, length, params) if params
end

#check_until(pattern, **options) ⇒ Mustermann::StringScanner::ScanResult?

Checks if the given pattern matches any substring starting at any position after the current position.

Does not affect #position or #params.

Returns:



207
208
209
# File 'lib/mustermann/string_scanner.rb', line 207

def check_until(pattern, **options)
  check_until_with_prefix(pattern, **options).first
end

#eos?true, false

Returns whether or not the end of the string has been reached.

Returns:

  • (true, false)

    whether or not the end of the string has been reached



242
243
244
# File 'lib/mustermann/string_scanner.rb', line 242

def eos?
  @position >= @string.size
end

#getchMustermann::StringScanner::ScanResult?

Reads a single character and advances the #position by one.

Returns:



222
223
224
# File 'lib/mustermann/string_scanner.rb', line 222

def getch
  track_result ScanResult.new(self, @position, 1) unless eos?
end

#peek(length = 1) ⇒ String

Allows to peek at a number of still unscanned characters without advacing the #position.

Parameters:

  • length (Integer) (defaults to: 1)

    how many characters to look at

Returns:

  • (String)

    the substring



265
266
267
# File 'lib/mustermann/string_scanner.rb', line 265

def peek(length = 1)
  @string[@position, length]
end

#resetMustermann::StringScanner

Resets the #position to the start and clears all #params.

Returns:



141
142
143
144
145
146
# File 'lib/mustermann/string_scanner.rb', line 141

def reset
  @position = 0
  @params   = {}
  @history  = []
  self
end

#restString

Returns outstanding string not yet matched, empty string at end of input string.

Returns:

  • (String)

    outstanding string not yet matched, empty string at end of input string



252
253
254
# File 'lib/mustermann/string_scanner.rb', line 252

def rest
  @string[@position..-1] || ""
end

#rest_sizeInteger

Returns number of character remaining to be scanned.

Returns:

  • (Integer)

    number of character remaining to be scanned



257
258
259
# File 'lib/mustermann/string_scanner.rb', line 257

def rest_size
  @position > size ? 0 : size - @position
end

#scan(pattern, **options) ⇒ Mustermann::StringScanner::ScanResult?

Checks if the given pattern matches any substring starting at the current position.

If it does, it will advance the current #position to the end of the substring and merges any params parsed from the substring into #params.

Returns:



162
163
164
# File 'lib/mustermann/string_scanner.rb', line 162

def scan(pattern, **options)
  track_result check(pattern, **options)
end

#scan_until(pattern, **options) ⇒ Mustermann::StringScanner::ScanResult?

Checks if the given pattern matches any substring starting at any position after the current position.

If it does, it will advance the current #position to the end of the substring and merges any params parsed from the substring into #params.

Returns:



173
174
175
176
# File 'lib/mustermann/string_scanner.rb', line 173

def scan_until(pattern, **options)
  result, prefix = check_until_with_prefix(pattern, **options)
  track_result(prefix, result)
end

#sizeInteger

Returns size of the input string.

Returns:

  • (Integer)

    size of the input string



287
288
289
# File 'lib/mustermann/string_scanner.rb', line 287

def size
  @string.size
end

#terminateMustermann::StringScanner

Moves the position to the end of the input string.

Returns:



150
151
152
153
# File 'lib/mustermann/string_scanner.rb', line 150

def terminate
  track_result ScanResult.new(self, @position, size - @position)
  self
end

#to_hHash

Params from all previous matches from #scan and #scan_until, but not from #check and #check_until. Changes can be reverted with #unscan and it can be completely cleared via #reset.

Returns:

  • (Hash)

    current params



275
276
277
# File 'lib/mustermann/string_scanner.rb', line 275

def to_h
  params.dup
end

#to_sString

Returns the input string.

Returns:

  • (String)

    the input string

See Also:



282
283
284
# File 'lib/mustermann/string_scanner.rb', line 282

def to_s
  @string.dup
end

#unscanMustermann::StringScanner

Reverts the last operation that advanced the position.

Operations advancing the position: #terminate, #scan, #scan_until, #getch.

Returns:

Raises:



182
183
184
185
186
187
188
# File 'lib/mustermann/string_scanner.rb', line 182

def unscan
  raise ScanError, 'unscan failed: previous match record not exist' if @history.empty?
  previous = @history[0..-2]
  reset
  previous.each { |r| track_result(*r) }
  self
end