Class: ANTLR3::StringStream
- Inherits:
-
Object
- Object
- ANTLR3::StringStream
- Includes:
- CharacterStream
- Defined in:
- lib/antlr3/streams.rb
Overview
A StringStream’s purpose is to wrap the basic, naked text input of a recognition system. Like all other stream types, it provides serial navigation of the input; a recognizer can arbitrarily step forward and backward through the stream’s symbols as it requires. StringStream and its subclasses are they main way to feed text input into an ANTLR Lexer for token processing.
The stream’s symbols of interest, of course, are character values. Thus, the #peek method returns the integer character value at look-ahead position k
and the #look method returns the character value as a String
. They also track various pieces of information such as the line and column numbers at the current position.
Note About Text Encoding
This version of the runtime library primarily targets ruby version 1.8, which does not have strong built-in support for multi-byte character encodings. Thus, characters are assumed to be represented by a single byte – an integer between 0 and 255. Ruby 1.9 does provide built-in encoding support for multi-byte characters, but currently this library does not provide any streams to handle non-ASCII encoding. However, encoding-savvy recognition code is a future development goal for this project.
Direct Known Subclasses
Constant Summary collapse
- NEWLINE =
?\n.ord
Constants included from Constants
Constants::BUILT_IN_TOKEN_NAMES, Constants::DEFAULT, Constants::DOWN, Constants::EOF, Constants::EOF_TOKEN, Constants::EOR_TOKEN_TYPE, Constants::HIDDEN, Constants::INVALID, Constants::INVALID_NODE, Constants::INVALID_TOKEN, Constants::MEMO_RULE_FAILED, Constants::MEMO_RULE_UNKNOWN, Constants::MIN_TOKEN_TYPE, Constants::SKIP_TOKEN, Constants::UP
Instance Attribute Summary collapse
-
#column ⇒ Object
readonly
the current character position within the current line, indexed upward from 0.
-
#data ⇒ Object
readonly
the entire string that is wrapped by the stream.
-
#line ⇒ Object
readonly
the current line number of the input, indexed upward from 1.
-
#name ⇒ Object
(also: #source_name)
the name associated with the stream – usually a file name defaults to
"(string)"
. -
#position ⇒ Object
(also: #index, #character_index)
readonly
current integer character index of the stream.
-
#string ⇒ Object
readonly
Returns the value of attribute string.
Instance Method Summary collapse
-
#<<(k) ⇒ Object
operator style look-behind.
-
#[](start, *args) ⇒ Object
identical to String#[].
-
#beginning_of_line? ⇒ Boolean
Returns true if the stream appears to be at the beginning of a new line.
-
#beginning_of_string? ⇒ Boolean
(also: #bof?)
Returns true if the stream appears to be at the beginning of a stream (position = 0).
-
#consume ⇒ Object
advance the stream by one character; returns the character consumed.
-
#end_of_line? ⇒ Boolean
Returns true if the stream appears to be at the end of a new line.
-
#end_of_string? ⇒ Boolean
(also: #eof?)
Returns true if the stream has been exhausted.
-
#initialize(data, options = {}) ⇒ StringStream
constructor
creates a new StringStream object where
data
is the string data to stream. -
#inspect(before_chars = 6, after_chars = 10) ⇒ Object
customized object inspection that shows: * the stream class * the stream’s location in
index / line:column
format *before_chars
characters before the cursor (6 characters by default) *after_chars
characters after the cursor (10 characters by default). -
#last_marker ⇒ Object
the last marker value created by a call to #mark.
-
#look(k = 1) ⇒ Object
(also: #>>)
identical to #peek, except it returns the character value as a String.
-
#mark ⇒ Object
record the current stream location parameters in the stream’s marker table and return an integer-valued bookmark that may be used to restore the stream’s position with the #rewind method.
-
#mark_depth ⇒ Object
the total number of markers currently in existence.
-
#peek(k = 1) ⇒ Object
return the character at look-ahead distance
k
as an integer. -
#release(marker = @markers.length - 1) ⇒ Object
let go of the bookmark data for the marker and all marker values created after the marker.
-
#reset ⇒ Object
rewinds the stream back to the start and clears out any existing marker entries.
-
#rewind(marker = @markers.length - 1, release = true) ⇒ Object
restore the stream to an earlier location recorded by #mark.
-
#seek(index) ⇒ Object
jump to the absolute position value given by
index
. - #size ⇒ Object (also: #length)
-
#substring(start, stop) ⇒ Object
return the string slice between position
start
andstop
. -
#through(k) ⇒ Object
return a substring around the stream cursor at a distance
k
ifk >= 0
, return the next k characters ifk < 0
, return the previous|k|
characters.
Constructor Details
#initialize(data, options = {}) ⇒ StringStream
creates a new StringStream object where data
is the string data to stream. accepts the following options in a symbol-to-value hash:
- :file or :name
-
the (file) name to associate with the stream; default:
'(string)'
- :line
-
the initial line number; default:
1
- :column
-
the initial column number; default:
0
397 398 399 400 401 402 403 404 405 406 |
# File 'lib/antlr3/streams.rb', line 397 def initialize( data, = {} ) # for 1.9 @string = data.to_s.encode( Encoding::UTF_8 ).freeze @data = @string.codepoints.to_a.freeze @position = .fetch :position, 0 @line = .fetch :line, 1 @column = .fetch :column, 0 @markers = [] @name ||= [ :file ] || [ :name ] # || '(string)' mark end |
Instance Attribute Details
#column ⇒ Object (readonly)
the current character position within the current line, indexed upward from 0
378 379 380 |
# File 'lib/antlr3/streams.rb', line 378 def column @column end |
#data ⇒ Object (readonly)
the entire string that is wrapped by the stream
385 386 387 |
# File 'lib/antlr3/streams.rb', line 385 def data @data end |
#line ⇒ Object (readonly)
the current line number of the input, indexed upward from 1
375 376 377 |
# File 'lib/antlr3/streams.rb', line 375 def line @line end |
#name ⇒ Object Also known as: source_name
the name associated with the stream – usually a file name defaults to "(string)"
382 383 384 |
# File 'lib/antlr3/streams.rb', line 382 def name @name end |
#position ⇒ Object (readonly) Also known as: index, character_index
current integer character index of the stream
372 373 374 |
# File 'lib/antlr3/streams.rb', line 372 def position @position end |
#string ⇒ Object (readonly)
Returns the value of attribute string.
386 387 388 |
# File 'lib/antlr3/streams.rb', line 386 def string @string end |
Instance Method Details
#<<(k) ⇒ Object
operator style look-behind
521 522 523 |
# File 'lib/antlr3/streams.rb', line 521 def <<( k ) self << -k end |
#[](start, *args) ⇒ Object
identical to String#[]
659 660 661 |
# File 'lib/antlr3/streams.rb', line 659 def []( start, *args ) @string[ start, *args ] end |
#beginning_of_line? ⇒ Boolean
Returns true if the stream appears to be at the beginning of a new line. This is an extra utility method for use inside lexer actions if needed.
534 535 536 |
# File 'lib/antlr3/streams.rb', line 534 def beginning_of_line? @position.zero? or @data[ @position - 1 ] == NEWLINE end |
#beginning_of_string? ⇒ Boolean Also known as: bof?
Returns true if the stream appears to be at the beginning of a stream (position = 0). This is an extra utility method for use inside lexer actions if needed.
558 559 560 |
# File 'lib/antlr3/streams.rb', line 558 def beginning_of_string? @position == 0 end |
#consume ⇒ Object
advance the stream by one character; returns the character consumed
478 479 480 481 482 483 484 485 486 487 488 489 |
# File 'lib/antlr3/streams.rb', line 478 def consume c = @data[ @position ] || EOF if @position < @data.length @column += 1 if c == NEWLINE @line += 1 @column = 0 end @position += 1 end return( c ) end |
#end_of_line? ⇒ Boolean
Returns true if the stream appears to be at the end of a new line. This is an extra utility method for use inside lexer actions if needed.
542 543 544 |
# File 'lib/antlr3/streams.rb', line 542 def end_of_line? @data[ @position ] == NEWLINE #if @position < @data.length end |
#end_of_string? ⇒ Boolean Also known as: eof?
Returns true if the stream has been exhausted. This is an extra utility method for use inside lexer actions if needed.
550 551 552 |
# File 'lib/antlr3/streams.rb', line 550 def end_of_string? @position >= @data.length end |
#inspect(before_chars = 6, after_chars = 10) ⇒ Object
customized object inspection that shows:
-
the stream class
-
the stream’s location in
index / line:column
format -
before_chars
characters before the cursor (6 characters by default) -
after_chars
characters after the cursor (10 characters by default)
638 639 640 641 642 643 644 645 646 647 |
# File 'lib/antlr3/streams.rb', line 638 def inspect( before_chars = 6, after_chars = 10 ) before = through( -before_chars ).inspect @position - before_chars > 0 and before.insert( 0, '... ' ) after = through( after_chars ).inspect @position + after_chars + 1 < @data.length and after << ' ...' location = "#@position / line #@line:#@column" "#<#{ self.class }: #{ before } | #{ after } @ #{ location }>" end |
#last_marker ⇒ Object
the last marker value created by a call to #mark
597 598 599 |
# File 'lib/antlr3/streams.rb', line 597 def last_marker @markers.length - 1 end |
#look(k = 1) ⇒ Object Also known as: >>
identical to #peek, except it returns the character value as a String
411 412 413 414 415 416 417 418 419 |
# File 'lib/antlr3/streams.rb', line 411 def look( k = 1 ) # for 1.9 k == 0 and return nil k += 1 if k < 0 index = @position + k - 1 index < 0 and return nil @string[ index ] end |
#mark ⇒ Object
record the current stream location parameters in the stream’s marker table and return an integer-valued bookmark that may be used to restore the stream’s position with the #rewind method. This method is used to implement backtracking.
570 571 572 573 574 |
# File 'lib/antlr3/streams.rb', line 570 def mark state = [ @position, @line, @column ].freeze @markers << state return @markers.length - 1 end |
#mark_depth ⇒ Object
the total number of markers currently in existence
590 591 592 |
# File 'lib/antlr3/streams.rb', line 590 def mark_depth @markers.length end |
#peek(k = 1) ⇒ Object
return the character at look-ahead distance k
as an integer. k = 1
represents the current character. k
greater than 1 represents upcoming characters. A negative value of k
returns previous characters consumed, where k = -1
is the last character consumed. k = 0
has undefined behavior and returns nil
497 498 499 500 501 502 503 |
# File 'lib/antlr3/streams.rb', line 497 def peek( k = 1 ) k == 0 and return nil k += 1 if k < 0 index = @position + k - 1 index < 0 and return nil @data[ index ] or EOF end |
#release(marker = @markers.length - 1) ⇒ Object
let go of the bookmark data for the marker and all marker values created after the marker.
605 606 607 608 609 |
# File 'lib/antlr3/streams.rb', line 605 def release( marker = @markers.length - 1 ) marker.between?( 1, @markers.length - 1 ) or return @markers.pop( @markers.length - marker ) return self end |
#reset ⇒ Object
rewinds the stream back to the start and clears out any existing marker entries
467 468 469 470 471 472 473 |
# File 'lib/antlr3/streams.rb', line 467 def reset initial_location = @markers.first @position, @line, @column = initial_location @markers.clear @markers << initial_location return self end |
#rewind(marker = @markers.length - 1, release = true) ⇒ Object
restore the stream to an earlier location recorded by #mark. If no marker value is provided, the last marker generated by #mark will be used.
580 581 582 583 584 585 |
# File 'lib/antlr3/streams.rb', line 580 def rewind( marker = @markers.length - 1, release = true ) ( marker >= 0 and location = @markers[ marker ] ) or return( self ) @position, @line, @column = location release( marker ) if release return self end |
#seek(index) ⇒ Object
jump to the absolute position value given by index
. note: if index
is before the current position, the line
and column
attributes of the stream will probably be incorrect
616 617 618 619 620 621 622 623 624 625 626 627 628 629 |
# File 'lib/antlr3/streams.rb', line 616 def seek( index ) index = index.bound( 0, @data.length ) # ensures index is within the stream's range if index > @position skipped = through( index - @position ) if lc = skipped.count( "\n" ) and lc.zero? @column += skipped.length else @line += lc @column = skipped.length - skipped.rindex( "\n" ) - 1 end end @position = index return nil end |
#size ⇒ Object Also known as: length
458 459 460 |
# File 'lib/antlr3/streams.rb', line 458 def size @data.length end |
#substring(start, stop) ⇒ Object
return the string slice between position start
and stop
652 653 654 |
# File 'lib/antlr3/streams.rb', line 652 def substring( start, stop ) @string[ start, stop - start + 1 ] end |
#through(k) ⇒ Object
return a substring around the stream cursor at a distance k
if k >= 0
, return the next k characters if k < 0
, return the previous |k|
characters
510 511 512 513 514 515 |
# File 'lib/antlr3/streams.rb', line 510 def through( k ) if k >= 0 then @string[ @position, k ] else start = ( @position + k ).at_least( 0 ) # start cannot be negative or index will wrap around @string[ start ... @position ] end end |