Module: ANTLR3::Stream

Extended by:
Included in:
AST::TreeNodeStream, CharacterStream, TokenStream
Defined in:


ANTLR3 Streams

This documentation first covers the general concept of streams as used by ANTLR recognizers, and then discusses the specific ANTLR3::Stream module.

ANTLR Stream Classes

ANTLR recognizers need a way to walk through input data in a serialized IO-style fashion. They also need some book-keeping about the input to provide useful information to developers, such as current line number and column. Furthermore, to implement backtracking and various error recovery techniques, recognizers need a way to record various locations in the input at a number of points in the recognition process so the input state may be restored back to a prior state.

ANTLR bundles all of this functionality into a number of Stream classes, each designed to be used by recognizers for a specific recognition task. Most of the Stream hierarchy is implemented in antlr3/stream.rb, which is loaded by default when 'antlr3' is required.

Here's a brief overview of the various stream classes and their respective purpose:


Similar to StringIO from the standard Ruby library, StringStream wraps raw String data in a Stream interface for use by ANTLR lexers.


A subclass of StringStream, FileStream simply wraps data read from an IO or File object for use by lexers.


The job of a TokenStream is to read lexer output and then provide ANTLR parsers with the means to sequential walk through series of tokens. CommonTokenStream is the default TokenStream implementation.


A subclass of CommonTokenStream, TokenRewriteStreams provide rewriting-parsers the ability to produce new output text from an input token-sequence by managing rewrite “programs” on top of the stream.


In a similar fashion to CommonTokenStream, CommonTreeNodeStream feeds tokens to recognizers in a sequential fashion. However, the stream object serializes an Abstract Syntax Tree into a flat, one-dimensional sequence, but preserves the two-dimensional shape of the tree using special UP and DOWN tokens. The sequence is primarily used by ANTLR Tree Parsers. note – this is not defined in antlr3/stream.rb, but antlr3/tree.rb

The next few sections cover the most significant methods of all stream classes.

consume / look / peek

stream.consume is used to advance a stream one unit. StringStreams are advanced by one character and TokenStreams are advanced by one token.

stream.peek(k = 1) is used to quickly retrieve the object of interest to a recognizer at look-ahead position specified by k. For StringStreams, this is the integer value of the character k characters ahead of the stream cursor. For TokenStreams, this is the integer token type of the token k tokens ahead of the stream cursor.

stream.look(k = 1) is used to retrieve the full object of interest at look-ahead position specified by k. While peek provides the bare-minimum lightweight information that the recognizer needs, look provides the full object of concern in the stream. For StringStreams, this is a string object containing the single character k characters ahead of the stream cursor. For TokenStreams, this is the full token structure k tokens ahead of the stream cursor.

Note: in most ANTLR runtime APIs for other languages, peek is implemented by some method with a name like LA(k) and look is implemented by some method with a name like LT(k). When writing this Ruby runtime API, I found this naming practice both confusing, ambiguous, and un-Ruby-like. Thus, I chose peek and look to represent a quick-look (peek) and a full-fledged look-ahead operation (look). If this causes confusion or any sort of compatibility strife for developers using this implementation, all apologies.

mark / rewind / release

marker = stream.mark causes the stream to record important information about the current stream state, place the data in an internal memory table, and return a memento, marker. The marker object is typically an integer key to the stream's internal memory table.

Used in tandem with, stream.rewind(mark = last_marker), the marker can be used to restore the stream to an earlier state. This is used by recognizers to perform tasks such as backtracking and error recovery.

stream.release(marker = last_marker) can be used to release an existing state marker from the memory table.

seek moves the stream cursor to an absolute position within the stream, basically like typical ruby IO#seek style methods. However, unlike IO#seek, ANTLR streams currently always use absolute position seeking.

The Stream Module

ANTLR3::Stream is an abstract-ish base mixin for all IO-like stream classes used by ANTLR recognizers.

The module doesn't do much on its own besides define arguably annoying “abstract'' pseudo-methods that demand implementation when it is mixed in to a class that wants to be a Stream. Right now this exists as an artifact of porting the ANTLR Java/Python runtime library to Ruby. In Java, of course, this is represented as an interface. In Ruby, however, objects are duck-typed and interfaces aren't that useful as programmatic entities – in fact, it's mildly wasteful to have a module like this hanging out. Thus, I may axe it.

When mixed in, it does give the class a #size and #source_name attribute methods.

Except in a small handful of places, most of the ANTLR runtime library uses duck-typing and not type checking on objects. This means that the methods which manipulate stream objects don't usually bother checking that the object is a Stream and assume that the object implements the proper stream interface. Thus, it is not strictly necessary that custom stream objects include ANTLR3::Stream, though it isn't a bad idea.

Constant Summary

Constant Summary

Constants included from Constants

Constants::BUILT_IN_TOKEN_NAMES, Constants::DEFAULT, Constants::DOWN, Constants::EOF, Constants::EOF_TOKEN, Constants::EOR_TOKEN_TYPE, Constants::HIDDEN, Constants::INVALID, Constants::INVALID_TOKEN, Constants::MEMO_RULE_FAILED, Constants::MEMO_RULE_UNKNOWN, Constants::MIN_TOKEN_TYPE, Constants::SKIP_TOKEN, Constants::UP

Instance Attribute Summary collapse

Instance Method Summary collapse

Instance Attribute Details

#sizeObject (readonly)

the total number of symbols in the stream

# File 'lib/antlr3/streams.rb', line 217

def size


indicates an identifying name for the stream – usually the file path of the input

# File 'lib/antlr3/streams.rb', line 221

def source_name

Instance Method Details


:method: consume used to advance a stream one unit (such as character or token)

# File 'lib/antlr3/streams.rb', line 173

abstract :consume


:method: index returns the current position of the stream

# File 'lib/antlr3/streams.rb', line 197

abstract :index


:method: look( k = 1 ) used to retreive the full object of interest at lookahead position specified by k (such as a character string or a token structure)

# File 'lib/antlr3/streams.rb', line 186

abstract :look


:method: mark saves the current position for the purposes of backtracking and returns a value to pass to #rewind at a later time

# File 'lib/antlr3/streams.rb', line 192

abstract :mark


:method: peek( k = 1 ) used to quickly retreive the object of interest to a recognizer at lookahead position specified by k (such as integer value of a character or an integer token type)

# File 'lib/antlr3/streams.rb', line 180

abstract :peek


:method: release( marker = last_marker ) clears the saved state information associated with the given marker value

# File 'lib/antlr3/streams.rb', line 208

abstract :release


:method: rewind( marker = last_marker ) restores the stream position using the state information previously saved by the given marker

# File 'lib/antlr3/streams.rb', line 203

abstract :rewind


:method: seek( position ) move the stream to the given absolute index given by position

# File 'lib/antlr3/streams.rb', line 213

abstract :seek