Class: ANTLR3::CommonTokenStream
- Inherits:
-
Object
- Object
- ANTLR3::CommonTokenStream
- Includes:
- TokenStream, Enumerable
- Defined in:
- lib/antlr3/streams.rb
Overview
CommonTokenStream serves as the primary token stream implementation for feeding sequential token input into parsers.
Using some TokenSource (such as a lexer), the stream collects a token sequence, setting the token’s index
attribute to indicate the token’s position within the stream. The streams may be tuned to some channel value; off-channel tokens will be filtered out by the #peek, #look, and #consume methods.
Sample Usage
source_input = ANTLR3::StringStream.new("35 * 4 - 1")
lexer = Calculator::Lexer.new(source_input)
tokens = ANTLR3::CommonTokenStream.new(lexer)
# assume this grammar defines whitespace as tokens on channel HIDDEN
# and numbers and operations as tokens on channel DEFAULT
tokens.look # => 0 INT['35'] @ line 1 col 0 (0..1)
tokens.look(2) # => 2 MULT["*"] @ line 1 col 2 (3..3)
tokens.tokens(0, 2)
# => [0 INT["35"] @line 1 col 0 (0..1),
# 1 WS[" "] @line 1 col 2 (1..1),
# 2 MULT["*"] @ line 1 col 3 (3..3)]
# notice the #tokens method does not filter off-channel tokens
lexer.reset
hidden_tokens =
ANTLR3::CommonTokenStream.new(lexer, :channel => ANTLR3::HIDDEN)
hidden_tokens.look # => 1 WS[' '] @ line 1 col 2 (1..1)
Direct Known Subclasses
Constant Summary
Constants included from Constants
ANTLR3::Constants::BUILT_IN_TOKEN_NAMES, ANTLR3::Constants::DEFAULT, ANTLR3::Constants::DOWN, ANTLR3::Constants::EOF, ANTLR3::Constants::EOF_TOKEN, ANTLR3::Constants::EOR_TOKEN_TYPE, ANTLR3::Constants::HIDDEN, ANTLR3::Constants::INVALID, ANTLR3::Constants::INVALID_NODE, ANTLR3::Constants::INVALID_TOKEN, ANTLR3::Constants::MEMO_RULE_FAILED, ANTLR3::Constants::MEMO_RULE_UNKNOWN, ANTLR3::Constants::MIN_TOKEN_TYPE, ANTLR3::Constants::SKIP_TOKEN, ANTLR3::Constants::UP
Instance Attribute Summary
Attributes included from TokenStream
#channel, #last_marker, #position, #token_source
Attributes included from Stream
Instance Method Summary collapse
- #<<(k) ⇒ Object
-
#[](i, *args) ⇒ Object
identical to Array#[], as applied to the stream’s token buffer.
- #at(i) ⇒ Object
-
#consume ⇒ Object
advance the stream one step to the next on-channel token.
-
#each(*args) ⇒ Object
yields each token in the stream (including off-channel tokens) If no block is provided, the method returns an Enumerator object.
-
#each_on_channel(channel = @channel) ⇒ Object
yields each token in the stream with the given channel value If no channel value is given, the stream’s tuned channel value will be used.
-
#extract_text(start = 0, stop = @tokens.length - 1) ⇒ Object
(also: #to_s)
fetches the text content of all tokens between
start
andstop
and joins the chunks into a single string. -
#future?(k = 1) ⇒ Boolean
returns the index of the on-channel token at look-ahead position
k
or nil if no other on-channel tokens exist. -
#hold(pos = @position) ⇒ Object
saves the current stream position, yields to the block, and then ensures the stream’s position is restored before returning the value of the block.
-
#initialize(token_source, options = {}) ⇒ CommonTokenStream
constructor
constructs a new token stream using the
token_source
provided. -
#inspect ⇒ Object
Standard Conversion Methods ###############################.
-
#look(k = 1) ⇒ Object
(also: #>>)
operates simillarly to #peek, but returns the full token object at look-ahead position
k
. -
#mark ⇒ Object
bookmark the current position of the input stream.
-
#past?(k = 1) ⇒ Boolean
returns the index of the on-channel token at look-behind position
k
or nil if no other on-channel tokens exist before the current token. -
#peek(k = 1) ⇒ Object
return the type of the on-channel token at look-ahead distance
k
. -
#rebuild(token_source = nil) ⇒ Object
resets the token stream and rebuilds it with a potentially new token source.
- #release(marker = nil) ⇒ Object
-
#reset ⇒ Object
rewind the stream to its initial state.
- #rewind(marker = @last_marker, release = true) ⇒ Object
-
#seek(index) ⇒ Object
jump to the stream position specified by
index
note: seek does not check whether or not the token at the specified position is on-channel,. - #size ⇒ Object (also: #length)
- #token_class ⇒ Object
-
#tokens(start = nil, stop = nil) ⇒ Object
returns a copy of the token buffer.
-
#tune_to(channel) ⇒ Object
tune the stream to a new channel value.
-
#walk ⇒ Object
iterates through the token stream, yielding each on channel token along the way.
Constructor Details
#initialize(token_source, options = {}) ⇒ CommonTokenStream
constructs a new token stream using the token_source
provided. token_source
is usually a lexer, but can be any object that implements next_token
and includes ANTLR3::TokenSource.
If a block is provided, each token harvested will be yielded and if the block returns a nil
or false
value, the token will not be added to the stream – it will be discarded.
Options
- :channel
-
The channel value the stream should be tuned to initially
- :source_name
-
The source name (file name) attribute of the stream
Example
# create a new token stream that is tuned to channel :comment, and
# discard all WHITE_SPACE tokens
ANTLR3::CommonTokenStream.new(lexer, :channel => :comment) do |token|
token.name != 'WHITE_SPACE'
end
780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 |
# File 'lib/antlr3/streams.rb', line 780 def initialize( token_source, = {} ) case token_source when CommonTokenStream # this is useful in cases where you want to convert a CommonTokenStream # to a RewriteTokenStream or other variation of the standard token stream stream = token_source @token_source = stream.token_source @channel = .fetch( :channel ) { stream.channel or DEFAULT_CHANNEL } @source_name = .fetch( :source_name ) { stream.source_name } tokens = stream.tokens.map { | t | t.dup } else @token_source = token_source @channel = .fetch( :channel, DEFAULT_CHANNEL ) @source_name = .fetch( :source_name ) { @token_source.source_name rescue nil } tokens = @token_source.to_a end @last_marker = nil @tokens = block_given? ? tokens.select { | t | yield( t, self ) } : tokens @tokens.each_with_index { |t, i| t.index = i } @position = if first_token = @tokens.find { |t| t.channel == @channel } @tokens.index( first_token ) else @tokens.length end end |
Instance Method Details
#<<(k) ⇒ Object
938 939 940 |
# File 'lib/antlr3/streams.rb', line 938 def << k self >> -k end |
#[](i, *args) ⇒ Object
identical to Array#[], as applied to the stream’s token buffer
1064 1065 1066 |
# File 'lib/antlr3/streams.rb', line 1064 def []( i, *args ) @tokens[ i, *args ] end |
#at(i) ⇒ Object
1057 1058 1059 |
# File 'lib/antlr3/streams.rb', line 1057 def at( i ) @tokens.at i end |
#consume ⇒ Object
advance the stream one step to the next on-channel token
901 902 903 904 905 906 907 |
# File 'lib/antlr3/streams.rb', line 901 def consume token = @tokens[ @position ] || EOF_TOKEN if @position < @tokens.length @position = future?( 2 ) || @tokens.length end return( token ) end |
#each(*args) ⇒ Object
yields each token in the stream (including off-channel tokens) If no block is provided, the method returns an Enumerator object. #each accepts the same arguments as #tokens
996 997 998 999 |
# File 'lib/antlr3/streams.rb', line 996 def each( *args ) block_given? or return enum_for( :each, *args ) tokens( *args ).each { |token| yield( token ) } end |
#each_on_channel(channel = @channel) ⇒ Object
yields each token in the stream with the given channel value If no channel value is given, the stream’s tuned channel value will be used. If no block is given, an enumerator will be returned.
1007 1008 1009 1010 1011 1012 |
# File 'lib/antlr3/streams.rb', line 1007 def each_on_channel( channel = @channel ) block_given? or return enum_for( :each_on_channel, channel ) for token in @tokens token.channel == channel and yield( token ) end end |
#extract_text(start = 0, stop = @tokens.length - 1) ⇒ Object Also known as: to_s
fetches the text content of all tokens between start
and stop
and joins the chunks into a single string
1081 1082 1083 1084 1085 |
# File 'lib/antlr3/streams.rb', line 1081 def extract_text( start = 0, stop = @tokens.length - 1 ) start = start.to_i.at_least( 0 ) stop = stop.to_i.at_most( @tokens.length ) @tokens[ start..stop ].map! { |t| t.text }.join( '' ) end |
#future?(k = 1) ⇒ Boolean
returns the index of the on-channel token at look-ahead position k
or nil if no other on-channel tokens exist
946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 |
# File 'lib/antlr3/streams.rb', line 946 def future?( k = 1 ) @position == -1 and fill_buffer case when k == 0 then nil when k < 0 then past?( -k ) when k == 1 then @position else # since the stream only yields on-channel # tokens, the stream can't just go to the # next position, but rather must skip # over off-channel tokens ( k - 1 ).times.inject( @position ) do |cursor, | begin tk = @tokens.at( cursor += 1 ) or return( cursor ) # ^- if tk is nil (i.e. i is outside array limits) end until tk.channel == @channel cursor end end end |
#hold(pos = @position) ⇒ Object
saves the current stream position, yields to the block, and then ensures the stream’s position is restored before returning the value of the block
887 888 889 890 891 892 893 894 |
# File 'lib/antlr3/streams.rb', line 887 def hold( pos = @position ) block_given? or return enum_for( :hold, pos ) begin yield ensure seek( pos ) end end |
#inspect ⇒ Object
Standard Conversion Methods ###############################
1069 1070 1071 1072 1073 1074 1075 |
# File 'lib/antlr3/streams.rb', line 1069 def inspect string = "#<%p: @token_source=%p @ %p/%p" % [ self.class, @token_source.class, @position, @tokens.length ] tk = look( -1 ) and string << " #{ tk.inspect } <--" tk = look( 1 ) and string << " --> #{ tk.inspect }" string << '>' end |
#look(k = 1) ⇒ Object Also known as: >>
operates simillarly to #peek, but returns the full token object at look-ahead position k
932 933 934 935 |
# File 'lib/antlr3/streams.rb', line 932 def look( k = 1 ) index = future?( k ) or return nil @tokens.fetch( index, EOF_TOKEN ) end |
#mark ⇒ Object
bookmark the current position of the input stream
869 870 871 |
# File 'lib/antlr3/streams.rb', line 869 def mark @last_marker = @position end |
#past?(k = 1) ⇒ Boolean
returns the index of the on-channel token at look-behind position k
or nil if no other on-channel tokens exist before the current token
972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 |
# File 'lib/antlr3/streams.rb', line 972 def past?( k = 1 ) @position == -1 and fill_buffer case when k == 0 then nil when @position - k < 0 then nil else k.times.inject( @position ) do |cursor, | begin cursor <= 0 and return( nil ) tk = @tokens.at( cursor -= 1 ) or return( nil ) end until tk.channel == @channel cursor end end end |
#peek(k = 1) ⇒ Object
return the type of the on-channel token at look-ahead distance k
. k = 1
represents the current token. k
greater than 1 represents upcoming on-channel tokens. A negative value of k
returns previous on-channel tokens consumed, where k = -1
is the last on-channel token consumed. k = 0
has undefined behavior and returns nil
925 926 927 |
# File 'lib/antlr3/streams.rb', line 925 def peek( k = 1 ) tk = look( k ) and return( tk.type ) end |
#rebuild(token_source = nil) ⇒ Object
resets the token stream and rebuilds it with a potentially new token source. If no token_source
value is provided, the stream will attempt to reset the current token_source
by calling reset
on the object. The stream will then clear the token buffer and attempt to harvest new tokens. Identical in behavior to CommonTokenStream.new, if a block is provided, tokens will be yielded and discarded if the block returns a false
or nil
value.
814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 |
# File 'lib/antlr3/streams.rb', line 814 def rebuild( token_source = nil ) if token_source.nil? @token_source.reset rescue nil else @token_source = token_source end @tokens = block_given? ? @token_source.select { |token| yield( token ) } : @token_source.to_a @tokens.each_with_index { |t, i| t.index = i } @last_marker = nil @position = if first_token = @tokens.find { |t| t.channel == @channel } @tokens.index( first_token ) else @tokens.length end return self end |
#release(marker = nil) ⇒ Object
873 874 875 |
# File 'lib/antlr3/streams.rb', line 873 def release( marker = nil ) # do nothing end |
#reset ⇒ Object
rewind the stream to its initial state
858 859 860 861 862 863 864 |
# File 'lib/antlr3/streams.rb', line 858 def reset @position = 0 @position += 1 while token = @tokens[ @position ] and token.channel != @channel @last_marker = nil return self end |
#rewind(marker = @last_marker, release = true) ⇒ Object
878 879 880 |
# File 'lib/antlr3/streams.rb', line 878 def rewind( marker = @last_marker, release = true ) seek( marker ) end |
#seek(index) ⇒ Object
jump to the stream position specified by index
note: seek does not check whether or not the
token at the specified position is on-channel,
914 915 916 917 |
# File 'lib/antlr3/streams.rb', line 914 def seek( index ) @position = index.to_i.bound( 0, @tokens.length ) return self end |
#size ⇒ Object Also known as: length
847 848 849 |
# File 'lib/antlr3/streams.rb', line 847 def size @tokens.length end |
#token_class ⇒ Object
838 839 840 841 842 843 |
# File 'lib/antlr3/streams.rb', line 838 def token_class @token_source.token_class rescue NoMethodError @position == -1 and fill_buffer @tokens.empty? ? CommonToken : @tokens.first.class end |
#tokens(start = nil, stop = nil) ⇒ Object
returns a copy of the token buffer. If start
and stop
are provided, tokens returns a slice of the token buffer from start..stop
. The parameters are converted to integers with their to_i
methods, and thus tokens can be provided to specify start and stop. If a block is provided, tokens are yielded and filtered out of the return array if the block returns a false
or nil
value.
1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 |
# File 'lib/antlr3/streams.rb', line 1044 def tokens( start = nil, stop = nil ) stop.nil? || stop >= @tokens.length and stop = @tokens.length - 1 start.nil? || stop < 0 and start = 0 tokens = @tokens[ start..stop ] if block_given? tokens.delete_if { |t| not yield( t ) } end return( tokens ) end |
#tune_to(channel) ⇒ Object
tune the stream to a new channel value
834 835 836 |
# File 'lib/antlr3/streams.rb', line 834 def tune_to( channel ) @channel = channel end |
#walk ⇒ Object
iterates through the token stream, yielding each on channel token along the way. After iteration has completed, the stream’s position will be restored to where it was before #walk was called. While #each or #each_on_channel does not change the positions stream during iteration, #walk advances through the stream. This makes it possible to look ahead and behind the current token during iteration. If no block is given, an enumerator will be returned.
1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 |
# File 'lib/antlr3/streams.rb', line 1022 def walk block_given? or return enum_for( :walk ) initial_position = @position begin while token = look and token.type != EOF consume yield( token ) end return self ensure @position = initial_position end end |