Class: SyntaxTree::Parser
- Inherits:
-
Ripper
- Object
- Ripper
- SyntaxTree::Parser
- Defined in:
- lib/syntax_tree/parser.rb
Overview
Parser is a subclass of the Ripper library that subscribes to the stream of tokens and nodes coming from the parser and builds up a syntax tree.
Defined Under Namespace
Classes: MultiByteString, ParseError, SingleByteString
Instance Attribute Summary collapse
-
#comments ⇒ Object
readonly
- Array[ Comment | EmbDoc ]
-
the list of comments that have been found while parsing the source.
-
#line_counts ⇒ Object
readonly
- Array[ SingleByteString | MultiByteString ]
-
the list of objects that represent the start of each line in character offsets.
-
#lines ⇒ Object
readonly
- Array[ String ]
-
the list of lines in the source.
-
#source ⇒ Object
readonly
- String
-
the source being parsed.
-
#tokens ⇒ Object
readonly
- Array[ untyped ]
-
a running list of tokens that have been found in the source.
Instance Method Summary collapse
-
#initialize(source) ⇒ Parser
constructor
A new instance of Parser.
Constructor Details
#initialize(source) ⇒ Parser
Returns a new instance of Parser.
79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 |
# File 'lib/syntax_tree/parser.rb', line 79 def initialize(source, *) super # We keep the source around so that we can refer back to it when we're # generating the AST. Sometimes it's easier to just reference the source # string when you want to check if it contains a certain character, for # example. @source = source # Similarly, we keep the lines of the source string around to be able to # check if certain lines contain certain characters. For example, we'll # use this to generate the content that goes after the __END__ keyword. # Or we'll use this to check if a comment has other content on its line. @lines = source.split(/\r?\n/) # This is the full set of comments that have been found by the parser. # It's a running list. At the end of every block of statements, they will # go in and attempt to grab any comments that are on their own line and # turn them into regular statements. So at the end of parsing the only # comments left in here will be comments on lines that also contain code. @comments = [] # This is the current embdoc (comments that start with =begin and end with # =end). Since they can't be nested, there's no need for a stack here, as # there can only be one active. These end up getting dumped into the # comments list before getting picked up by the statements that surround # them. @embdoc = nil # This is an optional node that can be present if the __END__ keyword is # used in the file. In that case, this will represent the content after # that keyword. @__end__ = nil # Heredocs can actually be nested together if you're using interpolation, # so this is a stack of heredoc nodes that are currently being created. # When we get to the token that finishes off a heredoc node, we pop the # top one off. If there are others surrounding it, then the body events # will now be added to the correct nodes. @heredocs = [] # This is a running list of tokens that have fired. It's useful mostly for # maintaining location information. For example, if you're inside the # handle of a def event, then in order to determine where the AST node # started, you need to look backward in the tokens to find a def keyword. # Most of the time, when a parser event consumes one of these events, it # will be deleted from the list. So ideally, this list stays pretty short # over the course of parsing a source string. @tokens = [] # Here we're going to build up a list of SingleByteString or # MultiByteString objects. They're each going to represent a string in the # source. They are used by the `char_pos` method to determine where we are # in the source string. @line_counts = [] last_index = 0 @source.lines.each do |line| @line_counts << if line.size == line.bytesize SingleByteString.new(last_index) else MultiByteString.new(last_index, line) end last_index += line.size end # Make sure line counts is filled out with the first and last line at # minimum so that it has something to compare against if the parser is in # a lineno=2 state for an empty file. @line_counts << SingleByteString.new(0) if @line_counts.empty? @line_counts << SingleByteString.new(last_index) end |
Instance Attribute Details
#comments ⇒ Object (readonly)
- Array[ Comment | EmbDoc ]
-
the list of comments that have been found
while parsing the source.
77 78 79 |
# File 'lib/syntax_tree/parser.rb', line 77 def comments @comments end |
#line_counts ⇒ Object (readonly)
- Array[ SingleByteString | MultiByteString ]
-
the list of objects that
represent the start of each line in character offsets
68 69 70 |
# File 'lib/syntax_tree/parser.rb', line 68 def line_counts @line_counts end |
#lines ⇒ Object (readonly)
- Array[ String ]
-
the list of lines in the source
64 65 66 |
# File 'lib/syntax_tree/parser.rb', line 64 def lines @lines end |
#source ⇒ Object (readonly)
- String
-
the source being parsed
61 62 63 |
# File 'lib/syntax_tree/parser.rb', line 61 def source @source end |
#tokens ⇒ Object (readonly)
- Array[ untyped ]
-
a running list of tokens that have been found in the
source. This list changes a lot as certain nodes will “consume” these tokens to determine their bounds.
73 74 75 |
# File 'lib/syntax_tree/parser.rb', line 73 def tokens @tokens end |