Class: GeneralizedLrParser

Inherits:
Object show all
Includes:
SourceCodeDumpable
Defined in:
lib/rpdf2txt-rockit/glr_parser.rb

Overview

Generalized LR Parsing class

This is a modification of Jan Rekers and Eelco Vissers Generalized LR parsers which in turn are derived from the Tomita parsing algorithm. The main feature of these kinds of parsers is that aribtrary long lookahead is used (when needed) since a parser is forked off every time there is an ambiguity.

This implementation assumes that the ambiguities (arising from lack of lookahead) are resolved later; it does not handle ambiguities arising from the grammar. However, it can easily be extended to return a parse tree forest with all possible parse trees if there is a need for that. Alternatively, the user can resolve ambiguities in the grammar by specifying production priorities.

The modification I’ve done is so that multiple token streams from the lexer can be handled. This allows simpler specification of lexers while still leading to valid parses as long as the grammar is unambigous.

The algorithm used is copyright © 2001 Robert Feldt.

Instance Method Summary collapse

Methods included from SourceCodeDumpable

as_code, as_method_named, as_module_method_named, #create_new, indent_lines, name_hash, #new_of_my_type, #parameter_named, #to_compact_src, #to_src_in_module, #type_to_src

Constructor Details

#initialize(aParseTable, aLexer = nil) ⇒ GeneralizedLrParser

Returns a new instance of GeneralizedLrParser.



32
33
34
35
36
37
38
39
40
41
42
# File 'lib/rpdf2txt-rockit/glr_parser.rb', line 32

def initialize(aParseTable, aLexer = nil)
  @parse_table = aParseTable
  # puts @parse_table.inspect
  if aLexer
    @lexer = aLexer
  else
    tokens = @parse_table.tokens.clone
    tokens.delete(:EOF)
    @lexer = ForkingRegexpLexer.new(tokens)
  end
end

Instance Method Details

#parse(aString) ⇒ Object



59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
# File 'lib/rpdf2txt-rockit/glr_parser.rb', line 59

def parse(aString)
  @string_being_parsed = aString
  @stacks_to_act_on, @accepted_stacks, @stacks_to_shift = [], [], []
  @lexer.init(aString)
  start_state = @parse_table.start_state
  @active_stacks = [ParseStack.new(start_state, @lexer)]
  @cnt, @reducer_cnt = -1, 0
  while @active_stacks.length > 0
    # File.open("as#{@cnt+=1}.graph", "w") {|f| f.write parsestacks_as_dot_digraph(@active_stacks)}
    @stacks_to_shift.clear
    @stacks_to_act_on = @active_stacks.clone
    actor(@stacks_to_act_on.shift) while @stacks_to_act_on.length > 0
    shifter
  end
  if @accepted_stacks.length > 0
    tree = @accepted_stacks.first.links_to_stack_in_state?(start_state).tree
    check_and_report_ambiguity tree
    return tree
  else
    handle_parse_error
  end
end

#parser_src_headerObject



44
45
46
47
48
49
50
# File 'lib/rpdf2txt-rockit/glr_parser.rb', line 44

def parser_src_header
  "# Parser for #{@parse_table.language}\n" +
    "# created by Rockit version #{rockit_version} on #{Time.new.inspect}\n" +
    "# Rockit is copyright (c) 2001 Robert Feldt, [email protected]\n" +
    "# and licensed under GPL\n" +
    "# but this parser is under LGPL\n"
end

#to_src(assignToName = nil, nameHash = {}) ⇒ Object



52
53
54
55
56
57
# File 'lib/rpdf2txt-rockit/glr_parser.rb', line 52

def to_src(assignToName = nil, nameHash = {})
  ptname = "@@parse_table" + self.object_id.inspect.gsub('-', '_')
  parser_src_header + @parse_table.to_src(ptname) + "\n" +
    assign_to(assignToName, 
new_of_my_type(as_code(ptname)))
end