Class: PDF::Reader::Parser
- Inherits:
-
Object
- Object
- PDF::Reader::Parser
- Defined in:
- lib/pdf/reader/parser.rb
Overview
An internal PDF::Reader class that reads objects from the PDF file and converts them into useable ruby objects (hash’s, arrays, true, false, etc)
Constant Summary collapse
- TOKEN_STRATEGY =
: Proc
proc { |parser, token| Token.new(token) }
- STRATEGIES =
{ "/" => proc { |parser, token| parser.send(:pdf_name) }, "<<" => proc { |parser, token| parser.send(:dictionary) }, "[" => proc { |parser, token| parser.send(:array) }, "(" => proc { |parser, token| parser.send(:string) }, "<" => proc { |parser, token| parser.send(:hex_string) }, nil => proc { nil }, "true" => proc { true }, "false" => proc { false }, "null" => proc { nil }, "obj" => TOKEN_STRATEGY, "endobj" => TOKEN_STRATEGY, "stream" => TOKEN_STRATEGY, "endstream" => TOKEN_STRATEGY, ">>" => TOKEN_STRATEGY, "]" => TOKEN_STRATEGY, ">" => TOKEN_STRATEGY, ")" => TOKEN_STRATEGY }
Instance Method Summary collapse
-
#initialize(buffer, objects = nil) ⇒ Parser
constructor
Create a new parser around a PDF::Reader::Buffer object.
-
#object(id, gen) ⇒ Object
Reads an entire PDF object from the buffer and returns it as a Ruby String.
-
#parse_token(operators = {}) ⇒ Object
Reads the next token from the underlying buffer and convets it to an appropriate object.
Constructor Details
#initialize(buffer, objects = nil) ⇒ Parser
Create a new parser around a PDF::Reader::Buffer object
buffer - a PDF::Reader::Buffer object that contains PDF data objects - a PDF::Reader::ObjectHash object that can return objects from the PDF file : (PDF::Reader::Buffer, ?PDF::Reader::ObjectHash?) -> void
66 67 68 69 |
# File 'lib/pdf/reader/parser.rb', line 66 def initialize(buffer, objects=nil) @buffer = buffer @objects = objects end |
Instance Method Details
#object(id, gen) ⇒ Object
Reads an entire PDF object from the buffer and returns it as a Ruby String. If the object is a content stream, returns both the stream and the dictionary that describes it
id - the object ID to return gen - the object revision number to return : (Integer, Integer) -> ( | PDF::Reader::Reference | | PDF::Reader::Token | | PDF::Reader::Stream | | Numeric | | String | | Symbol | | Array | | Hash[untyped, untyped] | | nil | )
123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 |
# File 'lib/pdf/reader/parser.rb', line 123 def object(id, gen) idCheck = parse_token # Sometimes the xref table is corrupt and points to an offset slightly too early in the file. # check the next token, maybe we can find the start of the object we're looking for if idCheck != id Error.assert_equal(parse_token, id) end Error.assert_equal(parse_token, gen) Error.str_assert(parse_token, "obj") obj = parse_token post_obj = parse_token if obj.is_a?(Hash) && post_obj == "stream" stream(obj) else obj end end |
#parse_token(operators = {}) ⇒ Object
Reads the next token from the underlying buffer and convets it to an appropriate object
operators - a hash of supported operators to read from the underlying buffer. : (?Hash[String | PDF::Reader::Token, Symbol]) -> ( | PDF::Reader::Reference | | PDF::Reader::Token | | Numeric | | String | | Symbol | | Array | | Hash[untyped, untyped] | | nil | )
85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 |
# File 'lib/pdf/reader/parser.rb', line 85 def parse_token(operators={}) token = @buffer.token if token.nil? nil elsif token.is_a?(String) && STRATEGIES.has_key?(token) proc = STRATEGIES[token] proc.call(self, token) if proc elsif token.is_a? PDF::Reader::Reference token elsif operators.has_key? token Token.new(token) elsif token.frozen? token elsif token =~ /\d*\.\d/ token.to_f else token.to_i end end |