Class: PDF::Reader::Parser

Inherits:
Object
  • Object
show all
Defined in:
lib/pdf/reader/parser.rb

Overview

An internal PDF::Reader class that reads objects from the PDF file and converts them into useable ruby objects (hash’s, arrays, true, false, etc)

Instance Method Summary collapse

Constructor Details

#initialize(buffer, ohash = nil) ⇒ Parser

Create a new parser around a PDF::Reader::Buffer object

buffer - a PDF::Reader::Buffer object that contains PDF data ohash - a PDF::Reader::ObjectHash object that can return objects from the PDF file



36
37
38
39
# File 'lib/pdf/reader/parser.rb', line 36

def initialize (buffer, ohash=nil)
  @buffer = buffer
  @ohash  = ohash
end

Instance Method Details

#object(id, gen) ⇒ Object

Reads an entire PDF object from the buffer and returns it as a Ruby String. If the object is a content stream, returns both the stream and the dictionary that describes it

id - the object ID to return gen - the object revision number to return



75
76
77
78
79
80
81
82
83
84
85
86
87
# File 'lib/pdf/reader/parser.rb', line 75

def object (id, gen)
  Error.assert_equal(parse_token, id)
  Error.assert_equal(parse_token, gen)
  Error.str_assert(parse_token, "obj")

  obj = parse_token
  post_obj = parse_token
  if post_obj == "stream"
    stream(obj)
  else
    obj
  end
end

#parse_token(operators = {}) ⇒ Object

Reads the next token from the underlying buffer and convets it to an appropriate object

operators - a hash of supported operators to read from the underlying buffer.



45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
# File 'lib/pdf/reader/parser.rb', line 45

def parse_token (operators={})
  token = @buffer.token

  case token
  when PDF::Reader::Reference, nil then return token
  when "/"                         then return pdf_name()
  when "<<"                        then return dictionary()
  when "["                         then return array()
  when "("                         then return string()
  when "<"                         then return hex_string()
  when "true"                      then return true
  when "false"                     then return false
  when "null"                      then return nil
  when "obj", "endobj", "stream", "endstream" then return Token.new(token)
  when "stream", "endstream"       then return Token.new(token)
  when ">>", "]", ">", ")"         then return Token.new(token)
  else
    if operators.has_key?(token)   then return Token.new(token)
    elsif token =~ /\d*\.\d/       then return token.to_f
    else                           return token.to_i
    end
  end
end