Class: PDF::Reader::XRef
- Inherits:
-
Object
- Object
- PDF::Reader::XRef
- Defined in:
- lib/pdf/reader/xref.rb
Overview
An internal PDF::Reader class that represents the Xref table in a PDF file An Xref table is a map of object identifiers and byte offsets. Any time a particular object needs to be found, the Xref table is used to find where it is stored in the file.
Instance Method Summary collapse
-
#initialize(buffer) ⇒ XRef
constructor
create a new Xref table based on the contents of the supplied PDF::Reader::Buffer object.
-
#load(offset = nil) ⇒ Object
Read the xref table from the underlying buffer.
-
#load_xref_table ⇒ Object
Assumes the underlying buffer is positioned at the start of an Xref table and processes it into memory.
-
#obj_type(ref) ⇒ Object
returns the type of object a ref points to.
-
#object(ref, save_pos = true) ⇒ Object
Return a string containing the contents of an entire PDF object.
-
#offset_for(ref) ⇒ Object
returns the byte offset for the specified PDF object.
-
#pdf_version ⇒ Object
returns the PDF version of the current document.
-
#store(id, gen, offset) ⇒ Object
Stores an offset value for a particular PDF object ID and revision number.
-
#stream?(ref) ⇒ Boolean
returns true if the supplied references points to an object with a stream.
Constructor Details
#initialize(buffer) ⇒ XRef
create a new Xref table based on the contents of the supplied PDF::Reader::Buffer object
35 36 37 38 |
# File 'lib/pdf/reader/xref.rb', line 35 def initialize (buffer) @buffer = buffer @xref = {} end |
Instance Method Details
#load(offset = nil) ⇒ Object
Read the xref table from the underlying buffer. If offset is specified the table will be loaded from there, otherwise the default offset will be located and used.
Will fail silently if there is no xref table at the requested offset.
54 55 56 57 58 59 60 61 62 63 64 65 66 |
# File 'lib/pdf/reader/xref.rb', line 54 def load (offset = nil) offset ||= @buffer.find_first_xref_offset @buffer.seek(offset) token = @buffer.token if token == "xref" || token == "ref" load_xref_table elsif token.to_i >= 0 && @buffer.token.to_i >= 0 && @buffer.token == "obj" raise PDF::Reader::UnsupportedFeatureError, "XRef streams are not supported in PDF::Reader yet" else raise PDF::Reader::MalformedPDFError, "xref table not found at offset #{offset} (#{token} != xref)" end end |
#load_xref_table ⇒ Object
Assumes the underlying buffer is positioned at the start of an Xref table and processes it into memory.
83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 |
# File 'lib/pdf/reader/xref.rb', line 83 def load_xref_table tok_one = tok_two = nil begin # loop over all subsections of the xref table # In a well formed PDF, the 'trailer' token will indicate # the end of the table. However we need to be careful in case # we're processing a malformed pdf that is missing the trailer. loop do tok_one, tok_two = @buffer.token, @buffer.token if tok_one != "trailer" && !tok_one.match(/\d+/) raise MalformedPDFError, "PDF malformed, missing trailer after cross reference" end break if tok_one == "trailer" or tok_one.nil? objid, count = tok_one.to_i, tok_two.to_i count.times do offset = @buffer.token.to_i generation = @buffer.token.to_i state = @buffer.token store(objid, generation, offset) if state == "n" objid += 1 end end rescue EOFError => e raise MalformedPDFError, "PDF malformed, missing trailer after cross reference" end raise MalformedPDFError, "PDF malformed, trailer should be a dictionary" unless tok_two == "<<" trailer = Parser.new(@buffer, self).dictionary load(trailer[:Prev].to_i) if trailer.has_key?(:Prev) trailer end |
#obj_type(ref) ⇒ Object
returns the type of object a ref points to
120 121 122 123 |
# File 'lib/pdf/reader/xref.rb', line 120 def obj_type(ref) obj = object(ref) obj.class.to_s.to_sym end |
#object(ref, save_pos = true) ⇒ Object
Return a string containing the contents of an entire PDF object. The object is requested by specifying a PDF::Reader::Reference object that contains the objects ID and revision number
If the object is a stream, that is returned as well
73 74 75 76 77 78 79 |
# File 'lib/pdf/reader/xref.rb', line 73 def object (ref, save_pos = true) return ref unless ref.kind_of?(Reference) pos = @buffer.pos if save_pos obj = Parser.new(@buffer.seek(offset_for(ref)), self).object(ref.id, ref.gen) @buffer.seek(pos) if save_pos return obj end |
#offset_for(ref) ⇒ Object
returns the byte offset for the specified PDF object.
ref - a PDF::Reader::Reference object containing an object ID and revision number
133 134 135 136 137 |
# File 'lib/pdf/reader/xref.rb', line 133 def offset_for (ref) @xref[ref.id][ref.gen] rescue raise InvalidObjectError, "Object #{ref.id}, Generation #{ref.gen} is invalid" end |
#pdf_version ⇒ Object
returns the PDF version of the current document. Technically this isn’t part of the XRef table, but it is one of the lowest level data items in the file, so we’ve lumped it in with the cross reference code.
43 44 45 46 47 48 |
# File 'lib/pdf/reader/xref.rb', line 43 def pdf_version @buffer.seek(0) m, version = *@buffer.read(8).match(/%PDF-(\d.\d)/) raise MalformedPDFError, 'invalid PDF version' if version.nil? return version.to_f end |
#store(id, gen, offset) ⇒ Object
Stores an offset value for a particular PDF object ID and revision number
140 141 142 |
# File 'lib/pdf/reader/xref.rb', line 140 def store (id, gen, offset) (@xref[id] ||= {})[gen] ||= offset end |
#stream?(ref) ⇒ Boolean
returns true if the supplied references points to an object with a stream
125 126 127 128 |
# File 'lib/pdf/reader/xref.rb', line 125 def stream?(ref) obj, stream = @xref.object(ref) stream ? true : false end |