Class: PDF::Reader::XRef

Inherits:
Object
  • Object
show all
Includes:
Enumerable
Defined in:
lib/pdf/reader/xref.rb

Overview

An internal PDF::Reader class that represents the XRef table in a PDF file as a hash-like object.

An Xref table is a map of object identifiers and byte offsets. Any time a particular object needs to be found, the Xref table is used to find where it is stored in the file.

Hash keys are object ids, values are either:

  • a byte offset where the object starts (regular PDF objects)

  • a PDF::Reader::Reference instance that points to a stream that contains the desired object (PDF objects embedded in an object stream)

The class behaves much like a standard Ruby hash, including the use of the Enumerable mixin. The key difference is no []= method - the hash is read only.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(io) ⇒ XRef

create a new Xref table based on the contents of the supplied io object

io - must be an IO object, generally either a file or a StringIO



58
59
60
61
62
63
# File 'lib/pdf/reader/xref.rb', line 58

def initialize(io)
  @io = io
  @junk_offset = calc_junk_offset(io) || 0
  @xref = {}
  @trailer = load_offsets
end

Instance Attribute Details

#trailerObject (readonly)

Returns the value of attribute trailer.



51
52
53
# File 'lib/pdf/reader/xref.rb', line 51

def trailer
  @trailer
end

Instance Method Details

#[](ref) ⇒ Object

returns the byte offset for the specified PDF object.

ref - a PDF::Reader::Reference object containing an object ID and revision number



75
76
77
78
79
# File 'lib/pdf/reader/xref.rb', line 75

def [](ref)
  @xref.fetch(ref.id, {}).fetch(ref.gen)
rescue
  raise InvalidObjectError, "Object #{ref.id}, Generation #{ref.gen} is invalid"
end

#each(&block) ⇒ Object

iterate over each object in the xref table



82
83
84
85
86
87
88
# File 'lib/pdf/reader/xref.rb', line 82

def each(&block)
  ids = @xref.keys.sort
  ids.each do |id|
    gen = @xref.fetch(id, {}).keys.sort[-1]
    yield PDF::Reader::Reference.new(id, gen.to_i)
  end
end

#sizeObject

return the number of objects in this file. Objects with multiple generations are only counter once.



68
69
70
# File 'lib/pdf/reader/xref.rb', line 68

def size
  @xref.size
end