Class: PDF::Reader::XRef

Inherits:
Object
  • Object
show all
Defined in:
lib/pdf/reader/xref.rb

Overview

An internal PDF::Reader class that represents the Xref table in a PDF file An Xref table is a map of object identifiers and byte offsets. Any time a particular object needs to be found, the Xref table is used to find where it is stored in the file.

Instance Method Summary collapse

Constructor Details

#initialize(io) ⇒ XRef

create a new Xref table based on the contents of the supplied PDF::Reader::Buffer object



35
36
37
38
# File 'lib/pdf/reader/xref.rb', line 35

def initialize (io)
  @io = io
  @xref = {}
end

Instance Method Details

#each(&block) ⇒ Object

iterate over each object in the xref table



104
105
106
107
108
109
110
111
# File 'lib/pdf/reader/xref.rb', line 104

def each(&block)
  ids = @xref.keys.sort
  ids.each do |id|
    gen = @xref[id].keys.sort[-1]
    ref = PDF::Reader::Reference.new(id, gen)
    yield ref, object(ref)
  end
end

#load(offset = nil) ⇒ Object

Read the xref table from the underlying buffer. If offset is specified the table will be loaded from there, otherwise the default offset will be located and used.

Will fail silently if there is no xref table at the requested offset.



57
58
59
60
61
62
63
64
65
66
67
68
69
70
# File 'lib/pdf/reader/xref.rb', line 57

def load (offset = nil)
  offset ||= new_buffer.find_first_xref_offset

  buf = new_buffer(offset)
  token = buf.token

  if token == "xref" || token == "ref"
    load_xref_table(buf)
  elsif token.to_i >= 0 && buf.token.to_i >= 0 && buf.token == "obj"
    raise PDF::Reader::UnsupportedFeatureError, "XRef streams are not supported in PDF::Reader yet"
  else
    raise PDF::Reader::MalformedPDFError, "xref table not found at offset #{offset} (#{token} != xref)"
  end
end

#obj_type(ref) ⇒ Object

returns the type of object a ref points to



84
85
86
87
# File 'lib/pdf/reader/xref.rb', line 84

def obj_type(ref)
  obj = object(ref)
  obj.class.to_s.to_sym
end

#object(ref) ⇒ Object

Return a string containing the contents of an entire PDF object. The object is requested by specifying a PDF::Reader::Reference object that contains the objects ID and revision number

If the object is a stream, that is returned as well



77
78
79
80
81
82
# File 'lib/pdf/reader/xref.rb', line 77

def object (ref)
  return ref unless ref.kind_of?(Reference)
  buf = new_buffer(offset_for(ref))
  obj = Parser.new(buf, self).object(ref.id, ref.gen)
  return obj
end

#offset_for(ref) ⇒ Object

returns the byte offset for the specified PDF object.

ref - a PDF::Reader::Reference object containing an object ID and revision number



97
98
99
100
101
# File 'lib/pdf/reader/xref.rb', line 97

def offset_for (ref)
  @xref[ref.id][ref.gen]
rescue
  raise InvalidObjectError, "Object #{ref.id}, Generation #{ref.gen} is invalid"
end

#pdf_versionObject

returns the PDF version of the current document. Technically this isn’t part of the XRef table, but it is one of the lowest level data items in the file, so we’ve lumped it in with the cross reference code.

Raises:



46
47
48
49
50
51
# File 'lib/pdf/reader/xref.rb', line 46

def pdf_version
  @io.seek(0)
  m, version = *@io.read(8).match(/%PDF-(\d.\d)/)
  raise MalformedPDFError, 'invalid PDF version' if version.nil?
  return version.to_f
end

#sizeObject



39
40
41
# File 'lib/pdf/reader/xref.rb', line 39

def size
  @xref.size
end

#store(id, gen, offset) ⇒ Object

Stores an offset value for a particular PDF object ID and revision number



114
115
116
# File 'lib/pdf/reader/xref.rb', line 114

def store (id, gen, offset)
  (@xref[id] ||= {})[gen] ||= offset
end

#stream?(ref) ⇒ Boolean

returns true if the supplied references points to an object with a stream

Returns:

  • (Boolean)


89
90
91
92
# File 'lib/pdf/reader/xref.rb', line 89

def stream?(ref)
  obj, stream = @xref.object(ref)
  stream ? true : false
end