Class: PDF::Hash
Overview
Provides low level access to the objects in a PDF file via a hash-like object.
A PDF file can be viewed as a large hash map. It is a series of objects stored at an exact byte offsets, and a table that maps object IDs to byte offsets. Given an object ID, looking up an object is an O(1) operation.
Each PDF object can be mapped to a ruby object, so by passing an object ID to the [] method, a ruby representation of that object will be retrieved.
The class behaves much like a standard Ruby hash, including the use of the Enumerable mixin. The key difference is no []= method - the hash is read only.
Basic Usage
h = PDF::Hash.new("somefile.pdf")
h[1]
=> 3469
h[PDF::Reader::Reference.new(1,0)]
=> 3469
Instance Attribute Summary collapse
-
#default ⇒ Object
Returns the value of attribute default.
-
#trailer ⇒ Object
readonly
Returns the value of attribute trailer.
-
#version ⇒ Object
readonly
Returns the value of attribute version.
Instance Method Summary collapse
-
#[](key) ⇒ Object
Access an object from the PDF.
-
#each(&block) ⇒ Object
(also: #each_pair)
iterate over each key, value.
-
#each_key(&block) ⇒ Object
iterate over each key.
-
#each_value(&block) ⇒ Object
iterate over each value.
-
#empty? ⇒ Boolean
return true if there are no objects in this file.
-
#fetch(key, local_default = nil) ⇒ Object
Access an object from the PDF.
-
#has_key?(check_key) ⇒ Boolean
(also: #include?, #key?, #member?, #value?)
return true if the specified key exists in the file.
-
#has_value?(value) ⇒ Boolean
return true if the specifiedvalue exists in the file.
-
#initialize(input) ⇒ Hash
constructor
Creates a new PDF:Hash object.
-
#keys ⇒ Object
return an array of all keys in the file.
-
#size ⇒ Object
(also: #length)
return the number of objects in the file.
-
#to_a ⇒ Object
return an array of arrays.
- #to_s ⇒ Object
-
#values ⇒ Object
return an array of all values in the file.
-
#values_at(*ids) ⇒ Object
return an array of all values from the specified keys.
Constructor Details
#initialize(input) ⇒ Hash
Creates a new PDF:Hash object. input can be a string with a valid filename, a string containing a PDF file, or an IO object.
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
# File 'lib/pdf/hash.rb', line 35 def initialize(input) if input.kind_of?(IO) || input.kind_of?(StringIO) io = input elsif File.file?(input.to_s) if File.respond_to?(:binread) input = File.binread(input.to_s) else input = File.read(input.to_s) end io = StringIO.new(input) else raise ArgumentError, "input must be an IO-like object or a filename" end @version = read_version(io) @xref = PDF::Reader::XRef.new(io) @trailer = @xref.load end |
Instance Attribute Details
#default ⇒ Object
Returns the value of attribute default.
29 30 31 |
# File 'lib/pdf/hash.rb', line 29 def default @default end |
#trailer ⇒ Object (readonly)
Returns the value of attribute trailer.
30 31 32 |
# File 'lib/pdf/hash.rb', line 30 def trailer @trailer end |
#version ⇒ Object (readonly)
Returns the value of attribute version.
30 31 32 |
# File 'lib/pdf/hash.rb', line 30 def version @version end |
Instance Method Details
#[](key) ⇒ Object
Access an object from the PDF. key can be an int or a PDF::Reader::Reference object.
If an int is used, the object with that ID and a generation number of 0 will be returned.
If a PDF::Reader::Reference object is used the exact ID and generation number can be specified.
62 63 64 65 66 67 68 69 70 71 72 73 |
# File 'lib/pdf/hash.rb', line 62 def [](key) return default if key.to_i <= 0 begin unless key.kind_of?(PDF::Reader::Reference) key = PDF::Reader::Reference.new(key.to_i, 0) end @xref.object(key) rescue return default end end |
#each(&block) ⇒ Object Also known as: each_pair
iterate over each key, value. Just like a ruby hash.
100 101 102 103 104 |
# File 'lib/pdf/hash.rb', line 100 def each(&block) @xref.each do |ref, obj| yield ref, obj end end |
#each_key(&block) ⇒ Object
iterate over each key. Just like a ruby hash.
109 110 111 112 113 |
# File 'lib/pdf/hash.rb', line 109 def each_key(&block) each do |id, obj| yield id end end |
#each_value(&block) ⇒ Object
iterate over each value. Just like a ruby hash.
117 118 119 120 121 |
# File 'lib/pdf/hash.rb', line 117 def each_value(&block) each do |id, obj| yield obj end end |
#empty? ⇒ Boolean
return true if there are no objects in this file
132 133 134 |
# File 'lib/pdf/hash.rb', line 132 def empty? size == 0 ? true : false end |
#fetch(key, local_default = nil) ⇒ Object
Access an object from the PDF. key can be an int or a PDF::Reader::Reference object.
If an int is used, the object with that ID and a generation number of 0 will be returned.
If a PDF::Reader::Reference object is used the exact ID and generation number can be specified.
local_deault is the object that will be returned if the requested key doesn’t exist.
87 88 89 90 91 92 93 94 95 96 |
# File 'lib/pdf/hash.rb', line 87 def fetch(key, local_default = nil) obj = self[key] if obj return obj elsif local_default return local_default else raise IndexError, "#{key} is invalid" if key.to_i <= 0 end end |
#has_key?(check_key) ⇒ Boolean Also known as: include?, key?, member?, value?
return true if the specified key exists in the file. key can be an int or a PDF::Reader::Reference
139 140 141 142 143 144 145 146 147 148 149 |
# File 'lib/pdf/hash.rb', line 139 def has_key?(check_key) # TODO update from O(n) to O(1) each_key do |key| if check_key.kind_of?(PDF::Reader::Reference) return true if check_key == key else return true if check_key.to_i == key.id end end return false end |
#has_value?(value) ⇒ Boolean
return true if the specifiedvalue exists in the file
156 157 158 159 160 161 162 |
# File 'lib/pdf/hash.rb', line 156 def has_value?(value) # TODO update from O(n) to O(1) each_value do |obj| return true if obj == value end return false end |
#keys ⇒ Object
return an array of all keys in the file
171 172 173 174 175 |
# File 'lib/pdf/hash.rb', line 171 def keys ret = [] each_key { |k| ret << k } ret end |
#size ⇒ Object Also known as: length
return the number of objects in the file. An object with multiple generations is counted once.
125 126 127 |
# File 'lib/pdf/hash.rb', line 125 def size @xref.size end |
#to_a ⇒ Object
return an array of arrays. Each sub array contains a key/value pair.
193 194 195 196 197 198 199 |
# File 'lib/pdf/hash.rb', line 193 def to_a ret = [] each do |id, obj| ret << [id, obj] end ret end |
#to_s ⇒ Object
165 166 167 |
# File 'lib/pdf/hash.rb', line 165 def to_s "<PDF::Hash size: #{self.size}>" end |
#values ⇒ Object
return an array of all values in the file
179 180 181 182 183 |
# File 'lib/pdf/hash.rb', line 179 def values ret = [] each_value { |v| ret << v } ret end |
#values_at(*ids) ⇒ Object
return an array of all values from the specified keys
187 188 189 |
# File 'lib/pdf/hash.rb', line 187 def values_at(*ids) ids.map { |id| self[id] } end |