Class: HexaPDF::Serializer
- Inherits:
-
Object
- Object
- HexaPDF::Serializer
- Defined in:
- lib/hexapdf/serializer.rb
Overview
Knows how to serialize Ruby objects for a PDF file.
For normal serialization purposes, the #serialize or #serialize_to_io methods should be used. However, if the type of the object to be serialized is known, a specialized serialization method like #serialize_float can be used.
Additionally, an object for encrypting strings and streams while serializing can be set via the #encrypter= method. The assigned object has to respond to #encrypt_string(str, ind_obj) (where the string is part of the indirect object; returns the encrypted string) and #encrypt_stream(stream) (returns a fiber that represents the encrypted stream).
How This Class Works
The main public interface consists of the #serialize and #serialize_to_io methods which accept an object and return its serialized form. During serialization of this object it is accessible by individual serialization methods via the @object instance variable (useful if the object is a composed object).
Internally, the #__serialize method is used for invoking the correct serialization method based on the class of a given object. It is also used for serializing individual parts of a composed object.
Therefore the serializer contains one serialization method for each class it needs to serialize. The naming scheme of these methods is based on the class name: The full class name is converted to lowercase, the namespace separator ‘::’ is replaced with a single underscore and the string “serialize_” is then prepended.
Examples:
NilClass => serialize_nilclass
TrueClass => serialize_trueclass
HexaPDF::Object => serialize_hexapdf_object
If no serialization method for a specific class is found, the ancestors classes are tried.
See: PDF1.7 s7.3
Constant Summary collapse
- NAME_SUBSTS =
The regexp matches all characters that need to be escaped and the substs hash contains the mapping from these characters to their escaped form.
See PDF1.7 s7.3.5
{}
- NAME_REGEXP =
:nodoc:
/[^!-~&&[^##{Regexp.escape(Tokenizer::DELIMITER)}#{Regexp.escape(Tokenizer::WHITESPACE)}]]/
- NAME_CACHE =
:nodoc:
Utils::LRUCache.new(1000)
- BYTE_IS_DELIMITER =
:nodoc:
{40 => true, 47 => true, 60 => true, 91 => true, # :nodoc: 41 => true, 62 => true, 93 => true}.freeze
- STRING_ESCAPE_MAP =
:nodoc:
{"(" => "\\(", ")" => "\\)", "\\" => "\\\\", "\r" => "\\r"}.freeze
Instance Attribute Summary collapse
-
#encrypter ⇒ Object
The encrypter to use for encrypting strings and streams.
Instance Method Summary collapse
-
#initialize ⇒ Serializer
constructor
Creates a new Serializer object.
-
#serialize(obj) ⇒ Object
Returns the serialized form of the given object.
-
#serialize_array(obj) ⇒ Object
Serializes an Array object.
-
#serialize_date(obj) ⇒ Object
See: #serialize_time.
-
#serialize_datetime(obj) ⇒ Object
See: #serialize_time.
-
#serialize_falseclass(_obj) ⇒ Object
Serializes the
false
value. -
#serialize_float(obj) ⇒ Object
Serializes a Float object.
-
#serialize_hash(obj) ⇒ Object
Serializes a Hash object (i.e. a PDF dictionary object).
-
#serialize_integer(obj) ⇒ Object
Serializes an Integer object.
-
#serialize_nilclass(_obj) ⇒ Object
Serializes the
nil
value. -
#serialize_numeric(obj) ⇒ Object
Serializes a Numeric object (either Integer or Float).
-
#serialize_string(obj) ⇒ Object
Serializes a String object.
-
#serialize_symbol(obj) ⇒ Object
Serializes a Symbol object (i.e. a PDF name object).
-
#serialize_time(obj) ⇒ Object
The ISO PDF specification differs in respect to the supported date format.
-
#serialize_to_io(obj, io) ⇒ Object
Serializes the given object and writes it to the IO.
-
#serialize_trueclass(_obj) ⇒ Object
Serializes the
true
value.
Constructor Details
#initialize ⇒ Serializer
Creates a new Serializer object.
90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 |
# File 'lib/hexapdf/serializer.rb', line 90 def initialize @dispatcher = { Hash => 'serialize_hash', Array => 'serialize_array', Symbol => 'serialize_symbol', String => 'serialize_string', Integer => 'serialize_integer', Float => 'serialize_float', Time => 'serialize_time', TrueClass => 'serialize_trueclass', FalseClass => 'serialize_falseclass', NilClass => 'serialize_nilclass', HexaPDF::Reference => 'serialize_hexapdf_reference', HexaPDF::Object => 'serialize_hexapdf_object', HexaPDF::Stream => 'serialize_hexapdf_stream', HexaPDF::Dictionary => 'serialize_hexapdf_object', HexaPDF::PDFArray => 'serialize_hexapdf_object', HexaPDF::Rectangle => 'serialize_hexapdf_object', } @dispatcher.default_proc = lambda do |h, klass| h[klass] = if klass <= HexaPDF::Stream "serialize_hexapdf_stream" elsif klass <= HexaPDF::Object "serialize_hexapdf_object" else method = nil klass.ancestors.each do |ancestor_klass| name = ancestor_klass.name.to_s.downcase name.gsub!(/::/, '_') method = "serialize_#{name}" break if respond_to?(method, true) end method end end @encrypter = false @io = nil @object = nil @in_object = false end |
Instance Attribute Details
#encrypter ⇒ Object
The encrypter to use for encrypting strings and streams. If nil
, strings and streams are not encrypted.
Default: nil
87 88 89 |
# File 'lib/hexapdf/serializer.rb', line 87 def encrypter @encrypter end |
Instance Method Details
#serialize(obj) ⇒ Object
Returns the serialized form of the given object.
For developers: While the object is serialized, methods can use the instance variable
135 136 137 138 139 140 |
# File 'lib/hexapdf/serializer.rb', line 135 def serialize(obj) @object = obj __serialize(obj) ensure @object = nil end |
#serialize_array(obj) ⇒ Object
Serializes an Array object.
See: PDF1.7 s7.3.6
226 227 228 229 230 231 232 233 234 235 236 237 |
# File 'lib/hexapdf/serializer.rb', line 226 def serialize_array(obj) str = +"[" index = 0 while index < obj.size tmp = __serialize(obj[index]) str << " " unless BYTE_IS_DELIMITER[tmp.getbyte(0)] || BYTE_IS_DELIMITER[str.getbyte(-1)] str << tmp index += 1 end str << "]" end |
#serialize_date(obj) ⇒ Object
See: #serialize_time
291 292 293 |
# File 'lib/hexapdf/serializer.rb', line 291 def serialize_date(obj) serialize_time(obj.to_time) end |
#serialize_datetime(obj) ⇒ Object
See: #serialize_time
296 297 298 |
# File 'lib/hexapdf/serializer.rb', line 296 def serialize_datetime(obj) serialize_time(obj.to_time) end |
#serialize_falseclass(_obj) ⇒ Object
Serializes the false
value.
See: PDF1.7 s7.3.2
169 170 171 |
# File 'lib/hexapdf/serializer.rb', line 169 def serialize_falseclass(_obj) "false" end |
#serialize_float(obj) ⇒ Object
Serializes a Float object.
See: PDF1.7 s7.3.3
193 194 195 |
# File 'lib/hexapdf/serializer.rb', line 193 def serialize_float(obj) -0.0001 < obj && obj < 0.0001 && obj != 0 ? sprintf("%.6f", obj) : obj.round(6).to_s end |
#serialize_hash(obj) ⇒ Object
Serializes a Hash object (i.e. a PDF dictionary object).
See: PDF1.7 s7.3.7
242 243 244 245 246 247 248 249 250 251 252 253 |
# File 'lib/hexapdf/serializer.rb', line 242 def serialize_hash(obj) str = +"<<" obj.each do |k, v| next if v.nil? || (v.respond_to?(:null?) && v.null?) str << serialize_symbol(k) tmp = __serialize(v) str << " " unless BYTE_IS_DELIMITER[tmp.getbyte(0)] || BYTE_IS_DELIMITER[str.getbyte(-1)] str << tmp end str << ">>" end |
#serialize_integer(obj) ⇒ Object
Serializes an Integer object.
See: PDF1.7 s7.3.3
186 187 188 |
# File 'lib/hexapdf/serializer.rb', line 186 def serialize_integer(obj) obj.to_s end |
#serialize_nilclass(_obj) ⇒ Object
Serializes the nil
value.
See: PDF1.7 s7.3.9
155 156 157 |
# File 'lib/hexapdf/serializer.rb', line 155 def serialize_nilclass(_obj) "null" end |
#serialize_numeric(obj) ⇒ Object
Serializes a Numeric object (either Integer or Float).
This method should be used for cases where it is known that the object is either an Integer or a Float.
See: PDF1.7 s7.3.3
179 180 181 |
# File 'lib/hexapdf/serializer.rb', line 179 def serialize_numeric(obj) obj.kind_of?(Integer) ? obj.to_s : serialize_float(obj) end |
#serialize_string(obj) ⇒ Object
Serializes a String object.
See: PDF1.7 s7.3.4
260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 |
# File 'lib/hexapdf/serializer.rb', line 260 def serialize_string(obj) obj = if @encrypter && @object.kind_of?(HexaPDF::Object) && @object.indirect? encrypter.encrypt_string(obj, @object) elsif obj.encoding != Encoding::BINARY if obj.match?(/[^ -~\t\r\n]/) "\xFE\xFF".b << obj.encode(Encoding::UTF_16BE).force_encoding(Encoding::BINARY) else obj.b end else obj.dup end obj.gsub!(/[()\\\r]/n, STRING_ESCAPE_MAP) "(#{obj})" end |
#serialize_symbol(obj) ⇒ Object
Serializes a Symbol object (i.e. a PDF name object).
See: PDF1.7 s7.3.5
211 212 213 214 215 216 217 218 |
# File 'lib/hexapdf/serializer.rb', line 211 def serialize_symbol(obj) NAME_CACHE[obj] ||= begin str = obj.to_s.dup.force_encoding(Encoding::BINARY) str.gsub!(NAME_REGEXP, NAME_SUBSTS) str.empty? ? "/ " : "/#{str}" end end |
#serialize_time(obj) ⇒ Object
The ISO PDF specification differs in respect to the supported date format. When converting to a date string, a format suitable for both is output.
See: PDF1.7 s7.9.4, ADB1.7 3.8.3
280 281 282 283 284 285 286 287 288 |
# File 'lib/hexapdf/serializer.rb', line 280 def serialize_time(obj) zone = obj.strftime("%z'") if zone == "+0000'" zone = '' else zone[3, 0] = "'" end serialize_string(obj.strftime("D:%Y%m%d%H%M%S#{zone}")) end |
#serialize_to_io(obj, io) ⇒ Object
Serializes the given object and writes it to the IO.
Also see: #serialize
145 146 147 148 149 150 |
# File 'lib/hexapdf/serializer.rb', line 145 def serialize_to_io(obj, io) @io = io @io << serialize(obj).freeze ensure @io = nil end |
#serialize_trueclass(_obj) ⇒ Object
Serializes the true
value.
See: PDF1.7 s7.3.2
162 163 164 |
# File 'lib/hexapdf/serializer.rb', line 162 def serialize_trueclass(_obj) "true" end |