Method: PDF::Reader::Encoding#to_utf8
- Defined in:
- lib/pdf/reader/encoding.rb
#to_utf8(str) ⇒ Object
convert the specified string to utf8
-
unpack raw bytes into codepoints
-
replace any that have entries in the differences table with a glyph name
-
convert codepoints from source encoding to Unicode codepoints
-
convert any glyph names to Unicode codepoints
-
replace characters that didn’t convert to Unicode nicely with something valid
-
pack the final array of Unicode codepoints into a utf-8 string
-
mark the string as utf-8 if we’re running on a M17N aware VM
103 104 105 106 107 108 109 |
# File 'lib/pdf/reader/encoding.rb', line 103 def to_utf8(str) if utf8_conversion_impossible? little_boxes(str.unpack(unpack).size) else convert_to_utf8(str) end end |