Class: HexaPDF::Font::CMap
- Inherits:
-
Object
- Object
- HexaPDF::Font::CMap
- Defined in:
- lib/hexapdf/font/cmap.rb,
lib/hexapdf/font/cmap/parser.rb,
lib/hexapdf/font/cmap/writer.rb
Overview
Represents a CMap, a mapping from character codes to CIDs (character IDs) or to their Unicode value.
See: PDF1.7 s9.7.5, s9.10.3; Adobe Technical Notes #5014 and #5411
Defined Under Namespace
Constant Summary collapse
Instance Attribute Summary collapse
-
#name ⇒ Object
The name of the CMap.
-
#ordering ⇒ Object
The ordering part of the CMap version.
-
#registry ⇒ Object
The registry part of the CMap version.
-
#supplement ⇒ Object
The supplement part of the CMap version.
-
#wmode ⇒ Object
The writing mode of the CMap: 0 for horizontal, 1 for vertical writing.
Class Method Summary collapse
-
.create_to_unicode_cmap(mapping) ⇒ Object
Returns a string containing a ToUnicode CMap that represents the given code to Unicode codepoint mapping.
-
.for_name(name) ⇒ Object
Creates a new CMap object by parsing a predefined CMap with the given name.
-
.parse(string) ⇒ Object
Creates a new CMap object from the given string which needs to contain a valid CMap file.
-
.predefined?(name) ⇒ Boolean
Returns
true
if the given name specifies a predefined CMap.
Instance Method Summary collapse
-
#add_cid_mapping(code, cid) ⇒ Object
Adds an individual mapping from character code to CID.
-
#add_cid_range(start_code, end_code, start_cid) ⇒ Object
Adds a CID range, mapping characters codes from
start_code
toend_code
to CIDs starting withstart_cid
. -
#add_codespace_range(first, *rest) ⇒ Object
Add a codespace range using an array of ranges for the individual bytes.
-
#add_unicode_mapping(code, string) ⇒ Object
Adds a mapping from character code to Unicode string in UTF-8 encoding.
-
#initialize ⇒ CMap
constructor
Creates a new CMap object.
-
#read_codes(string) ⇒ Object
Parses the string and returns all character codes.
-
#to_cid(code) ⇒ Object
Returns the CID for the given character code, or 0 if no mapping was found.
-
#to_unicode(code) ⇒ Object
Returns the Unicode string in UTF-8 encoding for the given character code, or
nil
if no mapping was found. -
#use_cmap(cmap) ⇒ Object
Add all mappings from the given CMap to this CMap.
Constructor Details
#initialize ⇒ CMap
Creates a new CMap object.
107 108 109 110 111 112 |
# File 'lib/hexapdf/font/cmap.rb', line 107 def initialize @codespace_ranges = [] @cid_mapping = {} @cid_range_mappings = [] @unicode_mapping = {} end |
Instance Attribute Details
#name ⇒ Object
The name of the CMap.
98 99 100 |
# File 'lib/hexapdf/font/cmap.rb', line 98 def name @name end |
#ordering ⇒ Object
The ordering part of the CMap version.
92 93 94 |
# File 'lib/hexapdf/font/cmap.rb', line 92 def ordering @ordering end |
#registry ⇒ Object
The registry part of the CMap version.
89 90 91 |
# File 'lib/hexapdf/font/cmap.rb', line 89 def registry @registry end |
#supplement ⇒ Object
The supplement part of the CMap version.
95 96 97 |
# File 'lib/hexapdf/font/cmap.rb', line 95 def supplement @supplement end |
#wmode ⇒ Object
The writing mode of the CMap: 0 for horizontal, 1 for vertical writing.
101 102 103 |
# File 'lib/hexapdf/font/cmap.rb', line 101 def wmode @wmode end |
Class Method Details
.create_to_unicode_cmap(mapping) ⇒ Object
Returns a string containing a ToUnicode CMap that represents the given code to Unicode codepoint mapping.
See: Writer#create_to_unicode_cmap
84 85 86 |
# File 'lib/hexapdf/font/cmap.rb', line 84 def self.create_to_unicode_cmap(mapping) Writer.new.create_to_unicode_cmap(mapping) end |
.for_name(name) ⇒ Object
Creates a new CMap object by parsing a predefined CMap with the given name.
Raises an error if the given CMap is not found.
64 65 66 67 68 69 70 71 72 73 |
# File 'lib/hexapdf/font/cmap.rb', line 64 def self.for_name(name) return @cmap_cache[name] if @cmap_cache.key?(name) file = File.join(CMAP_DIR, name) if File.exist?(file) @cmap_cache[name] = parse(File.read(file, encoding: ::Encoding::UTF_8)) else raise HexaPDF::Error, "No CMap named '#{name}' found" end end |
.parse(string) ⇒ Object
Creates a new CMap object from the given string which needs to contain a valid CMap file.
76 77 78 |
# File 'lib/hexapdf/font/cmap.rb', line 76 def self.parse(string) Parser.new.parse(string) end |
.predefined?(name) ⇒ Boolean
Returns true
if the given name specifies a predefined CMap.
57 58 59 |
# File 'lib/hexapdf/font/cmap.rb', line 57 def self.predefined?(name) File.exist?(File.join(CMAP_DIR, name)) end |
Instance Method Details
#add_cid_mapping(code, cid) ⇒ Object
Adds an individual mapping from character code to CID.
167 168 169 |
# File 'lib/hexapdf/font/cmap.rb', line 167 def add_cid_mapping(code, cid) @cid_mapping[code] = cid end |
#add_cid_range(start_code, end_code, start_cid) ⇒ Object
Adds a CID range, mapping characters codes from start_code
to end_code
to CIDs starting with start_cid
.
173 174 175 |
# File 'lib/hexapdf/font/cmap.rb', line 173 def add_cid_range(start_code, end_code, start_cid) @cid_range_mappings << [start_code..end_code, start_cid] end |
#add_codespace_range(first, *rest) ⇒ Object
Add a codespace range using an array of ranges for the individual bytes.
This means that the first range is checked against the first byte, the second range against the second byte and so on.
126 127 128 |
# File 'lib/hexapdf/font/cmap.rb', line 126 def add_codespace_range(first, *rest) @codespace_ranges << [first, rest] end |
#add_unicode_mapping(code, string) ⇒ Object
Adds a mapping from character code to Unicode string in UTF-8 encoding.
192 193 194 |
# File 'lib/hexapdf/font/cmap.rb', line 192 def add_unicode_mapping(code, string) @unicode_mapping[code] = string end |
#read_codes(string) ⇒ Object
Parses the string and returns all character codes.
An error is raised if the string contains invalid bytes.
133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 |
# File 'lib/hexapdf/font/cmap.rb', line 133 def read_codes(string) codes = [] bytes = string.each_byte loop do byte = bytes.next code = 0 found = @codespace_ranges.any? do |first_byte_range, rest_ranges| next unless first_byte_range.cover?(byte) code = (code << 8) + byte valid = rest_ranges.all? do |range| begin byte = bytes.next rescue StopIteration raise HexaPDF::Error, "Missing bytes while reading codes via CMap" end code = (code << 8) + byte range.cover?(byte) end codes << code if valid end unless found raise HexaPDF::Error, "Invalid byte while reading codes via CMap: #{byte}" end end codes end |
#to_cid(code) ⇒ Object
Returns the CID for the given character code, or 0 if no mapping was found.
178 179 180 181 182 183 184 185 186 187 188 189 |
# File 'lib/hexapdf/font/cmap.rb', line 178 def to_cid(code) cid = @cid_mapping.fetch(code, -1) if cid == -1 @cid_range_mappings.reverse_each do |range, start_cid| if range.cover?(code) cid = start_cid + code - range.first break end end end (cid == -1 ? 0 : cid) end |
#to_unicode(code) ⇒ Object
Returns the Unicode string in UTF-8 encoding for the given character code, or nil
if no mapping was found.
198 199 200 |
# File 'lib/hexapdf/font/cmap.rb', line 198 def to_unicode(code) unicode_mapping[code] end |
#use_cmap(cmap) ⇒ Object
Add all mappings from the given CMap to this CMap.
115 116 117 118 119 120 |
# File 'lib/hexapdf/font/cmap.rb', line 115 def use_cmap(cmap) @codespace_ranges.concat(cmap.codespace_ranges) @cid_mapping.merge!(cmap.cid_mapping) @cid_range_mappings.concat(cmap.cid_range_mappings) @unicode_mapping.merge!(cmap.unicode_mapping) end |