Method: UnicodeUtils.compatibility_decomposition
- Defined in:
- lib/unicode_utils/compatibility_decomposition.rb
.compatibility_decomposition(str) ⇒ Object
Get the compatibility decomposition of the given string, also called Normalization Form KD or short NFKD.
Compatibility decomposition decomposes more code points than canonical decomposition and contrary to Normalization Form D and C, this normalization can alter how a string is displayed.
Example:
require "unicode_utils/compatibility_decomposition"
# LATIN SMALL LIGATURE FI => LATIN SMALL LETTER F, LATIN SMALL LETTER I
UnicodeUtils.compatibility_decomposition("fi") => "fi"
See also: UnicodeUtils.nfkd
26 27 28 29 30 31 32 33 34 35 36 |
# File 'lib/unicode_utils/compatibility_decomposition.rb', line 26 def compatibility_decomposition(str) res = String.new.force_encoding(str.encoding) str.each_codepoint { |cp| if cp >= 0xAC00 && cp <= 0xD7A3 # hangul syllable Impl.append_hangul_syllable_decomposition(res, cp) else Impl.append_recursive_compatibility_decomposition_mapping(res, cp) end } Impl.put_into_canonical_order(res) end |