Module: Stringex::Unidecoder
- Defined in:
- lib/stringex/unidecoder.rb
Constant Summary collapse
- CODEPOINTS =
Contains Unicode codepoints, loading as needed from YAML files
Hash.new{|h, k| h[k] = ::YAML.load_file(File.join(File.(File.dirname(__FILE__)), "unidecoder_data", "#{k}.yml")) }
Class Method Summary collapse
-
.decode(string) ⇒ Object
Returns string with its UTF-8 characters transliterated to ASCII ones.
-
.encode(codepoint) ⇒ Object
Returns character for the given Unicode codepoint.
-
.get_codepoint(character) ⇒ Object
Returns Unicode codepoint for the given character.
-
.in_yaml_file(character) ⇒ Object
Returns string indicating which file (and line) contains the transliteration value for the character.
Class Method Details
.decode(string) ⇒ Object
Returns string with its UTF-8 characters transliterated to ASCII ones
You’re probably better off just using the added String#to_ascii
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
# File 'lib/stringex/unidecoder.rb', line 16 def decode(string) string.gsub(/[^\x00-\x00]/u) do |codepoint| if localized = translate(codepoint) localized else begin unpacked = codepoint.unpack("U")[0] CODEPOINTS[code_group(unpacked)][grouped_point(unpacked)] rescue # Hopefully this won't come up much # TODO: Make this note something to the user that is reportable to me perhaps "?" end end end end |
.encode(codepoint) ⇒ Object
Returns character for the given Unicode codepoint
34 35 36 |
# File 'lib/stringex/unidecoder.rb', line 34 def encode(codepoint) ["0x#{codepoint}".to_i(16)].pack("U") end |
.get_codepoint(character) ⇒ Object
Returns Unicode codepoint for the given character
39 40 41 |
# File 'lib/stringex/unidecoder.rb', line 39 def get_codepoint(character) "%04x" % character.unpack("U")[0] end |
.in_yaml_file(character) ⇒ Object
Returns string indicating which file (and line) contains the transliteration value for the character
45 46 47 48 |
# File 'lib/stringex/unidecoder.rb', line 45 def in_yaml_file(character) unpacked = character.unpack("U")[0] "#{code_group(unpacked)}.yml (line #{grouped_point(unpacked) + 2})" end |