Module: Lightstreamer::UTF16

Defined in:
lib/lightstreamer/utf16.rb

Overview

This module supports the decoding of UTF-16 escape sequences

Class Method Summary collapse

Class Method Details

.decode_escape_sequences(string) ⇒ Object

Decodes any UTF-16 escape sequences in the form ‘uXXXX’ into a new string. Invalid escape sequences are removed.



7
8
9
10
11
12
13
14
15
16
17
# File 'lib/lightstreamer/utf16.rb', line 7

def decode_escape_sequences(string)
  string = decode_surrogate_pairs_escape_sequences string

  # Match all remaining escape sequences
  string.gsub(/\\u[A-F\d]{4}/i) do |escape_sequence|
    codepoint = escape_sequence[2..-1].hex

    # Codepoints greater than 0xD7FF are invalid
    codepoint < 0xD800 ? [codepoint].pack('U') : ''
  end
end

.decode_surrogate_pairs_escape_sequences(string) ⇒ Object

Converts any UTF-16 surrogate pairs escape sequences in the form ‘uXXXXuYYYY’ into UTF-8.



20
21
22
23
24
25
26
27
28
29
# File 'lib/lightstreamer/utf16.rb', line 20

def decode_surrogate_pairs_escape_sequences(string)
  string.gsub(/\\uD[89AB][A-F\d]{2}\\uD[C-F][A-F\d]{2}/i) do |escape_sequence|
    high_surrogate = escape_sequence[2...6].hex
    low_surrogate = escape_sequence[8...12].hex

    codepoint = 0x10000 + ((high_surrogate - 0xD800) << 10) + (low_surrogate - 0xDC00)

    [codepoint].pack 'U'
  end
end