Module: SmsTools::UnicodeEncoding

Extended by:
UnicodeEncoding
Included in:
UnicodeEncoding
Defined in:
lib/sms_tools/unicode_encoding.rb

Constant Summary collapse

BASIC_PLANE =
0x0000..0xFFFF

Instance Method Summary collapse

Instance Method Details

#character_count(char) ⇒ Object

UCS-2/UTF-16 is used for unicode text messaging. UCS-2/UTF-16 represents characters in minimum 2-bytes, any characters in the basic plane are represented with 2-bytes, so each codepoint within the Basic Plane counts as a single character. Any codepoint outside the Basic Plane is encoded using 4-bytes and therefore counts as 2 characters in a text message.



11
12
13
# File 'lib/sms_tools/unicode_encoding.rb', line 11

def character_count(char)
  char.each_codepoint.sum { |codepoint| BASIC_PLANE.include?(codepoint) ? 1 : 2 }
end