Module: ICU::Util::String

Defined in:
lib/icu_name/util.rb

Overview

For converting strings in various ways.

Constant Summary collapse

LOWER_CHARS =
"àáâãäåæçèéêëìíîïñòóôõöøùúûüýþ"
UPPER_CHARS =
"ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜÝÞ"
ACCENTED_CHARS =
"ÀÁÂÃÄÅÈÉÊËÌÍÎÏÑÒÓÔÕÖÙÚÛÜÝàáâãäåèéêëìíîïñòóôõöùúûüý"
UNACCENTED_CHARS =
"AAAAAAEEEEIIIINOOOOOUUUUYaaaaaaeeeeiiiinooooouuuuy"

Class Method Summary collapse

Class Method Details

.capitalize(str) ⇒ Object

Capilalize a UTF-8 string that might contain accented characters.



44
45
46
47
# File 'lib/icu_name/util.rb', line 44

def self.capitalize(str)
  return str.capitalize if str.ascii_only? || !str.match(/\A(.)(.*)\z/)
  upcase($1) + downcase($2)
end

.downcase(str) ⇒ Object

Downcase a UTF-8 string that might contain accented characters.



37
38
39
40
41
# File 'lib/icu_name/util.rb', line 37

def self.downcase(str)
  str = str.downcase
  return str if str.ascii_only?
  str.tr(UPPER_CHARS, LOWER_CHARS)
end

.is_utf8(str) ⇒ Object

Decide if a string is valid UTF-8 or not, returning true or false.



14
15
16
17
18
# File 'lib/icu_name/util.rb', line 14

def self.is_utf8(str)
  dup = str.dup
  dup.force_encoding("UTF-8")
  dup.valid_encoding?
end

.to_utf8(str) ⇒ Object

Try to convert any string to UTF-8.



21
22
23
24
25
26
27
# File 'lib/icu_name/util.rb', line 21

def self.to_utf8(str)
  utf8 = is_utf8(str)
  dup = str.dup
  return dup.force_encoding("UTF-8") if utf8
  dup.force_encoding("Windows-1252") if dup.encoding.name.match(/^(ASCII-8BIT|UTF-8)$/)
  dup.encode("UTF-8")
end

.transliterate(str) ⇒ Object

Transliterate Latin-1 accented characters to ASCII.



50
51
52
53
# File 'lib/icu_name/util.rb', line 50

def self.transliterate(str)
  return str.dup if str.ascii_only?
  str.tr(ACCENTED_CHARS, UNACCENTED_CHARS)
end

.upcase(str) ⇒ Object

Upcase a UTF-8 string that might contain accented characters.



30
31
32
33
34
# File 'lib/icu_name/util.rb', line 30

def self.upcase(str)
  str = str.upcase
  return str if str.ascii_only?
  str.tr(LOWER_CHARS, UPPER_CHARS)
end