Module: InjalidDejice

Defined in:
lib/injalid_dejice.rb,
lib/injalid_dejice/decoder.rb,
lib/injalid_dejice/encoder.rb,
lib/injalid_dejice/version.rb,
lib/injalid_dejice/injalid_dejice.rb,
lib/injalid_dejice/locale_resolver.rb

Overview

UTF-8 <-> KOI-7 encoder/decoder.

Defined Under Namespace

Classes: Decoder, Encoder, Error, LocaleResolver

Constant Summary collapse

VERSION =
"1.0.0"
CYR_TO_LAT_DICT =

Cyrillic char. => Latin char. KOI-7 conversion dictionary.

{
  "ю" => "@",
  "а" => "A",
  "б" => "B",
  "ц" => "C",
  "д" => "D",
  "е" => "E",
  "ф" => "F",
  "г" => "G",
  "х" => "H",
  "и" => "I",
  "й" => "J",
  "к" => "K",
  "л" => "L",
  "м" => "M",
  "н" => "N",
  "о" => "O",
  "п" => "P",
  "я" => "Q",
  "р" => "R",
  "с" => "S",
  "т" => "T",
  "у" => "U",
  "ж" => "V",
  "в" => "W",
  "ь" => "X",
  "ы" => "Y",
  "з" => "Z",
  "ш" => "[",
  "э" => "\\",
  "щ" => "]",
  "ч" => "^",
  "ъ" => "_",
  "Ю" => "`",
  "А" => "a",
  "Б" => "b",
  "Ц" => "c",
  "Д" => "d",
  "Е" => "e",
  "Ф" => "f",
  "Г" => "g",
  "Х" => "h",
  "И" => "i",
  "Й" => "j",
  "К" => "k",
  "Л" => "l",
  "М" => "m",
  "Н" => "n",
  "О" => "o",
  "П" => "p",
  "Я" => "q",
  "Р" => "r",
  "С" => "s",
  "Т" => "t",
  "У" => "u",
  "Ж" => "v",
  "В" => "w",
  "Ь" => "x",
  "Ы" => "y",
  "З" => "z",
  "Ш" => "{",
  "Э" => "|",
  "Щ" => "}",
  "Ч" => "~"
}.freeze
LAT_TO_CYR_DICT =

Latin char. => Cyrillic char. KOI-7 conversion dictionary.

CYR_TO_LAT_DICT.invert.freeze
SHARED_CHARS_LIMIT =

ASCII characters from 0 to 64 are shared by both Latin (KOI-7 N0) & Cyrillic (KOI-7 N1) KOI-7 code pages.

64
SHARED_CHARS =
" !\"#$%&'()*+,-./0123456789:;<=>?"
ASCII_CHARS_LIMIT =

ASCII highest possible character code.

127
DEF_UNKNOWN_CHAR_REP =

Character to use if no replacement was found in the dictionary.

"?"
CYR_CHAR =

Swithes encoding to the Cyrillic KOI-7 code page. Also known as “Shift Out” (SO) character.

0x0E.chr
LATIN_CHAR =

Swithes encoding to the Latin KOI-7 code page. Also known as “Shift In” (SI) character.

0x0F.chr

Class Method Summary collapse

Class Method Details

.koi_to_utf(koi_string, **kwargs) ⇒ String

Decode ‘koi_string’ from the KOI-7 to the UTF-8.

Parameters:

  • koi_string (String)

    A KOI-7-compatible string.

  • kwargs (Hash{Symbol => Object})

    Options.

Options Hash (**kwargs):

  • :unknown_char_rep (String) — default: DEF_UNKNOWN_CHAR_REP

    Replacement character for the unsupported characters.

  • :strict_mode (Boolean) — default: false

    When ‘true’, an opening SO (0x0E) character should have a closing SI (0x0F) counterpart, otherwise ArgumentError would be raised. When ‘false’, an error won’t be raised.

Returns:

  • (String)


56
57
58
# File 'lib/injalid_dejice.rb', line 56

def self.koi_to_utf(koi_string, **kwargs)
  Decoder.new(**kwargs).call(koi_string)
end

.utf_to_koi(utf_string, **kwargs) ⇒ String

Encode ‘utf_string’ to the KOI-7 encoding.

Parameters:

  • utf_string (String)

    A UTF-8 string.

  • kwargs (Hash{Symbol => Object})

    Options.

Options Hash (**kwargs):

  • :forced_latin (Array<String>) — default: []

    Force characters to be recognized as Latin ones.

  • :unknown_char_rep (String) — default: DEF_UNKNOWN_CHAR_REP

    Replacement character for the unsupported characters.

Returns:

  • (String)

    An US-ASCII string, KOI-7-compatible.



33
34
35
# File 'lib/injalid_dejice.rb', line 33

def self.utf_to_koi(utf_string, **kwargs)
  Encoder.new(**kwargs).call(utf_string)
end