Module: TwitterCldr::Collation::ImplicitCollationElements

Defined in:
lib/twitter_cldr/collation/implicit_collation_elements.rb

Overview

ImplicitCollationElements generates implicit collation elements for code points (including some CJK characters), that are not explicitly mentioned in the collation elements table.

This module was ported from the ICU4J library (ImplicitCEGenerator class). See NOTICE file for license information.

Constant Summary collapse

DEFAULT_SECONDARY_AND_TERTIARY =
5
MIN_PRIMARY =

primary value

0xE0
MAX_PRIMARY =
0xe4
MIN_TRAIL =

final byte

0x04
MAX_TRAIL =
0xFE
GAP_3 =

gap for tailoring of 3-byte forms

1
PRIMARIES_3_COUNT =

number of 3-byte primaries that can be used

1
MAX_INPUT =

2 * [Unicode range] + 2

0x220001
MEDIAL_COUNT =

medials can use full range

MAX_TRAIL - MIN_TRAIL + 1
FINAL_3_MULTIPLIER =

number of values we can use in trailing bytes leave room for empty values between AND above, e.g., if gap = 2

range 3..7 => +3 -4 -5 -6 -7: so 1 value
range 3..8 => +3 -4 -5 +6 -7 -8: so 2 values
range 3..9 => +3 -4 -5 +6 -7 -8 -9: so 2 values
GAP_3 + 1
FINAL_3_COUNT =
MEDIAL_COUNT / FINAL_3_MULTIPLIER
THREE_BYTE_COUNT =

find out how many values fit in each form

MEDIAL_COUNT * FINAL_3_COUNT
PRIMARIES_AVAILABLE =

now determine where the 3/4 boundary is we use 3 bytes below the boundary, and 4 above

MAX_PRIMARY - MIN_PRIMARY + 1
PRIMARIES_4_COUNT =
PRIMARIES_AVAILABLE - PRIMARIES_3_COUNT
MIN_4_PRIMARY =
MIN_PRIMARY + PRIMARIES_3_COUNT
MIN_4_BOUNDARY =
PRIMARIES_3_COUNT * THREE_BYTE_COUNT
TOTAL_NEEDED =
MAX_INPUT - MIN_4_BOUNDARY
NEEDED_PER_PRIMARY_BYTE =
(TOTAL_NEEDED - 1) / PRIMARIES_4_COUNT + 1
NEEDED_PER_FINAL_BYTE =
(NEEDED_PER_PRIMARY_BYTE - 1) / (MEDIAL_COUNT * MEDIAL_COUNT) + 1
GAP_4 =
(MAX_TRAIL - MIN_TRAIL - 1) / NEEDED_PER_FINAL_BYTE
FINAL_4_MULTIPLIER =
GAP_4 + 1
FINAL_4_COUNT =
NEEDED_PER_FINAL_BYTE
NON_CJK_OFFSET =

CJK constants

0x110000
CJK_COMPAT_USED_BASE =
0xFA0E
CJK_COMPAT_USED_LIMIT =
0xFA2F + 1
CJK_BASE =

4E00;<CJK Ideograph, First>;Lo;0;L;;;;;N;;;;;

0x4E00
CJK_LIMIT =

9FCC;<CJK Ideograph, Last>;Lo;0;L;;;;;N;;;;;

0x9FCC + 1
CJK_A_BASE =

3400;<CJK Ideograph Extension A, First>;Lo;0;L;;;;;N;;;;;

0x3400
CJK_A_LIMIT =

4DB5;<CJK Ideograph Extension A, Last>;Lo;0;L;;;;;N;;;;;

0x4DB5 + 1
CJK_B_BASE =

20000;<CJK Ideograph Extension B, First>;Lo;0;L;;;;;N;;;;;

0x20000
CJK_B_LIMIT =

2A6D6;<CJK Ideograph Extension B, Last>;Lo;0;L;;;;;N;;;;;

0x2A6D6 + 1
CJK_C_BASE =

2A700;<CJK Ideograph Extension C, First>;Lo;0;L;;;;;N;;;;;

0x2A700
CJK_C_LIMIT =

2B734;<CJK Ideograph Extension C, Last>;Lo;0;L;;;;;N;;;;;

0x2B734 + 1
CJK_D_BASE =

2B740;<CJK Ideograph Extension D, First>;Lo;0;L;;;;;N;;;;;

0x2B740
CJK_D_LIMIT =

2B81D;<CJK Ideograph Extension D, Last>;Lo;0;L;;;;;N;;;;;

0x2B81D + 1

Class Method Summary collapse

Class Method Details

.for_code_point(code_point) ⇒ Object



20
21
22
# File 'lib/twitter_cldr/collation/implicit_collation_elements.rb', line 20

def for_code_point(code_point)
  [[primary_weight(swapCJK(code_point) + 1), DEFAULT_SECONDARY_AND_TERTIARY, DEFAULT_SECONDARY_AND_TERTIARY]]
end