Class: CBETA::Gaiji

Inherits:
Object
  • Object
show all
Defined in:
lib/cbeta/gaiji.rb

Overview

存取 CBETA 缺字資料庫

Instance Method Summary collapse

Constructor Details

#initializeGaiji

載入 CBETA 缺字資料庫



8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# File 'lib/cbeta/gaiji.rb', line 8

def initialize
  @us = CBETA::UnicodeService.new
  folder = File.join(File.dirname(__FILE__), '../data')
  fn = File.join(folder, 'cbeta_gaiji.json')
  @gaijis = JSON.parse(File.read(fn))
  
  fn = File.join(folder, 'cbeta_sanskrit.json')
  h = JSON.parse(File.read(fn))
  @gaijis.merge!(h)
  
  @zzs = {}
  @uni2cb = {}
  @gaijis.each do |k,v|
    if v.key? 'composition'
      zzs = v['composition']
      @zzs[zzs] = k
    end
    
    if v.key? 'uni_char'
      c = v['uni_char']
      @uni2cb[c] = k
    end
  end
end

Instance Method Details

#[](cb) ⇒ Hash{String => Strin, Array<String>}?

取得缺字資訊

Return:

{
  "composition": "[得-彳]",
  "unicode": "3775",
  "uni_char": "",
  "zhuyin": [ "ㄉㄜˊ", "ㄞˋ" ]
}

Examples:

g = CBETA::Gaiji.new
g["CB01002"]

Parameters:

  • cb (String)

    缺字 CB 碼

Returns:

  • (Hash{String => Strin, Array<String>})

    缺字資訊

  • (nil)

    如果該 CB 碼在 CBETA 缺字庫中不存在



50
51
52
# File 'lib/cbeta/gaiji.rb', line 50

def [](cb)
	@gaijis[cb]
end

#key?(cb) ⇒ Boolean

檢查某個缺字碼是否存在

Returns:

  • (Boolean)


55
56
57
# File 'lib/cbeta/gaiji.rb', line 55

def key?(cb)
  @gaijis.key? cb
end

#to_s(gid, cb_priority: nil, skt_priority: nil) ⇒ String

依優先序呈現缺字

Parameters:

  • cb_priority (Array<String>) (defaults to: nil)

    優先序

  • skt_priority (Array<String>) (defaults to: nil)

    優先序預設優先序的順序是:

    * uni_2: 有 Unicode Level 2 字元 (Unicode 10.0 以內) 就採用
    * norm_uni_2: 有 Unicode 通用字 Level 2 (Unicode 10.0 以內) 就採用
    * norm_big5_char: 有 Big5 通用字 就採用
    * uni_char: 有 Unicode 字元 就採用
    * norm_uni_char: 有 Unicode 通用字 就採用
    * composition: 有組字式 就採用
    

Returns:

  • (String)

    可能是 nil



71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
# File 'lib/cbeta/gaiji.rb', line 71

def to_s(gid, cb_priority: nil, skt_priority: nil)
  if cb_priority.nil?
    cb_priority = %w(uni_2 norm_uni_2 norm_big5_char uni_char norm_uni_char composition)
  end

  if skt_priority.nil?
    skt_priority = %w(symbol romanized PUA)
  end
  
  g = @gaijis[gid]
  return nil if g.nil?
  
  if gid.start_with? 'CB'
    cb_priority.each do |k|
      case k
      when 'PUA'
        return CBETA.pua(gid)
      when 'uni_2'
        k = 'uni_char'
        return g[k] if @us.level2?(g[k])
      when 'norm_uni_2'
        k = 'norm_uni_char'
        return g[k] if @us.level2?(g[k])
      else
        return g[k] if g.key?(k) and not g[k].empty?
      end
    end
  else
    skt_priority.each do |k|
      if k == 'PUA'
        s = g['pua'].sub(/^U\+(.*)$/, '\1')
        i = s.to_i(16)
        return [i].pack("U")
      else
        if g.key? k
          return g[k] unless g[k].empty?
        end
      end
    end
  end
  nil
end

#unicode_to_cb(unicode_char) ⇒ Object



114
115
116
# File 'lib/cbeta/gaiji.rb', line 114

def unicode_to_cb(unicode_char)
  @uni2cb[unicode_char]
end

#zhuyin(cb) ⇒ Array<String>

傳入缺字 CB 碼,傳回注音 array

資料來源:CBETA 於 2015.5.15 提供的 MS Access 缺字資料庫

Examples:

g = CBETA::Gaiji.new
g.zhuyin("CB00023") # return [ "ㄍㄢˇ", "ㄍㄢ", "ㄧㄤˊ", "ㄇㄧˇ", "ㄇㄧㄝ", "ㄒㄧㄤˊ" ]

Parameters:

  • cb (String)

    缺字 CB 碼

Returns:

  • (Array<String>)


128
129
130
131
# File 'lib/cbeta/gaiji.rb', line 128

def zhuyin(cb)
	return nil unless @gaijis.key? cb
  @gaijis[cb]['zhuyin']
end

#zzs2pua(zzs) ⇒ Object

傳入 組字式,取得 PUA



134
135
136
137
138
# File 'lib/cbeta/gaiji.rb', line 134

def zzs2pua(zzs)
  return nil unless @zzs.key? zzs
  gid = @zzs[zzs]
  CBETA.pua(gid)
end