Class: CBETA::Gaiji

Inherits:
Object
  • Object
show all
Defined in:
lib/cbeta/gaiji.rb

Overview

存取 CBETA 缺字資料庫

Instance Method Summary collapse

Constructor Details

#initialize(gaiji_base) ⇒ Gaiji

載入 CBETA 缺字資料庫gaiji_base clone from github.com/cbeta-org/cbeta_gaiji



7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# File 'lib/cbeta/gaiji.rb', line 7

def initialize(gaiji_base)
  fn = File.join(gaiji_base, 'cbeta_gaiji.json')
  @gaijis = JSON.parse(File.read(fn))
  
  fn = File.join(gaiji_base, 'cbeta_sanskrit.json')
  h = JSON.parse(File.read(fn))
  @gaijis.merge!(h)
  
  @zzs = {}
  @uni2cb = {}
  @gaijis.each do |k,v|
    if v.key? 'composition'
      zzs = v['composition']
      @zzs[zzs] = k
    end
    
    if v.key? 'uni_char'
      c = v['uni_char']
      @uni2cb[c] = k
    end
  end
end

Instance Method Details

#[](cb) ⇒ Hash{String => Strin, Array<String>}?

取得缺字資訊

Return:

{
  "composition": "[得-彳]",
  "unicode": "3775",
  "uni_char": "",
  "zhuyin": [ "ㄉㄜˊ", "ㄞˋ" ]
}

Examples:

g = CBETA::Gaiji.new
g["CB01002"]

Parameters:

  • cb (String)

    缺字 CB 碼

Returns:

  • (Hash{String => Strin, Array<String>})

    缺字資訊

  • (nil)

    如果該 CB 碼在 CBETA 缺字庫中不存在



47
48
49
# File 'lib/cbeta/gaiji.rb', line 47

def [](cb)
	@gaijis[cb]
end

#key?(cb) ⇒ Boolean

檢查某個缺字碼是否存在

Returns:

  • (Boolean)


52
53
54
# File 'lib/cbeta/gaiji.rb', line 52

def key?(cb)
  @gaijis.key? cb
end

#to_s(gid, cb_priority = nil, skt_priority = nil) ⇒ Object

依優先序呈現缺字



57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
# File 'lib/cbeta/gaiji.rb', line 57

def to_s(gid, cb_priority=nil, skt_priority=nil)
  if cb_priority.nil?
    cb_priority = %w(uni_char norm_uni_char norm_big5_char composition)
  end
  
  if skt_priority.nil?
    skt_priority = %w(symbol romanized PUA)
  end
  
  g = @gaijis[gid]
  if gid.start_with? 'CB'
    cb_priority.each do |k|
      if k == 'PUA'
        return CBETA.pua(gid)
      elsif g.key? k
        return g[k] unless g[k].empty?
      end
    end
  else
    skt_priority.each do |k|
      if k == 'PUA'
        s = g['pua'].sub(/^U\+(.*)$/, '\1')
        i = s.to_i(16)
        return [i].pack("U")
      else
        if g.key? k
          return g[k] unless g[k].empty?
        end
      end
    end
  end
  nil
end

#unicode_to_cb(unicode_char) ⇒ Object



91
92
93
# File 'lib/cbeta/gaiji.rb', line 91

def unicode_to_cb(unicode_char)
  @uni2cb[unicode_char]
end

#zhuyin(cb) ⇒ Array<String>

傳入缺字 CB 碼,傳回注音 array

資料來源:CBETA 於 2015.5.15 提供的 MS Access 缺字資料庫

Examples:

g = CBETA::Gaiji.new
g.zhuyin("CB00023") # return [ "ㄍㄢˇ", "ㄍㄢ", "ㄧㄤˊ", "ㄇㄧˇ", "ㄇㄧㄝ", "ㄒㄧㄤˊ" ]

Parameters:

  • cb (String)

    缺字 CB 碼

Returns:

  • (Array<String>)


105
106
107
108
# File 'lib/cbeta/gaiji.rb', line 105

def zhuyin(cb)
	return nil unless @gaijis.key? cb
  @gaijis[cb]['zhuyin']
end

#zzs2pua(zzs) ⇒ Object

傳入 組字式,取得 PUA



111
112
113
114
115
# File 'lib/cbeta/gaiji.rb', line 111

def zzs2pua(zzs)
  return nil unless @zzs.key? zzs
  gid = @zzs[zzs]
  CBETA.pua(gid)
end