Module: KLookup

Defined in:
lib/klookup.rb

Overview

Contains Lookup and Database.

Defined Under Namespace

Modules: Lookup Classes: Database

Class Method Summary collapse

Class Method Details

.cp_to_str(val) ⇒ Object

Returns a string containing the UTF-8 encoded character represented by the receiver’s value.

Uses RUnicode’s Integer#chr method



33
34
35
# File 'lib/klookup.rb', line 33

def self.cp_to_str(val)
  return val.chr
end

.include_kana?(str) ⇒ Boolean

Returns true if there is kana in the string.

Returns:

  • (Boolean)


25
26
27
# File 'lib/klookup.rb', line 25

def self.include_kana?(str)
  return (not (str =~ /[#{0x3040.chr}-#{0x30FF.chr}]/).nil?)
end

.norm_kana(str) ⇒ Object

Returns a regular expression that matches strings in a kana-insensitive manner.



39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
# File 'lib/klookup.rb', line 39

def self.norm_kana(str)
  # Relevant codepoints:
  #    ひらがな == カタカナ
  # 3041 - 3096 == 30A1 - 30F6  -  ァ-ヶ
  # 309D - 309E == 30FD - 30FE  -  ヽ-ヾ
  hiragana = (0x3041..0x3096).to_a + (0x309D..0x309E).to_a
  katakana = (0x30A1..0x30F6).to_a + (0x30FD..0x30FE).to_a
  hkhash = {}
  khhash = {}
  i=0
  hiragana.each {|c|
    hkhash[c] = katakana[i]
    khhash[katakana[i]] = c
    i+=1
  }
  re=''
  str.each_char {|c|
    if hiragana.include?(c.chars.first)
      re << "[#{c}#{cp_to_str(hkhash[c.chars.first])}]"
    elsif katakana.include?(c.chars.first)
      re << "[#{c}#{cp_to_str(khhash[c.chars.first])}]"
    else
      re << c
    end
  }
  Regexp.new("#{re}")
end