Module: KLookup
- Defined in:
- lib/klookup.rb
Overview
Contains Lookup and Database.
Defined Under Namespace
Modules: Lookup Classes: Database
Class Method Summary collapse
-
.cp_to_str(val) ⇒ Object
Returns a string containing the UTF-8 encoded character represented by the receiver’s value.
-
.include_kana?(str) ⇒ Boolean
Returns true if there is kana in the string.
-
.norm_kana(str) ⇒ Object
Returns a regular expression that matches strings in a kana-insensitive manner.
Class Method Details
.cp_to_str(val) ⇒ Object
Returns a string containing the UTF-8 encoded character represented by the receiver’s value.
Uses RUnicode’s Integer#chr method
33 34 35 |
# File 'lib/klookup.rb', line 33 def self.cp_to_str(val) return val.chr end |
.include_kana?(str) ⇒ Boolean
Returns true if there is kana in the string.
25 26 27 |
# File 'lib/klookup.rb', line 25 def self.include_kana?(str) return (not (str =~ /[#{0x3040.chr}-#{0x30FF.chr}]/).nil?) end |
.norm_kana(str) ⇒ Object
Returns a regular expression that matches strings in a kana-insensitive manner.
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 |
# File 'lib/klookup.rb', line 39 def self.norm_kana(str) # Relevant codepoints: # ひらがな == カタカナ # 3041 - 3096 == 30A1 - 30F6 - ァ-ヶ # 309D - 309E == 30FD - 30FE - ヽ-ヾ hiragana = (0x3041..0x3096).to_a + (0x309D..0x309E).to_a katakana = (0x30A1..0x30F6).to_a + (0x30FD..0x30FE).to_a hkhash = {} khhash = {} i=0 hiragana.each {|c| hkhash[c] = katakana[i] khhash[katakana[i]] = c i+=1 } re='' str.each_char {|c| if hiragana.include?(c.chars.first) re << "[#{c}#{cp_to_str(hkhash[c.chars.first])}]" elsif katakana.include?(c.chars.first) re << "[#{c}#{cp_to_str(khhash[c.chars.first])}]" else re << c end } Regexp.new("#{re}") end |