Zhongwen Tools:

Tools and methods for dealing with Chinese.

Build
Status Dependency Status Code Climate Coverage Status Gem Version

INSTALLATION

Install as a gem

$ [sudo] gem install zhongwen_tools

Usage

Add the ZhongwenTools component you need to your classes as a module.

require 'zhongwen_tools/romanization'

class String
  include ZhongwenTools::Romanization
end

str = "ni3 hao3"  #pinyin with numbers
str.to_pinyin     #=> "nǐ hǎo"
str.to_zhuyin_fuhao  #=> "ㄋㄧ3 ㄏㄠ3"

mzd = "Mao Tse-tung"
mzd.to_pinyin   #=> "Mao Zedong"

Or you can require the components you want

require 'zhongwen_tools/numbers'
ZhongwenTools::Numbers.to_pyn '一百二十' #=> 'yi1-bai2-er4-shi2'

ZhongwenTools includes the following Modules:

  1. ZhongwenTools::String - methods for dealing with strings with Chinese and pinyin.
  2. ZhongwenTools::Numbers - methods for identifying Chinese numbers and for converting to and from Chinese.
  3. ZhongwenTools::Integer - methods for converting integers into Chinese or pinyin.
  4. ZhongwenTools::Romanization - methods for converting between Chinese romanization systems.
  5. ZhongwenTools::Conversion - methods for converting between Chinese scripts.

Using ZhongwenTools::String

require 'zhongwen_tools/string'
ZhongwenTools::String.ascii? 'hello'               #=> true #non-multibyle strings
ZhongwenTools::String.multibyte? '中文'            #=> true #multibtye strings
ZhongwenTools::String.halfwidth? 'hello'           #=> true
ZhongwenTools::String.fullwidth? 'hello'       #=> true
ZhongwenTools::String.to_halfwidth 'hello'     #=> 'hello'

ZhongwenTools::String.uri_encode '我太懒'             #=> '%E6%88%91%E5%A4%AA%E6%87%92'
ZhongwenTools::String.to_codepoint '中文'            #=> '\u4e2d\u6587'
ZhongwenTools::String.from_codepoint '\u4e2d\u6587'   #=> '中文' #converts string from a utf-8 codepoint.

ZhongwenTools::String.has_zh? '1月'     #=> true
ZhongwenTools::String.zh? '1月'         #=> false #(The string can't be mixed.)

ZhongwenTools::String.has_zh_punctuation? '你在哪里?'    #=> true
ZhongwenTools::String.strip_zh_punctuation? '你在哪里?'  #=> '你在哪里'

require 'zhongwen_tools/conversion'
ZhongwenTools::String.zhs? '中国'    #=> true
ZhongwenTools::String.zht? '中国'    #=> false

The following capitalization methods work for pinyin.

require 'zhongwen_tools/string'
ZhongwenTools::String.downcase 'Àomén'  #=> 'àomén' does pinyin/ lowercase
ZhongwenTools::String.upcase 'àomén'    #=> 'ÀOMÉN'
ZhongwenTools::String.capitalize 'àomén'  #=> 'Àomén'

Ruby 1.8 safe methods

Zhongwen Tools is tested on every ruby since 1.8.7 and lets you deal with multibyte strings in an simple way.

require 'zhongwen_tools/string'
ZhongwenTools::String.chars '中文' #=> ['中','文']
ZhongwenTools::String.size '中文'  #=> 2
ZhongwenTools::String.reverse '中文' #=> '文中'
ZhongwenTools::String.to_utf8 '\x{D6D0}\x{CEC4}' => '中文'

Numbers

Functions for converting to and from Chinese numbers.

ZhongwenTools::Numbers.number_to_zht :num, 12000        #=> '一萬二千'
ZhongwenTools::Numbers.number_to_zhs :num, 42           #=> '四十二'
ZhongwenTools::Numbers.number_to_pyn :num, 42        #=> 'si4-shi2-er4'
ZhongwenTools::Numbers.zh_number_to_number '四十二'  #=> 42
ZhongwenTools::Numbers.number? '四十二'        #=> true

Integers

Monkey-patch your integers for Chinese.

class Integer
  include ZhongwenTools::Integer
end

12.to_pinyin #=> 'shi2-er4'
12.to_zht    #=> '十二'

Romanization

ZhongwenTools::Romanization has tools for converting between Chinese language romanization systems and scripts. It does not convert Chinese characters to pinyin (see ZhongwenTools::Conversion). Romanization methods must be required explicitly.

gem 'zhongwen_tools'
require 'zhongwen_tools/romanization'

class String
  include ZhongwenTools::Romanization
end


str = "ni3 hao3"

str.to_pinyin     #=> "nǐ hǎo"
str.to_py         #=> "nǐ hǎo"
str.to_pyn       #=> "ni3 hao3"

str.to_wg       #=> "ni3 hao3"    #Wade-Giles
str.to_bpmf     #=> "ㄋㄧ3 ㄏㄠ3" #Zhuyin Fuhao, a.k.a. Bopomofo
str.to_yale     #=> "ni3 hau3"
str.to_typy

str.pyn? #=> true
str.wg?  #=> true #(There can be overlap between Wade-Giles and Pinyin)

Conversion

Functions for converting between scripts (e.g. traditional Chinese to simplified Chinese) and [TODO] between Chinese and romanization systems (e.g. Chinese to pinyin). Conversion methods must be required explicitly.

gem 'zhongwen_tools'
require 'zhongwen_tools/conversion'

ZhongwenTools::Conversion.to_zhs '華語' #=> '华语'
ZhongwenTools::Conversion.to_zht '华语' #=> '華語'
ZhongwenTools::Conversion.to_zhtw '方便面' #=> '泡麵'
ZhongwenTools::Conversion.to_zhhk '方便面' #=> '即食麵'
ZhongwenTools::Conversion.to_zhcn '即食麵' #=> '方便面'

TODO

  1. A character -> pinyin converter