Module: RMMSeg::Dictionary

Defined in:
lib/rmmseg/dictionary.rb

Class Attribute Summary collapse

Class Method Summary collapse

Class Attribute Details

.dictionariesObject

An array of dictionaries used by RMMSeg. Each entry is of the following form:

[type, path]

where type can either :chars or :words. path is the path to the dictionary file.

The format of :chars dictionary is a collection of lines of the following form:

freq char

Where frequency is a number less than 65535. char is the character. They are spearated by exactly one space.

The format of :words dictionary is similar:

length word

except the first number is not the frequency, but the number of characters (not number of bytes) in the word.

There’s a script (convert.rb) in the tools directory that can be used to convert and normalize dictionaries.



37
38
39
# File 'lib/rmmseg/dictionary.rb', line 37

def dictionaries
  @dictionaries
end

Class Method Details

.add_dictionary(path, type) ⇒ Object

Add a user defined dictionary, type can be :chars or :words. See doc of dictionaries.



41
42
43
# File 'lib/rmmseg/dictionary.rb', line 41

def add_dictionary(path, type)
  @dictionaries << [type, path]
end

.load_dictionariesObject



45
46
47
48
49
50
51
52
53
# File 'lib/rmmseg/dictionary.rb', line 45

def load_dictionaries
  @dictionaries.each do |type, path|
    if type == :chars
      load_chars(path)
    elsif type == :words
      load_words(path)
    end
  end
end