Class: Text::Hyphen::Language

Inherits:
Object
  • Object
show all
Defined in:
lib/text/hyphen/language.rb

Overview

Language scaffolding support for Text::Hyphen. Language hyphenation patterns are defined as instances of this class—and only this class. This is a deliberate “breaking” of Ruby's concept of duck-typing and is intended to provide an indication that the patterns have been converted from TeX encodings to other encodings (e.g., latin1 or UTF-8) that are more suitable to general text manipulations.

Constant Summary

WORD_START_RE =

:nodoc:

%r{^\.}
WORD_END_RE =

:nodoc:

%r{\.$}
DIGIT_RE =

:nodoc:

%r{\d}
NONDIGIT_RE =

:nodoc:

%r{\D}
DASH_RE =

:nodoc:

%r{-}
EXCEPTION_DASH0_RE =

:nodoc:

%r{[^-](?=[^-])}
EXCEPTION_DASH1_RE =

:nodoc:

%r{[^-]-}
EXCEPTION_NONUM_RE =

:nodoc:

%r{[^01]}
ZERO_INSERT_RE =

:nodoc:

%r{(\D)(?=\D)}
ZERO_START_RE =

:nodoc:

%r{^(?=\D)}
DEFAULT_ENCODING =

:nodoc:

if RUBY_VERSION < "1.9.1" #:nodoc:
  "latin1"
else
  "utf-8"
end

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(language = nil) {|_self| ... } ⇒ Language

Creates a new language implementation. If a language object is provided, the default values will be set from the provided language. An exception will be thrown if a value is provided for language that is not an instance of Text::Hyphen::Language.

Yields:

  • (_self)

Yield Parameters:



143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
# File 'lib/text/hyphen/language.rb', line 143

def initialize(language = nil)
  if language.nil?
    self.encoding DEFAULT_ENCODING
    self.patterns ""
    self.exceptions ""
    self.left = 2
    self.right = 2
    self.isocode = nil
  elsif language.kind_of? Text::Hyphen::Language
    self.encoding language.encoding
    self.patterns language.instance_variable_get(:@pattern_text)
    self.exceptions language.instance_variable_get(:@exception_text)
    self.left = language.left
    self.right = language.right
    self.isocode = language.isocode
  else
    raise "Languages can only be created from descendants of Text::Hyphen::Language."
  end

  yield self if block_given?
end

Instance Attribute Details

#isocodeObject

The ISO language code for this language. Generally only used when there are multiple hyphenation tables available for a language.



137
138
139
# File 'lib/text/hyphen/language.rb', line 137

def isocode
  @isocode
end

#leftObject

No fewer than this number of letters will show up to the left of the hyphen for this language. The default value for this value is 2.



130
131
132
# File 'lib/text/hyphen/language.rb', line 130

def left
  @left
end

#rightObject

No fewer than this number of letters will show up to the right of the hyphen for this language. The default value for this value is 2.



133
134
135
# File 'lib/text/hyphen/language.rb', line 133

def right
  @right
end

Class Method Details

.aliases_for(mapping) ⇒ Object

Creates language constant aliases for the language.



166
167
168
169
170
171
172
173
174
175
176
177
178
179
# File 'lib/text/hyphen/language.rb', line 166

def self.aliases_for(mapping)
  mapping.each do |language, alias_names|
    unless const_defined? language
      warn "Aliases not created for #{language}; it has not been defined."
      next
    end
    language = const_get(language)

    [ alias_names ].flatten.each do |alias_name|
      next if const_defined? alias_name
      const_set(alias_name, language)
    end
  end
end

Instance Method Details

#bothObject

Patterns that match either the beginning or end of a word.



43
44
45
# File 'lib/text/hyphen/language.rb', line 43

def both
  @patterns[:both]
end

#encoding(enc = nil) ⇒ Object

The encoding of the hyphenation definitions. The text to be compared must be of the same type.



37
38
39
40
# File 'lib/text/hyphen/language.rb', line 37

def encoding(enc = nil)
  return @encoding if enc.nil?
  @encoding = enc
end

#exceptions(exc = nil) ⇒ Object

Exceptions to the hyphenation patterns.



112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
# File 'lib/text/hyphen/language.rb', line 112

def exceptions(exc = nil)
  return @exceptions if exc.nil?

  @exception_text = exc.dup
  @exceptions = {}

  @exception_text.split.each do |word|
    tag   = word.gsub(DASH_RE,'')
    value = "0" + word.gsub(EXCEPTION_DASH0_RE, '0').gsub(EXCEPTION_DASH1_RE, '1')
    value.gsub!(EXCEPTION_NONUM_RE, '0')
    @exceptions[tag] = value.scan(self.scan_re).map { |c| c.to_i }
  end

  true
end

#hyphenObject

Patterns that hyphenate mid-word.



58
59
60
# File 'lib/text/hyphen/language.rb', line 58

def hyphen
  @patterns[:hyphen]
end

#patterns(pats = nil) ⇒ Object

The hyphenation patterns for this language.



63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
# File 'lib/text/hyphen/language.rb', line 63

def patterns(pats = nil)
  return @patterns if pats.nil?

  @pattern_text = pats.dup

  @patterns = {
    :both   => {}, 
    :start  => {},
    :stop   => {},
    :hyphen => {}
  }

  plist = @pattern_text.split($/).map { |ln| ln.gsub(%r{%.*$}, '') }
  plist.each do |line|
    line.split.each do |word|
      next if word.empty?

      start = stop = false

      start = true if word.sub!(WORD_START_RE, '')
      stop  = true if word.sub!(WORD_END_RE, '')

      # Insert zeroes and start with some digit
      word.gsub!(ZERO_INSERT_RE) { "#{$1}0" }
      word.gsub!(ZERO_START_RE, "0")

      # This assumes that the pattern lists are already in lowercase
      # form only.
      tag   = word.gsub(DIGIT_RE, '')
      value = word.gsub(NONDIGIT_RE, '')

      if start and stop
        set = :both
      elsif start
        set = :start
      elsif stop
        set = :stop
      else
        set = :hyphen
      end

      @patterns[set][tag] = value
    end
  end

  true
end

#scan_reObject

The character scan regular expression to use.



28
29
30
31
32
33
# File 'lib/text/hyphen/language.rb', line 28

def scan_re #:nodoc:
  if RUBY_VERSION < '1.9.1'
    return %r{.}u if @encoding =~ /utf-?8/i
  end
  return %r{.}
end

#startObject

Patterns that match the beginning of a word.



48
49
50
# File 'lib/text/hyphen/language.rb', line 48

def start
  @patterns[:start]
end

#stopObject

Patterns that match the end of a word.



53
54
55
# File 'lib/text/hyphen/language.rb', line 53

def stop
  @patterns[:stop]
end