Class: String
Overview
This is an extension and modification of the standard String class. We do a lot of UTF-8 character processing in the parser. Ruby 1.8 does not have good enough UTF-8 support and Ruby 1.9 only handles UTF-8 characters as Strings. This is very inefficient compared to representing them as Integer objects. Some of these hacks can be removed once we have switched to 1.9 support only.
Instance Method Summary collapse
-
#<<(obj) ⇒ Object
Replacement for the existing << operator that also works for characters above Integer 255 (UTF-8 characters).
-
#forceUTF8Encoding ⇒ Object
Ensure the String is really UTF-8 encoded and newlines are only n.
- #ljust(len, pad = ' ') ⇒ Object
- #old_double_left_angle ⇒ Object
- #old_reverse ⇒ Object
-
#reverse ⇒ Object
UTF-8 aware version of reverse that replaces the built-in one.
- #to_base64 ⇒ Object
- #to_quoted_printable ⇒ Object
- #unix2dos ⇒ Object
Instance Method Details
#<<(obj) ⇒ Object
Replacement for the existing << operator that also works for characters above Integer 255 (UTF-8 characters).
63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 |
# File 'lib/taskjuggler/UTF8String.rb', line 63 def <<(obj) if obj.is_a?(String) || (obj < 256) # In this case we can use the built-in concat. concat(obj) else # UTF-8 characters have a maximum length of 4 byte and no byte is 0. mask = 0xFF000000 pos = 3 while pos >= 0 # Use the built-in concat operator for each byte. concat((obj & mask) >> (8 * pos)) if (obj & mask) != 0 # Move mask and position to the next byte. mask = mask >> 8 pos -= 1 end end end |
#forceUTF8Encoding ⇒ Object
Ensure the String is really UTF-8 encoded and newlines are only n. If that’s not possible, an Encoding::UndefinedConversionError is raised.
123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 |
# File 'lib/taskjuggler/UTF8String.rb', line 123 def forceUTF8Encoding if RUBY_VERSION < '1.9.0' # Ruby 1.8 really only support 7 bit ASCII well. Only do the line-end # clean-up. gsub(/\r\n/, "\n") else begin # Ensure that the text has LF line ends and is UTF-8 encoded. encode('UTF-8', :universal_newline => true) rescue # The encoding of the String is broken. Find the first broken line and # report it. lineCtr = 1 each_line do |line| begin line.encode('UTF-8') rescue line = line.encode('UTF-8', :invalid => :replace, :undef => :replace, :replace => '<?>') raise Encoding::UndefinedConversionError, "UTF-8 encoding error in line #{lineCtr}: #{line}" end lineCtr += 1 end end end end |
#ljust(len, pad = ' ') ⇒ Object
90 91 92 93 |
# File 'lib/taskjuggler/UTF8String.rb', line 90 def ljust(len, pad = ' ') return self + pad * (len - length_utf8) if length_utf8 < len self end |
#old_double_left_angle ⇒ Object
59 |
# File 'lib/taskjuggler/UTF8String.rb', line 59 alias old_double_left_angle << |
#old_reverse ⇒ Object
95 |
# File 'lib/taskjuggler/UTF8String.rb', line 95 alias old_reverse reverse |
#reverse ⇒ Object
UTF-8 aware version of reverse that replaces the built-in one.
98 99 100 101 102 |
# File 'lib/taskjuggler/UTF8String.rb', line 98 def reverse a = [] each_utf8_char { |c| a << c } a.reverse.join end |
#to_base64 ⇒ Object
113 114 115 |
# File 'lib/taskjuggler/UTF8String.rb', line 113 def to_base64 Base64.encode64(self) end |
#to_quoted_printable ⇒ Object
109 110 111 |
# File 'lib/taskjuggler/UTF8String.rb', line 109 def to_quoted_printable [self].pack('M').gsub(/\n/, "\r\n") end |
#unix2dos ⇒ Object
117 118 119 |
# File 'lib/taskjuggler/UTF8String.rb', line 117 def unix2dos gsub(/\n/, "\r\n") end |