Module: Babosa::UTF8::Proxy
- Included in:
- DumbProxy, JavaProxy, UnicodeProxy
- Defined in:
- lib/babosa/utf8/proxy.rb
Overview
A UTF-8 proxy for Babosa can be any object which responds to the methods in this module. The following proxies are provided by Babosa: ActiveSupportProxy, DumbProxy, JavaProxy, and UnicodeProxy.
Constant Summary collapse
- CP1252 =
{ 128 => [226, 130, 172], 129 => nil, 130 => [226, 128, 154], 131 => [198, 146], 132 => [226, 128, 158], 133 => [226, 128, 166], 134 => [226, 128, 160], 135 => [226, 128, 161], 136 => [203, 134], 137 => [226, 128, 176], 138 => [197, 160], 139 => [226, 128, 185], 140 => [197, 146], 141 => nil, 142 => [197, 189], 143 => nil, 144 => nil, 145 => [226, 128, 152], 146 => [226, 128, 153], 147 => [226, 128, 156], 148 => [226, 128, 157], 149 => [226, 128, 162], 150 => [226, 128, 147], 151 => [226, 128, 148], 152 => [203, 156], 153 => [226, 132, 162], 154 => [197, 161], 155 => [226, 128, 186], 156 => [197, 147], 157 => nil, 158 => [197, 190], 159 => [197, 184] }
Instance Method Summary collapse
-
#downcase(string) ⇒ Object
This is a stub for a method that should return a Unicode-aware downcased version of the given string.
-
#normalize_utf8(string) ⇒ Object
This is a stub for a method that should return the Unicode NFC normalization of the given string.
-
#tidy_bytes(string) ⇒ Object
Attempt to replace invalid UTF-8 bytes with valid ones.
-
#upcase(string) ⇒ Object
This is a stub for a method that should return a Unicode-aware upcased version of the given string.
Instance Method Details
#downcase(string) ⇒ Object
This is a stub for a method that should return a Unicode-aware downcased version of the given string.
49 50 51 |
# File 'lib/babosa/utf8/proxy.rb', line 49 def downcase(string) raise NotImplementedError end |
#normalize_utf8(string) ⇒ Object
This is a stub for a method that should return the Unicode NFC normalization of the given string.
61 62 63 |
# File 'lib/babosa/utf8/proxy.rb', line 61 def normalize_utf8(string) raise NotImplementedError end |
#tidy_bytes(string) ⇒ Object
Attempt to replace invalid UTF-8 bytes with valid ones. This method naively assumes if you have invalid UTF8 bytes, they are either Windows CP-1252 or ISO8859-1. In practice this isn’t a bad assumption, but may not always work.
70 71 72 73 74 |
# File 'lib/babosa/utf8/proxy.rb', line 70 def tidy_bytes(string) string.scrub do |bad| tidy_byte(*bad.bytes).flatten.compact.pack('C*').unpack('U*').pack('U*') end end |
#upcase(string) ⇒ Object
This is a stub for a method that should return a Unicode-aware upcased version of the given string.
55 56 57 |
# File 'lib/babosa/utf8/proxy.rb', line 55 def upcase(string) raise NotImplementedError end |