Module: URI
- Defined in:
- lib/unicorn-cuba-base/uri_ext.rb
Class Method Summary collapse
-
.decode(str) ⇒ Object
Use it by default.
- .pct_decode ⇒ Object
-
.utf_decode(str) ⇒ Object
From en.wikipedia.org/wiki/Percent-encoding: The generic URI syntax mandates that new URI schemes that provide for the representation of character data in a URI must, in effect, represent characters from the unreserved set without translation, and should convert all other characters to bytes according to UTF-8, and then percent-encode those values.
Class Method Details
.decode(str) ⇒ Object
Use it by default
21 22 23 |
# File 'lib/unicorn-cuba-base/uri_ext.rb', line 21 def self.decode(str) self.utf_decode(str) end |
.pct_decode ⇒ Object
5 |
# File 'lib/unicorn-cuba-base/uri_ext.rb', line 5 alias_method :pct_decode, :decode |
.utf_decode(str) ⇒ Object
From en.wikipedia.org/wiki/Percent-encoding: The generic URI syntax mandates that new URI schemes that provide for the representation of character data in a URI must, in effect, represent characters from the unreserved set without translation, and should convert all other characters to bytes according to UTF-8, and then percent-encode those values. Also sometimes JavaScript encode() function (deprecated) is being used; this uses %uXXXX encoding for UTF-8 chars
11 12 13 14 15 16 17 |
# File 'lib/unicorn-cuba-base/uri_ext.rb', line 11 def self.utf_decode(str) pct_decode(str) # decode %XX bits .force_encoding('UTF-8') # Make sure the string is interpreting UTF-8 chars .tap{|uri| validate_string_encoding(uri)} .gsub(/%u([0-9a-z]{4})/) {|s| [$1.to_i(16)].pack("U")} # Decode %uXXXX encoded chars (JavaScript.encode()) .tap{|uri| validate_string_encoding(uri)} end |