Module: URI

Defined in:
lib/unicorn-cuba-base/uri_ext.rb

Class Method Summary collapse

Class Method Details

.decode(str) ⇒ Object

Use it by default



21
22
23
# File 'lib/unicorn-cuba-base/uri_ext.rb', line 21

def self.decode(str)
  self.utf_decode(str)
end

.pct_decodeObject



5
# File 'lib/unicorn-cuba-base/uri_ext.rb', line 5

alias_method :pct_decode, :decode

.utf_decode(str) ⇒ Object

From en.wikipedia.org/wiki/Percent-encoding: The generic URI syntax mandates that new URI schemes that provide for the representation of character data in a URI must, in effect, represent characters from the unreserved set without translation, and should convert all other characters to bytes according to UTF-8, and then percent-encode those values. Also sometimes JavaScript encode() function (deprecated) is being used; this uses %uXXXX encoding for UTF-8 chars



11
12
13
14
15
16
17
# File 'lib/unicorn-cuba-base/uri_ext.rb', line 11

def self.utf_decode(str)
  pct_decode(str) # decode %XX bits
  .force_encoding('UTF-8') # Make sure the string is interpreting UTF-8 chars
  .tap{|uri| validate_string_encoding(uri)}
  .gsub(/%u([0-9a-z]{4})/) {|s| [$1.to_i(16)].pack("U")} # Decode %uXXXX encoded chars (JavaScript.encode())
  .tap{|uri| validate_string_encoding(uri)}
end