Class: Hermeneutics::URLText
- Inherits:
-
Object
- Object
- Hermeneutics::URLText
- Defined in:
- lib/hermeneutics/escape.rb
Overview
URL-able representation
What’s acually happening
URLs may not contain spaces and serveral character as slashes, ampersands etc. These characters will be masked by a percent sign and two hex digits representing the ASCII code. Eight bit characters should be masked the same way.
An URL line does not store encoding information by itself. A locator may either say one of these:
http://www.example.com/subdir/index.html?umlfield=%C3%BCber+alles
http://www.example.com/subdir/index.html?umlfield=%FCber+alles
The reading CGI has to decide on itself how to treat it.
Examples
URLText.encode "'Stop!' said Fred." #=> "%27Stop%21%27+said+Fred."
URLText.decode "%27Stop%21%27+said+Fred%2e"
#=> "'Stop!' said Fred."
Defined Under Namespace
Classes: Dict
Constant Summary collapse
- PAIR_SET =
:stopdoc:
"="
- PAIR_SEP =
"&"
Instance Attribute Summary collapse
-
#keep_8bit ⇒ Object
Returns the value of attribute keep_8bit.
-
#keep_space ⇒ Object
Returns the value of attribute keep_space.
-
#mask_space ⇒ Object
Returns the value of attribute mask_space.
Class Method Summary collapse
-
.decode(str) ⇒ Object
:call-seq: decode( str) -> str decode( str, encoding) -> str.
-
.decode_hash(qstr) ⇒ Object
:call-seq: decode_hash( str) -> hash decode_hash( str) { |key,val| … } -> nil or int.
- .encode(str) ⇒ Object
- .encode_hash(hash) ⇒ Object
- .mkurl(path, hash, anchor = nil) ⇒ Object
- .std ⇒ Object
Instance Method Summary collapse
- #decode(str) ⇒ Object
- #decode_hash(qstr, &block) ⇒ Object
-
#encode(str) ⇒ Object
:call-seq: encode( str) -> str.
-
#encode_hash(hash) ⇒ Object
:call-seq: encode_hash( hash) -> str.
-
#initialize(keep_8bit: nil, keep_space: nil, mask_space: nil) ⇒ URLText
constructor
:call-seq: new( hash) -> urltext.
-
#mkurl(path, hash = nil, anchor = nil) ⇒ Object
:call-seq: mkurl( path, hash, anchor = nil) -> str.
Constructor Details
#initialize(keep_8bit: nil, keep_space: nil, mask_space: nil) ⇒ URLText
:call-seq:
new( hash) -> urltext
Creates a URLText
converter.
The parameters may be given as values or as a hash.
utx = URLText.new keep_8bit: true, keep_space: false
See the encode
method for an explanation of these parameters.
270 271 272 273 274 |
# File 'lib/hermeneutics/escape.rb', line 270 def initialize keep_8bit: nil, keep_space: nil, mask_space: nil @keep_8bit = keep_8bit @keep_space = keep_space @mask_space = mask_space end |
Instance Attribute Details
#keep_8bit ⇒ Object
Returns the value of attribute keep_8bit.
257 258 259 |
# File 'lib/hermeneutics/escape.rb', line 257 def keep_8bit @keep_8bit end |
#keep_space ⇒ Object
Returns the value of attribute keep_space.
257 258 259 |
# File 'lib/hermeneutics/escape.rb', line 257 def keep_space @keep_space end |
#mask_space ⇒ Object
Returns the value of attribute mask_space.
257 258 259 |
# File 'lib/hermeneutics/escape.rb', line 257 def mask_space @mask_space end |
Class Method Details
.decode(str) ⇒ Object
:call-seq:
decode( str) -> str
decode( str, encoding) -> str
Decode the contained string.
utx = URLText.new
utx.decode "%27Stop%21%27+said+Fred%2e" #=> "'Stop!' said Fred."
The encoding will be kept. That means that an invalidly encoded string could be produced.
a = "bl%F6d"
a.encode! "utf-8"
d = utx.decode a
d =~ /./ #=> "invalid byte sequence in UTF-8 (ArgumentError)"
460 461 462 463 464 465 466 |
# File 'lib/hermeneutics/escape.rb', line 460 def decode str r = str.new_string r.tr! "+", " " r.gsub! /(?:%([0-9A-F]{2}))/i do $1.hex.chr end r.force_encoding str.encoding r end |
.decode_hash(qstr) ⇒ Object
:call-seq:
decode_hash( str) -> hash
decode_hash( str) { |key,val| ... } -> nil or int
Decode a URL-style encoded string to a Hash
. In case a block is given, the number of key-value pairs is returned.
str = "a=%3B%3B%3B&x=%26auml%3B%26ouml%3B%26uuml%3B"
URLText.decode_hash str do |k,v|
puts "#{k} = #{v}"
end
Output:
a = ;;;
x = äöü
485 486 487 488 489 490 491 492 493 494 495 496 497 498 |
# File 'lib/hermeneutics/escape.rb', line 485 def decode_hash qstr if block_given? then i = 0 each_pair qstr do |k,v| yield k, v i += 1 end i.nonzero? else Dict.create do |h| each_pair qstr do |k,v| h.parse k, v end end end end |
.encode(str) ⇒ Object
431 432 433 |
# File 'lib/hermeneutics/escape.rb', line 431 def encode str std.encode str end |
.encode_hash(hash) ⇒ Object
435 436 437 |
# File 'lib/hermeneutics/escape.rb', line 435 def encode_hash hash std.encode_hash hash end |
.mkurl(path, hash, anchor = nil) ⇒ Object
439 440 441 |
# File 'lib/hermeneutics/escape.rb', line 439 def mkurl path, hash, anchor = nil std.mkurl path, hash, anchor end |
.std ⇒ Object
427 428 429 |
# File 'lib/hermeneutics/escape.rb', line 427 def std @std ||= new end |
Instance Method Details
#decode(str) ⇒ Object
417 418 419 |
# File 'lib/hermeneutics/escape.rb', line 417 def decode str self.class.decode str end |
#decode_hash(qstr, &block) ⇒ Object
421 422 423 |
# File 'lib/hermeneutics/escape.rb', line 421 def decode_hash qstr, &block self.class.decode_hash qstr, &block end |
#encode(str) ⇒ Object
:call-seq:
encode( str) -> str
Create a string that contains %XX
-encoded bytes.
utx = URLText.new
utx.encode "'Stop!' said Fred." #=> "%27Stop%21%27+said+Fred."
The result will not contain any 8-bit characters, except when keep_8bit
is set. The result will be in the same encoding as the argument although this normally has no meaning.
utx = URLText.new keep_8bit: true
s = "< ä >".encode "UTF-8"
utx.encode s #=> "%3C+\u{e4}+%3E" in UTF-8
s = "< ä >".encode "ISO-8859-1"
utx.encode s #=> "%3C+\xe4+%3E" in ISO-8859-1
A space " "
will not be replaced by a plus "+"
if keep_space
is set.
utx = URLText.new keep_space: true
s = "< x >"
utx.encode s #=> "%3C x %3E"
When mask_space
is set, then a space will be represented as "%20"
,
305 306 307 308 309 310 311 312 313 314 315 316 317 318 |
# File 'lib/hermeneutics/escape.rb', line 305 def encode str r = str.new_string r.force_encoding Encoding::ASCII_8BIT unless @keep_8bit r.gsub! %r/([^a-zA-Z0-9_.-])/ do |c| if c == " " and not @mask_space then @keep_space ? c : "+" elsif not @keep_8bit or c.ascii_only? then "%%%02X" % c.ord else c end end r.encode! str.encoding end |
#encode_hash(hash) ⇒ Object
384 385 386 387 388 389 390 391 392 393 |
# File 'lib/hermeneutics/escape.rb', line 384 def encode_hash hash hash.map { |(k,v)| case v when nil then next when true then v = k when false then v = "" end [k, v].map { |x| encode x.to_s }.join PAIR_SET }.compact.join PAIR_SEP end |
#mkurl(path, hash = nil, anchor = nil) ⇒ Object
405 406 407 408 409 410 411 412 413 |
# File 'lib/hermeneutics/escape.rb', line 405 def mkurl path, hash = nil, anchor = nil unless Hash === hash then hash, anchor = anchor, hash end r = "#{path}" r << "?#{encode_hash hash}" if hash r << "##{anchor}" if anchor r end |