Module: Escape
- Defined in:
- lib/esc.rb
Overview
Escape module provides several escape functions.
-
URI
-
HTML
-
shell command
Defined Under Namespace
Classes: HTMLAttrValue, HTMLEscaped, PercentEncoded, ShellEscaped, StringWrapper
Constant Summary collapse
- HTML_TEXT_ESCAPE_HASH =
{ '&' => '&', '<' => '<', '>' => '>', }
- HTML_ATTR_ESCAPE_HASH =
:nodoc:all
{ #:nodoc:all '&' => '&', '<' => '<', '>' => '>', '"' => '"', }
Class Method Summary collapse
-
.html_attr_value(str) ⇒ Object
Escape.html_attr_value encodes a string as a double-quoted HTML attribute using character references.
-
.html_form(pairs, sep = '&') ⇒ Object
Escape.html_form composes HTML form key-value pairs as a x-www-form-urlencoded encoded string.
- .html_form_fast(pairs, sep = '&') ⇒ Object
-
.html_text(str) ⇒ Object
Escape.html_text escapes a string appropriate for HTML text using character references.
-
.shell_command(*command) ⇒ Object
Escape.shell_command composes a sequence of words to a single shell command line.
-
.shell_single_word(str) ⇒ Object
Escape.shell_single_word quotes shell meta characters.
-
.uri_path(str) ⇒ Object
Escape.uri_path escapes URI path using percent-encoding.
-
.uri_segment(str) ⇒ Object
Escape.uri_segment escapes URI segment using percent-encoding.
Class Method Details
.html_attr_value(str) ⇒ Object
Escape.html_attr_value encodes a string as a double-quoted HTML attribute using character references. It returns an instance of HTMLAttrValue.
Escape.html_attr_value("abc") #=> #<Escape::HTMLAttrValue: "abc">
Escape.html_attr_value("a&b") #=> #<Escape::HTMLAttrValue: "a&b">
Escape.html_attr_value("ab&<>\"c") #=> #<Escape::HTMLAttrValue: "ab&<>"c">
Escape.html_attr_value("a'c") #=> #<Escape::HTMLAttrValue: "a'c">
It escapes 4 characters:
-
‘&’ to ‘&’
-
‘<’ to ‘<’
-
‘>’ to ‘>’
-
‘“’ to ‘"’
297 298 299 300 |
# File 'lib/esc.rb', line 297 def html_attr_value(str) #:nodoc:all s = '"' + str.gsub(/[&<>"]/) {|ch| HTML_ATTR_ESCAPE_HASH[ch] } + '"' HTMLAttrValue.new_no_dup(s) end |
.html_form(pairs, sep = '&') ⇒ Object
Escape.html_form composes HTML form key-value pairs as a x-www-form-urlencoded encoded string. It returns an instance of PercentEncoded.
Escape.html_form takes an array of pair of strings or an hash from string to string.
Escape.html_form([["a","b"], ["c","d"]]) #=> #<Escape::PercentEncoded: a=b&c=d>
Escape.html_form({"a"=>"b", "c"=>"d"}) #=> #<Escape::PercentEncoded: a=b&c=d>
In the array form, it is possible to use same key more than once. (It is required for a HTML form which contains checkboxes and select element with multiple attribute.)
Escape.html_form([["k","1"], ["k","2"]]) #=> #<Escape::PercentEncoded: k=1&k=2>
If the strings contains characters which must be escaped in x-www-form-urlencoded, they are escaped using %-encoding.
Escape.html_form([["k=","&;="]]) #=> #<Escape::PercentEncoded: k%3D=%26%3B%3D>
The separator can be specified by the optional second argument.
Escape.html_form([["a","b"], ["c","d"]], ";") #=> #<Escape::PercentEncoded: a=b;c=d>
See HTML 4.01 for details.
210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 |
# File 'lib/esc.rb', line 210 def html_form(pairs, sep='&') r = '' first = true pairs.each {|k, v| # query-chars - pct-encoded - x-www-form-urlencoded-delimiters = # unreserved / "!" / "$" / "'" / "(" / ")" / "*" / "," / ":" / "@" / "/" / "?" # query-char - pct-encoded = unreserved / sub-delims / ":" / "@" / "/" / "?" # query-char = pchar / "/" / "?" = unreserved / pct-encoded / sub-delims / ":" / "@" / "/" / "?" # unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" # sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "=" # x-www-form-urlencoded-delimiters = "&" / "+" / ";" / "=" r << sep if !first first = false k.each_byte {|byte| ch = byte.chr if %r{[^0-9A-Za-z\-\._~:/?@!\$'()*,]}n =~ ch r << "%" << ch.unpack("H2")[0].upcase else r << ch end } r << '=' v.each_byte {|byte| ch = byte.chr if %r{[^0-9A-Za-z\-\._~:/?@!\$'()*,]}n =~ ch r << "%" << ch.unpack("H2")[0].upcase else r << ch end } } PercentEncoded.new_no_dup(r) end |
.html_form_fast(pairs, sep = '&') ⇒ Object
165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 |
# File 'lib/esc.rb', line 165 def html_form_fast(pairs, sep='&') s = pairs.map {|k, v| # query-chars - pct-encoded - x-www-form-urlencoded-delimiters = # unreserved / "!" / "$" / "'" / "(" / ")" / "*" / "," / ":" / "@" / "/" / "?" # query-char - pct-encoded = unreserved / sub-delims / ":" / "@" / "/" / "?" # query-char = pchar / "/" / "?" = unreserved / pct-encoded / sub-delims / ":" / "@" / "/" / "?" # unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" # sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "=" # x-www-form-urlencoded-delimiters = "&" / "+" / ";" / "=" k = k.gsub(%r{[^0-9A-Za-z\-\._~:/?@!\$'()*,]}n) { '%' + $&.unpack("H2")[0].upcase } v = v.gsub(%r{[^0-9A-Za-z\-\._~:/?@!\$'()*,]}n) { '%' + $&.unpack("H2")[0].upcase } "#{k}=#{v}" }.join(sep) PercentEncoded.new_no_dup(s) end |
.html_text(str) ⇒ Object
Escape.html_text escapes a string appropriate for HTML text using character references. It returns an instance of HTMLEscaped.
It escapes 3 characters:
-
‘&’ to ‘&’
-
‘<’ to ‘<’
-
‘>’ to ‘>’
Escape.html_text("abc") #=> #<Escape::HTMLEscaped: abc>
Escape.html_text("a & b < c > d") #=> #<Escape::HTMLEscaped: a & b < c > d>
This function is not appropriate for escaping HTML element attribute because quotes are not escaped.
267 268 269 270 |
# File 'lib/esc.rb', line 267 def html_text(str) #:nodoc:all s = str.gsub(/[&<>]/) {|ch| HTML_TEXT_ESCAPE_HASH[ch] } HTMLEscaped.new_no_dup(s) end |
.shell_command(*command) ⇒ Object
Escape.shell_command composes a sequence of words to a single shell command line. All shell meta characters are quoted and the words are concatenated with interleaving space. It returns an instance of ShellEscaped.
Escape.shell_command(["ls", "/"]) #=> #<Escape::ShellEscaped: ls />
Escape.shell_command(["echo", "*"]) #=> #<Escape::ShellEscaped: echo '*'>
Note that system(*command) and system(Escape.shell_command(command)) is roughly same. There are two exception as follows.
-
The first is that the later may invokes /bin/sh.
-
The second is an interpretation of an array with only one element: the element is parsed by the shell with the former but it is recognized as single word with the later. For example, system(*[“echo foo”]) invokes echo command with an argument “foo”. But system(Escape.shell_command([“echo foo”])) invokes “echo foo” command without arguments (and it probably fails).
86 87 88 89 90 |
# File 'lib/esc.rb', line 86 def shell_command(*command) command = [command].flatten.compact # Delano s = command.map {|word| shell_single_word(word) }.join(' ') ShellEscaped.new_no_dup(s) end |
.shell_single_word(str) ⇒ Object
Escape.shell_single_word quotes shell meta characters. It returns an instance of ShellEscaped.
The result string is always single shell word, even if the argument is “”. Escape.shell_single_word(“”) returns #<Escape::ShellEscaped: ”>.
Escape.shell_single_word("") #=> #<Escape::ShellEscaped: ''>
Escape.shell_single_word("foo") #=> #<Escape::ShellEscaped: foo>
Escape.shell_single_word("*") #=> #<Escape::ShellEscaped: '*'>
102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 |
# File 'lib/esc.rb', line 102 def shell_single_word(str) return unless str str &&= str.to_s # Delano fix if str.empty? ShellEscaped.new_no_dup("''") elsif %r{\A[0-9A-Za-z+,./:=@_-]+\z} =~ str ShellEscaped.new(str) else result = '' str.scan(/('+)|[^']+/) { if $1 result << %q{\'} * $1.length else result << "'#{$&}'" end } ShellEscaped.new_no_dup(result) end end |
.uri_path(str) ⇒ Object
Escape.uri_path escapes URI path using percent-encoding. The given path should be a sequence of (non-escaped) segments separated by “/”. The segments cannot contains “/”. It returns an instance of PercentEncoded.
Escape.uri_path("a/b/c") #=> #<Escape::PercentEncoded: a/b/c>
Escape.uri_path("a?b/c?d/e?f") #=> #<Escape::PercentEncoded: a%3Fb/c%3Fd/e%3Ff>
The path is the part after authority before query in URI, as follows.
scheme:///path#fragment
See RFC 3986 for details of URI.
Note that this function is not appropriate to convert OS path to URI.
160 161 162 163 |
# File 'lib/esc.rb', line 160 def uri_path(str) s = str.gsub(%r{[^/]+}n) { uri_segment($&) } PercentEncoded.new_no_dup(s) end |
.uri_segment(str) ⇒ Object
Escape.uri_segment escapes URI segment using percent-encoding. It returns an instance of PercentEncoded.
Escape.uri_segment("a/b") #=> #<Escape::PercentEncoded: a%2Fb>
The segment is “/”-splitted element after authority before query in URI, as follows.
scheme://authority/segment1/segment2/.../segmentN?query#fragment
See RFC 3986 for details of URI.
135 136 137 138 139 140 141 142 143 |
# File 'lib/esc.rb', line 135 def uri_segment(str) # pchar - pct-encoded = unreserved / sub-delims / ":" / "@" # unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" # sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "=" s = str.gsub(%r{[^A-Za-z0-9\-._~!$&'()*+,;=:@]}n) { '%' + $&.unpack("H2")[0].upcase } PercentEncoded.new_no_dup(s) end |