Module: String::StringExtensions
- Included in:
- String
- Defined in:
- lib/ruckus/extensions/string.rb
Instance Method Summary collapse
-
#adler ⇒ Object
A hacked up adler16 checksum, a la Andrew Tridgell.
-
#asciiz ⇒ Object
Sometimes string buffers passed through Win32 interfaces come with garbage after the trailing NUL; this method gets rid of that, like String#trim.
-
#class_name ⇒ Object
convert a string to its idiomatic ruby class name.
-
#dehexify ⇒ Object
Convert a string of raw hex characters (no %‘s or anything) into binary.
-
#dehexify! ⇒ Object
Convert a string of raw hex characters (no %‘s or anything) into binary in place.
- #ends_with?(x) ⇒ Boolean
-
#entropy ⇒ Object
Cribbed from Ero Carrera’s pefile; a relatively expensive entropy function, gives a float result of random-bits-per-byte.
-
#from_utf16 ⇒ Object
(also: #to_utf8, #to_ascii)
Convert a “Unicode” (Win32-style) string back to native Ruby UTF-8; get rid of any trailing NUL.
-
#from_utf16_buffer ⇒ Object
Convenience for parsing UNICODE strings from a buffer Assumes last char ends in 00, which is not always true but works in English.
-
#hexdump(capture = false) ⇒ Object
My entry into the hexdump race.
-
#hexify ⇒ Object
Convert a string into hex characters.
-
#hexify! ⇒ Object
convert a string to hex characters in place.
-
#method_name ⇒ Object
oh, it’s exactly what it sounds like.
-
#nextstring(opts = {}) ⇒ Object
The driver function for String#strings below; really, this will run on any Enumerable that contains Fixnums.
-
#or(str) ⇒ Object
OR two strings together.
-
#pad(size, char = "\x00") ⇒ Object
I love you String#ljust.
-
#rotate_bytes(k = 0) ⇒ Object
byte rotation cypher (yes it’s been useful).
-
#shift(count = 1) ⇒ Object
Insanely useful shorthand: pop bytes off the front of a string.
- #shift_b16 ⇒ Object
- #shift_b32 ⇒ Object
- #shift_l16 ⇒ Object
- #shift_l32 ⇒ Object
-
#shift_tok(rx) ⇒ Object
“foo: bar”.shift_tok /:s*/ => “foo” # leaving “bar”.
- #shift_u8 ⇒ Object
-
#starts_with?(x) ⇒ Boolean
Insane that this isn’t in the library by default.
-
#strings(opts = {}) ⇒ Object
A la Unix strings(1).
- #to_b16 ⇒ Object
- #to_b32 ⇒ Object
- #to_l16 ⇒ Object
-
#to_l32 ⇒ Object
Convert binary strings back to integers.
- #to_u8 ⇒ Object
-
#to_utf16 ⇒ Object
Convert a string to “Unicode”, ie, the way Win32 expects it, including trailing NUL.
- #underscore ⇒ Object
-
#xor(str) ⇒ Object
XOR two strings.
- #xor!(str) ⇒ Object
Instance Method Details
#adler ⇒ Object
A hacked up adler16 checksum, a la Andrew Tridgell. This is probably even slower than Ruby’s native CRC support. A weak, trivial checksum, part of rsync.
182 183 184 185 186 187 188 189 |
# File 'lib/ruckus/extensions/string.rb', line 182 def adler a, b = 0, 0 0.upto(size-1) {|i| a += self[i]} a %= 65536 0.upto(size-1) {|i| b += ((size-i)+1) * self[i]} b %= 65536 return (a|(b<<16)) end |
#asciiz ⇒ Object
Sometimes string buffers passed through Win32 interfaces come with garbage after the trailing NUL; this method gets rid of that, like String#trim
31 32 33 34 35 36 37 |
# File 'lib/ruckus/extensions/string.rb', line 31 def asciiz begin self[0..self.index("\x00")-1] rescue self end end |
#class_name ⇒ Object
convert a string to its idiomatic ruby class name
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 |
# File 'lib/ruckus/extensions/string.rb', line 81 def class_name r = "" up = true each_byte do |c| if c == 95 if up r << "::" else up = true end else m = up ? :upcase : :to_s r << (c.chr.send(m)) up = false end end r end |
#dehexify ⇒ Object
Convert a string of raw hex characters (no %‘s or anything) into binary
250 251 252 253 |
# File 'lib/ruckus/extensions/string.rb', line 250 def dehexify (ret||="") << (me||=clone).shift(2).to_i(16).chr while not (me||=clone).empty? return ret end |
#dehexify! ⇒ Object
Convert a string of raw hex characters (no %‘s or anything) into binary in place
256 257 258 259 |
# File 'lib/ruckus/extensions/string.rb', line 256 def dehexify! (ret||="") << (me||=clone).shift(2).to_i(16).chr while not (me||=clone).empty? self.replace ret end |
#ends_with?(x) ⇒ Boolean
105 106 107 |
# File 'lib/ruckus/extensions/string.rb', line 105 def ends_with? x self[-(x.size)..-1] == x end |
#entropy ⇒ Object
Cribbed from Ero Carrera’s pefile; a relatively expensive entropy function, gives a float result of random-bits-per-byte.
111 112 113 114 115 116 117 118 119 120 121 |
# File 'lib/ruckus/extensions/string.rb', line 111 def entropy e = 0 0.upto(255) do |i| x = count(i.chr)/size.to_f if x > 0 e += - x * Math.log2(x) end end return e end |
#from_utf16 ⇒ Object Also known as: to_utf8, to_ascii
Convert a “Unicode” (Win32-style) string back to native Ruby UTF-8; get rid of any trailing NUL.
13 14 15 16 17 18 |
# File 'lib/ruckus/extensions/string.rb', line 13 def from_utf16 ret = Iconv.iconv("utf-8", "utf-16le", self).first if ret[-1] == 0 ret = ret[0..-2] end end |
#from_utf16_buffer ⇒ Object
Convenience for parsing UNICODE strings from a buffer Assumes last char ends in 00, which is not always true but works in English
24 25 26 |
# File 'lib/ruckus/extensions/string.rb', line 24 def from_utf16_buffer self[0..index("\0\0\0")+2].from_utf16 end |
#hexdump(capture = false) ⇒ Object
My entry into the hexdump race. Outputs canonical hexdump, uses StringIO for speed, could be cleaned up with “ljust”, and should probably use table lookup instead of to_s(16) method calls.
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
# File 'lib/ruckus/extensions/string.rb', line 42 def hexdump(capture=false) sio = StringIO.new rem = size - 1 off = 0 while rem > 0 pbuf = "" pad = (15 - rem) if rem < 16 pad ||= 0 sio.write(("0" * (8 - (x = off.to_s(16)).size)) + x + " ") 0.upto(15-pad) do |i| c = self[off] x = c.to_s(16) sio.write(("0" * (2 - x.size)) + x + " ") if c.printable? pbuf << c else pbuf << "." end off += 1 rem -= 1 sio.write(" ") if i == 7 end sio.write("-- " * pad) if pad > 0 sio.write(" |#{ pbuf }|\n") end sio.rewind() if capture sio.read() else puts sio.read() end end |
#hexify ⇒ Object
Convert a string into hex characters
237 238 239 240 241 |
# File 'lib/ruckus/extensions/string.rb', line 237 def hexify l = [] each_byte{|b| l << "%02x" % b} l.join end |
#hexify! ⇒ Object
convert a string to hex characters in place
244 245 246 |
# File 'lib/ruckus/extensions/string.rb', line 244 def hexify! self.replace hexify end |
#method_name ⇒ Object
oh, it’s exactly what it sounds like.
204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 |
# File 'lib/ruckus/extensions/string.rb', line 204 def method_name r = "" scoped = false each_byte do |c| if c == 58 if not scoped r << "_" scoped = true else scoped = false end else if r.size == 0 r << c.chr.downcase else if c.upper? r << "_" r << c.chr.downcase else r << c.chr end end end end return r end |
#nextstring(opts = {}) ⇒ Object
The driver function for String#strings below; really, this will run on any Enumerable that contains Fixnums.
125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 |
# File 'lib/ruckus/extensions/string.rb', line 125 def nextstring(opts={}) off = opts[:offset] || 0 sz = opts[:minimum] || 7 u = opts[:unicode] || false l = size i = off while i < l if self[i].printable? start = i cnt = 1 i += 1 lastu = false while i < l if self[i].printable? lastu = false cnt += 1 i += 1 elsif u and self[i] == 0 and not lastu lastu = true i += 1 else break end end return([start, i - start]) if cnt >= sz else i += 1 end end return false, false end |
#or(str) ⇒ Object
OR two strings together. Slow. Handles mismatched lengths by zero-extending
262 263 264 265 266 267 268 269 270 271 |
# File 'lib/ruckus/extensions/string.rb', line 262 def or(str) max = size < str.size ? str.size : size ret = "" 0.upto(max-1) do |i| x = self[i] || 0 y = str[i] || 0 ret << (x | y).chr end return ret end |
#pad(size, char = "\x00") ⇒ Object
I love you String#ljust
232 233 234 |
# File 'lib/ruckus/extensions/string.rb', line 232 def pad(size, char="\x00") ljust(size, char) end |
#rotate_bytes(k = 0) ⇒ Object
byte rotation cypher (yes it’s been useful)
289 290 291 292 293 294 295 296 |
# File 'lib/ruckus/extensions/string.rb', line 289 def rotate_bytes(k=0) # XXX not used r = [] each_byte do |b| r << ((b + k) % 256).chr end return r.join end |
#shift(count = 1) ⇒ Object
Insanely useful shorthand: pop bytes off the front of a string
299 300 301 302 |
# File 'lib/ruckus/extensions/string.rb', line 299 def shift(count=1) return self if count == 0 slice! 0..(count-1) end |
#shift_b16 ⇒ Object
200 |
# File 'lib/ruckus/extensions/string.rb', line 200 def shift_b16; shift(2).to_b16; end |
#shift_b32 ⇒ Object
198 |
# File 'lib/ruckus/extensions/string.rb', line 198 def shift_b32; shift(4).to_b32; end |
#shift_l16 ⇒ Object
199 |
# File 'lib/ruckus/extensions/string.rb', line 199 def shift_l16; shift(2).to_l16; end |
#shift_l32 ⇒ Object
197 |
# File 'lib/ruckus/extensions/string.rb', line 197 def shift_l32; shift(4).to_l32; end |
#shift_tok(rx) ⇒ Object
“foo: bar”.shift_tok /:s*/ => “foo” # leaving “bar”
312 313 314 315 316 317 318 319 320 321 322 323 324 |
# File 'lib/ruckus/extensions/string.rb', line 312 def shift_tok(rx) # XXX not used src = rx.source if rx.kind_of? Regexp rx = Regexp.new "(#{ src })" idx = (self =~ rx) if idx ret = shift(idx) shift($1.size) return ret else shift(self.size) end end |
#shift_u8 ⇒ Object
201 |
# File 'lib/ruckus/extensions/string.rb', line 201 def shift_u8; shift(1).to_u8; end |
#starts_with?(x) ⇒ Boolean
Insane that this isn’t in the library by default.
101 102 103 |
# File 'lib/ruckus/extensions/string.rb', line 101 def starts_with? x self[0..x.size-1] == x end |
#strings(opts = {}) ⇒ Object
A la Unix strings(1). With a block, yields offset, string length, and contents. Otherwise returns a list. Accepts options: :unicode: superficial but effective Win32 Unicode support, skips NULs :minimum: minimum length of returned strings, ala strings -10
163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 |
# File 'lib/ruckus/extensions/string.rb', line 163 def strings(opts={}) ret = [] opts[:offset] ||= 0 while 1 off, size = nextstring(opts) break if not off opts[:offset] += (off + size) if block_given? yield off, size, self[off,size] else ret << [off, size, self[off,size]] end end ret end |
#to_b16 ⇒ Object
195 |
# File 'lib/ruckus/extensions/string.rb', line 195 def to_b16; unpack("n").first; end |
#to_b32 ⇒ Object
193 |
# File 'lib/ruckus/extensions/string.rb', line 193 def to_b32; unpack("N").first; end |
#to_l16 ⇒ Object
194 |
# File 'lib/ruckus/extensions/string.rb', line 194 def to_l16; unpack("v").first; end |
#to_l32 ⇒ Object
Convert binary strings back to integers
192 |
# File 'lib/ruckus/extensions/string.rb', line 192 def to_l32; unpack("L").first; end |
#to_u8 ⇒ Object
196 |
# File 'lib/ruckus/extensions/string.rb', line 196 def to_u8; self[0]; end |
#to_utf16 ⇒ Object
Convert a string to “Unicode”, ie, the way Win32 expects it, including trailing NUL.
7 8 9 |
# File 'lib/ruckus/extensions/string.rb', line 7 def to_utf16 Iconv.iconv("utf-16LE", "utf-8", self).first + "\x00\x00" end |
#underscore ⇒ Object
304 305 306 307 308 309 |
# File 'lib/ruckus/extensions/string.rb', line 304 def underscore first = false gsub(/[a-z0-9][A-Z]/) do |m| "#{ m[0].chr }_#{ m[1].chr.downcase }" end end |
#xor(str) ⇒ Object
XOR two strings. wrapping around if str is shorter thn self.
274 275 276 277 278 279 280 |
# File 'lib/ruckus/extensions/string.rb', line 274 def xor(str) r = [] size.times do |i| r << (self[i] ^ str[i % str.size]).chr end return r.join end |
#xor!(str) ⇒ Object
282 283 284 285 286 |
# File 'lib/ruckus/extensions/string.rb', line 282 def xor!(str) size.times do |i| self[i] ^= str[i % str.size] end end |