Module: Ascii85
- Defined in:
- lib/ascii85.rb,
lib/Ascii85/version.rb
Overview
Ascii85 is an implementation of Adobe’s binary-to-text encoding of the same name in pure Ruby.
See en.wikipedia.org/wiki/Ascii85 for more information about the format.
- Author
-
Johannes Holzfuß ([email protected])
- License
-
Distributed under the MIT License (see LICENSE file)
Defined Under Namespace
Classes: BufferedReader, BufferedWriter, DecodingError, DummyWrapper, Wrapper
Constant Summary collapse
- VERSION =
'2.0.0'
Class Method Summary collapse
-
.decode(str, out: nil) ⇒ String, IO
Searches through a String and decodes the first substring enclosed by ‘<~’ and ‘~>’.
-
.decode_raw(str_or_io, out: nil) ⇒ String, IO
Decodes the given raw Ascii85-encoded String or IO-like object.
-
.encode(str_or_io, wrap_lines = 80, out: nil) ⇒ String, IO
Encodes the bytes of the given String or IO-like object as Ascii85.
-
.extract(str) ⇒ String
Searches through a String and extracts the first substring enclosed by ‘<~’ and ‘~>’.
Class Method Details
.decode(str, out: nil) ⇒ String, IO
This method only accepts a String, not an IO-like object, as the entire input needs to be available to ensure validity.
Searches through a String and decodes the first substring enclosed by ‘<~’ and ‘~>’.
191 192 193 |
# File 'lib/ascii85.rb', line 191 def decode(str, out: nil) decode_raw(extract(str), out: out) end |
.decode_raw(str_or_io, out: nil) ⇒ String, IO
The input must not be enclosed in ‘<~’ and ‘~>’ delimiters.
Decodes the given raw Ascii85-encoded String or IO-like object.
221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 |
# File 'lib/ascii85.rb', line 221 def decode_raw(str_or_io, out: nil) reader = if io_like?(str_or_io) str_or_io else StringIO.new(str_or_io.to_s, 'rb') end # Return an unfrozen String on empty input return ''.dup if reader.eof? # Setup buffered Reader and Writers bufreader = BufferedReader.new(reader, encoded_chunk_size) bufwriter = BufferedWriter.new(out || StringIO.new(String.new, 'wb'), unencoded_chunk_size) # Populate the lookup table (caches the exponentiation) lut = (0..4).map { |count| 85**(4 - count) } # Decode word = 0 count = 0 wordbuf = "\0\0\0\0".dup bufreader.each_chunk do |chunk| chunk.each_byte do |c| case c.chr when ' ', "\t", "\r", "\n", "\f", "\0" # Ignore whitespace next when 'z' raise(Ascii85::DecodingError, "Found 'z' inside Ascii85 5-tuple") unless count.zero? # Expand z to 0-word bufwriter.write("\0\0\0\0") when '!'..'u' # Decode 5 characters into a 4-byte word word += (c - 33) * lut[count] count += 1 if count == 5 && word > 0xffffffff raise(Ascii85::DecodingError, "Invalid Ascii85 5-tuple (#{word} >= 2**32)") elsif count == 5 b3 = word & 0xff; word >>= 8 b2 = word & 0xff; word >>= 8 b1 = word & 0xff; word >>= 8 b0 = word wordbuf.setbyte(0, b0) wordbuf.setbyte(1, b1) wordbuf.setbyte(2, b2) wordbuf.setbyte(3, b3) bufwriter.write(wordbuf) word = 0 count = 0 end else raise(Ascii85::DecodingError, "Illegal character inside Ascii85: #{c.chr.dump}") end end end # We're done if all 5-tuples have been consumed if count.zero? bufwriter.flush return out || bufwriter.io.string.force_encoding('ASCII-8BIT') end raise(Ascii85::DecodingError, 'Last 5-tuple consists of single character') if count == 1 # Finish last, partially decoded 32-bit word count -= 1 word += lut[count] bufwriter.write((word >> 24).chr) if count >= 1 bufwriter.write(((word >> 16) & 0xff).chr) if count >= 2 bufwriter.write(((word >> 8) & 0xff).chr) if count == 3 bufwriter.flush out || bufwriter.io.string.force_encoding('ASCII-8BIT') end |
.encode(str_or_io, wrap_lines = 80, out: nil) ⇒ String, IO
Encodes the bytes of the given String or IO-like object as Ascii85.
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 |
# File 'lib/ascii85.rb', line 51 def encode(str_or_io, wrap_lines = 80, out: nil) reader = if io_like?(str_or_io) str_or_io else StringIO.new(str_or_io.to_s, 'rb') end return ''.dup if reader.eof? # Setup buffered Reader and Writers bufreader = BufferedReader.new(reader, unencoded_chunk_size) bufwriter = BufferedWriter.new(out || StringIO.new(String.new, 'wb'), encoded_chunk_size) writer = wrap_lines ? Wrapper.new(bufwriter, wrap_lines) : DummyWrapper.new(bufwriter) padding = "\0\0\0\0" tuplebuf = '!!!!!'.dup bufreader.each_chunk do |chunk| chunk.unpack('N*').each do |word| # Encode each big-endian 32-bit word into a 5-character tuple (except # for 0, which encodes to 'z') if word.zero? writer.write('z') else word, b0 = word.divmod(85) word, b1 = word.divmod(85) word, b2 = word.divmod(85) word, b3 = word.divmod(85) b4 = word tuplebuf.setbyte(0, b4 + 33) tuplebuf.setbyte(1, b3 + 33) tuplebuf.setbyte(2, b2 + 33) tuplebuf.setbyte(3, b1 + 33) tuplebuf.setbyte(4, b0 + 33) writer.write(tuplebuf) end end next if (chunk.bytesize & 0b11).zero? # If we have leftover bytes, we need to zero-pad to a multiple of four # before converting to a 32-bit word. padding_length = (-chunk.bytesize) % 4 trailing = chunk[-(4 - padding_length)..] word = (trailing + padding[0...padding_length]).unpack1('N') # Encode the last word and cut off any padding if word.zero? writer.write('!!!!!'[0..(4 - padding_length)]) else word, b0 = word.divmod(85) word, b1 = word.divmod(85) word, b2 = word.divmod(85) word, b3 = word.divmod(85) b4 = word tuplebuf.setbyte(0, b4 + 33) tuplebuf.setbyte(1, b3 + 33) tuplebuf.setbyte(2, b2 + 33) tuplebuf.setbyte(3, b1 + 33) tuplebuf.setbyte(4, b0 + 33) writer.write(tuplebuf[0..(4 - padding_length)]) end end # If no output IO-object was provided, extract the encoded String from the # default StringIO writer. We force the encoding to 'ASCII-8BIT' to work # around a TruffleRuby bug. return writer.finish.io.string.force_encoding('ASCII-8BIT') if out.nil? # Otherwise we make sure to flush the output writer, and then return it. writer.finish.io end |
.extract(str) ⇒ String
This method only accepts a String, not an IO-like object, as the entire input needs to be available to ensure validity.
Searches through a String and extracts the first substring enclosed by ‘<~’ and ‘~>’.
145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 |
# File 'lib/ascii85.rb', line 145 def extract(str) input = str.to_s # Make sure the delimiter Strings have the correct encoding. opening_delim = '<~'.encode(input.encoding) closing_delim = '~>'.encode(input.encoding) # Get the positions of the opening/closing delimiters. If there is no pair # of opening/closing delimiters, return an unfrozen empty String. (start_pos = input.index(opening_delim)) or return ''.dup (end_pos = input.index(closing_delim, start_pos + 2)) or return ''.dup # Get the String inside the delimiter-pair input[(start_pos + 2)...end_pos] end |