Module: Ascii85

Defined in:
lib/ascii85.rb,
lib/Ascii85/version.rb

Overview

Ascii85 is an implementation of Adobe’s binary-to-text encoding of the same name in pure Ruby.

See en.wikipedia.org/wiki/Ascii85 for more information about the format.

Author

Johannes Holzfuß ([email protected])

License

Distributed under the MIT License (see LICENSE file)

Defined Under Namespace

Classes: BufferedReader, BufferedWriter, DecodingError, DummyWrapper, Wrapper

Constant Summary collapse

VERSION =
'2.0.0'

Class Method Summary collapse

Class Method Details

.decode(str, out: nil) ⇒ String, IO

Note:

This method only accepts a String, not an IO-like object, as the entire input needs to be available to ensure validity.

Searches through a String and decodes the first substring enclosed by ‘<~’ and ‘~>’.

Examples:

Decoding Ascii85 content

Ascii85.decode("<~;KZGo~>")
# => "Ruby"

Decoding with multiple Ascii85 blocks present (ignores all but the first)

Ascii85.decode("Foo<~;KZGo~>Bar<~87cURDZ~>Baz")
# => "Ruby"

When no delimiters are found

Ascii85.decode("No delimiters")
# => ""

Decoding to an IO object

output = StringIO.new
Ascii85.decode("<~;KZGo~>", out: output)
# => output (with "Ruby" written to it)

Parameters:

  • str (String)

    The String containing Ascii85-encoded content

  • out (IO, nil) (defaults to: nil)

    An optional IO-like object to write the output to

Returns:

  • (String, IO)

    The decoded String (in ASCII-8BIT encoding) or the output IO object (if it was provided)

Raises:



191
192
193
# File 'lib/ascii85.rb', line 191

def decode(str, out: nil)
  decode_raw(extract(str), out: out)
end

.decode_raw(str_or_io, out: nil) ⇒ String, IO

Note:

The input must not be enclosed in ‘<~’ and ‘~>’ delimiters.

Decodes the given raw Ascii85-encoded String or IO-like object.

Examples:

Decoding a raw Ascii85 String

Ascii85.decode_raw(";KZGo")
# => "Ruby"

Decoding from an IO-like object

input = StringIO.new(";KZGo")
Ascii85.decode_raw(input)
# => "Ruby"

Decoding to an IO object

output = StringIO.new
Ascii85.decode_raw(";KZGo", out: output)
# => output (with "Ruby" written to it)

Parameters:

  • str_or_io (String, IO)

    The Ascii85-encoded input to decode

  • out (IO, nil) (defaults to: nil)

    An optional IO-like object to write the output to

Returns:

  • (String, IO)

    The decoded String (in ASCII-8BIT encoding) or the output IO object (if it was provided)

Raises:



221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
# File 'lib/ascii85.rb', line 221

def decode_raw(str_or_io, out: nil)
  reader = if io_like?(str_or_io)
             str_or_io
           else
             StringIO.new(str_or_io.to_s, 'rb')
           end

  # Return an unfrozen String on empty input
  return ''.dup if reader.eof?

  # Setup buffered Reader and Writers
  bufreader = BufferedReader.new(reader, encoded_chunk_size)
  bufwriter = BufferedWriter.new(out || StringIO.new(String.new, 'wb'), unencoded_chunk_size)

  # Populate the lookup table (caches the exponentiation)
  lut = (0..4).map { |count| 85**(4 - count) }

  # Decode
  word   = 0
  count  = 0
  wordbuf = "\0\0\0\0".dup

  bufreader.each_chunk do |chunk|
    chunk.each_byte do |c|
      case c.chr
      when ' ', "\t", "\r", "\n", "\f", "\0"
        # Ignore whitespace
        next

      when 'z'
        raise(Ascii85::DecodingError, "Found 'z' inside Ascii85 5-tuple") unless count.zero?

        # Expand z to 0-word
        bufwriter.write("\0\0\0\0")

      when '!'..'u'
        # Decode 5 characters into a 4-byte word
        word  += (c - 33) * lut[count]
        count += 1

        if count == 5 && word > 0xffffffff
          raise(Ascii85::DecodingError, "Invalid Ascii85 5-tuple (#{word} >= 2**32)")
        elsif count == 5
          b3 = word & 0xff; word >>= 8
          b2 = word & 0xff; word >>= 8
          b1 = word & 0xff; word >>= 8
          b0 = word

          wordbuf.setbyte(0, b0)
          wordbuf.setbyte(1, b1)
          wordbuf.setbyte(2, b2)
          wordbuf.setbyte(3, b3)

          bufwriter.write(wordbuf)

          word  = 0
          count = 0
        end

      else
        raise(Ascii85::DecodingError, "Illegal character inside Ascii85: #{c.chr.dump}")
      end
    end
  end

  # We're done if all 5-tuples have been consumed
  if count.zero?
    bufwriter.flush
    return out || bufwriter.io.string.force_encoding('ASCII-8BIT')
  end

  raise(Ascii85::DecodingError, 'Last 5-tuple consists of single character') if count == 1

  # Finish last, partially decoded 32-bit word
  count -= 1
  word  += lut[count]

  bufwriter.write((word >> 24).chr) if count >= 1
  bufwriter.write(((word >> 16) & 0xff).chr) if count >= 2
  bufwriter.write(((word >> 8) & 0xff).chr) if count == 3
  bufwriter.flush

  out || bufwriter.io.string.force_encoding('ASCII-8BIT')
end

.encode(str_or_io, wrap_lines = 80, out: nil) ⇒ String, IO

Encodes the bytes of the given String or IO-like object as Ascii85.

Examples:

Encoding a simple String

Ascii85.encode("Ruby")
# => <~;KZGo~>

Encoding with line wrapping

Ascii85.encode("Supercalifragilisticexpialidocious", 15)
# => <~;g!%jEarNoBkD
#    BoB5)0rF*),+AU&
#    0.@;KXgDe!L"F`R
#    ~>

Encoding without line wrapping

Ascii85.encode("Supercalifragilisticexpialidocious", false)
# => <~;g!%jEarNoBkDBoB5)0rF*),+AU&0.@;KXgDe!L"F`R~>

Encoding from an IO-like object

input = StringIO.new("Ruby")
Ascii85.encode(input)
# => "<~;KZGo~>"

Encoding to an IO object

output = StringIO.new
Ascii85.encode("Ruby", out: output)
# => output (with "<~;KZGo~>" written to it)

Parameters:

  • str_or_io (String, IO)

    The input to encode

  • wrap_lines (Integer, false) (defaults to: 80)

    The line length for wrapping, or false for no wrapping

  • out (IO, nil) (defaults to: nil)

    An optional IO-like object to write the output to

Returns:

  • (String, IO)

    The encoded String or the output IO object that was passed in



51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
# File 'lib/ascii85.rb', line 51

def encode(str_or_io, wrap_lines = 80, out: nil)
  reader = if io_like?(str_or_io)
             str_or_io
           else
             StringIO.new(str_or_io.to_s, 'rb')
           end

  return ''.dup if reader.eof?

  # Setup buffered Reader and Writers
  bufreader = BufferedReader.new(reader, unencoded_chunk_size)
  bufwriter = BufferedWriter.new(out || StringIO.new(String.new, 'wb'), encoded_chunk_size)
  writer = wrap_lines ? Wrapper.new(bufwriter, wrap_lines) : DummyWrapper.new(bufwriter)

  padding = "\0\0\0\0"
  tuplebuf = '!!!!!'.dup

  bufreader.each_chunk do |chunk|
    chunk.unpack('N*').each do |word|
      # Encode each big-endian 32-bit word into a 5-character tuple (except
      # for 0, which encodes to 'z')
      if word.zero?
        writer.write('z')
      else
        word, b0 = word.divmod(85)
        word, b1 = word.divmod(85)
        word, b2 = word.divmod(85)
        word, b3 = word.divmod(85)
        b4 = word

        tuplebuf.setbyte(0, b4 + 33)
        tuplebuf.setbyte(1, b3 + 33)
        tuplebuf.setbyte(2, b2 + 33)
        tuplebuf.setbyte(3, b1 + 33)
        tuplebuf.setbyte(4, b0 + 33)

        writer.write(tuplebuf)
      end
    end

    next if (chunk.bytesize & 0b11).zero?

    # If we have leftover bytes, we need to zero-pad to a multiple of four
    # before converting to a 32-bit word.
    padding_length = (-chunk.bytesize) % 4
    trailing = chunk[-(4 - padding_length)..]
    word = (trailing + padding[0...padding_length]).unpack1('N')

    # Encode the last word and cut off any padding
    if word.zero?
      writer.write('!!!!!'[0..(4 - padding_length)])
    else
      word, b0 = word.divmod(85)
      word, b1 = word.divmod(85)
      word, b2 = word.divmod(85)
      word, b3 = word.divmod(85)
      b4 = word

      tuplebuf.setbyte(0, b4 + 33)
      tuplebuf.setbyte(1, b3 + 33)
      tuplebuf.setbyte(2, b2 + 33)
      tuplebuf.setbyte(3, b1 + 33)
      tuplebuf.setbyte(4, b0 + 33)

      writer.write(tuplebuf[0..(4 - padding_length)])
    end
  end

  # If no output IO-object was provided, extract the encoded String from the
  # default StringIO writer. We force the encoding to 'ASCII-8BIT' to work
  # around a TruffleRuby bug.
  return writer.finish.io.string.force_encoding('ASCII-8BIT') if out.nil?

  # Otherwise we make sure to flush the output writer, and then return it.
  writer.finish.io
end

.extract(str) ⇒ String

Note:

This method only accepts a String, not an IO-like object, as the entire input needs to be available to ensure validity.

Searches through a String and extracts the first substring enclosed by ‘<~’ and ‘~>’.

Examples:

Extracting Ascii85 content

Ascii85.extract("Foo<~;KZGo~>Bar<~z~>Baz")
# => ";KZGo"

When no delimiters are found

Ascii85.extract("No delimiters")
# => ""

Parameters:

  • str (String)

    The String to search through

Returns:

  • (String)

    The extracted substring, or an empty String if no valid delimiters are found



145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
# File 'lib/ascii85.rb', line 145

def extract(str)
  input = str.to_s

  # Make sure the delimiter Strings have the correct encoding.
  opening_delim = '<~'.encode(input.encoding)
  closing_delim = '~>'.encode(input.encoding)

  # Get the positions of the opening/closing delimiters. If there is no pair
  # of opening/closing delimiters, return an unfrozen empty String.
  (start_pos = input.index(opening_delim))                or return ''.dup
  (end_pos   = input.index(closing_delim, start_pos + 2)) or return ''.dup

  # Get the String inside the delimiter-pair
  input[(start_pos + 2)...end_pos]
end