Module: Ascii85

Defined in:: lib/ascii85.rb,
lib/Ascii85/version.rb

Overview

Ascii85 is an implementation of Adobe’s binary-to-text encoding of the same name in pure Ruby.

See en.wikipedia.org/wiki/Ascii85 for more information about the format.

Author: Johannes Holzfuß ([email protected])
License: Distributed under the MIT License (see LICENSE file)

Defined Under Namespace

Classes: BufferedReader, BufferedWriter, DecodingError, DummyWrapper, Wrapper

Constant Summary collapse

VERSION =

'2.0.0'

Class Method Summary collapse

.decode(str, out: nil) ⇒ String, IO

Searches through a String and decodes the first substring enclosed by ‘<~’ and ‘~>’.
.decode_raw(str_or_io, out: nil) ⇒ String, IO

Decodes the given raw Ascii85-encoded String or IO-like object.
.encode(str_or_io, wrap_lines = 80, out: nil) ⇒ String, IO

Encodes the bytes of the given String or IO-like object as Ascii85.
.extract(str) ⇒ String

Searches through a String and extracts the first substring enclosed by ‘<~’ and ‘~>’.

Class Method Details

.decode(str, out: nil) ⇒ `String`, `IO`

Note:

This method only accepts a String, not an IO-like object, as the entire input needs to be available to ensure validity.

Searches through a String and decodes the first substring enclosed by ‘<~’ and ‘~>’.

Examples:

Decoding Ascii85 content

Ascii85.decode("<~;KZGo~>")
# => "Ruby"

Decoding with multiple Ascii85 blocks present (ignores all but the first)

Ascii85.decode("Foo<~;KZGo~>Bar<~87cURDZ~>Baz")
# => "Ruby"

When no delimiters are found

Ascii85.decode("No delimiters")
# => ""

Decoding to an IO object

output = StringIO.new
Ascii85.decode("<~;KZGo~>", out: output)
# => output (with "Ruby" written to it)

Parameters:

str (String) —

The String containing Ascii85-encoded content
out (IO, nil) (defaults to: nil) —

An optional IO-like object to write the output to

Returns:

(String, IO) —

The decoded String (in ASCII-8BIT encoding) or the output IO object (if it was provided)

Raises:

(Ascii85::DecodingError) —

When malformed input is encountered



191
192
193

# File 'lib/ascii85.rb', line 191

def decode(str, out: nil)
  decode_raw(extract(str), out: out)
end

.decode_raw(str_or_io, out: nil) ⇒ `String`, `IO`

Note:

The input must not be enclosed in ‘<~’ and ‘~>’ delimiters.

Decodes the given raw Ascii85-encoded String or IO-like object.

Examples:

Decoding a raw Ascii85 String

Ascii85.decode_raw(";KZGo")
# => "Ruby"

Decoding from an IO-like object

input = StringIO.new(";KZGo")
Ascii85.decode_raw(input)
# => "Ruby"

Decoding to an IO object

output = StringIO.new
Ascii85.decode_raw(";KZGo", out: output)
# => output (with "Ruby" written to it)

Parameters:

str_or_io (String, IO) —

The Ascii85-encoded input to decode
out (IO, nil) (defaults to: nil) —

An optional IO-like object to write the output to

Returns:

(String, IO) —

The decoded String (in ASCII-8BIT encoding) or the output IO object (if it was provided)

Raises:

(Ascii85::DecodingError) —

When malformed input is encountered

# File 'lib/ascii85.rb', line 221

def decode_raw(str_or_io, out: nil)
  reader = if io_like?(str_or_io)
             str_or_io
           else
             StringIO.new(str_or_io.to_s, 'rb')
           end

  # Return an unfrozen String on empty input
  return ''.dup if reader.eof?

  # Setup buffered Reader and Writers
  bufreader = BufferedReader.new(reader, encoded_chunk_size)
  bufwriter = BufferedWriter.new(out || StringIO.new(String.new, 'wb'), unencoded_chunk_size)

  # Populate the lookup table (caches the exponentiation)
  lut = (0..4).map { |count| 85**(4 - count) }

  # Decode
  word   = 0
  count  = 0
  wordbuf = "\0\0\0\0".dup

  bufreader.each_chunk do |chunk|
    chunk.each_byte do |c|
      case c.chr
      when ' ', "\t", "\r", "\n", "\f", "\0"
        # Ignore whitespace
        next

      when 'z'
        raise(Ascii85::DecodingError, "Found 'z' inside Ascii85 5-tuple") unless count.zero?

        # Expand z to 0-word
        bufwriter.write("\0\0\0\0")

      when '!'..'u'
        # Decode 5 characters into a 4-byte word
        word  += (c - 33) * lut[count]
        count += 1

        if count == 5 && word > 0xffffffff
          raise(Ascii85::DecodingError, "Invalid Ascii85 5-tuple (#{word} >= 2**32)")
        elsif count == 5
          b3 = word & 0xff; word >>= 8
          b2 = word & 0xff; word >>= 8
          b1 = word & 0xff; word >>= 8
          b0 = word

          wordbuf.setbyte(0, b0)
          wordbuf.setbyte(1, b1)
          wordbuf.setbyte(2, b2)
          wordbuf.setbyte(3, b3)

          bufwriter.write(wordbuf)

          word  = 0
          count = 0
        end

      else
        raise(Ascii85::DecodingError, "Illegal character inside Ascii85: #{c.chr.dump}")
      end
    end
  end

  # We're done if all 5-tuples have been consumed
  if count.zero?
    bufwriter.flush
    return out || bufwriter.io.string.force_encoding('ASCII-8BIT')
  end

  raise(Ascii85::DecodingError, 'Last 5-tuple consists of single character') if count == 1

  # Finish last, partially decoded 32-bit word
  count -= 1
  word  += lut[count]

  bufwriter.write((word >> 24).chr) if count >= 1
  bufwriter.write(((word >> 16) & 0xff).chr) if count >= 2
  bufwriter.write(((word >> 8) & 0xff).chr) if count == 3
  bufwriter.flush

  out || bufwriter.io.string.force_encoding('ASCII-8BIT')
end

.encode(str_or_io, wrap_lines = 80, out: nil) ⇒ `String`, `IO`

Encodes the bytes of the given String or IO-like object as Ascii85.

Examples:

Encoding a simple String

Ascii85.encode("Ruby")
# => <~;KZGo~>

Encoding with line wrapping

Ascii85.encode("Supercalifragilisticexpialidocious", 15)
# => <~;g!%jEarNoBkD
#    BoB5)0rF*),+AU&
#    0.@;KXgDe!L"F`R
#    ~>

Encoding without line wrapping

Ascii85.encode("Supercalifragilisticexpialidocious", false)
# => <~;g!%jEarNoBkDBoB5)0rF*),+AU&0.@;KXgDe!L"F`R~>

Encoding from an IO-like object

input = StringIO.new("Ruby")
Ascii85.encode(input)
# => "<~;KZGo~>"

Encoding to an IO object

output = StringIO.new
Ascii85.encode("Ruby", out: output)
# => output (with "<~;KZGo~>" written to it)

Parameters:

str_or_io (String, IO) —

The input to encode
wrap_lines (Integer, false) (defaults to: 80) —

The line length for wrapping, or false for no wrapping
out (IO, nil) (defaults to: nil) —

An optional IO-like object to write the output to

Returns:

(String, IO) —

The encoded String or the output IO object that was passed in

# File 'lib/ascii85.rb', line 51

def encode(str_or_io, wrap_lines = 80, out: nil)
  reader = if io_like?(str_or_io)
             str_or_io
           else
             StringIO.new(str_or_io.to_s, 'rb')
           end

  return ''.dup if reader.eof?

  # Setup buffered Reader and Writers
  bufreader = BufferedReader.new(reader, unencoded_chunk_size)
  bufwriter = BufferedWriter.new(out || StringIO.new(String.new, 'wb'), encoded_chunk_size)
  writer = wrap_lines ? Wrapper.new(bufwriter, wrap_lines) : DummyWrapper.new(bufwriter)

  padding = "\0\0\0\0"
  tuplebuf = '!!!!!'.dup

  bufreader.each_chunk do |chunk|
    chunk.unpack('N*').each do |word|
      # Encode each big-endian 32-bit word into a 5-character tuple (except
      # for 0, which encodes to 'z')
      if word.zero?
        writer.write('z')
      else
        word, b0 = word.divmod(85)
        word, b1 = word.divmod(85)
        word, b2 = word.divmod(85)
        word, b3 = word.divmod(85)
        b4 = word

        tuplebuf.setbyte(0, b4 + 33)
        tuplebuf.setbyte(1, b3 + 33)
        tuplebuf.setbyte(2, b2 + 33)
        tuplebuf.setbyte(3, b1 + 33)
        tuplebuf.setbyte(4, b0 + 33)

        writer.write(tuplebuf)
      end
    end

    next if (chunk.bytesize & 0b11).zero?

    # If we have leftover bytes, we need to zero-pad to a multiple of four
    # before converting to a 32-bit word.
    padding_length = (-chunk.bytesize) % 4
    trailing = chunk[-(4 - padding_length)..]
    word = (trailing + padding[0...padding_length]).unpack1('N')

    # Encode the last word and cut off any padding
    if word.zero?
      writer.write('!!!!!'[0..(4 - padding_length)])
    else
      word, b0 = word.divmod(85)
      word, b1 = word.divmod(85)
      word, b2 = word.divmod(85)
      word, b3 = word.divmod(85)
      b4 = word

      tuplebuf.setbyte(0, b4 + 33)
      tuplebuf.setbyte(1, b3 + 33)
      tuplebuf.setbyte(2, b2 + 33)
      tuplebuf.setbyte(3, b1 + 33)
      tuplebuf.setbyte(4, b0 + 33)

      writer.write(tuplebuf[0..(4 - padding_length)])
    end
  end

  # If no output IO-object was provided, extract the encoded String from the
  # default StringIO writer. We force the encoding to 'ASCII-8BIT' to work
  # around a TruffleRuby bug.
  return writer.finish.io.string.force_encoding('ASCII-8BIT') if out.nil?

  # Otherwise we make sure to flush the output writer, and then return it.
  writer.finish.io
end

.extract(str) ⇒ `String`

Note:

This method only accepts a String, not an IO-like object, as the entire input needs to be available to ensure validity.

Searches through a String and extracts the first substring enclosed by ‘<~’ and ‘~>’.

Examples:

Extracting Ascii85 content

Ascii85.extract("Foo<~;KZGo~>Bar<~z~>Baz")
# => ";KZGo"

When no delimiters are found

Ascii85.extract("No delimiters")
# => ""

Parameters:

str (String) —

The String to search through

Returns:

(String) —

The extracted substring, or an empty String if no valid delimiters are found

# File 'lib/ascii85.rb', line 145

def extract(str)
  input = str.to_s

  # Make sure the delimiter Strings have the correct encoding.
  opening_delim = '<~'.encode(input.encoding)
  closing_delim = '~>'.encode(input.encoding)

  # Get the positions of the opening/closing delimiters. If there is no pair
  # of opening/closing delimiters, return an unfrozen empty String.
  (start_pos = input.index(opening_delim))                or return ''.dup
  (end_pos   = input.index(closing_delim, start_pos + 2)) or return ''.dup

  # Get the String inside the delimiter-pair
  input[(start_pos + 2)...end_pos]
end

Module: Ascii85

Overview

Defined Under Namespace

Constant Summary collapse

Class Method Summary collapse

Class Method Details

.decode(str, out: nil) ⇒ String, IO

Examples:

Decoding Ascii85 content

Decoding with multiple Ascii85 blocks present (ignores all but the first)

When no delimiters are found

Decoding to an IO object

.decode_raw(str_or_io, out: nil) ⇒ String, IO

Examples:

Decoding a raw Ascii85 String

Decoding from an IO-like object

Decoding to an IO object

.encode(str_or_io, wrap_lines = 80, out: nil) ⇒ String, IO

Examples:

Encoding a simple String

Encoding with line wrapping

Encoding without line wrapping

Encoding from an IO-like object

Encoding to an IO object

.extract(str) ⇒ String

Examples:

Extracting Ascii85 content

When no delimiters are found

.decode(str, out: nil) ⇒ `String`, `IO`

.decode_raw(str_or_io, out: nil) ⇒ `String`, `IO`

.encode(str_or_io, wrap_lines = 80, out: nil) ⇒ `String`, `IO`

.extract(str) ⇒ `String`