Module: ZipTricks::BlockDeflate

Defined in:
lib/zip_tricks/block_deflate.rb

Overview

Permits Deflate compression in independent blocks. The workflow is as follows:

  • Run every block to compress through deflate_chunk, remove the header, footer and adler32 from the result
  • Write out the compressed block bodies (the ones deflate_chunk returns)to your output, in sequence
  • Write out the footer (\03\00)

The resulting stream is guaranteed to be handled properly by all zip unarchiving tools, including the BOMArchiveHelper/ArchiveUtility on OSX.

You could also build a compressor for Rubyzip using this module quite easily, even though this is outside the scope of the library.

When you deflate the chunks separately, you need to write the end marker yourself (using write_terminator). If you just want to deflate a large IO's contents, use deflate_in_blocks_and_terminate to have the end marker written out for you.

Constant Summary collapse

DEFAULT_BLOCKSIZE =
1024*1024*5
END_MARKER =
[3, 0].pack("C*")
VALID_COMPRESSIONS =

Zlib::NO_COMPRESSION..

(Zlib::DEFAULT_COMPRESSION..Zlib::BEST_COMPRESSION).to_a.freeze

Class Method Summary collapse

Class Method Details

.deflate_chunk(bytes, level: Zlib::DEFAULT_COMPRESSION) ⇒ String

Compress a given binary string and flush the deflate stream at byte boundary. The returned string can be spliced into another deflate stream.

Parameters:

  • bytes (String)

    Bytes to compress

  • level (Fixnum) (defaults to: Zlib::DEFAULT_COMPRESSION)

    Zlib compression level (defaults to Zlib::DEFAULT_COMPRESSION)

Returns:

  • (String)

    compressed bytes



37
38
39
40
41
42
43
44
45
46
# File 'lib/zip_tricks/block_deflate.rb', line 37

def self.deflate_chunk(bytes, level: Zlib::DEFAULT_COMPRESSION)
  raise "Invalid Zlib compression level #{level}" unless VALID_COMPRESSIONS.include?(level)
  z = Zlib::Deflate.new(level)
  compressed_blob = z.deflate(bytes, Zlib::SYNC_FLUSH)
  compressed_blob << z.finish
  z.close

  # Remove the header (2 bytes), the [3,0] end marker and the adler (4 bytes)
  compressed_blob[2...-6]
end

.deflate_in_blocks(input_io, output_io, level: Zlib::DEFAULT_COMPRESSION, block_size: DEFAULT_BLOCKSIZE) ⇒ Fixnum

Compress the contents of input_io into output_io, in blocks of block_size. Align the parts so that they can be concatenated later. Will not write the deflate end marker (\x3\x0) so more parts can be written later and succesfully read back in provided the end marker wll be written.

output_io can also be a Streamer to expedite ops.

Parameters:

  • input_io (IO)

    the stream to read from (should respond to :read)

  • output_io (IO)

    the stream to write to (should respond to :<<)

  • level (Fixnum) (defaults to: Zlib::DEFAULT_COMPRESSION)

    Zlib compression level (defaults to Zlib::DEFAULT_COMPRESSION)

  • block_size (Fixnum) (defaults to: DEFAULT_BLOCKSIZE)

    The block size to use (defaults to DEFAULT_BLOCKSIZE)

Returns:

  • (Fixnum)

    number of bytes written to output_io



80
81
82
83
84
85
86
87
88
# File 'lib/zip_tricks/block_deflate.rb', line 80

def self.deflate_in_blocks(input_io, output_io, level: Zlib::DEFAULT_COMPRESSION, block_size: DEFAULT_BLOCKSIZE)
  bytes_written = 0
  while block = input_io.read(block_size)
    deflated = deflate_chunk(block, level: level)
    output_io << deflated
    bytes_written += deflated.bytesize
  end
  bytes_written
end

.deflate_in_blocks_and_terminate(input_io, output_io, level: Zlib::DEFAULT_COMPRESSION, block_size: DEFAULT_BLOCKSIZE) ⇒ Fixnum

Compress the contents of input_io into output_io, in blocks of block_size. Aligns the parts so that they can be concatenated later. Writes deflate end marker (\x3\x0) into output_io as the final step, so the contents of output_io can be spliced verbatim into a ZIP archive.

Once the write completes, no more parts for concatenation should be written to the same stream.

output_io can also be a Streamer to expedite ops.

Parameters:

  • input_io (IO)

    the stream to read from (should respond to :read)

  • output_io (IO)

    the stream to write to (should respond to :<<)

  • level (Fixnum) (defaults to: Zlib::DEFAULT_COMPRESSION)

    Zlib compression level (defaults to Zlib::DEFAULT_COMPRESSION)

  • block_size (Fixnum) (defaults to: DEFAULT_BLOCKSIZE)

    The block size to use (defaults to DEFAULT_BLOCKSIZE)

Returns:

  • (Fixnum)

    number of bytes written to output_io



63
64
65
66
# File 'lib/zip_tricks/block_deflate.rb', line 63

def self.deflate_in_blocks_and_terminate(input_io, output_io, level: Zlib::DEFAULT_COMPRESSION, block_size: DEFAULT_BLOCKSIZE)
  bytes_written = deflate_in_blocks(input_io, output_io, level: level, block_size: block_size)
  bytes_written + write_terminator(output_io)
end

.write_terminator(output_io) ⇒ Fixnum

Write the end marker (\x3\x0) to the given IO.

output_io can also be a Streamer to expedite ops.

Parameters:

  • output_io (IO)

    the stream to write to (should respond to :<<)

Returns:

  • (Fixnum)

    number of bytes written to output_io



26
27
28
29
# File 'lib/zip_tricks/block_deflate.rb', line 26

def self.write_terminator(output_io)
  output_io << END_MARKER
  END_MARKER.bytesize
end