Class: ZipTricks::Streamer
- Inherits:
-
Object
- Object
- ZipTricks::Streamer
- Defined in:
- lib/zip_tricks/streamer.rb
Overview
Is used to write streamed ZIP archives into the provided IO-ish object. The output IO is never going to be rewound or seeked, so the output of this object can be coupled directly to, say, a Rack output.
Allows for splicing raw files (for "stored" entries without compression) and splicing of deflated files (for "deflated" storage mode).
For stored entries, you need to know the CRC32 (as a uint) and the filesize upfront, before the writing of the entry body starts.
Any object that responds to << can be used as the Streamer target - you can use
a String, an Array, a Socket or a File, at your leisure.
Using the Streamer with runtime compression
You can use the Streamer with data descriptors (the CRC32 and the sizes will be written after the file data). This allows non-rewinding on-the-fly compression. If you are compressing large files, the Deflater object that the Streamer controls will be regularly flushed to prevent memory inflation.
ZipTricks::Streamer.open(file_socket_or_string) do |zip|
zip.write_stored_file('mov.mp4') do |sink|
File.open('mov.mp4', 'rb'){|source| IO.copy_stream(source, sink) }
end
zip.write_deflated_file('long-novel.txt') do |sink|
File.open('novel.txt', 'rb'){|source| IO.copy_stream(source, sink) }
end
end
The central directory will be written automatically at the end of the block.
Using the Streamer with entries of known size and having a known CRC32 checksum
Streamer allows "IO splicing" - in this mode it will only control the metadata output, but you can write the data to the socket/file outside of the Streamer. For example, when using the sendfile gem:
ZipTricks::Streamer.open(socket) do | zip |
zip.add_stored_entry(filename: "myfile1.bin", size: 9090821, crc32: 12485)
socket.sendfile(tempfile1)
zip.simulate_write(tempfile1.size)
zip.add_stored_entry(filename: "myfile2.bin", size: 458678, crc32: 89568)
socket.sendfile(tempfile2)
zip.simulate_write(tempfile2.size)
end
Note that you need to use simulate_write in this case. This needs to happen since Streamer
writes absolute offsets into the ZIP (local file header offsets and the like),
and it relies on the output object to tell it how many bytes have been written
so far. When using sendfile the Ruby write methods get bypassed entirely, and the
offsets in the IO will not be updated - which will result in an invalid ZIP.
The central directory will be written automatically at the end of the open block.
Defined Under Namespace
Constant Summary collapse
- EntryBodySizeMismatch =
Class.new(StandardError)
- InvalidOutput =
Class.new(ArgumentError)
- Overflow =
Class.new(StandardError)
- UnknownMode =
Class.new(StandardError)
Class Method Summary collapse
-
.open(stream, **kwargs_for_new) {|Streamer| ... } ⇒ Object
Creates a new Streamer on top of the given IO-ish object and yields it.
Instance Method Summary collapse
-
#<<(binary_data) ⇒ Object
Writes a part of a zip entry body (actual binary data of the entry) into the output stream.
-
#add_compressed_entry(filename:, compressed_size:, uncompressed_size:, crc32:) ⇒ Fixnum
Writes out the local header for an entry (file in the ZIP) that is using the deflated storage model (is compressed).
-
#add_stored_entry(filename:, size:, crc32:) ⇒ Fixnum
Writes out the local header for an entry (file in the ZIP) that is using the stored storage model (is stored as-is).
-
#close ⇒ Fixnum
Closes the archive.
-
#create_writer ⇒ ZipTricks::ZipWriter
Sets up the ZipWriter with wrappers if necessary.
-
#initialize(stream, writer: create_writer) ⇒ Streamer
constructor
Creates a new Streamer on top of the given IO-ish object.
-
#simulate_write(num_bytes) ⇒ Numeric
Advances the internal IO pointer to keep the offsets of the ZIP file in check.
-
#write(binary_data) ⇒ Fixnum
Writes a part of a zip entry body (actual binary data of the entry) into the output stream, and returns the number of bytes written.
-
#write_deflated_file(filename) {|#<<, #write| ... } ⇒ Object
Opens the stream for a deflated file in the archive, and yields a writer for that file to the block.
-
#write_stored_file(filename) {|#<<, #write| ... } ⇒ Object
Opens the stream for a stored file in the archive, and yields a writer for that file to the block.
Constructor Details
#initialize(stream, writer: create_writer) ⇒ Streamer
Creates a new Streamer on top of the given IO-ish object.
88 89 90 91 92 93 94 95 96 97 98 |
# File 'lib/zip_tricks/streamer.rb', line 88 def initialize(stream, writer: create_writer) raise InvalidOutput, "The stream must respond to #<<" unless stream.respond_to?(:<<) unless stream.respond_to?(:tell) && stream.respond_to?(:advance_position_by) stream = ZipTricks::WriteAndTell.new(stream) end @out = stream @files = [] @local_header_offsets = [] @writer = writer end |
Class Method Details
.open(stream, **kwargs_for_new) {|Streamer| ... } ⇒ Object
Creates a new Streamer on top of the given IO-ish object and yields it. Once the given block
returns, the Streamer will have it's close method called, which will write out the central
directory of the archive to the output.
77 78 79 80 81 |
# File 'lib/zip_tricks/streamer.rb', line 77 def self.open(stream, **kwargs_for_new) archive = new(stream, **kwargs_for_new) yield(archive) archive.close end |
Instance Method Details
#<<(binary_data) ⇒ Object
Writes a part of a zip entry body (actual binary data of the entry) into the output stream.
104 105 106 107 |
# File 'lib/zip_tricks/streamer.rb', line 104 def <<(binary_data) @out << binary_data self end |
#add_compressed_entry(filename:, compressed_size:, uncompressed_size:, crc32:) ⇒ Fixnum
Writes out the local header for an entry (file in the ZIP) that is using the deflated storage model (is compressed).
Once this method is called, the << method has to be called to write the actual contents of the body.
Note that the deflated body that is going to be written into the output has to be precompressed (pre-deflated) before writing it into the Streamer, because otherwise it is impossible to know it's size upfront.
142 143 144 145 146 |
# File 'lib/zip_tricks/streamer.rb', line 142 def add_compressed_entry(filename:, compressed_size:, uncompressed_size:, crc32:) add_file_and_write_local_header(filename: filename, crc32: crc32, storage_mode: DEFLATED, compressed_size: compressed_size, uncompressed_size: uncompressed_size) @out.tell end |
#add_stored_entry(filename:, size:, crc32:) ⇒ Fixnum
Writes out the local header for an entry (file in the ZIP) that is using the stored storage model (is stored as-is).
Once this method is called, the << method has to be called one or more times to write the actual contents of the body.
155 156 157 158 159 |
# File 'lib/zip_tricks/streamer.rb', line 155 def add_stored_entry(filename:, size:, crc32:) add_file_and_write_local_header(filename: filename, crc32: crc32, storage_mode: STORED, compressed_size: size, uncompressed_size: size) @out.tell end |
#close ⇒ Fixnum
Closes the archive. Writes the central directory, and switches the writer into a state where it can no longer be written to.
Once this method is called, the Streamer should be discarded (the ZIP archive is complete).
212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 |
# File 'lib/zip_tricks/streamer.rb', line 212 def close # Record the central directory offset, so that it can be written into the EOCD record cdir_starts_at = @out.tell # Write out the central directory entries, one for each file @files.each_with_index do |entry, i| header_loc = @local_header_offsets.fetch(i) @writer.write_central_directory_file_header(io: @out, local_file_header_location: header_loc, gp_flags: entry.gp_flags, storage_mode: entry.storage_mode, compressed_size: entry.compressed_size, uncompressed_size: entry.uncompressed_size, mtime: entry.mtime, crc32: entry.crc32, filename: entry.filename) #, external_attrs: DEFAULT_EXTERNAL_ATTRS) end # Record the central directory size, for the EOCDR cdir_size = @out.tell - cdir_starts_at # Write out the EOCDR @writer. write_end_of_central_directory(io: @out, start_of_central_directory_location: cdir_starts_at, central_directory_size: cdir_size, num_files_in_archive: @files.length) @out.tell end |
#create_writer ⇒ ZipTricks::ZipWriter
Sets up the ZipWriter with wrappers if necessary. The method is called once, when the Streamer gets instantiated - the Writer then gets reused. This method is primarily there so that you can override it.
239 240 241 |
# File 'lib/zip_tricks/streamer.rb', line 239 def create_writer ZipTricks::ZipWriter.new end |
#simulate_write(num_bytes) ⇒ Numeric
Advances the internal IO pointer to keep the offsets of the ZIP file in check. Use this if you are going
to use accelerated writes to the socket (like the sendfile() call) after writing the headers, or if you
just need to figure out the size of the archive.
126 127 128 129 |
# File 'lib/zip_tricks/streamer.rb', line 126 def simulate_write(num_bytes) @out.advance_position_by(num_bytes) @out.tell end |
#write(binary_data) ⇒ Fixnum
Writes a part of a zip entry body (actual binary data of the entry) into the output stream,
and returns the number of bytes written. Is implemented to make Streamer usable with
IO.copy_stream(from, to).
115 116 117 118 |
# File 'lib/zip_tricks/streamer.rb', line 115 def write(binary_data) @out << binary_data binary_data.bytesize end |
#write_deflated_file(filename) {|#<<, #write| ... } ⇒ Object
Opens the stream for a deflated file in the archive, and yields a writer for that file to the block. Once the write completes, a data descriptor will be written with the actual compressed/uncompressed sizes and the CRC32 checksum.
190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 |
# File 'lib/zip_tricks/streamer.rb', line 190 def write_deflated_file(filename) add_file_and_write_local_header(filename: filename, storage_mode: DEFLATED, use_data_descriptor: true, crc32: 0, compressed_size: 0, uncompressed_size: 0) w = DeflatedWriter.new(@out) yield(Writable.new(w)) crc, comp, uncomp = w.finish # Save the information into the entry for when the time comes to write out the central directory last_entry = @files[-1] last_entry.crc32 = crc last_entry.compressed_size = comp last_entry.uncompressed_size = uncomp write_data_descriptor_for_last_entry end |
#write_stored_file(filename) {|#<<, #write| ... } ⇒ Object
Opens the stream for a stored file in the archive, and yields a writer for that file to the block. Once the write completes, a data descriptor will be written with the actual compressed/uncompressed sizes and the CRC32 checksum.
167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 |
# File 'lib/zip_tricks/streamer.rb', line 167 def write_stored_file(filename) add_file_and_write_local_header(filename: filename, storage_mode: STORED, use_data_descriptor: true, crc32: 0, compressed_size: 0, uncompressed_size: 0) w = StoredWriter.new(@out) yield(Writable.new(w)) crc, comp, uncomp = w.finish # Save the information into the entry for when the time comes to write out the central directory last_entry = @files[-1] last_entry.crc32 = crc last_entry.compressed_size = comp last_entry.uncompressed_size = uncomp @writer.write_data_descriptor(io: @out, crc32: crc, compressed_size: comp, uncompressed_size: uncomp) end |