Class: Bzip2::FFI::Writer

Inherits:
IO
  • Object
show all
Defined in:
lib/bzip2/ffi/writer.rb

Overview

Writer compresses and writes a bzip2 compressed stream or file. The public instance methods of Writer are intended to be equivalent to those of a standard IO object.

Data can be written as a stream using Writer.open and #write. For example, the following compresses lines read from standard input:

Bzip2::FFI::Writer.open(io_or_path) do |writer|
  ARGF.each_line do |line|
    writer.write(line)
  end
end

Alternatively, without passing a block to Writer.open:

writer = Bzip2::FFI::Writer.open(io_or_path)
begin
  ARGF.each_line do |line|
    writer.write(line)
  end
ensure
  writer.close
end

An entire bzip2 structure can be written in a single step using Writer.write:

Bzip2::FFI::Writer.write(io_or_path, 'Hello, World!')

The Writer.open and Writer.write methods accept either an IO-like object or a file path. IO-like objects must have a #write method. Paths can be given as either a String or Pathname.

No character conversion is performed when writing and compressing. The Writer.write and #write methods compress the raw bytes from the given String (using the encoding of the String).

Writer does not support seeking (it's not supported by the underlying libbz2 library). There are no #seek or #pos= methods.

Class Method Summary collapse

Instance Method Summary collapse

Methods inherited from IO

#autoclose=, #autoclose?, #binmode, #binmode?, #closed?, #external_encoding, #internal_encoding

Constructor Details

#initialize(io, options = {}) ⇒ Writer

Initializes a Bzip2::FFI::Writer to write compressed data to an IO-like object (io). io must have a #write method.

The following options can be specified using the options Hash:

  • :autoclose - When passing an IO-like object, set to true to close it when the Bzip2::FFI::Writer instance is closed.
  • :block_size - Specifies the block size used for compression. It should be set to an integer between 1 and 9 inclusive. The actual block size used is 100 kB times the chosen figure. 9 gives the best compression, but requires most memory. 1 gives the worst compression, but uses least memory. If not specified, :block_size defaults to 9.
  • :work_factor - Controls how the compression algorithm behaves when presented with the worst case, highly repetitive, input data. If compression runs into difficulties caused by repetitive data, the library switches from the standard sorting algorithm to a fallback algorithm. The fallback is slower than the standard algorithm by approximately a factor of three, but always behaves reasonably, no matter how bad the input. Lower values of :work_factor reduce the amount of effort the standard algorithm will expend before resorting to the fallback. Allowable values range from 0 to 250 inclusive. 0 is a special case, equivalent to using the default libbz2 work factor value (30 as of bzip2 v1.0.8). If not specified, :work_factor defaults to 0.

#binmode is called on io if io responds to #binmode.

After use, the Bzip2::FFI::Writer instance must be closed using the #close method in order to complete the compression process.

Parameters:

  • io (Object)

    An IO-like object that has a #write method.

  • options (Hash) (defaults to: {})

    Optional parameters (:autoclose, :block_size and :small).

Raises:

  • (ArgumentError)

    If io is nil or does not respond to #write.

  • (RangeError)

    If options[:block_size] is less than 1 or greater than 9, or options[:work_factor] is less than 0 or greater than 250.

  • (Error::Bzip2Error)

    If an error occurs when initializing libbz2.



251
252
253
254
255
256
257
258
259
260
261
262
263
264
# File 'lib/bzip2/ffi/writer.rb', line 251

def initialize(io, options = {})
  super
  raise ArgumentError, 'io must respond to write' unless io.respond_to?(:write)

  block_size = options[:block_size] || 9
  work_factor = options[:work_factor] || 0

  raise RangeError, 'block_size must be >= 1 and <= 9' if block_size < 1 || block_size > 9
  raise RangeError, 'work_factor must be >= 0 and <= 250' if work_factor < 0 || work_factor > 250

  check_error(Libbz2::BZ2_bzCompressInit(stream, block_size, 0, work_factor))

  ObjectSpace.define_finalizer(self, self.class.send(:finalize, stream))
end

Class Method Details

.open(io_or_path, options = {}) ⇒ Object

Opens a Bzip2::FFI::Writer to compress and write bzip2 compressed data to either an IO-like object or a file. IO-like objects must have a #write method. Files can be specified using either a String containing the file path or a Pathname.

If no block is given, the opened Bzip2::FFI::Writer instance is returned. After writing data, the instance must be closed using the #close method in order to complete the compression process.

If a block is given, it will be passed the opened Bzip2::FFI::Writer instance as an argument. After the block terminates, the Bzip2::FFI::Writer instance will automatically be closed. open will then return the result of the block.

The following options can be specified using the options Hash:

  • :autoclose - When passing an IO-like object, set to true to close it when the Bzip2::FFI::Writer instance is closed.
  • :block_size - Specifies the block size used for compression. It should be set to an integer between 1 and 9 inclusive. The actual block size used is 100 kB times the chosen figure. 9 gives the best compression, but requires most memory. 1 gives the worst compression, but uses least memory. If not specified, :block_size defaults to 9.
  • :work_factor - Controls how the compression algorithm behaves when presented with the worst case, highly repetitive, input data. If compression runs into difficulties caused by repetitive data, the library switches from the standard sorting algorithm to a fallback algorithm. The fallback is slower than the standard algorithm by approximately a factor of three, but always behaves reasonably, no matter how bad the input. Lower values of :work_factor reduce the amount of effort the standard algorithm will expend before resorting to the fallback. Allowable values range from 0 to 250 inclusive. 0 is a special case, equivalent to using the default libbz2 work factor value (30 as of bzip2 v1.0.8). If not specified, :work_factor defaults to 0.

If an IO-like object that has a #binmode method is passed to open, #binmode will be called on io_or_path before yielding to the block or returning.

If a path to a file that already exists is passed to open, the file will be truncated before writing.

Parameters:

  • io_or_path (Object)

    Either an IO-like object with a #write method or a file path as a String or Pathname.

  • options (Hash) (defaults to: {})

    Optional parameters (:autoclose, :block_size and :small).

Returns:

  • (Object)

    The opened Bzip2::FFI::Writer instance if no block is given, or the result of the block if a block is given.

Raises:

  • (ArgumentError)

    If io_or_path is not a String, Pathname or an IO-like object with a #write method.

  • (Errno::ENOENT)

    If the parent directory of the specified file does not exist.

  • (Error::Bzip2Error)

    If an error occurs when initializing libbz2.



119
120
121
122
123
124
125
126
127
128
129
# File 'lib/bzip2/ffi/writer.rb', line 119

def open(io_or_path, options = {})
  if io_or_path.kind_of?(String) || io_or_path.kind_of?(Pathname)
    options = options.merge(autoclose: true)
    proc = -> { open_bzip_file(io_or_path.to_s, 'wb') }
    super(proc, options)
  elsif !io_or_path.kind_of?(Proc)
    super
  else
    raise ArgumentError, 'io_or_path must be an IO-like object or a path'
  end
end

.write(io_or_path, string, options = {}) ⇒ Integer

Compresses data from a String and writes an entire bzip2 compressed structure to either an IO-like object or a file. IO-like objects must have a #write method. Files can be specified using either a String containing the file path or a Pathname.

The following options can be specified using the options Hash:

  • :autoclose - When passing an IO-like object, set to true to close it when the Bzip2::FFI::Writer instance is closed.
  • :block_size - Specifies the block size used for compression. It should be set to an integer between 1 and 9 inclusive. The actual block size used is 100 kB times the chosen figure. 9 gives the best compression, but requires most memory. 1 gives the worst compression, but uses least memory. If not specified, :block_size defaults to 9.
  • :work_factor - Controls how the compression algorithm behaves when presented with the worst case, highly repetitive, input data. If compression runs into difficulties caused by repetitive data, the library switches from the standard sorting algorithm to a fallback algorithm. The fallback is slower than the standard algorithm by approximately a factor of three, but always behaves reasonably, no matter how bad the input. Lower values of :work_factor reduce the amount of effort the standard algorithm will expend before resorting to the fallback. Allowable values range from 0 to 250 inclusive. 0 is a special case, equivalent to using the default libbz2 work factor value (30 as of bzip2 v1.0.8). If not specified, :work_factor defaults to 0.

No character conversion is performed. The raw bytes from string are compressed (using the encoding of string).

If an IO-like object that has a #binmode method is passed to write, #binmode will be called on io_or_path before any compressed data is written.

The number of uncompressed bytes written is returned.

Parameters:

  • io_or_path (Object)

    Either an IO-like object with a #write method or a file path as a String or Pathname.

  • string (Object)

    The string to write (the result of calling #to_s on string will be written).

  • options (Hash) (defaults to: {})

    Optional parameters (:autoclose, :block_size and :small).

Returns:

  • (Integer)

    The number of uncompressed bytes written.

Raises:

  • (ArgumentError)

    If io_or_path is not a String, Pathname or an IO-like object with a #write method.

  • (Errno::ENOENT)

    If the parent directory of the specified file does not exist.

  • (Error::Bzip2Error)

    If an error occurs when initializing libbz2 or compressing data.



187
188
189
190
191
# File 'lib/bzip2/ffi/writer.rb', line 187

def write(io_or_path, string, options = {})
  open(io_or_path, options) do |writer|
    writer.write(string)
  end
end

Instance Method Details

#closeNilType

Completes compression of data written using #write, writes all remaining compressed bytes to the underlying stream and closes the Bzip2::FFI::Writer.

If the open method is used with a block, it is not necessary to call #close. Otherwise, #close must be called once the all the data to be compressed has been passed to #write.

Returns:

  • (NilType)

    nil.

Raises:



276
277
278
279
280
281
282
283
# File 'lib/bzip2/ffi/writer.rb', line 276

def close
  s = stream
  flush_buffers(s, Libbz2::BZ_FINISH, Libbz2::BZ_STREAM_END)
  res = Libbz2::BZ2_bzCompressEnd(s)
  ObjectSpace.undefine_finalizer(self)
  check_error(res)
  super
end

#flushWriter

Completes compression of data provided via #write, terminates and writes out the current bzip2 compressed block to the underlying compressed stream or file.

It is not usually necessary to call #flush.

Calling #flush may result in a larger compressed output.

Returns:

Raises:



340
341
342
343
# File 'lib/bzip2/ffi/writer.rb', line 340

def flush
  flush_buffers(stream, Libbz2::BZ_FLUSH, Libbz2::BZ_RUN_OK)
  self
end

#tellInteger Also known as: pos

Returns the number of uncompressed bytes that have been written.

Returns:

  • (Integer)

    The number of uncompressed bytes that have been written.

Raises:



350
351
352
353
# File 'lib/bzip2/ffi/writer.rb', line 350

def tell
  s = stream
  (s[:total_in_hi32] << 32) | s[:total_in_lo32]
end

#write(string) ⇒ Integer

Compresses data from a String and writes it to the bzip2 compressed stream or file.

No character conversion is performed. The raw bytes from string are compressed (using the encoding of string).

The number of uncompressed bytes written is returned.

Parameters:

  • string (Object)

    The string to write (the result of calling #to_s on string will be written).

Returns:

  • (Integer)

    The number of uncompressed bytes written.

Raises:



298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
# File 'lib/bzip2/ffi/writer.rb', line 298

def write(string)
  string = string.to_s

  s = stream
  next_in = ::FFI::MemoryPointer.new(1, string.bytesize)
  buffer = ::FFI::MemoryPointer.new(1, OUT_BUFFER_SIZE)
  begin
    next_in.write_bytes(string)
    s[:next_in] = next_in
    s[:avail_in] = next_in.size

    while s[:avail_in] > 0
      s[:next_out] = buffer
      s[:avail_out] = buffer.size

      check_error(Libbz2::BZ2_bzCompress(s, Libbz2::BZ_RUN))

      count = buffer.size - s[:avail_out]
      io.write(buffer.read_string(count))
    end
  ensure
    next_in.free
    buffer.free
    s[:next_in] = nil
    s[:next_out] = nil
  end

  string.bytesize
end