Class: Bzip2::FFI::Reader
Overview
Reader reads and decompresses a bzip2 compressed stream or file. The
public instance methods of Reader are intended to be equivalent to those
of a standard IO
object.
Data can be read as a stream using Reader.open and #read, for example:
Bzip2::FFI::Reader.open(io_or_path) do |reader|
while buffer = reader.read(1024) do
# process uncompressed bytes in buffer
end
end
Alternatively, without passing a block to Reader.open:
reader = Bzip2::FFI::Reader.open(io_or_path)
begin
while buffer = reader.read(1024) do
# process uncompressed bytes in buffer
end
ensure
reader.close
end
All the available bzipped data can be read in a single step using Reader.read:
uncompressed = Bzip2::FFI::Reader.read(io_or_path)
The Reader.open and Reader.read methods accept either an IO-like object or a file
path. IO-like objects must have a #read
method. Paths can be given as
either a String
or Pathname
.
No character conversion is performed on decompressed bytes. The Reader.read and
#read methods return instances of String
that represent the raw
decompressed bytes, with #encoding
set to Encoding::ASCII_8BIT
(also
known as Encoding::BINARY
).
Reader will normally read all consecutive bzip2 compressed structure
from the given stream or file (unless the :first_only
parameter is
specified - see Reader.open). If the stream or file contains additional data
beyond the end of the compressed bzip2 data, it may be read during
decompression. If such an overread has occurred and the IO-like object
being read from has a #seek
method, Reader will use it to reposition
the stream to the byte immediately following the end of the compressed
bzip2 data. If #seek
raises an IOError
, it will be caught and the
stream position will be left unchanged.
Reader does not support seeking (it's not supported by the underlying
libbz2 library). There are no #seek
or #pos=
methods. The only way to
advance the position is to call #read. Discard the result if it's not
needed.
Class Method Summary collapse
-
.open(io_or_path, options = {}) {|reader| ... } ⇒ Object
Opens a Reader to read and decompress data from either an IO-like object or a file.
-
.read(io_or_path, options = {}) ⇒ String
Reads and decompresses and entire bzip2 compressed structure from either an IO-like object or a file and returns the decompressed bytes as a
String
.
Instance Method Summary collapse
-
#close ⇒ NilType
Ends decompression and closes the Reader.
-
#eof? ⇒ Boolean
(also: #eof)
Returns
true
if decompression has completed, otherwisefalse
. -
#initialize(io, options = {}) ⇒ Reader
constructor
Initializes a Reader to read compressed data from an IO-like object (
io
). -
#read(length = nil, buffer = nil) ⇒ String
Reads and decompresses data from the bzip2 compressed stream or file, returning the uncompressed bytes.
-
#tell ⇒ Integer
(also: #pos)
Returns the number of decompressed bytes that have been read.
Methods inherited from IO
#autoclose=, #autoclose?, #binmode, #binmode?, #closed?, #external_encoding, #internal_encoding
Constructor Details
#initialize(io, options = {}) ⇒ Reader
Initializes a Bzip2::FFI::Reader to read compressed data from an IO-like object
(io
). io
must have a #read
method.
The following options can be specified using the options
Hash
:
:autoclose
- Set totrue
to closeio
when the Bzip2::FFI::Reader instance is closed.:first_only
- Bzip2 files can contain multiple consecutive compressed strctures. Normally all the structures will be decompressed with the decompressed bytes concatenated. Set totrue
to only read the first structure.:small
- Set totrue
to use an alternative decompression algorithm that uses less memory, but at the cost of decompressing more slowly (roughly 2,300 kB less memory at about half the speed).
#binmode
is called on io
if io
responds to #binmode
.
After use, the Bzip2::FFI::Reader instance should be closed using the #close method.
221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 |
# File 'lib/bzip2/ffi/reader.rb', line 221 def initialize(io, = {}) super raise ArgumentError, 'io must respond to read' unless io.respond_to?(:read) @first_only = [:first_only] @small = [:small] ? 1 : 0 @in_eof = false @out_eof = false @in_buffer = nil @structure_number = 1 @structure_start_pos = 0 @in_pos = 0 @out_pos = 0 decompress_init(stream) end |
Class Method Details
.open(io_or_path, options = {}) {|reader| ... } ⇒ Object
Opens a Bzip2::FFI::Reader to read and decompress data from either an IO-like
object or a file. IO-like objects must have a #read
method. Files
can be specified using either a String
containing the file path or a
Pathname
.
If no block is given, the opened Bzip2::FFI::Reader instance is returned. After use, the instance should be closed using the #close method.
If a block is given, it will be passed the opened Bzip2::FFI::Reader instance as an argument. After the block terminates, the Bzip2::FFI::Reader instance will automatically be closed. open will then return the result of the block.
The following options can be specified using the options
Hash
:
:autoclose
- When passing an IO-like object, set totrue
to close it when the Bzip2::FFI::Reader instance is closed.:first_only
- Bzip2 files can contain multiple consecutive compressed strctures. Normally all the structures will be decompressed with the decompressed bytes concatenated. Set totrue
to only read the first structure.:small
- Set totrue
to use an alternative decompression algorithm that uses less memory, but at the cost of decompressing more slowly (roughly 2,300 kB less memory at about half the speed).
If an IO-like object that has a #binmode
method is passed to open,
#binmode
will be called on io_or_path
before yielding to the block
or returning.
122 123 124 125 126 127 128 129 130 131 132 |
# File 'lib/bzip2/ffi/reader.rb', line 122 def open(io_or_path, = {}) if io_or_path.kind_of?(String) || io_or_path.kind_of?(Pathname) = .merge(autoclose: true) proc = -> { open_bzip_file(io_or_path.to_s, 'rb') } super(proc, ) elsif !io_or_path.kind_of?(Proc) super else raise ArgumentError, 'io_or_path must be an IO-like object or a path' end end |
.read(io_or_path, options = {}) ⇒ String
Reads and decompresses and entire bzip2 compressed structure from
either an IO-like object or a file and returns the decompressed bytes
as a String
. IO-like objects must have a #read
method. Files can
be specified using either a String
containing the file path or a
Pathname
.
The following options can be specified using the options
Hash
:
:autoclose
- When passing an IO-like object, set totrue
to close it when the compressed data has been read.:first_only
- Bzip2 files can contain multiple consecutive compressed strctures. Normally all the structures will be decompressed with the decompressed bytes concatenated. Set totrue
to only read the first structure.:small
- Set totrue
to use an alternative decompression algorithm that uses less memory, but at the cost of decompressing more slowly (roughly 2,300 kB less memory at about half the speed).
No character conversion is performed on decompressed bytes. read
returns a String
that represents the raw decompressed bytes, with
encoding
set to Encoding::ASCII_8BIT
(also known as
Encoding::BINARY
).
If an IO-like object that has a #inmode
method is passed to read,
#binmode
will be called on io_or_path
before any compressed data
is read.
174 175 176 177 178 |
# File 'lib/bzip2/ffi/reader.rb', line 174 def read(io_or_path, = {}) open(io_or_path, ) do |reader| reader.read end end |
Instance Method Details
#close ⇒ NilType
Ends decompression and closes the Bzip2::FFI::Reader.
If the open method is used with a block, it is not necessary to call #close. Otherwise, #close should be called once the Bzip2::FFI::Reader is no longer needed.
247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 |
# File 'lib/bzip2/ffi/reader.rb', line 247 def close s = stream unless @out_eof decompress_end(s) end s[:next_in] = nil s[:next_out] = nil if @in_buffer @in_buffer.free @in_buffer = nil end super end |
#eof? ⇒ Boolean Also known as: eof
350 351 352 353 |
# File 'lib/bzip2/ffi/reader.rb', line 350 def eof? check_closed @out_eof end |
#read(length = nil, buffer = nil) ⇒ String
Reads and decompresses data from the bzip2 compressed stream or file, returning the uncompressed bytes.
length
must be a non-negative integer or nil
.
If length
is a positive integer, it specifies the maximum number of
uncompressed bytes to return. read
will return nil
or a String
with a length of 1 to length
bytes containing the decompressed data.
A result of nil
or a String
with a length less than length
bytes
indicates that the end of the decompressed data has been reached.
If length
is nil
, #read reads until the end of the decompressed
data, returning the uncompressed bytes as a String
.
If length
is 0, #read returns an empty String
.
If the optional buffer
argument is present, it must reference a
String
that will receive the decompressed data. buffer
will
contain only the decompressed data after the call to #read, even if it
is not empty beforehand.
No character conversion is performed on decompressed bytes. #read
returns a String
that represents the raw decompressed bytes, with
encoding
set to Encoding::ASCII_8BIT
(also known as
Encoding::BINARY
).
304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 |
# File 'lib/bzip2/ffi/reader.rb', line 304 def read(length = nil, buffer = nil) if buffer buffer.clear buffer.force_encoding(Encoding::ASCII_8BIT) end if length raise ArgumentError 'length must be a non-negative integer or nil' if length < 0 if length == 0 check_closed return buffer || String.new end decompressed = decompress(length) return nil unless decompressed buffer ? buffer << decompressed : decompressed else result = buffer ? StringIO.new(buffer) : StringIO.new # StringIO#binmode is a no-op, but call in case it is implemented in # future versions. result.binmode result.set_encoding(Encoding::ASCII_8BIT) loop do decompressed = decompress(DEFAULT_DECOMPRESS_COUNT) break unless decompressed result.write(decompressed) break if decompressed.bytesize < DEFAULT_DECOMPRESS_COUNT end result.string end end |
#tell ⇒ Integer Also known as: pos
Returns the number of decompressed bytes that have been read.
360 361 362 363 |
# File 'lib/bzip2/ffi/reader.rb', line 360 def tell check_closed @out_pos end |