Class: Bzip2::FFI::Reader
Overview
Reader reads and decompresses a bzip2 compressed stream or file. The
public instance methods of Reader are intended to be equivalent to those
of a standard IO object.
Data can be read as a stream using Reader.open and #read, for example:
Bzip2::FFI::Reader.open(io_or_path) do |reader|
while buffer = reader.read(1024) do
# process uncompressed bytes in buffer
end
end
Alternatively, without passing a block to Reader.open:
reader = Bzip2::FFI::Reader.open(io_or_path)
begin
while buffer = reader.read(1024) do
# process uncompressed bytes in buffer
end
ensure
reader.close
end
All the available bzipped data can be read in a single step using Reader.read:
uncompressed = Bzip2::FFI::Reader.read(io_or_path)
The Reader.open and Reader.read methods accept either an IO-like object or a file
path. IO-like objects must have a #read method. Paths can be given as
either a String or Pathname.
No character conversion is performed on decompressed bytes. The Reader.read and
#read methods return instances of String that represent the raw
decompressed bytes, with #encoding set to Encoding::ASCII_8BIT (also
known as Encoding::BINARY).
Reader will normally read all consecutive bzip2 compressed structure
from the given stream or file (unless the :first_only parameter is
specified - see Reader.open). If the stream or file contains additional data
beyond the end of the compressed bzip2 data, it may be read during
decompression. If such an overread has occurred and the IO-like object
being read from has a #seek method, Reader will use it to reposition
the stream to the byte immediately following the end of the compressed
bzip2 data. If #seek raises an IOError, it will be caught and the
stream position will be left unchanged.
Reader does not support seeking (it's not supported by the underlying
libbz2 library). There are no #seek or #pos= methods. The only way to
advance the position is to call #read. Discard the result if it's not
needed.
Class Method Summary collapse
-
.open(io_or_path, options = {}) {|reader| ... } ⇒ Object
Opens a Reader to read and decompress data from either an IO-like object or a file.
-
.read(io_or_path, options = {}) ⇒ String
Reads and decompresses and entire bzip2 compressed structure from either an IO-like object or a file and returns the decompressed bytes as a
String.
Instance Method Summary collapse
-
#close ⇒ NilType
Ends decompression and closes the Reader.
-
#eof? ⇒ Boolean
(also: #eof)
Returns
trueif decompression has completed, otherwisefalse. -
#initialize(io, options = {}) ⇒ Reader
constructor
Initializes a Reader to read compressed data from an IO-like object (
io). -
#read(length = nil, buffer = nil) ⇒ String
Reads and decompresses data from the bzip2 compressed stream or file, returning the uncompressed bytes.
-
#tell ⇒ Integer
(also: #pos)
Returns the number of decompressed bytes that have been read.
Methods inherited from IO
#autoclose=, #autoclose?, #binmode, #binmode?, #closed?, #external_encoding, #internal_encoding
Constructor Details
#initialize(io, options = {}) ⇒ Reader
Initializes a Bzip2::FFI::Reader to read compressed data from an IO-like object
(io). io must have a #read method.
The following options can be specified using the options Hash:
:autoclose- Set totrueto closeiowhen the Bzip2::FFI::Reader instance is closed.:first_only- Bzip2 files can contain multiple consecutive compressed strctures. Normally all the structures will be decompressed with the decompressed bytes concatenated. Set totrueto only read the first structure.:small- Set totrueto use an alternative decompression algorithm that uses less memory, but at the cost of decompressing more slowly (roughly 2,300 kB less memory at about half the speed).
#binmode is called on io if io responds to #binmode.
After use, the Bzip2::FFI::Reader instance should be closed using the #close method.
221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 |
# File 'lib/bzip2/ffi/reader.rb', line 221 def initialize(io, = {}) super raise ArgumentError, 'io must respond to read' unless io.respond_to?(:read) @first_only = [:first_only] @small = [:small] ? 1 : 0 @in_eof = false @out_eof = false @in_buffer = nil @structure_number = 1 @structure_start_pos = 0 @in_pos = 0 @out_pos = 0 decompress_init(stream) end |
Class Method Details
.open(io_or_path, options = {}) {|reader| ... } ⇒ Object
Opens a Bzip2::FFI::Reader to read and decompress data from either an IO-like
object or a file. IO-like objects must have a #read method. Files
can be specified using either a String containing the file path or a
Pathname.
If no block is given, the opened Bzip2::FFI::Reader instance is returned. After use, the instance should be closed using the #close method.
If a block is given, it will be passed the opened Bzip2::FFI::Reader instance as an argument. After the block terminates, the Bzip2::FFI::Reader instance will automatically be closed. open will then return the result of the block.
The following options can be specified using the options Hash:
:autoclose- When passing an IO-like object, set totrueto close it when the Bzip2::FFI::Reader instance is closed.:first_only- Bzip2 files can contain multiple consecutive compressed strctures. Normally all the structures will be decompressed with the decompressed bytes concatenated. Set totrueto only read the first structure.:small- Set totrueto use an alternative decompression algorithm that uses less memory, but at the cost of decompressing more slowly (roughly 2,300 kB less memory at about half the speed).
If an IO-like object that has a #binmode method is passed to open,
#binmode will be called on io_or_path before yielding to the block
or returning.
122 123 124 125 126 127 128 129 130 131 132 |
# File 'lib/bzip2/ffi/reader.rb', line 122 def open(io_or_path, = {}) if io_or_path.kind_of?(String) || io_or_path.kind_of?(Pathname) = .merge(autoclose: true) proc = -> { open_bzip_file(io_or_path.to_s, 'rb') } super(proc, ) elsif !io_or_path.kind_of?(Proc) super else raise ArgumentError, 'io_or_path must be an IO-like object or a path' end end |
.read(io_or_path, options = {}) ⇒ String
Reads and decompresses and entire bzip2 compressed structure from
either an IO-like object or a file and returns the decompressed bytes
as a String. IO-like objects must have a #read method. Files can
be specified using either a String containing the file path or a
Pathname.
The following options can be specified using the options Hash:
:autoclose- When passing an IO-like object, set totrueto close it when the compressed data has been read.:first_only- Bzip2 files can contain multiple consecutive compressed strctures. Normally all the structures will be decompressed with the decompressed bytes concatenated. Set totrueto only read the first structure.:small- Set totrueto use an alternative decompression algorithm that uses less memory, but at the cost of decompressing more slowly (roughly 2,300 kB less memory at about half the speed).
No character conversion is performed on decompressed bytes. read
returns a String that represents the raw decompressed bytes, with
encoding set to Encoding::ASCII_8BIT (also known as
Encoding::BINARY).
If an IO-like object that has a #inmode method is passed to read,
#binmode will be called on io_or_path before any compressed data
is read.
174 175 176 177 178 |
# File 'lib/bzip2/ffi/reader.rb', line 174 def read(io_or_path, = {}) open(io_or_path, ) do |reader| reader.read end end |
Instance Method Details
#close ⇒ NilType
Ends decompression and closes the Bzip2::FFI::Reader.
If the open method is used with a block, it is not necessary to call #close. Otherwise, #close should be called once the Bzip2::FFI::Reader is no longer needed.
247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 |
# File 'lib/bzip2/ffi/reader.rb', line 247 def close s = stream unless @out_eof decompress_end(s) end s[:next_in] = nil s[:next_out] = nil if @in_buffer @in_buffer.free @in_buffer = nil end super end |
#eof? ⇒ Boolean Also known as: eof
350 351 352 353 |
# File 'lib/bzip2/ffi/reader.rb', line 350 def eof? check_closed @out_eof end |
#read(length = nil, buffer = nil) ⇒ String
Reads and decompresses data from the bzip2 compressed stream or file, returning the uncompressed bytes.
length must be a non-negative integer or nil.
If length is a positive integer, it specifies the maximum number of
uncompressed bytes to return. read will return nil or a String
with a length of 1 to length bytes containing the decompressed data.
A result of nil or a String with a length less than length bytes
indicates that the end of the decompressed data has been reached.
If length is nil, #read reads until the end of the decompressed
data, returning the uncompressed bytes as a String.
If length is 0, #read returns an empty String.
If the optional buffer argument is present, it must reference a
String that will receive the decompressed data. buffer will
contain only the decompressed data after the call to #read, even if it
is not empty beforehand.
No character conversion is performed on decompressed bytes. #read
returns a String that represents the raw decompressed bytes, with
encoding set to Encoding::ASCII_8BIT (also known as
Encoding::BINARY).
304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 |
# File 'lib/bzip2/ffi/reader.rb', line 304 def read(length = nil, buffer = nil) if buffer buffer.clear buffer.force_encoding(Encoding::ASCII_8BIT) end if length raise ArgumentError 'length must be a non-negative integer or nil' if length < 0 if length == 0 check_closed return buffer || String.new end decompressed = decompress(length) return nil unless decompressed buffer ? buffer << decompressed : decompressed else result = buffer ? StringIO.new(buffer) : StringIO.new # StringIO#binmode is a no-op, but call in case it is implemented in # future versions. result.binmode result.set_encoding(Encoding::ASCII_8BIT) loop do decompressed = decompress(DEFAULT_DECOMPRESS_COUNT) break unless decompressed result.write(decompressed) break if decompressed.bytesize < DEFAULT_DECOMPRESS_COUNT end result.string end end |
#tell ⇒ Integer Also known as: pos
Returns the number of decompressed bytes that have been read.
360 361 362 363 |
# File 'lib/bzip2/ffi/reader.rb', line 360 def tell check_closed @out_pos end |