Class: RangesIO
- Inherits:
-
Object
- Object
- RangesIO
- Defined in:
- lib/ole/ranges_io.rb
Overview
Introduction
RangesIO is a basic class for wrapping another IO object allowing you to arbitrarily reorder slices of the input file by providing a list of ranges. Intended as an initial measure to curb inefficiencies in the Dirent#data method just reading all of a file’s data in one hit, with no method to stream it.
This class will encapuslate the ranges (corresponding to big or small blocks) of any ole file and thus allow reading/writing directly to the source bytes, in a streamed fashion (so just getting 16 bytes doesn’t read the whole thing).
In the simplest case it can be used with a single range to provide a limited io to a section of a file.
Limitations
-
No buffering. by design at the moment. Intended for large reads
TODO
On further reflection, this class is something of a joining/optimization of two separate IO classes. a SubfileIO, for providing access to a range within a File as a separate IO object, and a ConcatIO, allowing the presentation of a bunch of io objects as a single unified whole.
I will need such a ConcatIO if I’m to provide Mime#to_io, a method that will convert a whole mime message into an IO stream, that can be read from. It will just be the concatenation of a series of IO objects, corresponding to headers and boundaries, as StringIO’s, and SubfileIO objects, coming from the original message proper, or RangesIO as provided by the Attachment#data, that will then get wrapped by Mime in a Base64IO or similar, to get encoded on-the- fly. Thus the attachment, in its plain or encoded form, and the message as a whole never exists as a single string in memory, as it does now. This is a fair bit of work to achieve, but generally useful I believe.
This class isn’t ole specific, maybe move it to my general ruby stream project.
Direct Known Subclasses
Instance Attribute Summary collapse
-
#io ⇒ Object
readonly
Returns the value of attribute io.
-
#mode ⇒ Object
readonly
Returns the value of attribute mode.
-
#pos ⇒ Object
(also: #tell)
Returns the value of attribute pos.
-
#ranges ⇒ Object
readonly
Returns the value of attribute ranges.
-
#size ⇒ Object
Returns the value of attribute size.
Class Method Summary collapse
-
.open(*args, &block) ⇒ Object
add block form.
Instance Method Summary collapse
- #close ⇒ Object
- #eof? ⇒ Boolean
-
#gets ⇒ Object
(also: #readline)
i can wrap it in a buffered io stream that provides gets, and appropriately handle pos, truncate.
-
#initialize(io, mode = 'r', params = {}) ⇒ RangesIO
constructor
io-
the parent io object that we are wrapping.
- #inspect ⇒ Object
-
#offset_and_size(pos) ⇒ Object
returns the [
offset,size], pair inorder to read/write atpos(like a partial range), and its index. -
#read(limit = nil) ⇒ Object
read bytes from file, to a maximum of
limit, or all available if unspecified. -
#truncate(size) ⇒ Object
you may override this call to update @ranges and @size, if applicable.
- #write(data) ⇒ Object
Constructor Details
#initialize(io, mode = 'r', params = {}) ⇒ RangesIO
io-
the parent io object that we are wrapping.
mode-
the mode to use
params-
hash of params.
-
:ranges - byte offsets, either:
-
an array of ranges [1..2, 4..5, 6..8] or
-
an array of arrays, where the second is length [[1, 1], [4, 1], [6, 2]] for the above (think the way String indexing works)
-
-
:close_parent - boolean to close parent when this object is closed
NOTE: the ranges can overlap.
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 |
# File 'lib/ole/ranges_io.rb', line 54 def initialize io, mode='r', params={} mode, params = 'r', mode if Hash === mode ranges = params[:ranges] @params = {:close_parent => false}.merge params @mode = IO::Mode.new mode @io = io # convert ranges to arrays. check for negative ranges? ranges ||= [0, io.size] @ranges = ranges.map { |r| Range === r ? [r.begin, r.end - r.begin] : r } # calculate size @size = @ranges.inject(0) { |total, (pos, len)| total + len } # initial position in the file @pos = 0 # handle some mode flags truncate 0 if @mode.truncate? seek size if @mode.append? end |
Instance Attribute Details
#io ⇒ Object (readonly)
Returns the value of attribute io.
43 44 45 |
# File 'lib/ole/ranges_io.rb', line 43 def io @io end |
#mode ⇒ Object (readonly)
Returns the value of attribute mode.
43 44 45 |
# File 'lib/ole/ranges_io.rb', line 43 def mode @mode end |
#pos ⇒ Object Also known as: tell
Returns the value of attribute pos.
43 44 45 |
# File 'lib/ole/ranges_io.rb', line 43 def pos @pos end |
#ranges ⇒ Object (readonly)
Returns the value of attribute ranges.
43 44 45 |
# File 'lib/ole/ranges_io.rb', line 43 def ranges @ranges end |
#size ⇒ Object
Returns the value of attribute size.
43 44 45 |
# File 'lib/ole/ranges_io.rb', line 43 def size @size end |
Class Method Details
.open(*args, &block) ⇒ Object
add block form. TODO add test for this
78 79 80 81 82 83 84 85 86 87 |
# File 'lib/ole/ranges_io.rb', line 78 def self.open(*args, &block) ranges_io = new(*args) if block_given? begin; yield ranges_io ensure; ranges_io.close end else ranges_io end end |
Instance Method Details
#close ⇒ Object
105 106 107 |
# File 'lib/ole/ranges_io.rb', line 105 def close @io.close if @params[:close_parent] end |
#eof? ⇒ Boolean
124 125 126 |
# File 'lib/ole/ranges_io.rb', line 124 def eof? @pos == @size end |
#gets ⇒ Object Also known as: readline
i can wrap it in a buffered io stream that provides gets, and appropriately handle pos, truncate. mostly added just to past the tests. FIXME
200 201 202 203 204 205 |
# File 'lib/ole/ranges_io.rb', line 200 def gets s = read 1024 i = s.index "\n" @pos -= s.length - (i+1) s[0..i] end |
#inspect ⇒ Object
208 209 210 211 212 213 214 |
# File 'lib/ole/ranges_io.rb', line 208 def inspect # the rescue is for empty files pos, len = (@ranges[offset_and_size(@pos).last] rescue [nil, nil]) range_str = pos ? "#{pos}..#{pos+len}" : 'nil' "#<#{self.class} io=#{io.inspect}, size=#@size, pos=#@pos, "\ "range=#{range_str}>" end |
#offset_and_size(pos) ⇒ Object
returns the [offset, size], pair inorder to read/write at pos (like a partial range), and its index.
111 112 113 114 115 116 117 118 119 120 121 122 |
# File 'lib/ole/ranges_io.rb', line 111 def offset_and_size pos total = 0 ranges.each_with_index do |(offset, size), i| if pos <= total + size diff = pos - total return [offset + diff, size - diff], i end total += size end # should be impossible for any valid pos, (0...size) === pos raise ArgumentError, "no range for pos #{pos.inspect}" end |
#read(limit = nil) ⇒ Object
read bytes from file, to a maximum of limit, or all available if unspecified.
129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 |
# File 'lib/ole/ranges_io.rb', line 129 def read limit=nil data = '' return data if eof? limit ||= size partial_range, i = offset_and_size @pos # this may be conceptually nice (create sub-range starting where we are), but # for a large range array its pretty wasteful. even the previous way was. but # i'm not trying to optimize this atm. it may even go to c later if necessary. ([partial_range] + ranges[i+1..-1]).each do |pos, len| @io.seek pos if limit < len # convoluted, to handle read errors. s may be nil s = @io.read limit @pos += s.length if s break data << s end # convoluted, to handle ranges beyond the size of the file s = @io.read len @pos += s.length if s data << s break if s.length != len limit -= len end data end |
#truncate(size) ⇒ Object
you may override this call to update @ranges and @size, if applicable.
156 157 158 |
# File 'lib/ole/ranges_io.rb', line 156 def truncate size raise NotImplementedError, 'truncate not supported' end |
#write(data) ⇒ Object
166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 |
# File 'lib/ole/ranges_io.rb', line 166 def write data # short cut. needed because truncate 0 may return no ranges, instead of empty range, # thus offset_and_size fails. return 0 if data.empty? data_pos = 0 # if we don't have room, we can use the truncate hook to make more space. if data.length > @size - @pos begin truncate @pos + data.length rescue NotImplementedError raise IOError, "unable to grow #{inspect} to write #{data.length} bytes" end end partial_range, i = offset_and_size @pos ([partial_range] + ranges[i+1..-1]).each do |pos, len| @io.seek pos if data_pos + len > data.length chunk = data[data_pos..-1] @io.write chunk @pos += chunk.length data_pos = data.length break end @io.write data[data_pos, len] @pos += len data_pos += len end data_pos end |