Method: Ole::Storage#load

Defined in:
lib/ole/storage/base.rb

#loadObject

load document from file.

TODO: implement various allocationtable checks, maybe as a AllocationTable#fsck function :)

  1. reterminate any chain not ending in EOC. compare file size with actually allocated blocks per file.

  2. pass through all chain heads looking for collisions, and making sure nothing points to them (ie they are really heads). in both sbat and mbat

  3. we know the locations of the bbat data, and mbat data. ensure that there are placeholder blocks in the bat for them.

  4. maybe a check of excess data. if there is data outside the bbat.truncate.length + 1 * block_size, (eg what is used for truncate in #flush), then maybe add some sort of message about that. it will be automatically thrown away at close time.



107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
# File 'lib/ole/storage/base.rb', line 107

def load
  # we always read 512 for the header block. if the block size ends up being different,
  # what happens to the 109 fat entries. are there more/less entries?
  @io.rewind
  header_block = @io.read 512
  @header = Header.new header_block

  # create an empty bbat.
  @bbat = AllocationTable::Big.new self
  bbat_chain = header_block[Header::SIZE..-1].unpack 'V*'
  mbat_block = @header.mbat_start
  @header.num_mbat.times do
    blocks = @bbat.read([mbat_block]).unpack 'V*'
    mbat_block = blocks.pop
    bbat_chain += blocks
  end
  # am i using num_bat in the right way?
  @bbat.load @bbat.read(bbat_chain[0, @header.num_bat])
  
  # get block chain for directories, read it, then split it into chunks and load the
  # directory entries. semantics changed - used to cut at first dir where dir.type == 0
  @dirents = @bbat.read(@header.dirent_start).to_enum(:each_chunk, Dirent::SIZE).
    map { |str| Dirent.new self, str }

  # now reorder from flat into a tree
  # links are stored in some kind of balanced binary tree
  # check that everything is visited at least, and at most once
  # similarly with the blocks of the file.
  # was thinking of moving this to Dirent.to_tree instead.
  class << @dirents
    def to_tree idx=0
      return [] if idx == Dirent::EOT
      d = self[idx]
      to_tree(d.child).each { |child| d << child }
      raise FormatError, "directory #{d.inspect} used twice" if d.idx
      d.idx = idx
      to_tree(d.prev) + [d] + to_tree(d.next)
    end
  end

  @root = @dirents.to_tree.first
  @dirents.reject! { |d| d.type_id == 0 }
  # silence this warning by default, its not really important (issue #5).
  # fairly common one appears to be "R" (from office OS X?) which smells
  # like some kind of UTF16 snafu, but scottwillson also has had some kanji...
  #Log.warn "root name was #{@root.name.inspect}" unless @root.name == 'Root Entry'
  unused = @dirents.reject(&:idx).length
  Log.warn "#{unused} unused directories" if unused > 0

  # FIXME i don't currently use @header.num_sbat which i should
  # hmm. nor do i write it. it means what exactly again?
  # which mode to use here?
  @sb_file = RangesIOResizeable.new @bbat, :first_block => @root.first_block, :size => @root.size
  @sbat = AllocationTable::Small.new self
  @sbat.load @bbat.read(@header.sbat_start)
end