Module: Archive::Zip::Entry

Included in:
Directory, File, Symlink
Defined in:
lib/archive/zip/entry.rb,
lib/archive/zip/entry.rb,
lib/archive/zip/entry.rb,
lib/archive/zip/entry.rb

Overview

The Archive::Zip::Entry mixin provides classes with methods implementing many of the common features of all entry types. Some of these methods, such as dump_local_file_record and dump_central_file_record, are required by Archive::Zip in order to store the entry into an archive. Those should be left alone. Others, such as ftype and mode=, are expected to be overridden to provide sensible information for the new entry type.

A class using this mixin must provide 2 methods: extract and dump_file_data. extract should be a public method with the following signature:

def extract(options = {})
  ...
end

This method should extract the contents of the entry to the filesystem. options should be an optional Hash containing a mapping of option names to option values. Please refer to Archive::Zip::Entry::File#extract, Archive::Zip::Entry::Symlink#extract, and Archive::Zip::Entry::Directory#extract for examples of the options currently supported.

dump_file_data should be a private method with the following signature:

def dump_file_data(io)
  ...
end

This method should use the write method of io to write all file data. io will be a writable, IO-like object.

The class methods from_file and parse are factories for creating the 3 kinds of concrete entries currently implemented: File, Directory, and Symlink. While it is possible to create new archives using custom entry implementations, it is not possible to load those same entries from the archive since the parse factory method does not know about them. Patches to support new entry types are welcome.

Defined Under Namespace

Classes: CFHRecord, Directory, File, LFHRecord, Symlink

Constant Summary collapse

FLAG_ENCRYPTED =

When this flag is set in the general purpose flags, it indicates that the entry’s file data is encrypted using the original (weak) algorithm.

0b0001
FLAG_DATA_DESCRIPTOR_FOLLOWS =

When this flag is set in the general purpose flags, it indicates that the read data descriptor record for a local file record is located after the entry’s file data.

0b1000

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Instance Attribute Details

#atimeObject

The last accessed time.



492
493
494
# File 'lib/archive/zip/entry.rb', line 492

def atime
  @atime
end

#commentObject

The comment associated with this entry.



502
503
504
# File 'lib/archive/zip/entry.rb', line 502

def comment
  @comment
end

#compression_codecObject

The selected compression codec.



509
510
511
# File 'lib/archive/zip/entry.rb', line 509

def compression_codec
  @compression_codec
end

#encryption_codecObject

The selected encryption codec.



511
512
513
# File 'lib/archive/zip/entry.rb', line 511

def encryption_codec
  @encryption_codec
end

#expected_data_descriptorObject

An Archive::Zip::DataDescriptor instance which should contain the expected CRC32 checksum, compressed size, and uncompressed size for the file data. When not nil, this is used by #extract to confirm that the data extraction was successful.



507
508
509
# File 'lib/archive/zip/entry.rb', line 507

def expected_data_descriptor
  @expected_data_descriptor
end

#gidObject

The group ID of the owner of this entry.



498
499
500
# File 'lib/archive/zip/entry.rb', line 498

def gid
  @gid
end

#modeObject

The file mode/permission bits for this entry.



500
501
502
# File 'lib/archive/zip/entry.rb', line 500

def mode
  @mode
end

#mtimeObject

The last modified time.



494
495
496
# File 'lib/archive/zip/entry.rb', line 494

def mtime
  @mtime
end

#passwordObject

The password used with the encryption codec to encrypt or decrypt the file data for an entry.



514
515
516
# File 'lib/archive/zip/entry.rb', line 514

def password
  @password
end

#raw_dataObject

The raw, possibly compressed and/or encrypted file data for an entry.



516
517
518
# File 'lib/archive/zip/entry.rb', line 516

def raw_data
  @raw_data
end

#uidObject

The user ID of the owner of this entry.



496
497
498
# File 'lib/archive/zip/entry.rb', line 496

def uid
  @uid
end

#zip_pathObject

The path for this entry in the ZIP archive.



490
491
492
# File 'lib/archive/zip/entry.rb', line 490

def zip_path
  @zip_path
end

Class Method Details

.expand_path(zip_path) ⇒ Object

Cleans up and returns zip_path by eliminating . and .. references, leading and trailing /‘s, and runs of /’s.



94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
# File 'lib/archive/zip/entry.rb', line 94

def self.expand_path(zip_path)
  result = []
  source = zip_path.split('/')

  source.each do |e|
    next if e.empty? || e == '.'

    if e == '..' && ! (result.last.nil? || result.last == '..') then
      result.pop
    else
      result.push(e)
    end
  end
  result.shift while result.first == '..'

  result.join('/')
end

.from_file(file_path, options = {}) ⇒ Object

Creates a new Entry based upon a file, symlink, or directory. file_path points to the source item. options is a Hash optionally containing the following:

:zip_path

The path for the entry in the archive where ‘/’ is the file separator character. This defaults to the basename of file_path if unspecified.

:follow_symlinks

When set to true (the default), symlinks are treated as the files or directories to which they point.

:compression_codec

Specifies a proc, lambda, or class. If a proc or lambda is used, it must take a single argument containing a zip entry and return a compression codec class to be instantiated and used with the entry. Otherwise, a compression codec class must be specified directly. When unset, the default compression codec for each entry type is used.

:encryption_codec

Specifies a proc, lambda, or class. If a proc or lambda is used, it must take a single argument containing a zip entry and return an encryption codec class to be instantiated and used with the entry. Otherwise, an encryption codec class must be specified directly. When unset, the default encryption codec for each entry type is used.

Raises Archive::Zip::EntryError if processing the given file path results in a file not found error.



136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
# File 'lib/archive/zip/entry.rb', line 136

def self.from_file(file_path, options = {})
  zip_path        = options.has_key?(:zip_path) ?
                    expand_path(options[:zip_path]) :
                    ::File.basename(file_path)
  follow_symlinks = options.has_key?(:follow_symlinks) ?
                    options[:follow_symlinks] :
                    true

  # Avoid repeatedly stat'ing the file by storing the stat structure once.
  begin
    stat = follow_symlinks ?
           ::File.stat(file_path) :
           ::File.lstat(file_path)
  rescue Errno::ENOENT
    if ::File.symlink?(file_path) then
      raise Zip::EntryError,
        "symlink at `#{file_path}' points to a non-existent file `#{::File.readlink(file_path)}'"
    else
      raise Zip::EntryError, "no such file or directory `#{file_path}'"
    end
  end

  # Ensure that zip paths for directories end with '/'.
  if stat.directory? then
    zip_path += '/'
  end

  # Instantiate the entry.
  if stat.symlink? then
    entry = Entry::Symlink.new(zip_path)
    entry.link_target = ::File.readlink(file_path)
  elsif stat.file? then
    entry = Entry::File.new(zip_path)
    entry.file_path = file_path
  elsif stat.directory? then
    entry = Entry::Directory.new(zip_path)
  else
    raise Zip::EntryError,
      "unsupported file type `#{stat.ftype}' for file `#{file_path}'"
  end

  # Set the compression and encryption codecs.
  unless options[:compression_codec].nil? then
    if options[:compression_codec].kind_of?(Proc) then
      entry.compression_codec = options[:compression_codec][entry].new
    else
      entry.compression_codec = options[:compression_codec].new
    end
  end
  unless options[:encryption_codec].nil? then
    if options[:encryption_codec].kind_of?(Proc) then
      entry.encryption_codec = options[:encryption_codec][entry].new
    else
      entry.encryption_codec = options[:encryption_codec].new
    end
  end

  # Set the entry's metadata.
  entry.uid = stat.uid
  entry.gid = stat.gid
  entry.mtime = stat.mtime
  entry.atime = stat.atime
  entry.mode = stat.mode

  entry
end

.parse(io) ⇒ Object

Creates and returns a new entry object by parsing from the current position of io. io must be a readable, IO-like object which is positioned at the start of a central file record following the signature for that record.

NOTE: For now io MUST be seekable.

Currently, the only entry objects returned are instances of Archive::Zip::Entry::File, Archive::Zip::Entry::Directory, and Archive::Zip::Entry::Symlink. Any other kind of entry will be mapped into an instance of Archive::Zip::Entry::File.

Raises Archive::Zip::EntryError for any other errors related to processing the entry.



217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
# File 'lib/archive/zip/entry.rb', line 217

def self.parse(io)
  # Parse the central file record and then use the information found there
  # to locate and parse the corresponding local file record.
  cfr = parse_central_file_record(io)
  next_record_position = io.pos
  io.seek(cfr.local_header_position)
  unless IOExtensions.read_exactly(io, 4) == LFH_SIGNATURE then
    raise Zip::EntryError, 'bad local file header signature'
  end
  lfr = parse_local_file_record(io, cfr.compressed_size)

  # Check to ensure that the contents of the central file record and the
  # local file record which are supposed to be duplicated are in fact the
  # same.
  compare_file_records(lfr, cfr)

  begin
    # Load the correct compression codec.
    compression_codec = Codec.create_compression_codec(
      cfr.compression_method,
      cfr.general_purpose_flags
    )
  rescue Zip::Error => e
    raise Zip::EntryError, "`#{cfr.zip_path}': #{e.message}"
  end

  begin
    # Load the correct encryption codec.
    encryption_codec = Codec.create_encryption_codec(
      cfr.general_purpose_flags
    )
  rescue Zip::Error => e
    raise Zip::EntryError, "`#{cfr.zip_path}': #{e.message}"
  end

  # Set up a data descriptor with expected values for later comparison.
  expected_data_descriptor = DataDescriptor.new(
    cfr.crc32,
    cfr.compressed_size,
    cfr.uncompressed_size
  )

  # Create the entry.
  expanded_path = expand_path(cfr.zip_path)
  io_window = IOWindow.new(io, io.pos, cfr.compressed_size)
  if cfr.zip_path[-1..-1] == '/' then
    # This is a directory entry.
    entry = Entry::Directory.new(expanded_path, io_window)
  elsif (cfr.external_file_attributes >> 16) & 0770000 == 0120000 then
    # This is a symlink entry.
    entry = Entry::Symlink.new(expanded_path, io_window)
  else
    # Anything else is a file entry.
    entry = Entry::File.new(expanded_path, io_window)
  end

  # Set the expected data descriptor so that extraction can be verified.
  entry.expected_data_descriptor = expected_data_descriptor
  # Record the compression codec.
  entry.compression_codec = compression_codec
  # Record the encryption codec.
  entry.encryption_codec = encryption_codec
  # Set some entry metadata.
  entry.mtime = cfr.mtime
  # Only set mode bits for the entry if the external file attributes are
  # Unix-compatible.
  if cfr.made_by_version & 0xFF00 == 0x0300 then
    entry.mode = cfr.external_file_attributes >> 16
  end
  entry.comment = cfr.comment
  cfr.extra_fields.each { |ef| entry.add_extra_field(ef) }
  lfr.extra_fields.each { |ef| entry.add_extra_field(ef) }

  # Return to the beginning of the next central directory record.
  io.seek(next_record_position)

  entry
end

Instance Method Details

#add_extra_field(extra_field) ⇒ Object

Adds extra_field as an extra field specification to both the central file record and the local file record of this entry.

If extra_field is an instance of Archive::Zip::Entry::ExtraField::ExtendedTimestamp, the values of that field are used to set mtime and atime for this entry. If extra_field is an instance of Archive::Zip::Entry::ExtraField::Unix, the values of that field are used to set mtime, atime, uid, and gid for this entry.



558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
# File 'lib/archive/zip/entry.rb', line 558

def add_extra_field(extra_field)
  # Try to find an extra field with the same header ID already in the list
  # and merge the new one with that if one exists; otherwise, add the new
  # one to the list.
  existing_extra_field = @extra_fields.find do |ef|
    ef.header_id == extra_field.header_id
  end
  if existing_extra_field.nil? then
    @extra_fields << extra_field
  else
    extra_field = existing_extra_field.merge(extra_field)
  end

  # Set some attributes of this entry based on the settings in select types
  # of extra fields.
  if extra_field.kind_of?(ExtraField::ExtendedTimestamp) then
    self.mtime = extra_field.mtime unless extra_field.mtime.nil?
    self.atime = extra_field.atime unless extra_field.atime.nil?
  elsif extra_field.kind_of?(ExtraField::Unix) then
    self.mtime = extra_field.mtime unless extra_field.mtime.nil?
    self.atime = extra_field.atime unless extra_field.atime.nil?
    self.uid   = extra_field.uid unless extra_field.uid.nil?
    self.gid   = extra_field.gid unless extra_field.uid.nil?
  end
  self
end

#directory?Boolean

Returns false.

Returns:

  • (Boolean)


546
547
548
# File 'lib/archive/zip/entry.rb', line 546

def directory?
  false
end

#dump_central_file_record(io) ⇒ Object

Writes the central file record for this entry to io, a writable, IO-like object which provides a write method. Returns the number of bytes written.

NOTE: This method should only be called by Archive::Zip.



694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
# File 'lib/archive/zip/entry.rb', line 694

def dump_central_file_record(io)
  bytes_written = 0

  # Assume that no trailing data descriptor will be necessary.
  need_trailing_data_descriptor = false
  begin
    io.pos
  rescue Errno::ESPIPE
    # A trailing data descriptor is required for non-seekable IO.
    need_trailing_data_descriptor = true
  end
  if encryption_codec.class == Codec::TraditionalEncryption then
    # HACK:
    # According to the ZIP specification, a trailing data descriptor should
    # only be required when writing to non-seekable IO , but InfoZIP
    # *always* does this when using traditional encryption even though it
    # will also write the data descriptor in the usual place if possible.
    # Failure to emulate InfoZIP in this behavior will prevent InfoZIP
    # compatibility with traditionally encrypted entries.
    need_trailing_data_descriptor = true
  end

  # Set the general purpose flags.
  general_purpose_flags  = compression_codec.general_purpose_flags
  general_purpose_flags |= encryption_codec.general_purpose_flags
  if need_trailing_data_descriptor then
    general_purpose_flags |= FLAG_DATA_DESCRIPTOR_FOLLOWS
  end

  # Select the minimum ZIP specification version needed to extract this
  # entry.
  version_needed_to_extract = compression_codec.version_needed_to_extract
  if encryption_codec.version_needed_to_extract > version_needed_to_extract then
    version_needed_to_extract = encryption_codec.version_needed_to_extract
  end

  # Write the data.
  bytes_written += io.write(CFH_SIGNATURE)
  bytes_written += io.write(
    [
      version_made_by,
      version_needed_to_extract,
      general_purpose_flags,
      compression_codec.compression_method,
      mtime.to_dos_time.to_i
    ].pack('vvvvV')
  )
  bytes_written += @data_descriptor.dump(io)
  extra_field_data = central_extra_field_data
  bytes_written += io.write(
    [
      zip_path.length,
      extra_field_data.length,
      comment.length,
      0,
      internal_file_attributes,
      external_file_attributes,
      @local_file_record_position
    ].pack('vvvvvVV')
  )
  bytes_written += io.write(zip_path)
  bytes_written += io.write(extra_field_data)
  bytes_written += io.write(comment)

  bytes_written
end

#dump_local_file_record(io, local_file_record_position) ⇒ Object

Writes the local file record for this entry to io, a writable, IO-like object which provides a write method. local_file_record_position is the offset within io at which writing will begin. This is used so that when writing to a non-seekable IO object it is possible to avoid calling the pos method of io. Returns the number of bytes written.

NOTE: This method should only be called by Archive::Zip.



592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
# File 'lib/archive/zip/entry.rb', line 592

def dump_local_file_record(io, local_file_record_position)
  @local_file_record_position = local_file_record_position
  bytes_written = 0

  # Assume that no trailing data descriptor will be necessary.
  need_trailing_data_descriptor = false
  begin
    io.pos
  rescue Errno::ESPIPE
    # A trailing data descriptor is required for non-seekable IO.
    need_trailing_data_descriptor = true
  end
  if encryption_codec.class == Codec::TraditionalEncryption then
    # HACK:
    # According to the ZIP specification, a trailing data descriptor should
    # only be required when writing to non-seekable IO , but InfoZIP
    # *always* does this when using traditional encryption even though it
    # will also write the data descriptor in the usual place if possible.
    # Failure to emulate InfoZIP in this behavior will prevent InfoZIP
    # compatibility with traditionally encrypted entries.
    need_trailing_data_descriptor = true
    # HACK:
    # The InfoZIP implementation of traditional encryption requires that the
    # the last modified file time be used as part of the encryption header.
    # This is a deviation from the ZIP specification.
    encryption_codec.mtime = mtime
  end

  # Set the general purpose flags.
  general_purpose_flags  = compression_codec.general_purpose_flags
  general_purpose_flags |= encryption_codec.general_purpose_flags
  if need_trailing_data_descriptor then
    general_purpose_flags |= FLAG_DATA_DESCRIPTOR_FOLLOWS
  end

  # Select the minimum ZIP specification version needed to extract this
  # entry.
  version_needed_to_extract = compression_codec.version_needed_to_extract
  if encryption_codec.version_needed_to_extract > version_needed_to_extract then
    version_needed_to_extract = encryption_codec.version_needed_to_extract
  end

  # Write the data.
  bytes_written += io.write(LFH_SIGNATURE)
  extra_field_data = local_extra_field_data
  bytes_written += io.write(
    [
      version_needed_to_extract,
      general_purpose_flags,
      compression_codec.compression_method,
      mtime.to_dos_time.to_i,
      0,
      0,
      0,
      zip_path.length,
      extra_field_data.length
    ].pack('vvvVVVVvv')
  )
  bytes_written += io.write(zip_path)
  bytes_written += io.write(extra_field_data)

  # Pipeline a compressor into an encryptor, write all the file data to the
  # compressor, and get a data descriptor from it.
  encryption_codec.encryptor(io, password) do |e|
    compression_codec.compressor(e) do |c|
      dump_file_data(c)
      c.close(false)
      @data_descriptor = DataDescriptor.new(
        c.data_descriptor.crc32,
        c.data_descriptor.compressed_size + encryption_codec.header_size,
        c.data_descriptor.uncompressed_size
      )
    end
    e.close(false)
  end
  bytes_written += @data_descriptor.compressed_size

  # Write the trailing data descriptor if necessary.
  if need_trailing_data_descriptor then
    bytes_written += io.write(DD_SIGNATURE)
    bytes_written += @data_descriptor.dump(io)
  end

  begin
    # Update the data descriptor located before the compressed data for the
    # entry.
    saved_position = io.pos
    io.pos = @local_file_record_position + 14
    @data_descriptor.dump(io)
    io.pos = saved_position
  rescue Errno::ESPIPE
    # Ignore a failed attempt to update the data descriptor.
  end

  bytes_written
end

#file?Boolean

Returns false.

Returns:

  • (Boolean)


536
537
538
# File 'lib/archive/zip/entry.rb', line 536

def file?
  false
end

#ftypeObject

Returns the file type of this entry as the symbol :unknown.

Override this in concrete subclasses to return an appropriate symbol.



531
532
533
# File 'lib/archive/zip/entry.rb', line 531

def ftype
  :unknown
end

#initialize(zip_path, raw_data = nil) ⇒ Object

Creates a new, uninitialized Entry instance using the Store compression method. The zip path is initialized to zip_path. raw_data, if specified, must be a readable, IO-like object containing possibly compressed/encrypted file data for the entry. It is intended to be used primarily by the parse class method.



473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
# File 'lib/archive/zip/entry.rb', line 473

def initialize(zip_path, raw_data = nil)
  self.zip_path = zip_path
  self.mtime = Time.now
  self.atime = @mtime
  self.uid = nil
  self.gid = nil
  self.mode = 0777
  self.comment = ''
  self.expected_data_descriptor = nil
  self.compression_codec = Zip::Codec::Store.new
  self.encryption_codec = Zip::Codec::NullEncryption.new
  self.password = nil
  @raw_data = raw_data
  @extra_fields = []
end

#symlink?Boolean

Returns false.

Returns:

  • (Boolean)


541
542
543
# File 'lib/archive/zip/entry.rb', line 541

def symlink?
  false
end