Class: IMW::Tools::Archiver

Inherits:
Object
  • Object
show all
Defined in:
lib/imw/tools/archiver.rb

Overview

Packages an Array of input files into a single output archive. When the archive is extracted, all the input files given will be in a single directory with a chosen name. The path to the output archive determines both the name of the archive and its type (tar, tar.bz2, zip, &c.).

If any of the input files are themselves archives, they will first be extracted, with only their contents winding up in the final directory (the file hierarchy of the archive will be preserved). If any of the input files are compressed, they will first be uncompressed before being added to the directory.

Both local and remote files can be archived. An exmaple:

archiver = IMW::Transforms::Archiver.new 'my_archive', '/path/to/my/regular_file.tsv', '/path/to/an/archive.tar.bz2', '/path/to/my_compressed_file.gz', 'http://mywebsite.com/index.html'
archiver.package! '/path/to/my_archive.zip'

This will create a ZIP archive at /path/to/my_archive.zip. When the ZIP archive is extracted its contents will look like

my_archive
|-- regular_file.tsv
|-- archive_file1
|-- archive_dir
|   |-- archive_file2
|   `-- archive_file3
|-- archive_file3
|-- my_compressed_file
`-- index.html

Notice that

  • the name of the extracted directory is given by the first argument to the Archiver when it was instantiated.

  • all files wind up in the top-level of this extracted directory when possible (regular_file.tsv, index.html)

  • /path/to/archive.tar.bz2 was not directly included, but its contents (archive_file1, archive_dir/archive_file2, archive_dir/archive_file3) were included instead.

  • /path/to/my_compressed_file.gz was first uncompressed before being added to the archive.

  • the remote file http://mywebsite.com/index.html was downloaded and included

This process can take a while when the constituent files are large because there is quite a lot of preparation done to the files to make this nice output structure in the final archive. Further calls to package! on the same instance of Archiver will skip the preparation step (the intermediate results of which are sitting in IMW’s temporary directory) and directly create the package, saving time when attempting to create multiple package formats from the same input data.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(name, *raw_inputs) ⇒ Archiver

Returns a new instance of Archiver.



68
69
70
71
# File 'lib/imw/tools/archiver.rb', line 68

def initialize name, *raw_inputs
  @name   = name
  self.inputs = raw_inputs
end

Instance Attribute Details

#local_inputsObject

Returns the value of attribute local_inputs.



66
67
68
# File 'lib/imw/tools/archiver.rb', line 66

def local_inputs
  @local_inputs
end

#nameObject

Returns the value of attribute name.



66
67
68
# File 'lib/imw/tools/archiver.rb', line 66

def name
  @name
end

#remote_inputsObject

Returns the value of attribute remote_inputs.



66
67
68
# File 'lib/imw/tools/archiver.rb', line 66

def remote_inputs
  @remote_inputs
end

Instance Method Details

#clean!Object

Remove the tmp_dir entirely, getting rid of all temporary files.



123
124
125
126
# File 'lib/imw/tools/archiver.rb', line 123

def clean!
  IMW.announce_if_verbose("Cleaning temporary directory #{tmp_dir}...")
  FileUtils.rm_rf(tmp_dir)
end

#dirString

A directory which will contain all the content being packaged, including the contents of any archives that were included in the list of files to process.

Returns:



117
118
119
# File 'lib/imw/tools/archiver.rb', line 117

def dir
  @dir ||= File.join(tmp_dir, name.to_s)
end

#errorsArray

Return a list of error messages for this archiver.

Returns:

  • (Array)

    the error messages



92
93
94
# File 'lib/imw/tools/archiver.rb', line 92

def errors
  @errors ||= []      
end

#inputs=(raw_inputs) ⇒ Object

Set the inputs for this archiver.

Parameters:



76
77
78
79
80
81
82
83
84
85
86
87
# File 'lib/imw/tools/archiver.rb', line 76

def inputs= raw_inputs
  @local_inputs, @remote_inputs = [], []
  raw_inputs.flatten.each do |raw_input|
    input = IMW.open(raw_input)
    if input.is_local?
      @local_inputs << input
    else
      @remote_inputs << input
    end
  end
  @local_inputs.flatten!
end

#package(output, options = {}) ⇒ StandardError, IMW::Resource

Package the contents of the temporary directory to an archive at output but return exceptions instead of raising them.

Parameters:

Returns:

  • (StandardError, IMW::Resource)

    either the completed package or the error which was raised



187
188
189
190
191
192
193
# File 'lib/imw/tools/archiver.rb', line 187

def package output, options={}
  begin
    package! output, options={}
  rescue StandardError => e
    return e
  end
end

#package!(output, options = {}) ⇒ IMW::Resource

Package the contents of the temporary directory to an archive at output. The extension of output determines the kind of archive.

Parameters:

Returns:



202
203
204
205
206
207
208
209
210
# File 'lib/imw/tools/archiver.rb', line 202

def package! output, options={}
  prepare!                          unless prepared?
  output = IMW.open(output)
  FileUtils.mkdir_p(output.dirname) unless File.exist?(output.dirname)        
  output.rm!                        if output.exist?
  FileUtils.cd(tmp_dir) { IMW.open(output.basename).create(name).mv(output.path) }
  add_processing_error "Archiver: couldn't create archive #{output.path}" unless output.exists?
  output
end

#prepare!Object

Copy, decompress, or extract the input paths to the temporary directory, readying them for packaging.



130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
# File 'lib/imw/tools/archiver.rb', line 130

def prepare!
  FileUtils.mkdir_p dir unless File.exist?(dir)
  
  local_inputs.each do |existing_file|
    new_path      = File.join(dir, existing_file.basename)
    case
    when existing_file.is_archive?
      IMW.announce_if_verbose("Extracting #{existing_file}...")
      FileUtils.cd(dir) do
        existing_file.extract
      end
    when existing_file.is_compressed?
      IMW.announce_if_verbose("Decompressing #{existing_file}...")
      existing_file.cp(new_path).decompress!
    else
      IMW.announce_if_verbose("Copying #{existing_file}...")
      existing_file.cp(new_path)
    end
  end
  
  remote_inputs.each do |remote_input|
    IMW.announce_if_verbose("Downloading #{remote_input}...")
    remote_input.cp(File.join(dir, remote_input.effective_basename))
  end
end

#prepared?true, false

Checks to see if all expected files exist in the temporary directory for this packager.

Returns:

  • (true, false)


160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
# File 'lib/imw/tools/archiver.rb', line 160

def prepared?
  local_inputs.each do |existing_file|
    case
    when existing_file.is_archive?
      existing_file.contents.each do |archived_file_path|
        return false unless File.exist?(File.join(dir, archived_file_path))
      end
    when existing_file.is_compressed?
      return false unless File.exist?(File.join(dir, existing_file.decompressed_basename))
    else
      return false unless File.exist?(File.join(dir, existing_file.basename))
    end
  end

  remote_inputs.each do |remote_input|
    return false unless File.exist?(File.join(dir, remote_input.effective_basename))
  end
  
  true
end

#success?true, false

Was this archiver successful (did it not have any errors)?

Returns:

  • (true, false)


99
100
101
# File 'lib/imw/tools/archiver.rb', line 99

def success?
  errors.empty?
end

#tmp_dirString

A temporary directory to work in. Its contents will ultimately consist of a directory named for the package containing all the input files.

Returns:



108
109
110
# File 'lib/imw/tools/archiver.rb', line 108

def tmp_dir
  @tmp_dir ||= File.join(IMW.path_to(:tmp_root, 'packager'), (Time.now.to_i.to_s + "-" + $$.to_s)) # guaranteed unique on a node
end