Class: Metadata

Inherits:
Object
  • Object
show all
Defined in:
lib/tpkg/metadata.rb

Overview

This class is used for storing metadata of a package. The idea behind this class is that you can give it a metadata file of any format, such as yaml or xml, and it will provide you a uniform interface for accessing/dealing with the metadata.

Direct Known Subclasses

FileMetadata

Constant Summary collapse

REQUIRED_FIELDS =
[:name, :version, :maintainer, :description]

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(text, format, file = nil, source = nil) ⇒ Metadata

text = text representation of the metadata format = yml, xml, json, etc. file = Path to the metadata file that was the source of this metadata source = Source, in the tpkg sense, of the package described by this metadata. I.e. the filename of an individual package or a directory or URL containing multiple packages and a metadata.yml file. Used by tpkg to report on how many packages are available from various sources.



279
280
281
282
283
284
285
286
# File 'lib/tpkg/metadata.rb', line 279

def initialize(text, format, file=nil, source=nil)
  @text = text
  # FIXME: should define enum of supported formats and reject others
  @format = format
  @file = file
  @source = source
  @hash = nil
end

Instance Attribute Details

#fileObject (readonly)

Returns the value of attribute file.



232
233
234
# File 'lib/tpkg/metadata.rb', line 232

def file
  @file
end

#formatObject (readonly)

Returns the value of attribute format.



232
233
234
# File 'lib/tpkg/metadata.rb', line 232

def format
  @format
end

#sourceObject

Returns the value of attribute source.



233
234
235
# File 'lib/tpkg/metadata.rb', line 233

def source
  @source
end

#textObject (readonly)

Returns the value of attribute text.



232
233
234
# File 'lib/tpkg/metadata.rb', line 232

def text
  @text
end

Class Method Details

.clean_for_filename(dirtystring) ⇒ Object

Cleans up a string to make it suitable for use in a filename



237
238
239
# File 'lib/tpkg/metadata.rb', line 237

def self.clean_for_filename(dirtystring)
  dirtystring.downcase.gsub(/[^\w]/, '')
end

.get_pkgs_metadata_from_yml_doc(yml_doc, metadata = nil, source = nil) ⇒ Object

Parse a file containing multiple package metadata documents (such as is generated by Tpkg.extract_metadata) into a hash of Metadata objects



243
244
245
246
247
248
249
250
251
252
253
254
# File 'lib/tpkg/metadata.rb', line 243

def self.(yml_doc, =nil, source=nil)
   ||= {} 
   = yml_doc.split("---")
  .each do |  |
    if  =~ /^:?name:(.+)/
      name = $1.strip
      [name] ||= []
      [name] << Metadata.new(,'yml', nil, source)
    end
  end
  return 
end

.instantiate_from_dir(dir) ⇒ Object

Given the directory of an unpacked package, returns a Metadata object. The metadata file can be in yml or xml format



258
259
260
261
262
263
264
265
266
267
268
269
270
# File 'lib/tpkg/metadata.rb', line 258

def self.instantiate_from_dir(dir)
   = nil
  if File.exist?(File.join(dir, 'tpkg.yml'))
     = Metadata.new(File.read(File.join(dir, 'tpkg.yml')),
                            'yml',
                            File.join(dir, 'tpkg.yml'))
  elsif File.exists?(File.join(dir, 'tpkg.xml'))
     = Metadata.new(File.read(File.join(dir, 'tpkg.xml')),
                            'xml',
                            File.join(dir, 'tpkg.xml'))
  end
  return 
end

Instance Method Details

#[](key) ⇒ Object



288
289
290
# File 'lib/tpkg/metadata.rb', line 288

def [](key)
  return to_hash[key]
end

#[]=(key, value) ⇒ Object



292
293
294
# File 'lib/tpkg/metadata.rb', line 292

def []=(key,value)
  to_hash[key]=value
end

#add_tpkg_version(version) ⇒ Object

Add tpkg_version to the existing tpkg.xml or tpkg.yml file



353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
# File 'lib/tpkg/metadata.rb', line 353

def add_tpkg_version(version)
  if self[:tpkg_version]
    if self[:tpkg_version] != version
      warn "Warning: tpkg_version is specified as #{self[:tpkg_version]}, which doesn't match with the actual tpkg version being used (#{version})."
    end
  else
    # Add to in-memory data
    self[:tpkg_version] = version
    # Update the metadata source file (if known)
    if @file
      if @format == 'yml'
        File.open(@file, 'a') do |file|
          file.puts "tpkg_version: #{version}"
        end 
      elsif @format == 'xml'
         = REXML::Document.new(@text)
        tpkg_version_ele = REXML::Element.new('tpkg_version')
        tpkg_version_ele.text = version
        .root.add_element(tpkg_version_ele)  
        File.open(@file, 'w') do |file|
          .write(file)
        end
      else
        raise "Unknown metadata format"
      end
    end
  end
end

#generate_package_filenameObject



382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
# File 'lib/tpkg/metadata.rb', line 382

def generate_package_filename
  name = to_hash[:name]
  version = to_hash[:version]
  packageversion = nil
  if to_hash[:package_version] && !to_hash[:package_version].to_s.empty?
    packageversion = to_hash[:package_version]
  end
  package_filename = "#{name}-#{version}"
  if packageversion
    package_filename << "-#{packageversion}"
  end


  if to_hash[:operatingsystem] and !to_hash[:operatingsystem].empty?
    if to_hash[:operatingsystem].length == 1
      package_filename << "-#{Metadata::clean_for_filename(to_hash[:operatingsystem].first)}"
    else
      operatingsystems = to_hash[:operatingsystem].dup
      # Genericize any equivalent operating systems
      # FIXME: more generic handling of equivalent OSs is probably called for
      operatingsystems.each do |os|
        os.sub!('CentOS', 'RedHat')
      end
      firstname = operatingsystems.first.split('-').first
      firstversion = operatingsystems.first.split('-').last
      if operatingsystems.all? { |os| os == operatingsystems.first }
        # After genericizing all OSs are the same
        package_filename << "-#{Metadata::clean_for_filename(operatingsystems.first)}"
      elsif operatingsystems.all? { |os| os =~ /#{firstname}-/ }
        # All of the OSs have the same name, just different versions.  It
        # may not be perfect, but name the package after the OS without a
        # version.  I.e. if the package specifies RedHat-4,RedHat-5 then
        # name it "redhat". It might be confusing when it won't install on
        # RedHat-3, but it seems better to me than naming it "multios".
        package_filename << "-#{Metadata::clean_for_filename(firstname)}"
      else
        package_filename << "-multios"
      end
    end
  end
  if to_hash[:architecture] and !to_hash[:architecture].empty?
    if to_hash[:architecture].length == 1
      package_filename << "-#{Metadata::clean_for_filename(to_hash[:architecture].first)}"
    else
      package_filename << "-multiarch"
    end
  end

  return package_filename
end

#get_native_depsObject



725
726
727
728
729
730
731
# File 'lib/tpkg/metadata.rb', line 725

def get_native_deps
  native_deps = []
  if self[:dependencies]
    native_deps = self[:dependencies].select{|dep| dep[:type] == :native}
  end
  native_deps
end

#metadata_xml_to_hashObject



491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
# File 'lib/tpkg/metadata.rb', line 491

def 
  # Don't do anything if metadata is not from xml file
  return if @format != "xml"

   = {}
   = REXML::Document.new(@text)

  if .root.attributes['filename'] # && !metadata_xml.root.attributes['filename'].empty?
    [:filename] = .root.attributes['filename'] 
  end

  REQUIRED_FIELDS.each do |reqfield|
    if .elements["/tpkg/#{reqfield}"]
      [reqfield] = .elements["/tpkg/#{reqfield}"].text 
    end
  end

  [:tpkg_version, :package_version, :description, :bugreporting].each do |optfield|
    if .elements["/tpkg/#{optfield.to_s}"]
      [optfield] =
        .elements["/tpkg/#{optfield.to_s}"].text
    end
  end

  [:operatingsystem, :architecture].each do |arrayfield|
    array = []
    # In the tpkg design docs I wrote that the user would specify
    # multiple OSs or architectures by specifying the associated XML
    # element more than once:
    # <tpkg>
    # <operatingsystem>RedHat-4</operatingsystem>
    # <operatingsystem>CentOS-4</operatingsystem>
    # </tpkg>
    # However, I wrote the initial code and built my initial packages
    # using comma separated values in a single instance of the
    # element:
    # <tpkg>
    # <operatingsystem>RedHat-4,CentOS-4</operatingsystem>
    # </tpkg>
    # So we support both.
    .elements.each("/tpkg/#{arrayfield.to_s}") do |af|
      array.concat(af.text.split(/\s*,\s*/))
    end
    [arrayfield] = array unless array.empty?
  end

  deps = []
  .elements.each('/tpkg/dependencies/dependency') do |depxml|
    dep = {}
    dep[:name] = depxml.elements['name'].text
    [:allowed_versions, :minimum_version, :maximum_version,
     :minimum_package_version, :maximum_package_version].each do |depfield|
      if depxml.elements[depfield.to_s]
        dep[depfield] = depxml.elements[depfield.to_s].text
      end
    end
    if depxml.elements['native']
      dep[:type] = :native
    else
      dep[:type] = :tpkg
    end
    deps << dep
  end
  [:dependencies] = deps unless deps.empty?
  
  conflicts = []
  .elements.each('/tpkg/conflicts/conflict') do |conflictxml|
    conflict = {}
    conflict[:name] = conflictxml.elements['name'].text
    [:minimum_version, :maximum_version,
     :minimum_package_version, :maximum_package_version].each do |conflictfield|
      if conflictxml.elements[conflictfield.to_s]
        conflict[conflictfield] = conflictxml.elements[conflictfield.to_s].text
      end
    end
    if conflictxml.elements['native']
      conflict[:type] = :native
    else
      conflict[:type] = :tpkg
    end
    conflicts << conflict
  end
  [:conflicts] = conflicts unless conflicts.empty?

  externals = []
  .elements.each('/tpkg/externals/external') do |extxml|
    external = {}
    external[:name] = extxml.elements['name'].text
    if extxml.elements['data']
      # The data element requires special handling.  We want to capture its
      # raw contents, which may be XML.  The "text" method we use for other
      # fields only returns text outside of child XML elements.  That fine
      # for other fields which we don't expect to contain any child
      # elements.  But here there may well be child elements and we need to
      # capture the raw data.
      # I.e. if we have:
      # <data>
      #   <one>Some text</one>
      #   <two>Other text</two>
      # </data>
      # We want to capture:
      # "\n  <one>Some text</one>\n  <two>Other text</two>\n"
      external[:data] = extxml.elements['data'].children.join('')
    elsif extxml.elements['datafile']
      # We don't have access to the package contents here, so we just save
      # the name of the file and leave it up to others to read the file
      # when the package contents are available.
      external[:datafile] = extxml.elements['datafile'].text
    elsif extxml.elements['datascript']
      # We don't have access to the package contents here, so we just save
      # the name of the script and leave it up to others to run the script
      # when the package contents are available.
      external[:datascript] = extxml.elements['datascript'].text 
    end
    externals << external
  end
  [:externals] = externals unless externals.empty?

  [:files] = {}
  file_defaults = {}
  if .elements['/tpkg/files/file_defaults/posix']
    posix = {}
    if .elements['/tpkg/files/file_defaults/posix/owner']
      owner =
        .elements['/tpkg/files/file_defaults/posix/owner'].text
      posix[:owner] = owner

    end
    gid = nil
    if .elements['/tpkg/files/file_defaults/posix/group']
      group =
        .elements['/tpkg/files/file_defaults/posix/group'].text
      posix[:group] = group
    end
    perms = nil
    if .elements['/tpkg/files/file_defaults/posix/perms']
      perms = 
        .elements['/tpkg/files/file_defaults/posix/perms'].text
      posix[:perms] = perms.oct
    end
    file_defaults[:posix] = posix
  end
  [:files][:file_defaults] = file_defaults unless file_defaults.empty?

  dir_defaults = {}
  if .elements['/tpkg/files/dir_defaults/posix']
    posix = {}
    if .elements['/tpkg/files/dir_defaults/posix/owner']
      owner =
        .elements['/tpkg/files/dir_defaults/posix/owner'].text
      posix[:owner] = owner
    end
    gid = nil
    if .elements['/tpkg/files/dir_defaults/posix/group']
      group =
        .elements['/tpkg/files/dir_defaults/posix/group'].text
      posix[:group] = group
    end
    perms = nil
    if .elements['/tpkg/files/dir_defaults/posix/perms']
      perms =
        .elements['/tpkg/files/dir_defaults/posix/perms'].text
      posix[:perms] = perms.oct
    end
    dir_defaults[:posix] = posix
  end
  [:files][:dir_defaults] = dir_defaults unless dir_defaults.empty?

  files = []
  .elements.each('/tpkg/files/file') do |filexml|
    file = {}
    file[:path] = filexml.elements['path'].text
    file[:config] = true if filexml.elements['config']
    if filexml.elements['encrypt']
      encrypt = {}
      if filexml.elements['encrypt'].attribute('precrypt') &&
         filexml.elements['encrypt'].attribute('precrypt').value == 'true'
        encrypt['precrypt'] = true
      end
      if filexml.elements['encrypt'].attribute('algorithm')
        encrypt['algorithm'] = filexml.elements['encrypt'].attribute('algorithm').value
      end
      file[:encrypt] = encrypt
    end
    if filexml.elements['init']
      init = {}
      if filexml.elements['init/start']
        init[:start] = filexml.elements['init/start'].text
      end
      if filexml.elements['init/levels']
        if filexml.elements['init/levels'].text
          # Split '234' into ['2','3','4'], for example
          init[:levels] = filexml.elements['init/levels'].text.split(//)
        else
          # If the element is empty in the XML (<levels/> or
          # <levels></levels>) then we get nil back from the .text
          # call, interpret that as no levels
          init[:levels] = []
        end
      end
      file[:init] = init
    end
    if filexml.elements['crontab']
      crontab = {}
      if filexml.elements['crontab/user']
        crontab[:user] = filexml.elements['crontab/user'].text
      end
      file[:crontab] = crontab
    end
    if filexml.elements['posix']
      posix = {}
      if filexml.elements['posix/owner']
        owner = filexml.elements['posix/owner'].text
        posix[:owner] = owner
      end
      gid = nil
      if filexml.elements['posix/group']
        group = filexml.elements['posix/group'].text
        posix[:group] = group
      end
      perms = nil
      if filexml.elements['posix/perms']
        perms = filexml.elements['posix/perms'].text
        posix[:perms] = perms.oct
      end
      file[:posix] = posix
    end
    files << file
  end
  [:files][:files] = files unless files.empty?

  return 
end

#to_hashObject



296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
# File 'lib/tpkg/metadata.rb', line 296

def to_hash
  if @hash  
    return @hash 
  end
  
  if @format == 'yml'
    hash = YAML::load(@text)
    @hash = hash.with_indifferent_access
    
    # We need this for backward compatibility. With xml, we specify
    # native dependency as type: :native rather then native: true
    @hash[:dependencies].each do | dep |
      if !dep[:type]
        if dep[:native]
          dep[:type] = :native
        else
          dep[:type] = :tpkg
        end
      end
    end if @hash[:dependencies]
    
    @hash[:files][:files].each do |file|
      # We need to do this for backward compatibility. In the old yml schema,
      # the encrypt field can either be "true" or a string value. Now, it is
      # a hash. We need to use a hash because we need to store info like the 
      # encryption algorithm.
      if file[:encrypt] && !file[:encrypt].is_a?(Hash)
        precrypt = true if file[:encrypt] == 'precrypt'
        file[:encrypt] = {:precrypt => precrypt}
      end
      # perms value are octal, but kwalify might treat it as decimal if it's something like 4550
      # the user might also use string instead of number
      if file[:posix] && file[:posix][:perms] && 
        (file[:posix][:perms].is_a?(String) or file[:posix][:perms] >= 1000)
        file[:posix][:perms] = "#{file[:posix][:perms]}".oct
      end
    end if @hash[:files] && @hash[:files][:files]
  elsif @format == 'xml'
    @hash = .with_indifferent_access
  else
    raise "Unknown metadata format"
  end
  @hash
end

#validate(schema_dir) ⇒ Object

Validate the metadata against the schema/dtd specified by the user or use the default one in schema_dir Return array of errors (if there are any)



436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
# File 'lib/tpkg/metadata.rb', line 436

def validate(schema_dir)
  errors = []
  if @format == 'yml'
    if to_hash[:schema_file] 
      schema_file = File.join(schema_dir, to_hash[:schema_file]) 
    else
      schema_file = File.join(schema_dir, "schema.yml") 
    end
    unless File.exists?(schema_file)
      warn "Warning: unable to validate metadata because #{schema_file} does not exist"
      return
    end 
    errors = verify_yaml(schema_file, @text)
  elsif @format == 'xml'
    # TODO: use DTD to validate XML
    errors = verify_required_fields
  end

  # Verify version and package version begin with a digit
  if to_hash[:version].to_s !~ /^\d/
    errors << "Version must begins with a digit"
  end
  if to_hash[:package_version] && to_hash[:package_version].to_s !~ /^\d/
    errors << "Package version must begins with a digit"
  end
  errors
end

#verify_required_fieldsObject

Once we implement validating the XML using the DTD, we won’t need this method anymore



479
480
481
482
483
484
485
486
487
488
489
# File 'lib/tpkg/metadata.rb', line 479

def verify_required_fields
  errors = []
  REQUIRED_FIELDS.each do |reqfield|
    if to_hash[reqfield].nil?
      errors << "Required field #{reqfield} not found"
    elsif to_hash[reqfield].to_s.empty?
      errors << "Required field #{reqfield} is empty"
    end
  end
  errors
end

#verify_yaml(schema, yaml_text) ⇒ Object

Verify the yaml text against the given schema Return array of errors (if there are any)



466
467
468
469
470
471
472
473
474
475
# File 'lib/tpkg/metadata.rb', line 466

def verify_yaml(schema, yaml_text)
  errors = nil
  # Kwalify generates lots of warnings, silence it
  Silently.silently do
    schema = Kwalify::Yaml.load_file(schema)
    validator = Kwalify::Validator.new(schema.with_indifferent_access)
    errors = validator.validate(YAML::load(yaml_text).with_indifferent_access)
  end
  errors
end

#write(dir) ⇒ Object

Write the metadata to a file under the specified directory The file will be saved as tpkg.yml, even if originally loaded as XML.



343
344
345
346
347
348
349
350
# File 'lib/tpkg/metadata.rb', line 343

def write(dir)
  File.open(File.join(dir, "tpkg.yml"), "w") do |file|
    # When we convert xml to hash, we store the key as symbol. So when we
    # write back out to file, we should stringify all the keys for readability.
    data = to_hash.recursively{|h| h.stringify_keys }
    YAML::dump(data, file)
  end
end