Class: Stash::Harvester::OAIPMH::OAISourceConfig

Inherits:
SourceConfig
  • Object
show all
Defined in:
lib/stash/harvester/oaipmh/oai_source_config.rb

Overview

The configuration of an OAI data source. Defaults to harvesting Dublin Core at seconds granularity, across all record sets.

Instance Attribute Summary collapse

Attributes inherited from SourceConfig

#source_uri

Instance Method Summary collapse

Methods inherited from SourceConfig

from_yaml

Constructor Details

#initialize(oai_base_url:, metadata_prefix: DUBLIN_CORE, set: nil, seconds_granularity: false) ⇒ OAISourceConfig

Constructs a new Stash::Harvester::OAIPMH::OAISourceConfig with the specified properties.

Parameters:

  • oai_base_url (URI, String)

    the base URL of the repository. *(Required)*

  • metadata_prefix (String, nil) (defaults to: DUBLIN_CORE)

    the metadata prefix defining the metadata format requested from the repository. If metadata_prefix is omitted, the prefix oai_dc (Dublin Core) will be used.

  • set (String, nil) (defaults to: nil)

    the colon-separated path to the set requested for selective harvesting from the repository. If set_spec is omitted, harvesting will be across all sets.

  • seconds_granularity (Boolean) (defaults to: false)

    whether to include the full time out to the second in the from / until time range. (Defaults to false, i.e., days granularity.)

Raises:

  • (URI::InvalidURIError)

    if oai_base_url is a string that is not a valid URI

  • (ArgumentError)

    if metadata_prefix or any set_spec element contains invalid characters, i.e. URI reserved characters per RFC 2396


51
52
53
54
55
56
# File 'lib/stash/harvester/oaipmh/oai_source_config.rb', line 51

def initialize(oai_base_url:, metadata_prefix: DUBLIN_CORE, set: nil, seconds_granularity: false)
  super(source_url: oai_base_url)
  @seconds_granularity = seconds_granularity
  @metadata_prefix = valid_prefix()
  @set = valid_spec(set)
end

Instance Attribute Details

#metadata_prefixString (readonly)

Returns the metadata prefix defining the metadata format requested from the repository.

Returns:

  • (String)

    the metadata prefix defining the metadata format requested from the repository.


17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
# File 'lib/stash/harvester/oaipmh/oai_source_config.rb', line 17

class OAISourceConfig < SourceConfig

  # ------------------------------------------------------------
  # Constants

  DUBLIN_CORE = 'oai_dc'
  private_constant :DUBLIN_CORE

  UNRESERVED_PATTERN = Regexp.new("^[#{URI::RFC2396_REGEXP::PATTERN::UNRESERVED}]+$")
  private_constant :UNRESERVED_PATTERN

  # ------------------------------------------------------------
  # Attributes

  attr_reader :seconds_granularity
  attr_reader :metadata_prefix
  attr_reader :set

  # ------------------------------------------------------------
  # Initializer

  # Constructs a new {OAISourceConfig} with the specified properties.
  #
  # @param oai_base_url [URI, String] the base URL of the repository. *(Required)*
  # @param metadata_prefix [String, nil] the metadata prefix defining the metadata format requested
  #   from the repository. If +metadata_prefix+ is omitted, the prefix +oai_dc+ (Dublin Core)
  #   will be used.
  # @param set [String, nil] the colon-separated path to the set requested for selective harvesting
  #   from the repository. If +set_spec+ is omitted, harvesting will be across all sets.
  # @param seconds_granularity [Boolean] whether to include the full time out to the second in
  #   the from / until time range. (Defaults to +false+, i.e., days granularity.)
  # @raise [URI::InvalidURIError] if +oai_base_url+ is a string that is not a valid URI
  # @raise [ArgumentError] if +metadata_prefix+ or any +set_spec+ element contains invalid characters,
  #   i.e. URI reserved characters per {https://www.ietf.org/rfc/rfc2396.txt RFC 2396}
  def initialize(oai_base_url:, metadata_prefix: DUBLIN_CORE, set: nil, seconds_granularity: false)
    super(source_url: oai_base_url)
    @seconds_granularity = seconds_granularity
    @metadata_prefix = valid_prefix()
    @set = valid_spec(set)
  end

  # ------------------------------------------------------------
  # Instance methods

  def to_h
    opts = { metadata_prefix:  }
    (opts[:set] = set) if set
    opts
  end

  # ------------------------------------------------------------
  # Private methods

  private

  # ------------------------------
  # Parameter validators

  def valid_spec(set_spec)
    return nil unless set_spec
    (set_spec.split(':').map do |element|
      if UNRESERVED_PATTERN =~ element
        element
      else
        fail ArgumentError, "setSpec element ''#{element}'' must consist only of RFC 2396 URI unreserved characters"
      end
    end).join(':')
  end

  def valid_prefix()
    if UNRESERVED_PATTERN =~ 
      
    else
      fail ArgumentError, "metadata_prefix ''#{}'' must consist only of RFC 2396 URI unreserved characters"
    end
  end

end

#seconds_granularityObject (readonly)


Attributes


17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
# File 'lib/stash/harvester/oaipmh/oai_source_config.rb', line 17

class OAISourceConfig < SourceConfig

  # ------------------------------------------------------------
  # Constants

  DUBLIN_CORE = 'oai_dc'
  private_constant :DUBLIN_CORE

  UNRESERVED_PATTERN = Regexp.new("^[#{URI::RFC2396_REGEXP::PATTERN::UNRESERVED}]+$")
  private_constant :UNRESERVED_PATTERN

  # ------------------------------------------------------------
  # Attributes

  attr_reader :seconds_granularity
  attr_reader :metadata_prefix
  attr_reader :set

  # ------------------------------------------------------------
  # Initializer

  # Constructs a new {OAISourceConfig} with the specified properties.
  #
  # @param oai_base_url [URI, String] the base URL of the repository. *(Required)*
  # @param metadata_prefix [String, nil] the metadata prefix defining the metadata format requested
  #   from the repository. If +metadata_prefix+ is omitted, the prefix +oai_dc+ (Dublin Core)
  #   will be used.
  # @param set [String, nil] the colon-separated path to the set requested for selective harvesting
  #   from the repository. If +set_spec+ is omitted, harvesting will be across all sets.
  # @param seconds_granularity [Boolean] whether to include the full time out to the second in
  #   the from / until time range. (Defaults to +false+, i.e., days granularity.)
  # @raise [URI::InvalidURIError] if +oai_base_url+ is a string that is not a valid URI
  # @raise [ArgumentError] if +metadata_prefix+ or any +set_spec+ element contains invalid characters,
  #   i.e. URI reserved characters per {https://www.ietf.org/rfc/rfc2396.txt RFC 2396}
  def initialize(oai_base_url:, metadata_prefix: DUBLIN_CORE, set: nil, seconds_granularity: false)
    super(source_url: oai_base_url)
    @seconds_granularity = seconds_granularity
    @metadata_prefix = valid_prefix()
    @set = valid_spec(set)
  end

  # ------------------------------------------------------------
  # Instance methods

  def to_h
    opts = { metadata_prefix:  }
    (opts[:set] = set) if set
    opts
  end

  # ------------------------------------------------------------
  # Private methods

  private

  # ------------------------------
  # Parameter validators

  def valid_spec(set_spec)
    return nil unless set_spec
    (set_spec.split(':').map do |element|
      if UNRESERVED_PATTERN =~ element
        element
      else
        fail ArgumentError, "setSpec element ''#{element}'' must consist only of RFC 2396 URI unreserved characters"
      end
    end).join(':')
  end

  def valid_prefix()
    if UNRESERVED_PATTERN =~ 
      
    else
      fail ArgumentError, "metadata_prefix ''#{}'' must consist only of RFC 2396 URI unreserved characters"
    end
  end

end

#setString? (readonly)

Returns the colon-separated path to the set requested for selective harvesting.

Returns:

  • (String, nil)

    the colon-separated path to the set requested for selective harvesting.


17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
# File 'lib/stash/harvester/oaipmh/oai_source_config.rb', line 17

class OAISourceConfig < SourceConfig

  # ------------------------------------------------------------
  # Constants

  DUBLIN_CORE = 'oai_dc'
  private_constant :DUBLIN_CORE

  UNRESERVED_PATTERN = Regexp.new("^[#{URI::RFC2396_REGEXP::PATTERN::UNRESERVED}]+$")
  private_constant :UNRESERVED_PATTERN

  # ------------------------------------------------------------
  # Attributes

  attr_reader :seconds_granularity
  attr_reader :metadata_prefix
  attr_reader :set

  # ------------------------------------------------------------
  # Initializer

  # Constructs a new {OAISourceConfig} with the specified properties.
  #
  # @param oai_base_url [URI, String] the base URL of the repository. *(Required)*
  # @param metadata_prefix [String, nil] the metadata prefix defining the metadata format requested
  #   from the repository. If +metadata_prefix+ is omitted, the prefix +oai_dc+ (Dublin Core)
  #   will be used.
  # @param set [String, nil] the colon-separated path to the set requested for selective harvesting
  #   from the repository. If +set_spec+ is omitted, harvesting will be across all sets.
  # @param seconds_granularity [Boolean] whether to include the full time out to the second in
  #   the from / until time range. (Defaults to +false+, i.e., days granularity.)
  # @raise [URI::InvalidURIError] if +oai_base_url+ is a string that is not a valid URI
  # @raise [ArgumentError] if +metadata_prefix+ or any +set_spec+ element contains invalid characters,
  #   i.e. URI reserved characters per {https://www.ietf.org/rfc/rfc2396.txt RFC 2396}
  def initialize(oai_base_url:, metadata_prefix: DUBLIN_CORE, set: nil, seconds_granularity: false)
    super(source_url: oai_base_url)
    @seconds_granularity = seconds_granularity
    @metadata_prefix = valid_prefix()
    @set = valid_spec(set)
  end

  # ------------------------------------------------------------
  # Instance methods

  def to_h
    opts = { metadata_prefix:  }
    (opts[:set] = set) if set
    opts
  end

  # ------------------------------------------------------------
  # Private methods

  private

  # ------------------------------
  # Parameter validators

  def valid_spec(set_spec)
    return nil unless set_spec
    (set_spec.split(':').map do |element|
      if UNRESERVED_PATTERN =~ element
        element
      else
        fail ArgumentError, "setSpec element ''#{element}'' must consist only of RFC 2396 URI unreserved characters"
      end
    end).join(':')
  end

  def valid_prefix()
    if UNRESERVED_PATTERN =~ 
      
    else
      fail ArgumentError, "metadata_prefix ''#{}'' must consist only of RFC 2396 URI unreserved characters"
    end
  end

end

Instance Method Details

#to_hObject


Instance methods


61
62
63
64
65
# File 'lib/stash/harvester/oaipmh/oai_source_config.rb', line 61

def to_h
  opts = { metadata_prefix:  }
  (opts[:set] = set) if set
  opts
end