Class: RDF::RDFXML::Writer

Inherits:
Writer
  • Object
show all
Defined in:
lib/rdf/rdfxml/writer.rb

Overview

An RDF/XML serialiser in Ruby

Note that the natural interface is to write a whole graph at a time. Writing statements or Triples will create a graph to add them to and then serialize the graph.

The writer will add prefix definitions, and use them for creating @prefix definitions, and minting QNames

Examples:

Obtaining a RDF/XML writer class

RDF::Writer.for(:rdf)         #=> RDF::RDFXML::Writer
RDF::Writer.for("etc/test.rdf")
RDF::Writer.for(:file_name      => "etc/test.rdf")
RDF::Writer.for(:file_extension => "rdf")
RDF::Writer.for(:content_type   => "application/rdf+xml")

Serializing RDF graph into an RDF/XML file

RDF::RDFXML::Write.open("etc/test.rdf") do |writer|
  writer << graph
end

Serializing RDF statements into an RDF/XML file

RDF::RDFXML::Writer.open("etc/test.rdf") do |writer|
  graph.each_statement do |statement|
    writer << statement
  end
end

Serializing RDF statements into an RDF/XML string

RDF::RDFXML::Writer.buffer do |writer|
  graph.each_statement do |statement|
    writer << statement
  end
end

Creating @base and @prefix definitions in output

RDF::RDFXML::Writer.buffer(:base_uri => "http://example.com/", :prefixes => {
    nil => "http://example.com/ns#",
    :foaf => "http://xmlns.com/foaf/0.1/"}
) do |writer|
  graph.each_statement do |statement|
    writer << statement
  end
end

Author:

Constant Summary collapse

VALID_ATTRIBUTES =
[:none, :untyped, :typed]

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(output = $stdout, options = {}) {|writer| ... } ⇒ Writer

Initializes the RDF/XML writer instance.

Parameters:

  • output (IO, File) (defaults to: $stdout)

    the output stream

  • options (Hash{Symbol => Object}) (defaults to: {})

    any additional options

Options Hash (options):

  • :canonicalize (Boolean) — default: false

    whether to canonicalize literals when serializing

  • :prefixes (Hash) — default: Hash.new

    the prefix mappings to use (not supported by all writers)

  • :base_uri (#to_s) — default: nil

    the base URI to use when constructing relative URIs

  • :max_depth (Integer) — default: 3

    Maximum depth for recursively defining resources

  • :lang (#to_s) — default: nil

    Output as root xml:lang attribute, and avoid generation xml:lang where possible

  • :attributes (Array) — default: nil

    How to use XML attributes when serializing, one of :none, :untyped, :typed. The default is :none.

  • :standard_prefixes (Boolean) — default: false

    Add standard prefixes to prefixes, if necessary.

  • :default_namespace (String) — default: nil

    URI to use as default namespace, same as prefix(nil)

Yields:

  • (writer)

Yield Parameters:

  • writer (RDF::Writer)


86
87
88
89
90
91
92
93
# File 'lib/rdf/rdfxml/writer.rb', line 86

def initialize(output = $stdout, options = {}, &block)
  super do
    @graph = RDF::Graph.new
    @uri_to_qname = {}
    @uri_to_prefix = {}
    block.call(self) if block_given?
  end
end

Instance Attribute Details

#base_uriURI

Returns Base URI used for relativizing URIs.

Returns:

  • (URI)

    Base URI used for relativizing URIs



59
60
61
# File 'lib/rdf/rdfxml/writer.rb', line 59

def base_uri
  @base_uri
end

#graphGraph

Returns Graph of statements serialized.

Returns:

  • (Graph)

    Graph of statements serialized



57
58
59
# File 'lib/rdf/rdfxml/writer.rb', line 57

def graph
  @graph
end

Instance Method Details

#get_qname(resource, options = {}) ⇒ String?

Return a QName for the URI, or nil. Adds namespace of QName to defined prefixes

Parameters:

  • resource (URI, #to_s)
  • options (Hash<Symbol => Object>) (defaults to: {})
  • [Boolean] (Hash)

    a customizable set of options

Returns:

  • (String, nil)

    value to use to identify URI



178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
# File 'lib/rdf/rdfxml/writer.rb', line 178

def get_qname(resource, options = {})
  case resource
  when RDF::Node
    add_debug "qname(#{resource.inspect}): #{resource}"
    return resource.to_s
  when RDF::URI
    uri = resource.to_s
  else
    add_debug "qname(#{resource.inspect}): nil"
    return nil
  end

  qname = case
  when options[:with_default] && prefix(nil) && uri.index(prefix(nil)) == 0
    # Don't cache
    add_debug "qname(#{resource.inspect}): #{uri.sub(prefix(nil), '').inspect} (default)"
    return uri.sub(prefix(nil), '')
  when @uri_to_qname.has_key?(uri)
    add_debug "qname(#{resource.inspect}): #{@uri_to_qname[uri].inspect} (cached)"
    return @uri_to_qname[uri]
  when u = @uri_to_prefix.keys.detect {|u| uri.index(u.to_s) == 0 && NC_REGEXP.match(uri[u.to_s.length..-1])}
    # Use a defined prefix
    prefix = @uri_to_prefix[u]
    prefix(prefix, u)  # Define for output
    uri.sub(u.to_s, "#{prefix}:")
  when @options[:standard_prefixes] && vocab = RDF::Vocabulary.detect {|v| uri.index(v.to_uri.to_s) == 0 && NC_REGEXP.match(uri[v.to_uri.to_s.length..-1])}
    prefix = vocab.__name__.to_s.split('::').last.downcase
    @uri_to_prefix[vocab.to_uri.to_s] = prefix
    prefix(prefix, vocab.to_uri) # Define for output
    uri.sub(vocab.to_uri.to_s, "#{prefix}:")
  else
    
    # No vocabulary found, invent one
    # Add bindings for predicates not already having bindings
    # From RDF/XML Syntax and Processing:
    #   An XML namespace-qualified name (QName) has restrictions on the legal characters such that not all
    #   property URIs can be expressed as these names. It is recommended that implementors of RDF serializers,
    #   in order to break a URI into a namespace name and a local name, split it after the last XML non-NCName
    #   character, ensuring that the first character of the name is a Letter or '_'. If the URI ends in a
    #   non-NCName character then throw a "this graph cannot be serialized in RDF/XML" exception or error.
    separation = uri.rindex(%r{[^a-zA-Z_0-9-][a-zA-Z_][a-z0-9A-Z_-]*$})
    return @uri_to_qname[uri] = nil unless separation
    base_uri = uri.to_s[0..separation]
    suffix = uri.to_s[separation+1..-1]
    @gen_prefix = @gen_prefix ? @gen_prefix.succ : "ns0"
    @uri_to_prefix[base_uri] = @gen_prefix
    prefix(@gen_prefix, base_uri)
    "#{@gen_prefix}:#{suffix}"
  end
  
  add_debug "qname(#{resource.inspect}): #{qname.inspect}"
  @uri_to_qname[uri] = qname
rescue Addressable::URI::InvalidURIError => e
  raise RDF::WriterError, "Invalid URI #{uri.inspect}: #{e.message}"
end

#indent(modifier = 0) ⇒ String (protected)

Returns indent string multiplied by the depth

Parameters:

  • modifier (Integer) (defaults to: 0)

    Increase depth by specified amount

Returns:

  • (String)

    A number of spaces, depending on current depth



308
309
310
# File 'lib/rdf/rdfxml/writer.rb', line 308

def indent(modifier = 0)
  " " * (@depth + modifier)
end

#order_subjectsArray<Resource> (protected)

Order subjects for output. Override this to output subjects in another order.

Uses top_classes

Returns:

  • (Array<Resource>)

    Ordered list of subjects



256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
# File 'lib/rdf/rdfxml/writer.rb', line 256

def order_subjects
  seen = {}
  subjects = []
  
  top_classes.each do |class_uri|
    graph.query(:predicate => RDF.type, :object => class_uri).map {|st| st.subject}.sort.uniq.each do |subject|
      #add_debug "order_subjects: #{subject.inspect}"
      subjects << subject
      seen[subject] = @top_levels[subject] = true
    end
  end
  
  # Sort subjects by resources over bnodes, ref_counts and the subject URI itself
  recursable = @subjects.keys.
    select {|s| !seen.include?(s)}.
    map {|r| [(r.is_a?(RDF::Node) ? 1 : 0) + ref_count(r), r]}.
    sort_by {|l| l.first }
  
  subjects += recursable.map{|r| r.last}
end

#predicate_orderArray<URI> (protected)

Defines order of predicates to to emit at begninning of a resource description. Defaults to

rdf:type, rdfs:label, dc:title

Returns:



250
# File 'lib/rdf/rdfxml/writer.rb', line 250

def predicate_order; [RDF.type, RDF::RDFS.label, RDF::DC.title]; end

#preprocessObject (protected)

Perform any preprocessing of statements required



278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
# File 'lib/rdf/rdfxml/writer.rb', line 278

def preprocess
  default_namespace = @options[:default_namespace] || prefix(nil)

  # Load defined prefixes
  (@options[:prefixes] || {}).each_pair do |k, v|
    @uri_to_prefix[v.to_s] = k
  end
  @options[:prefixes] = {}  # Will define actual used when matched

  if default_namespace
    add_debug("preprocess: default_namespace: #{default_namespace}")
    prefix(nil, default_namespace) 
  end

  @graph.each {|statement| preprocess_statement(statement)}
end

#preprocess_statement(statement) ⇒ Object (protected)

Perform any statement preprocessing required. This is used to perform reference counts and determine required prefixes.

Parameters:

  • statement (Statement)


298
299
300
301
302
303
# File 'lib/rdf/rdfxml/writer.rb', line 298

def preprocess_statement(statement)
  #add_debug "preprocess: #{statement.inspect}"
  references = ref_count(statement.object) + 1
  @references[statement.object] = references
  @subjects[statement.subject] = true
end

#relativize(uri) ⇒ String (protected)

If @base_uri is defined, use it to try to make uri relative

Parameters:

  • uri (#to_s)

Returns:

  • (String)


238
239
240
241
# File 'lib/rdf/rdfxml/writer.rb', line 238

def relativize(uri)
  uri = uri.to_s
  @base_uri ? uri.sub(@base_uri.to_s, "") : uri
end

#resetObject (protected)



312
313
314
315
316
317
318
319
320
# File 'lib/rdf/rdfxml/writer.rb', line 312

def reset
  @depth = 0
  @lists = {}
  prefixes = {}
  @references = {}
  @serialized = {}
  @subjects = {}
  @top_levels = {}
end

#top_classesArray<URI> (protected)

Defines rdf:type of subjects to be emitted at the beginning of the graph. Defaults to none

Returns:



245
# File 'lib/rdf/rdfxml/writer.rb', line 245

def top_classes; []; end

#write_epilogue

This method returns an undefined value.

Outputs the RDF/XML representation of all stored triples.

Raises:

  • (RDF::WriterError)

    when attempting to write non-conformant graph

See Also:



129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
# File 'lib/rdf/rdfxml/writer.rb', line 129

def write_epilogue
  @force_RDF_about = {}
  @max_depth = @options[:max_depth] || 3
  @base_uri = @options[:base_uri]
  @lang = @options[:lang]
  @attributes = @options[:attributes] || :none
  @debug = @options[:debug]
  raise RDF::WriterError, "Invalid attribute option '#{@attributes}', should be one of #{VALID_ATTRIBUTES.to_sentence}" unless VALID_ATTRIBUTES.include?(@attributes.to_sym)
  self.reset

  doc = Nokogiri::XML::Document.new

  add_debug "\nserialize: graph of size #{@graph.size}"
  add_debug "options: #{@options.inspect}"

  preprocess

  prefix(:rdf, RDF.to_uri)
  prefix(:xml, RDF::XML) if @base_uri || @lang
  
  add_debug "\nserialize: graph namespaces: #{prefixes.inspect}"
  
  doc.root = Nokogiri::XML::Element.new("rdf:RDF", doc)
  doc.root["xml:lang"] = @lang if @lang
  doc.root["xml:base"] = @base_uri if @base_uri
  
  # Add statements for each subject
  order_subjects.each do |subject|
    #add_debug "subj: #{subject.inspect}"
    subject(subject, doc.root)
  end

  prefixes.each_pair do |p, uri|
    if p == nil
      doc.root.default_namespace = uri.to_s
    else
      doc.root.add_namespace(p.to_s, uri.to_s)
    end
  end

  add_debug "doc:\n #{doc.to_xml(:encoding => "UTF-8", :indent => 2)}"
  doc.write_xml_to(@output, :encoding => "UTF-8", :indent => 2)
end

#write_graph(graph)

This method returns an undefined value.

Write whole graph

Parameters:



100
101
102
# File 'lib/rdf/rdfxml/writer.rb', line 100

def write_graph(graph)
  @graph = graph
end

#write_statement(statement)

This method returns an undefined value.

Addes a statement to be serialized

Parameters:

  • statement (RDF::Statement)


108
109
110
# File 'lib/rdf/rdfxml/writer.rb', line 108

def write_statement(statement)
  @graph.insert(statement)
end

#write_triple(subject, predicate, object)

This method is abstract.

This method returns an undefined value.

Addes a triple to be serialized

Parameters:

  • subject (RDF::Resource)
  • predicate (RDF::URI)
  • object (RDF::Value)


119
120
121
# File 'lib/rdf/rdfxml/writer.rb', line 119

def write_triple(subject, predicate, object)
  @graph.insert(Statement.new(subject, predicate, object))
end