Class: RdfContext::RdfaParser

Inherits:
Parser
  • Object
show all
Defined in:
lib/rdf_context/rdfaparser.rb

Overview

An RDFa parser in Ruby

Based on processing rules described here:

See Also:

Author:

  • Ben Adida

  • Gregg Kellogg

Constant Summary collapse

SafeCURIEorCURIEorURI =
{
  :rdfa_1_0 => [:term, :safe_curie, :uri, :bnode],
  :rdfa_1_1 => [:safe_curie, :curie, :term, :uri, :bnode],
}
TERMorCURIEorAbsURI =
{
  :rdfa_1_0 => [:term, :curie],
  :rdfa_1_1 => [:term, :curie, :absuri],
}
TERMorCURIEorAbsURIprop =
{
  :rdfa_1_0 => [:curie],
  :rdfa_1_1 => [:term, :curie, :absuri],
}

Instance Attribute Summary collapse

Attributes inherited from Parser

#debug, #doc, #graph, #processor_graph, #uri

Instance Method Summary collapse

Methods inherited from Parser

#add_debug, #add_error, #add_info, #add_processor_message, #add_triple, #add_warning, #detect_format, n3_parser, #node_path, parse, rdfa_parser, rdfxml_parser

Constructor Details

#initialize(options = {}) ⇒ RdfaParser

Creates a new parser for RDFa.

Parameters:

  • options (Hash) (defaults to: {})

    a customizable set of options

Options Hash (options):

  • :graph (Graph) — default: nil

    Graph to parse into, otherwise a new RdfContext::Graph instance is created

  • :processor_graph (Graph) — default: nil

    Graph to record information, warnings and errors.

  • :profile_graph (Graph) — default: nil

    Graph to save profile graphs.

  • :debug (Array) — default: nil

    Array to place debug messages

  • :type (:rdfxml, :html, :n3) — default: nil
  • :strict (Boolean) — default: false

    Raise Error if true, continue with lax parsing, otherwise



154
155
156
157
158
# File 'lib/rdf_context/rdfaparser.rb', line 154

def initialize(options = {})
  super
  @profile_graph = options[:profile_graph]
  @@vocabulary_cache ||= {}
end

Instance Attribute Details

#host_language:xhtml (readonly)

Host language

Returns:

  • (:xhtml)


30
31
32
# File 'lib/rdf_context/rdfaparser.rb', line 30

def host_language
  @host_language
end

#profile_graphRdfContext::Graph

Graph instance containing parsed profiles

Returns:



38
39
40
# File 'lib/rdf_context/rdfaparser.rb', line 38

def profile_graph
  @profile_graph
end

#version:rdfa_1_0, :rdfa_1_1 (readonly)

Version

Returns:

  • (:rdfa_1_0, :rdfa_1_1)


34
35
36
# File 'lib/rdf_context/rdfaparser.rb', line 34

def version
  @version
end

Instance Method Details

#parse(stream, uri = nil, options = {}) {|triple| ... } ⇒ Graph

Parse XHTML+RDFa document from a string or input stream to closure or graph.

If the parser is called with a block, triples are passed to the block rather than added to the graph.

Optionally, the stream may be a Nokogiri::HTML::Document or Nokogiri::XML::Document With a block, yeilds each statement with URIRef, BNode or Literal elements

Parameters:

  • stream (Nokogiri::HTML::Document, Nokogiri::XML::Document, #read, #to_s)

    the HTML+RDFa IO stream, string, Nokogiri::HTML::Document or Nokogiri::XML::Document

  • uri (String) (defaults to: nil)

    (nil) the URI of the document

  • options (Hash) (defaults to: {})

    a customizable set of options

Options Hash (options):

  • :graph (Graph) — default: Graph.new

    Graph to parse into, otherwise a new Graph

  • :processor_graph (Graph) — default: nil

    Graph to record information, warnings and errors.

  • :profile_graph (ConjunctiveGraph) — default: nil

    Graph to save profile graphs.

  • :debug (Array) — default: nil

    Array to place debug messages

  • :version (:rdfa_1_0, :rdfa_1_1) — default: :rdfa_1_1

    Parser version information

  • :host_language (:xhtml) — default: :xhtml

    Host Language

  • :strict (Boolean) — default: false

    Raise Error if true, continue with lax parsing, otherwise

Yields:

  • (triple)

Yield Parameters:

Returns:

  • (Graph)

    Returns the graph containing parsed triples

Raises:

  • (Error)

    Raises RdfError if strict



181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
# File 'lib/rdf_context/rdfaparser.rb', line 181

def parse(stream, uri = nil, options = {}, &block) # :yields: triple
  super

  @doc = case stream
  when Nokogiri::HTML::Document then stream
  when Nokogiri::XML::Document  then stream
  else                               Nokogiri::XML.parse(stream, uri.to_s)
  end
  
  add_error(nil, "Empty document", RDFA_NS.DocumentError) if @doc.nil?
  add_warning(nil, "Synax errors:\n#{@doc.errors}", RDFA_NS.DocumentError) unless @doc.errors.empty?
  
  @callback = block

  @version = options[:version] ? options[:version].to_sym : :rdfa_1_1
  @host_language = options[:host_language] || case @doc.root.name.downcase.to_sym
  when :html  then :xhtml
  when :svg   then :svg
  else             :xhtml
  end

  # Section 4.2 RDFa Host Language Conformance
  #
  # The Host Language may define a default RDFa Profile. If it does, the RDFa Profile triples that establish term or
  # URI mappings associated with that profile must not change without changing the profile URI. RDFa Processors may
  # embed, cache, or retrieve the RDFa Profile triples associated with that profile.
  @host_defaults = case @host_language
  when :xhtml
    @graph.bind(XHV_NS)
    {
      :vocabulary => nil,
      :prefix     => XHV_NS,
      :uri_mappings => {"xhv" => XHV_NS}, # RDF::XHTML is wrong
      :term_mappings => %w(
        alternate appendix bookmark cite chapter contents copyright first glossary help icon index
        last license meta next p3pv1 prev role section stylesheet subsection start top up
        ).inject({}) { |hash, term| hash[term] = XHV_NS.send("#{term}_"); hash },
    }
  else
    {}
  end

  @profile_graph ||= options[:profile_graph] if options.has_key?(:profile_graph)
  
  add_debug(@doc.root, "version = #{@version.inspect},  host_language = #{@host_language}")
  # parse
  parse_whole_document(@doc, @uri)

  @graph
end