RDF::Microdata reader/writer

Microdata parser for RDF.rb.


RDF::Microdata is a Microdata reader for Ruby using the RDF.rb library suite.


RDF::Microdata parses Microdata into statements or triples.

  • Microdata parser.
  • Uses Nokogiri for parsing HTML

Install with 'gem install rdf-microdata'


Reading RDF data in the Microdata format

graph = RDF::Graph.load("etc/foaf.html", :format => :microdata)


The Microdata editor has recently dropped support for RDF conversion, as a result, this gem is being used to investigate ways in which Microdata might have more satisfactory RDF generation.

Generating RDF friendly URIs from terms

If the @itemprop is included within an item having an @itemtype, the URI of the @itemtype will be used for generating a term URI. The type URI will be trimmed following the last '#' or '/' character, and the term will be appended to the resulting URI. This is in keeping with standard convention for defining properties and classes within an RDFS or OWL vocabulary.

For example:

<div itemscope itemtype="http://schema.org/Person">
  My name is <span itemprop="name">Gregg</span>

Without the :rdf\_terms option, this would create the following statements:

@prefix md: <http://www.w3.org/1999/xhtml/microdata#> .
@prefix schema: <http://schema.org/> .
<> md:item [
  a schema:Person;
  <http://www.w3.org/1999/xhtml/microdata#http://schema.org/Person%23:name> "Gregg"
] .

With the :rdf\_terms option, this becomes:

@prefix md: <http://www.w3.org/1999/xhtml/microdata#> .
@prefix schema: <http://schema.org/> .
<> md:item [ a schema:Person; schema:name "Gregg" ] .

Improve xsd:date, xsd:time, xsd:dateTime and xsd:duration generation from time element

Use the lexical form of the @datetime attribute of the time element to determine the specific type of the generated literal.

Remove implicit RDF triple generation

html>head>title and anchor (a) elements no longer generate triples without @item* properties



Full documentation available on Rubydoc.info

Principle Classes

Additional vocabularies


  • Add support for LibXML and REXML bindings, and use the best available
  • Consider a SAX-based parser for improved performance




  • Do your best to adhere to the existing coding conventions and idioms.
  • Don't use hard tabs, and don't leave trailing whitespace on any line.
  • Do document every method you add using YARD annotations. Read the tutorial or just look at the existing code for examples.
  • Don't touch the .gemspec, VERSION or AUTHORS files. If you need to change them, do so on your private branch only.
  • Do feel free to add yourself to the CREDITS file and the corresponding list in the the README. Alphabetical order applies.
  • Do note that in order for us to merge any non-trivial changes (as a rule of thumb, additions larger than about 15 lines of code), we need an explicit public domain dedication on record from you.


This is free and unencumbered public domain software. For more information, see http://unlicense.org/ or the accompanying UNLICENSE file.