Microdata parser for RDF.rb.
RDF::Microdata is a Microdata reader for Ruby using the RDF.rb library suite.
RDF::Microdata parses Microdata into statements or triples.
- Microdata parser.
- Uses Nokogiri for parsing HTML
Install with 'gem install rdf-microdata'
Reading RDF data in the Microdata format
graph = RDF::Graph.load("etc/foaf.html", :format => :microdata)
The Microdata editor has recently dropped support for RDF conversion, as a result, this gem is being used to investigate ways in which Microdata might have more satisfactory RDF generation.
Generating RDF friendly URIs from terms
@itemprop is included within an item having an
the URI of the
@itemtype will be used for generating a term URI. The type URI will be trimmed following
the last '#' or '/' character, and the term will be appended to the resulting URI. This is in keeping
with standard convention for defining properties and classes within an RDFS or OWL vocabulary.
<div itemscope itemtype="http://schema.org/Person"> My name is <span itemprop="name">Gregg</span> </div>
:rdf\_terms option, this would create the following statements:
@prefix md: <http://www.w3.org/1999/xhtml/microdata#> . @prefix schema: <http://schema.org/> . <> md:item [ a schema:Person; <http://www.w3.org/1999/xhtml/microdata#http://schema.org/Person%23:name> "Gregg" ] .
:rdf\_terms option, this becomes:
@prefix md: <http://www.w3.org/1999/xhtml/microdata#> . @prefix schema: <http://schema.org/> . <> md:item [ a schema:Person; schema:name "Gregg" ] .
Improve xsd:date, xsd:time, xsd:dateTime and xsd:duration generation from time element
Use the lexical form of the @datetime attribute of the time element to determine the specific type of the generated literal.
Remove implicit RDF triple generation
html>head>title and anchor (a) elements no longer generate triples without @item* properties
Full documentation available on Rubydoc.info
- Asserts :html format, text/html mime-type and .html file extension.
- Add support for LibXML and REXML bindings, and use the best available
- Consider a SAX-based parser for improved performance
- Do your best to adhere to the existing coding conventions and idioms.
- Don't use hard tabs, and don't leave trailing whitespace on any line.
- Do document every method you add using YARD annotations. Read the tutorial or just look at the existing code for examples.
- Don't touch the
AUTHORSfiles. If you need to change them, do so on your private branch only.
- Do feel free to add yourself to the
CREDITSfile and the corresponding list in the the
README. Alphabetical order applies.
- Do note that in order for us to merge any non-trivial changes (as a rule of thumb, additions larger than about 15 lines of code), we need an explicit public domain dedication on record from you.