PoshRDF

From W3C Wiki

poshRDF (Draft)

poshRDF is an approach (and, as a by-product, an RDF-in-HTML syntax) to efficiently extract RDF from (many) microformats and similar, ad-hoc POSH markup. The generated RDF is a single-namespace (intermediate) triple representation that can be converted to deployed vocabularies such as FOAF, iCal/RDF, vcard/RDF, or RSS 1.0 in a post-processing step.

By not directly converting to a target RDF term set, we keep the possibility to generate representations (e.g. JSON) for a standardized microformats test suite and/or a generic posh formats validator, without having to spend time on the RDF vocabulary market.

poshRDF is valid HTML 4, XHTML 1.0, or HTML 5. It follows the microformats idea of using @class values as structural hooks. All a posh format parser needs to know is

  • a list of terms and
  • whether a term indicates a triple subject, predicate, or object,
  • the possible containers of a predicate term (default: parent subject or document),
  • if the object value type is a node, a simple literal, or markup.
  • a target namespace and prefix (optional, "posh" with "http://poshrdf.org/ns/posh/" are defaults)

Based on these definitions, a poshRDF processor injects additional class values (rdf-s, rdf-p, rdf-o, rdf-o-xml) into the currently processed DOM node. The following parsing step(s) will then generate triples based on the simple poshRDF processing instructions (see below). poshRDF simplifies single-pass parsing of any number of terms.

poshRDF as native RDF-in-HTML syntax

poshRDF can be also used to encode (a subset of) RDF in HTML, similar to RDFa (but less expressive) or eRDF by explicitly adding the term-indicating @class values. poshRDF uses a combination of default prefixes (@@todo prefix registry) for common vocabularies, and optional namespace declarations following the eRDF and RDFa mechanisms.

Syntax (work in progress)

The syntax is based on the experience that RDF newcomers often don't understand the naming conventions in RDF serializations (e.g. "about" vs. "resource" or "nodeID"), but get the graph/triple idea rather quickly (e.g. to write SPARQL queries). XML namespaces are another barrier (even though more a political one, it seems), so poshRDF has a set of "built-in" prefixes. Assuming that the RDF triple model (subject-predicate-object, objects can be subjects, too, what we get is a graph) is understandable, poshRDF introduces basic subject, predicate, and object indicators: "rdf-s", "rdf-p", and "rdf-o". For markup objects, "rdf-o-xml" can be used. (In a later version, typed objects could perhaps be indicated by additional suffixes, e.g. "rdf-o-integer" or "rdf-o-boolean"). Predicate URIs are either specified as prefixed names following the eRDF syntax (prefix + "-" + local name, e.g. "foaf-name") or as full URIs.

Rules and Processing

  • document/head-level additions: none required, optional prefix declarations
  • prefixes:
    • set of defaults (a tiny bit of centralization can't hurt, eh?)
    • xmlns mechanism
    • Dublin Core and eRDF mechanism (link-tag, a-tag)
  • subject declaration
    • default: page URL
    • @class="rdf-s" and first of
      • @href
      • @src
      • @title? (2b discussed)
      • @value
      • an auto-generated URI or bnodeID for the given node
  • subjects are set for sub-nodes and following siblings(!)
  • predicate declarations
    • @class="rdf-p" and
      • any @class|rel="[pre-defined POSH term or prefixed name]"
      • or (if no @class|rel) @href
  • predicates are set for sub-nodes and following siblings(!)
  • object declarations
    • @class="rdf-o|rdf-o-xml" and first of
    • an auto-generated URI or bnodeID for the given node (if rdf-o and also rdf-s)
    • @href (if rdf-o)
    • @src (if rdf-o)
    • @title (if rdf-o)
    • @value (if rdf-o)
    • node content (if rdf-o)
    • node content markup (rdf-o-xml)

Examples

stand-alone poshRDF

HTML
<a class="rdf-s" href="#me">My</a>
<a class="rdf-p" href="...foaf/0.1/name">name</a> is
<span class="rdf-o">
  <span class="rdf-p rdf-o foaf-givenname">John</span>
  <span class="rdf-p rdf-o foaf-family_name">Doe</span>
</span>
<a class="rdf-p foaf-homepage rdfs-seeAlso rdf-o" href="http://doe.com/">
  Homepage
</a>.
RDF
<#me> foaf:name "John Doe" ;
      foaf:givenname "John" ;
      foaf:family_name "Doe" ;
      foaf:homepage <http://example.com/> ;
      rdfs:seeAlso <http://example.com/> .


microformats

Source HTML
<span class="adr">
  Postal Code: <span class="postal-code">40468</span>
</span>
HTML as seen by a poshRDF processor
<span class="rdf-p adr rdf-s">
  <span class="rdf-p rdf-o postal-code">40468</span>
</span>
generated RDF
@prefix mf: <http://poshrdf.org/ns/mf#>
<> mf:adr _:bn1 .
_:bn1 mf:postal-code "40468" . 


Implementations