This is an archive of an inactive wiki and cannot be modified.

RDFa is a syntax of embedding RDF in HTML so that existing data can be reused. GRDDL is a mechanism for extracting RDF from XML dialects. Of course, one XML dialect that GRDDL can process is XHTML, which raises the issue: what is the relationship between RDFa and GRDDL. This document aims to answer that question.

GRDDL as an RDFa Parser

One valid use case is to use GRDDL to extract RDF from an XHTML+RDFa document. Effectively, GRDDL acts as the RDFa parser in this case. The schema document for XHTML+RDFa (under development) can (and likely will) contain a namespace-level GRDDL transformation. Such GRDDL transformations, indicated in the namespace document, are meant to apply to instances of documents that reference the namespace document. Thus, a GRDDL agent will find an XHTML+RDFa document, follow its namespace pointer, find the namespace-level transformation, and apply it to the XHTML+RDFa document to extract RDF.

It is important to note that, while such a mechanism extracts RDF correctly from an RDFa document, it may lose some of the features of RDFa. Specifically, the binding of triples to specific rendered HTML regions is lost. In other words, a GRDDL approach to parsing RDFa is quite reasonable when machines, and only machines, will ever deal with the structured data from that point on. If it is desirable for humans to be involved in selecting and accessing this structured data, it may be best to use a native RDFa parser that maintains the DOM-to-RDF correspondence.

GRDDL for RDFa on Other XHTML Documents

Note how RDFa is not currently defined in any XHTML/HTML documents. A document-level transformation, rather than a namespace-level transformation, can be used to indicate the presence of RDFa statements in such existing XHTML/HTML documents, e.g. an XHTML 1.0 or HTML 4.01 document. Such documents likely will not validate because of the extra RDFa attributes, but they are perfectly processable by GRDDL, using the GRDDL XHTML Profile or the GRDDL HTML Profile.

Just like the previous case, this RDF extraction is meant mostly for machine readers: the DOM-RDF tie-ins are lost by GRDDL processing. However, as there is no other way to cleanly include RDFa statements in existing versions of XHTML/HTML, this direction should not be discounted. It is even possible that, by detecting this GRDDL RDFa transformation on a given document, RDFa native parsers would be able to provide the DOM-RDF correspondence features of RDFa on XHTML/HTML documents that contain the appropriate GRDDL/RDFa declaration.

hGRDDL: Using GRDDL to Transform Microformats to RDFa

Besides RDFa, there are other HTML/XHTML approaches to embedding structured data. Microformats are the preeminent example. Unfortunately, microformat syntax varies from one application domain to another: it would be quite useful to transform them into a generic syntax and structural approach, like RDF, while maintaining the DOM-RDF correspondence, like RDFa. GRDDL is already being used to transform HTML+microformats into RDF/XML.

We are currently discussing implementing hGRDDL, a GRDDL-like feature, in RDFa. Ideally, an XHTML+microformat document would contain an hGRDDL profile which would trigger a GRDDL-like transform from XHTML+microformat to XHTML+RDFa. All of the structure *and* DOM-to-data-structure correspondence from microformats will be preserved in the RDFa, allowing RDFa to become a "big umbrella" of structured data in HTML: eRDF, microformats, custom-designed structure, can all feed into the RDFa parser pipeline.

In addition, hGRDDL may be used to detect HTML-specific semantics. For example, HTML's <ul> and <li> have clear semantic significance. In addition, a number of reserved class attribute values: next, prev, license, etc. should be picked up. These features should not be built into a native RDFa parser, as they are highly HTML specific, where the RDFa processing model aims to be relatively generic for all XML documents. Instead, each version of HTML should have a default (namespace-level?) hGRDDL transformation that maps its specific semantically-valid elements to generic RDFa statements. Thus, the processing for an XHML1.0+hCal document would go as follows:

  1. note the namespace pointer, dereference it, find the hGRDDL transformation for XHTML 1.0, apply it.
  2. note the HTML HEAD LINK or PROFILE to the hGRDDL transformation for hCal, apply it.
  3. either HTML merge the two previous results or pipeline the two changes, first the namespace, then the profile. The latter solution is far more realistic.
  4. the final output is XHTML+RDFa and contains RDFa statements for both built-in HTML features (ul,li,class) and specific hCal features.

Currently, we are working to determine whether hGRDDL can be implemented purely according to the existing GRDDL spec. The GRDDL working group has agreed to allow for output formats other than RDF/XML, which means RDFa is a legitimate GRDDL output. This may be enough to implement hGRDDL as GRDDL, though there is a design worry that GRDDL transformations may need to be typed or labeled to distinguish different possible outputs (RDF/XML vs. XHTML+RDFa).