RDFa vs RDFXML

From W3C Wiki


This is a note comparing RDFa and RDF/XML in terms of features, usage, etc. - feel free to comment it or put questions inline :)

RDFa vs. RDF/XML

Index

Some History

RDF, the Semantic Web data model, needs concrete serialisations in order to be exchanged, stored, etc. The original idea was to layer RDF on top of XML beacause people thought that (already available and deployed) XML tools could be reused. Interestingly, the latest version of the Semantic Web stack does not require RDF to be used on top of XML anymore (cf. Fig. 1).

But RDF/XML (application/rdf+xml) is not the only player in town. There exists a bunch of other formats, such as N3 (text/rdf+n3), Turtle (text/rdf+n3), TriX. What have all these serialisation formats in common? Well, all of them are standalone formats, assuming that an RDF graph is serialised into a document that exists on its own. In case of RDF/XML, this document is an XML document. Each of these formats has a Internet Content Type, passed by the server, so the client knows how to parse the data.

Now, the question arises what happens in the context of the real world, where developers use APIs and environments such as PHP? The developer just writes some more PHP scripts to deliver RDF/XML, at different URIs to the HTML data. (If the HTML data is really exactly equivalent to the RDF data, then content-negotiation can be used to deliver them at the same URI, but this should be used with caution, typically only when the HTML and RDF are generated from the same data and contain exactly the same information.)

For an existing site which has pages in (X)HTML, how can they integrate or reference the RDF-based data available if they were to publish it? A number of proposal exists how to 'marry' RDF and HTML (see EmbeddingRDFinHTML or Sean's note).

Some years ago, W3C started to investigate a more 'natural' way of using RDF and (X)HTML together. The result is RDFa, RDF in attributes.

Further reading

RDF serialisations:

RDFa:

Comparison

In Fig. 2 RDF/XML and RDFa are depcited in terms of layering on top of other technologies:

RDF/XML
URI

In the following a comparison of RDF/XML and RDFa is given.

Issue RDF/XML
Status W3C Recommendation
Unicode yes
URIs yes
XML Namespaces yes
QNames yes
Spec builds on CURIEs? no
Datatypes yes
Standalone yes
Reification yes
Named Graphs currently not, but there are discussions

Usage

Human vs. Machine

In a Semantic Web application there are really two kinds of users around: human users and a piece of software (aka Semantic Web agent). As the Semantic Web is an extension rather than a replacement of the Web, it is important to distinguish these two users when talking about certain services, interfaces, etc. Note that human users, with the proper user interfaces, can benefit from RDF data as well as HTML docuemnts. RDFa makes a hybrid document, whose interface becomes a hybrid of the hypertext and data interfaces.

URI design and more

David Booth has put together a nice piece on URI Declaration Versus Use. And for sure we have the TAG finding regarding httpRange-14 issue.

Attempting to actually use RDF and (X)HTML representations together, one has to think about who is served (human user or SW agent) and, independently, what is served (e.g. RDF, or XHTML).

The Cool URIs for the Semantic Web Working Draft suggests options how to use RDF/XML and (X)HTML in parallel. Certain implementations exist that use (one of) these designs (cf. DBpedia). However, in the case of RDFa, the Cool URIs WD states:


  The solutions described in the following apply to deployment scenarios in which the RDF data and the HTML data is served  
  separately, such as a standalone RDF/XML document along with an HTML document. The metadata can also be embedded in HTML, using 
  technologies such as RDFa [RDFa Primer], microformats and other documents to which the GRDDL [GRDDL] mechanisms can be applied. 
  In those cases the RDF data is extracted from the returned HTML document.


In the case of using RDFa (or more precisely: XHTML+RDFa) one has to rethink URI design issues. Or, put as a question: Does httpRange-14 apply to the RDFa setup equally?

The httpRange-14 finding that HTTP URIs which return 200 identify documents applies in all these cases. The architectural question which RDFa brings up is about fragment identifiers. RDFa files define both conceptual things (People, Calendar events, etc) as well as traditional hypertext anchors. This requires browsers to be able to display either of these. This raises user interface issues, which (as of early 2008) have not been clearly resolved:

  • If you represent the data as well as the hypertext, will the data be repeated, confusing the user?
  • If you don't represent the data, will a user have access to the full power of data handling?
  • There is a security threat if the user does not see (but agrees to) data which is not visible.
  • Are there systems whcih currently assume that a localID within a hypertext document will identify an anchor? How will they cope with RDFa?

Further reading

Discussion


Changelog:

  • 2009-02-01: Updated status of RDFa, updated named graphs and reification
  • 2008-02-19: Replying to TimBL's comments
  • 2007-12-23: TimBL comments
  • 2007-12-19: init draft of this page by MichaelHausenblas