W3

RDF in HTML

The TAG meeting of 2002-04-08 concluded with the following reflection - not consensus.

Requirements

  1. There is a requirement for namespace documents to be human-readable. This allows, for example, an engineer to read a namespace document for a new standard and find a link to a specification that explains how to author content.
  2. There is a requirement for semantic web applications to be able to put arbitrary information about things in a namespace (be they rdf:Properties or for that matter any other individual thing). Some forms of semantic web processing also need to be able to pick up that data quickly and in real time without unnecessary indirections;.
  3. There is a requirement for a machine application now to be able to find, for example, XML Schema documents by dereferencing a namespace URI (a function which is provided by RDDL language).

Observations

  1. XHTML satisfies the first requirement as a W3C Recommendation for human readable documents.
  2. RDF encoded RDDL information (see schema) satisfies the second requirement.
  3. RDF satisfies the third requirement.

Conclusion

Problems

Therefore, despite widely adopted specifications for XHTML and RDF, there is no specification for the interpretation of the mixture. The TAG felt that this lack, falling between the scopes of two working groups, was within its scope to fill or ask to be filled.

A futher problem is that the question of how to define the meaning of a URIref with fragement id wihtin such a document. This is the subject of a lot of discussion on www-tag.

Possibilities

We either have to:

  1. Specify the architecture for XML so that the thing referred to by a #idvalue reference to an XML document depends on not the MIME type simply, but for a mixed namespace document itself, the namespace of the identified element; or
  2. Change XHTML and its MIME type to know about embedded RDF and specify its meaning; or
  3. specify the architecture so that the semantic web langauges always use #idvalue to refer to the abstract thing described by a bit of XML, while hypertext languages always mean the bit of the document; or
  4. Just don't mix HTML and RDF, as it will always be confusing to have two parts of the meaning of a document.

My gut feeling is to go for 3. I think 1 means that you can't use fragids to point to a generic bit of XML when just doing XML text processing. Solution 2 doesn't solve the general problem, and will need n^2 fixes for n langauges. Solution 3 has the problem that the same URIref s being associated with two different levels of meaning in different contexts, which on the face of it violates the rule that the same URI always refers to the same thing, but actually doesn't as you just say that they both refer to the bit of document but there is an implicit dereference operation in every use of a URIref in a semantic web langauge. This is, I think, normal, as for example a graphic language which refers to a circle by URIref does refer to the circle not the bit of XML.

This, however, means that for example EARL (an RDF language for talking about accessability tests) cannot use RDF to refer to pieces of an RDF document as XML.

Proposal

While one could imagine more complex specifications for processing the mixture, here is a very straightforward one:

If this seemed on the face of it to be acceptable to the community, the TAG would then encourage:


Last change $Id: htmlrdf.html,v 1.9 2002/04/17 17:05:44 timbl Exp $

Tim Berners-Lee