An RDF-in-XHTML Proposal

NOTE: a new draft is in progress, which is likely to obsolete this one.

Editors:
Dominique Hazaël-Massieux

Abstract

This document presents a proposal for embedding RDF statements in XHTML through the use of XSLT style sheets.

Introduction

Since RDF became a W3C Recommendation, embedding RDF in XHTML has been sought as a way both to ease the adoption of RDF by a large community and to make the current HTML Web part of the upcoming RDF-based Semantic Web. A separate document lists the known use cases for such an approach and the previous attempts at finding a solution.

The proposal developed in this document relies on XSLT to extract RDF statements from a defined XHTML markup.

Scope

This document addresses only one of the possible methods of embedding RDF in XHTML while keeping XHTML-validity; it does not address the generic question of embedding XML in XHTML, nor of embedding RDF in any XML vocabulary.

Classes of Products

This document creates conformance requirements for 3 classes of products:

RDF-in-XHTML processor
An RDF processor@@@ able to read RDF statements from an XHTML document
XHTML document with embedded RDF
A document conforming to XHTML (any version), with RDF statements embedded
XHTML to RDF converter
A set of rules describing how to convert an XHTML document with embedded RDF in a lists of RDF statements

Processing model

An RDF processor trying to extract RDF statement from an XHTML document first checks that the given XHTML document has a well-known URI as part of its profile attribute, then for each XSLT linked from the <head> element using a specific value of the rel attribute, applies the said style sheet to the XHTML document and adds its output to its store of RDF statements.

@@@ illustration

One doesn't need to know XSLT to embed RDF in one's XHTML document. Typically, an XHTML document author wishing to embed RDF statements in its document would just need to know specific markup rules for his document, matching those expected by the XSLT style sheets. The XSLT style sheets would be developed by the knowledgeable people in a community to embed specific set of RDF statements for a particular application.

XHTML document with embedded RDF

To be identified as embedding RDF statement, an XHTML document MUST include the URI http://www.w3.org/2003/11/rdf-in-xhtml as part of the white-space separated list of URIs in the profile attribute of its head or html (as defined by the specification of the version of the said XHTML document) element.

Example: <html xmlns="http://www.w3.org/1999/xhtml"> <head profile="http://www.w3.org/2003/11/rdf-in-xhtml">

@@@ What to put at the end of this URI? RDDL? XHTML? RDF?

To identify each XSLT defined to extract RDF statements from its content, an XHTML document MUST link to it using the <link> element, with the href attribute pointing to the said XSLT, the rel attribute containing the token xslt2rdf.

Example: <link rel="xslt2rdf" href="http://www.example.org/xhtml2rdf.xsl" />

@@@ Should be version the token used rel? (linked to XSLT 1.0 vs XSLT *)

XHTML to RDF converter

An XHTML to RDF converter MUST comply with the XSLT 1.0 Recommendation.

Should we restrict to XSLT 1.0? Any XSLT version?

An XHTML to RDF converter MUST has its output method set to XML, its output encoding set to UTF-8 or UTF-16 (@@@ check what are the required encoding for RDF processors), MUST use as the top level element of its result tree an RDF element in the namespace http://www.w3.org/1999/02/22-rdf-syntax-ns#, MUST NOT use any extension.

For instance, this converter extracts the title of an XHTML document and advertises it as RDF using the Dublin Core schema:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                              xmlns:html="http://www.w3.org/1999/xhtml">
<xsl:output method="xml" encoding="utf-8"/>
<xsl:template match="/html:html"
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <!-- Processing the XHTML input -->
  <rdf:Description rdf:about="">
    <dc:title><xsl:value-of select="html:head/html:title"/></dc:title>
  </rdf:Description>
</rdf:RDF>
</xsl:template>
</xsl:stylesheet>

Allow non RDF/XML formats?

Allowing extensions?

Limitations on xsl:import? the document() function?

Need to make further requirements? e.g. parameters to the XSLT? minimal content?

RDF-in-XHTML processor

An RDF-in-XHTML processor MUST process add to its store all the valid output resulting of processing the given XHTML document through each of the XSLT style sheets linked the document in the ways described above.

How to integrate trust/security considerations vs all the XSLT?

Do we need to require a processor to record the XSLTs used to extract RDF (with a specific property)?

What to do with regard to the lower protocol stack? E.g., we may want to have the Content-Language and the Last-Modified transitive through the transformation, but not Content-Length or Content-Md5sum(@@@); what about Content-Location (needed for relative URI-resolving, but misleading wrt content)

An RDF-in-XHTML processor MUST interpret relative URIs as relative to the URI of the XHTML document.

When processing the XHTML document through a given XSLT style sheet, if the processing is interrupted by an error, or if the resulting tree is not well-formed RDF/XML [@@@ does such a notion exist?], the RDF-in-XHTML processor MUST discard all the RDF statements that could have been extracted from this style sheet. It does not affect the RDF statements extracted from other transformations.

profile of XSLT processors wrt error handling?

Conformance

XHTML document with embedded RDF

Conformance to this specification is achieved by respecting all the MUST requirements that it defines.

@@@ conformance claim? Validator/pre-parser?

XHTML to RDF converter

Conformance to this specification is achieved by respecting all the MUST requirements that it defines.

@@@ conformance claim? Validator?

RDF-in-XHTML processor

Conformance to this specification is achieved by respecting all the MUST requirements that it defines.

@@@ conformance claim? Test Suite?

A simple RDF-in-XHTML processor

As a demonstration of this proposal, a simple RDF-in-XHTML processor has been developed in XSLT; it is not entirely conformant due to the limitations of error handling in XSLT 1.0 and due to the non-conformance of the XSLT service it relies on, but should give in most cases the expected results on input that does not generate errors.

Security considerations

document function, xsl:import, trust management...

Social meaning considerations

RDF statements embedded using this method bares the same social meaning as if they were asserted on their own. Given that these RDF statements are obtained through several resources (the XHTML document and the XSLT style sheets), an XHTML document author relying on this method should trust the entities controlling the XSLT style sheets as much as he trusts those controlling the ontologies and vocabularies he uses.