RDF in XHTML Task Force -- 20 Mar 2008

Action Review

ACTION: Ben to update RDFa schedule to include CR [recorded in http://www.w3.org/2008/03/13-rdfa-minutes.html#action01] [DONE]

ACTION: Manu to enable EARL output in RDFa Test Harness [recorded in http://www.w3.org/2008/03/13-rdfa-minutes.html#action13] [CONTINUES]

ACTION: Michael to create 'RDFa for uF users' on RDFa Wiki [recorded in http://www.w3.org/2008/03/13-rdfa-minutes.html#action12] [CONTINUES]

ACTION: Ben followup with Fabien on getting his RDFa GRDDL transform transferred to W3C [recorded in http://www.w3.org/2007/11/15-rdfa-minutes.html#action01] [CONTINUES]

ACTION: Ben to respond to issue 87 [recorded in http://www.w3.org/2008/02/28-rdfa-minutes.html#action09] [CONTINUES]

ACTION: Manu write a response to Christian Hoertnagl for issue 7 [recorded in http://www.w3.org/2008/02/21-rdfa-minutes.html#action09] [CONTINUES]

Manu: I have a draft response

ACTION: Mark/Shane include issue 89 correction in Changes section [recorded in http://www.w3.org/2008/03/06-rdfa-minutes.html#action11] [CONTINUES]

Shane: I forgot to update the Changes section in the editors' draft after making the other correction

Media Type / Self-Describing Web

-> The Self-Describing Web [Steven 2008-03-19]

Steven: we'd have to re-issue a spec
... the current media type spec says we control it
... normally media type specifications are issued via IETF as an RFC
... so normally a new RFC is issued
... the change would be something like "the document may include RDFa"
... I think this is almost daft; I don't see why it's necessary
... you can scrape loads of metadata out of existing documents

<ShaneM> The current media type document explicitly includes XHTML family markup languages

Ben: we haven't changed anything about the media, have we?
... did we change something when adding GRDDL?

Steven: if someone uses our module in another language, are they also going to have to change their media type registration?

Ben: the main argument seems to be that HTML didn't update when GRDDL became a REC
... the claim will be that GRDDL introduced no new attributes

Shane: our media type application/xml+xhtml is specifically meant to be extended

<Steven> rfc2854.txt

<Steven> rfc3236.txt

Ralph: is there enough indirection from the mime type registration to the XHTML specification at W3C such that when W3C changes XHTML definition we don't have to change the mime registration?

Shane: we don't need to do anything

Ralph: can we update an XML Schema document for XHTML?

Shane: there isn't one
... CR hasn't been approved

Ralph: is that a process issue?

<msporny> +1 - in agreement with being able to create new XML documents using XHTML modules

Ben: given XHTML modularization, one can cobble together a bunch of modules into a new schema

<msporny> +1 for not updating existing media-type registrations.

Ben: that new schema does not require a new media type

Ralph: we're not being asked to change the media type, but to update the registration

Shane: Ben is exactly right

<Steven> "With respect to XHTML Modularization [XHTMLMOD] and the existence

<Steven> of XHTML based languages (referred to as XHTML family members)

<Steven> that are not XHTML 1.0 conformant languages, it is possible that

<Steven> 'application/xhtml+xml' may be used to describe some of these

<Steven> documents. However, it should suffice for now for the purposes of

<Steven> interoperability that user agents accepting

<Steven> 'application/xhtml+xml' content use the user agent conformance

<Steven> rules in [XHTML1]."

Mark: if TAG would be happy with a link to a GRDDL transform
... isn't this apples and pears?

<ShaneM> I think that the issue is that the TAG wants a way to know that a document contains extractable meta data

<ShaneM> @version="XHTML+RDFa 1.0"

<msporny> Shane, is XHTML2 going to do that?

<ShaneM> XHTML2 is @version="XHTML2 1.0"

<ShaneM> XHTML+RDFa is the other.

<msporny> So, are you saying that we do @version="XHTML+RDFa 1.0" in the current RDFa Syntax Document?

<ShaneM> yes

<msporny> are you saying that we do that in addition to @profile and the DTD type?

<msporny> or in place of?

<ShaneM> oh I dont mind. I was just pointing out that we have this declaration mechanism too. See http://www.w3.org/TR/rdfa-syntax/#a_DTD_driver

Ben: do we agree the TAG is wrong on the need to update the media type?

<ShaneM> The declaration does not help address the real TAG issue, which is that they want to know when a resource contains triples.

Ralph: I abstain
... what's the status of the XHTML1 Schema document?

Steven: we had to combine several documents and were forced to go through Last Call again
... we're trying to complete that Last Call
... and discussing whether we have to have a CR
... as XHTML modularization is a methodology
... there's no processor to write; the processors are schema or dtd processors

<ShaneM> XHTML Modularization 1.1 is the spec that is held up

Ralph: so the schema document we would update to add a transform URI is held up in this process?

Steven: yes

<ShaneM> Note that I dont think this schema technique is valid. It would mean that EVERY xhtml document generated triples. We don't mean that.

ACTION: Ben to follow up on media type discussion with Steven, Ralph, and TAG [recorded in http://www.w3.org/2008/03/20-rdfa-minutes.html#action08]

Test Cases

Manu: we need to resolve XMLLiteral first; all the new tests depend on that

XML Literal

-> issue 97

Ben: I agree with Mark that the abstract RDF graph contains XML literals in canonical form
... however, in practice a parser always returns some serialization of RDF
... so even if there was a parser that always stores a canonical representation internally it is still permissible for that internal representation to be re-serialized in non-canonical form
... so if the output of an RDF parser is a valid serialization, it doesn't matter whether the canonicalization was actually done in an internal step

Mark: yes, and Ivan gave the example of an RDF serialization that uses ' instead of "
... it's valid serialization but is it a true representation of the RDF graph?
... moving the problem elsewhere
... RDFa is defined in terms of RDF, not in terms of a serialization of RDF
... we have to describe the result of parsing and this is clearly defined
... the result is a canonicalized string

Ben: only in the internal store
... when the result is output, the canonicalization does not apply
... the spec can say that in the abstract graph the result is canonicalized

Mark: but Ivan also notes that you need to preserve the namespaces
... and preserving the namespaces is exactly what canonicalization does
... it would be wrong to drop the namespaces from the XML output

Ben: yes, it has to be a serialization, not the canonical one

Mark: at the start of the thread I said we could dump the namespaces on the toplevel element
... and people objected

Ben: maybe there was mis-communication

Mark: the 'apex' node; the [outermost] element in the fragment
... if there's a DIV there, we're all set
... but if there's a text node then you have to scan inside it

Ben: so we agree that the only thing we need to do is put the namespaces somewhere so that the result yields the same canonicalization

Mark: yes
... but then Ivan added that this is more than he wants to have to do in his parser
... there's a lot of processing that has to be done
... so I propose that either we do the analysis on the content or drop the whole idea

Shane: is there a belief that the RDF that is emitted by an RDFa compatible parser needs to be isomorphic to itself in some way?
... do I need to be able to turn the triples back into RDFa in some way?

Ben: not in that way
... if someone were to go from RDFa to RDF and then wanted to go back to RDFa, not necessarily in the same markup (presentation information will be lost)
... it should be possible in some way to do this
... if you've parsed triples out of an RDFa document you should be able to reserialize in RDFa

Ralph: I believe Ivan is right, the answer to Johannes is "yes, the xmlns should have been included in that test case."

<ShaneM> introduce a wrapper element with no semantics to carry the namespace declarations

<ShaneM> xh:wrapper xmlns:foo= xmlns:bar= ....

Manu: are we going to say that we do need to preserve the namespaces?

Ralph: I do absolutely think we need to preserve the namespaces

<benadida> <div xmlns:svg="..."><span property="dc:title"><p>Title</p><svg:x>...</svg:x></span></div>

Ben: consider the example above; the namespace is declared outside the title markup
... if we don't follow something like the XML literal rules we might loose the svg namespace
... yes, it's complicated but it's the right thing

Mark: the xml literal starts with the SPAN
... this would make it clear that there's always an apex node
... but as Shane notes, you don't know what's going to receive your RDF data

Mark: the good news is that we do have all the namespaces in our processing

Mark: the RDF Concepts spec could be read as "an XML Literal is something that could be canonicalized", not "something that is canonicalized"
... so you just have to make sure you don't loose anything
... just somehow have to find an apex node

Ben: a wild idea ... suppose there is no apex node; there's a few pseudo-apex nodes siblings

Ben: if it's easy enough to put the namespaces on one apex node, let's just put them on each of the sibling nodes
... that would only leave xml:lang to deal with
... so a warning that XMLLiterals without an apex node might cause xml:lang to be lost

Mark: taking that line of modification, could take the DIV and call it the apex node by dropping @property

Ben: the resulting asymmetry between XML literal and plain literal bothers me
... you couldn't have an apex node that is different from the actual markup

Manu: I don't necessarily agree with the approach of using the apex node
... when we say XML Literal, do we really mean XML Literal?
... this is asking a lot of authors

Ben: it's an XML literal, just if there's no apex node we can't carry xml:lang to the toplevel text nodes. We can carry xml:lang and namespaces to the sibling elements

Mark: this point is also made in the RDF Concepts document
... typed literals do not have a language
... so RDF Concepts notes you have to explicitly add an apex node to preserve xml:lang

Ben: so we just stuff xml:lang into as many toplevel elements as we can

Shane: you also have to push the default xmlns

Mark: you have to push all the currently in-scope namespace mappings, since you don't know which might be used

Manu: we're going to include the xmlns in all top-level elements of the literal?

Ben: yes

Manu: I agree with this. It's just a big pain to implement.

Mark: this can be done with string parsing; it doesn't require an XML parser

[Ralph departs]

Shane: about this solution, let's say a child element has the xmlns declaration already. we have to check if it's there.

Ben: I don't think this changes the spec

ACTION: Manu to add test cases for xmlliterals with namespace preservation, including one where the xmlliteral re-declares one of the namespaces [recorded in http://www.w3.org/2008/03/20-rdfa-minutes.html#action09]

PROPOSE that we resolve "the RDFa syntax does not need any change with respect to XMLLiterals. We note that the easy way to generate a valid serialization of the XMLLiteral is to dump the namespaces and xml:lang into all top-level elements of the xml literal, including xmlns, watching out for redeclarations."

<Steven> +1

RESOLVED: the RDFa syntax does not need any change with respect to XMLLiterals. We note that the easy way to generate a valid serialization of the XMLLiteral is to dump the namespaces and xml:lang into all top-level elements of the xml literal, including xmlns, watching out for redeclarations.

Note that this RESOLVES ISSUE-97

<ShaneM> does anyone object to @typeof?

We'll discuss @typeof on the mailing list

ADJOURNED

RDF in XHTML Task Force

20 Mar 2008

Attendees

Contents

Action Review

Media Type / Self-Describing Web

Test Cases

XML Literal

Summary of Action Items