Re: ISSUE-77 (xsd-c14n): XSD canonicalization – 1.0 or 1.1? [R2RML]

Hi Ivan,

Ok, that's interesting and I agree with the general approach, although we have to work out the details.

Let's look at what might happen with XSD, and what it means to the Edited Recommendations that we would publish for R2RML and DM.

1. XSD 1.1 goes to REC without relevant changes: Then we need to merely remove the caveats from the Status sections and update references to the REC. This seems like the most likely outcome.

2. XSD 1.1 goes to REC with relevant changes, that is, some canonical forms are changed between CR and REC: Then in the DM we need to merely remove the caveat from the Status section. In R2RML, we'd need to remove the caveat, update refs to the REC, and possibly update informative examples in Section 10.

3. XSD 1.1 doesn't go to REC for whatever reasons. This is unlikely but possible according to the process, so I suppose we need to take that case into consideration. In the DM, again we'd just remove the caveat. In R2RML, we'd need to change all XSD 1.1 references back to XSD 1.0. We'd need to change the informative examples in Section 10, and remove the informative note on partial implementations of infinite datatypes.

So in terms of changes to the documents, this doesn't seem like a problem and would only require informative edits, except for the changed normative reference (from XSD 1.1 CR to XSD 1.1 REC or XSD 1.0 REC).

However, there is a problem because cases 2) and 3) change the conformance criteria. Behviour that would be ok in an implementation of the current R2RML/DM specs would no longer be ok in the updated specs.

As a concrete example, let's consider the case where a TIMESTAMP is used as a primary key. In the DM this might result in a row IRI, assuming reference to XSD 1.1 for the canonicalization of TIMESTAMPs:

    <LOG/TSTAMP-2011-11-25T13:38:56+01:00>

However, if the reference has to be changed back to XSD 1.0 then implementations would have to be changed to produce:

    <LOG/TSTAMP-2011-11-25T12:38:56Z>

because XSD 1.0 says all values with timezone offsets are canonicalized to UTC, while XSD 1.1 says that the timezone offset is retained. The same story happens if we use an IRI template in R2RML, say, "LOG/TSTAMP-{TSTAMP}".

In the DM, because canonicalization is a MUST, this change would break existing implementations.

In R2RML canonicalization is a SHOULD, so this wouldn't break implementations, but merely have the consequence that they no longer follow a SHOULD-level requirement.

It seems to me that the OWL2 group was able to simply declare all the XSD 1.1 dependent features as OPTIONAL and thereby get around the problem of breaking existing implementations. We can't do that because all implementations rely on generating IRIs.

I guess there are a number of ways of dealing with the DM issue:

a) Ignore it and move forward. In the unlikely case that cases 2) or 3) occur, implementers conforming to the original DM REC just have to change their implementations to conform with the Edited Recommendation.

b) Find some way of making both behaviours allowable in the DM. “For XSD canonicalization rules, implementations SHOULD use XSD 1.1 REC if that's available, but MAY also use XSD 1.1 CR or XSD 1.0 REC.” Thus, no matter whether 1) 2) or 3) happens, anyone who conforms now will still be conforming, and hopefully everyone will eventually settle on the latest thing.

c) Drop the MUST-level requirement for canonicalization in the DM altogether and turn it into a SHOULD. This would reduce interoperability between DM implementations, making it more likely that you can't just drop in one implementation for another.

Best,
Richard


On 25 Nov 2011, at 08:24, Ivan Herman wrote:

> Richard, all,
> 
> We do have precedence here. Both the OWL and the RIF WG-s hit exactly the same issue (I have not checked lately for SPARQL and I imagine the RDF WG will hit the same problem, too). Here is what has happened in the OWL 2case (the RIF is fairly identical).
> 
> - OWL 2 has clearly chosen for XSD 1.1. That means the reference in the document _is_ on the XSD 1.1 CR document. Unusual, slight breakage of the rules, but it was necessary.
> - The 'Status' section of the Recommendation includes a subsection on this, see the example below
> - Once the Recommendations were published, the OWL 2 WG went into a 'dormant' state. Ie, it is not formally closed, is maintained in the books, but there is not activity (calls, etc). The only exception is that error reports are stored and maintained in a file; something that this group will have to plan for when the time comes anyway.
> - The agreement is that once the XSD 1.1 is published as Rec, the Group reconvenes and publishes what we call an Edited Recommendation. That is a rec that has absolutely non difference in technical content v.a.v. the original ones, but only editorial changes (misspellings, that sort of things). In this specific case that ER of OWL 2 will change the formal reference to the XSD 1.1 Rec, remove that status subsection and, if any, fold in the editorial errors that the community may have found. Ie, there would be an editorial work to be done when the time comes, and editorial work that is clearly quick and can be done by 1-2 persons.
> 
> The OWL 2 adoption has not suffered from this issue at all, nobody raised any problems since 2009. My advise would be to adopt the same line of action for R2RML and DM. It would be wrong to keep to 1.0 when other SW standards have made the choice of 1.1
> 
> Cheers
> 
> Ivan
> 
> Here is the status subsection I was referring to:
> 
> [[[
> XML Schema Datatypes Dependency
> 
> OWL 2 is defined to use datatypes defined in the XML Schema Definition Language (XSD). As of this writing, the latest W3C Recommendation for XSD is version 1.0, with version 1.1 progressing toward Recommendation. OWL 2 has been designed to take advantage of the new datatypes and clearer explanations available in XSD 1.1, but for now those advantages are being partially put on hold. Specifically, until XSD 1.1 becomes a W3C Recommendation, the elements of OWL 2 which are based on it should be considered optional, as detailed in Conformance, section 2.3. Upon the publication of XSD 1.1 as a W3C Recommendation, those elements cease to be optional and are to be considered required as otherwise specified.
> 
> We suggest that for now developers and users follow the XSD 1.1 Candidate Recommendation. Based on discussions between the Schema and OWL Working Groups, we do not expect any implementation changes will be necessary as XSD 1.1 advances to Recommendation.
> ]]] 
> 
> See http://www.w3.org/TR/2009/REC-owl2-syntax-20091027/
> 
> 
> On Nov 25, 2011, at 24:30 , RDB2RDF Working Group Issue Tracker wrote:
> 
>> 
>> ISSUE-77 (xsd-c14n): XSD canonicalization – 1.0 or 1.1? [R2RML]
>> 
>> http://www.w3.org/2001/sw/rdb2rdf/track/issues/77
>> 
>> Raised by: Richard Cyganiak
>> On product: R2RML
>> 
>> So it turns out that XSD canonicalization is actually very different between XSD 1.0 and XSD 1.1. Quite a lot has changed – I don't have the full picture but handling of time zone offsets is different, handling of decimals appears to be different, and who knows what else.
>> 
>> Given that XSD 1.1 is in the CR stage, I don't feel very good about writing spec text that asks R2RML/DM implementers to implement XSD 1.0 canonicalization rules that will soon be obsolete.
>> 
>> On the other hand, given that XSD 1.1 is not yet at REC stage, we can't write spec text that normatively prescribes the use of XSD 1.1 canonicalization rules.
>> 
>> 
>> 
> 
> 
> ----
> Ivan Herman, W3C Semantic Web Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> FOAF: http://www.ivan-herman.net/foaf.rdf
> 
> 
> 
> 
> 
> 

Received on Friday, 25 November 2011 13:54:19 UTC