Re: Encouraging canonical serializations of datatypes in RDF

On 07/31/2012 03:59 PM, David Booth wrote:
> Hi Peter,
>
> On Tue, 2012-07-31 at 15:36 -0400, Peter F. Patel-Schneider wrote:
>> Hmm.
>>
>> Your two examples have different canonical forms in XML.   I do not believe
>> that going beyond XML canonicalization is a good idea.
> What downside do you see?

If RDF goes beyond XML canonicalization is it doing something to XML datatypes 
that is not part of the XML specification.   This appears to be driving a 
further wedge between RDF and XML data.

[...]

>
>> In any case, I don't see the point here.  If equality-unique canonical forms
>> are only encouraged, then applications will still have to do datatype-aware
>> comparisons.
> Only if they need to handle all possible data serializations.   If 90%
> of the available datasets use the canonical forms then many apps will
> not need to do datatype-aware comparisons, though the ones that need to
> cover 100% will.

If even 99.99% of available datasets use the canonical forms then all apps 
should still be prepared for non-canonical forms.  To do otherwise is to be 
wrong.  That is not to say that being wrong is not useful on occasion, but I 
don't see that there is any good to be had here in the WG suggesting canonical 
forms be used exclusively.
>
> I think it is important to keep the RDF entry barrier as low as possible
> whenever possible, in order to support scruffy apps that are good enough
> for many purposes, even if they don't handle every case.
>
> David
>
It is important that apps should do the right thing.  For example, should apps 
ignore character encoding?  How hard is doing datatype-aware processing of 
literals, compared with all the rest of the stuff that is required to handle RDF?

peter

PS:  Yes, I do use text processors to handle RDF, and quite often, even 
analysing the 2011 Billion Triple Challenge triples using sed and grep.   
However, I check to ensure that the right thing happens.

Received on Tuesday, 31 July 2012 20:24:32 UTC