Comments on new datatyping document, part 1

These comments are with reference to 
http://www-nrc.nokia.com/sw/rdf-datatyping.html, as modified by 
http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Sep/0040.html, point 
1.  (I can't tell for sure if the other changes proposed there will affect 
my comments below, so I'll make them anyway.)

I've focused my review on sections 2.1, 2.2, 2.3, 3.1, 3.2 and 4, since 
they seem to be the substantive core that we've agreed to adopt.

My most serious comment concerns section 3.1 (and, by association, 3.2).


2.1 rdfs:Datatype - OK


2.2 Datatype Mapping - OK, I think.

Is it necessary to indicate that the XML flag and language tag are a 
significant part of the literal value mapping?  For example consider whether:
   < <xsd:integer>"25"   , 25 >
   < <xsd:integer>"25"-en, 25 >
are distinct members of a datatype mapping.  Similarly, are the following 
distinct?
   < <xsd:integer>"25"   , 25 >
   < <xsd:integer>xml"25", 25 >


2.3 Typed Literal

I think the discussion of "validity" belongs here with the definition of a 
typed literal, rather than in the section 3 on designation in RDF.  Rather 
than defining the concept of "validity" it may be simpler to simply say 
that the lexical form part of a typed literal MUST be in the lexical space 
of the identified datatype.


3. Designation of Typed Literals in RDF

I think this is just about how to represent a typed literal, so I don't 
think the discussion of validity should be part of this (see above).

I think the N-triples syntax examples should include a case corresponding 
to the XML flag being set.


3.1 Global Datatyping Assertions

I have potential serious concerns with this section.  I think it could be 
dropped without harm to the basic structure of local typing.

Also part my concern is that I understand there to be a desire to deal with 
the issue of literal datatyping separately from the issue of tidiness of 
untyped literals.  I think this section defeats such separation.

The basis of my concern, reinforced by the 1st paragraph of this section, 
is that it *could* lead to non-monotonicity depending upon a decision about 
tidyness of untyped literals.  The 1st para says:
[[
It is often convenient to associate a datatype with a property, so that 
every use of the property can be understood as asserting a particular 
datatype for every value.
]]
which I think is re-introducing the very aspect of datatyping that was 
previously causing us such grief.  If we suppose that untyped literals are 
tidy, so we can have entailments like:

The second paragraph of this section reads:
[[
RDF Datatyping employs rdfs:range to associate a datatype class with a 
particular property. The associated datatype may be taken to to constrain 
all values of the property to correspond to members of the value space of 
the designated datatype, and according to the characteristics of RDF 
datatypes thereby also constrain all lexical forms to members of the 
lexical space of the datatype.
]]
The problem, I think, is with the clause beginning "and according to ...".

It's not clear exactly what this means, but the interpretation that I think 
is intended is something like this:

    Jenny age "10" .
    age rdfs:range xsd:integer .
|=
    Jenny age <xsd:integer>"10" .
[E1]

Assuming this, let us now suppose that a tidy interpretation of untyped 
literals is subsequently agreed, then:

    Jenny age "10" .
    AFilm title "10" .
|=
    Jenny age _:x .
    AFilm title _:x .
[E2]

Then we must allow:

    Jenny age "10" .
    AFilm title "10" .
    age rdfs:range xsd:integer .
|=
    Jenny age _:x .
    AFilm title _:x .

hence, from [E1]:

    Jenny age "10" .
    AFilm title "10" .
    age rdfs:range xsd:integer .
|=
    Jenny age _:x .
    AFilm title _:x .
    Jenny age <xsd:integer>"10" .

At this point, my (limited) powers of formal deduction have deserted me, 
but it seems to me that there's a clear problem:  the range constraint on 
age means that there can be no value of Jenny's age that is not a value of 
xsd:integer, hence no value of the film's title that is not an integer, 
which seems to be not what was intended.

If we allow untidy untyped literals (i.e. don't support entailment [E2] 
above) I think this problem goes away.  I would not object with such an 
outcome, but I don't want us to arrive there in ignorance.

Hence, to keep the issue of tidiness separate from datatyping, I think we 
need to drop section 3.1, or at least be very much clearer about what is 
meant by this with respect to the enatilment [E1].


3.2  Datatype Clashes

I think this section is closely related to 3.1, and cannot comment further 
until 3.1 is adequately clarified.  If section 3.1 is dropped, then I think 
this section too should be dropped.


4. RDF Datatyping Model Theory

I think there's a problem with statement (1):

   ICEXT(I(ddd)) = {x : <x,y> in IEXT(I(ddd))}

This condition is expressed in terms of IEXT(I(ddd)), but I don't see the 
earlier sections describing a datatype URI as denoting a datatype mapping 
relation.  I think the intent here can be obtained by saying:

   ICEXT(I(ddd)) = { x : EXISTS(y) and x = L2V(I(ddd))(y) }

...

#g



-------------------
Graham Klyne
<GK@NineByNine.org>

Received on Monday, 9 September 2002 12:12:03 UTC