Clarifying what datatype entailment support means (Re: xsd:float and xsd:decimal)

[Patrick Stickler, Nokia/Finland, (+358 40) 801 9690, patrick.stickler@nokia.com]


----- Original Message -----
From: "ext Jan Grant" <Jan.Grant@bristol.ac.uk>
To: "Brian McBride" <bwm@hplb.hpl.hp.com>
Cc: "Patrick Stickler" <patrick.stickler@nokia.com>; "ext Jeremy Carroll" <jjc@hpl.hp.com>; "w3c-rdfcore-wg" <w3c-rdfcore-wg@w3.org>
Sent: 28 November, 2002 17:31
Subject: Re: xsd:float and xsd:decimal


> On Thu, 28 Nov 2002, Brian McBride wrote:
>
> > >I'm fine with this as long as it is clear (somewhere) that
> > >datatype entailments involving equality of values between
> > >different datatypes are based on the definitions of the
> > >datatypes themselves, and if the relationships between
> > >the datatypes are not part of the formal definitions of
> > >the datatypes, then the entailments cannot be determined.
> > >
> > >I.e. we need to be clear about the basis for the entailments
> > >and not work solely on the basis of human intuition.
> >
> > Patrick,
> >
> > May I test my understanding of what you mean here.  I offer two datatype
> > definitions and an entailment.
>
> Patrick's concerns come down to this, I think: what _machinery_ is
> mandated for determining a RDF-D entailment?

Yes, this is the crux of my concerns.

> Do we say, we expect
> reasoners to only copy when given subClassOf relationships between
> datatypes?
>
> That is a limited form of reasoning and would not catch other
> entailments such as the one you give in your example.
>
> I think the definition I proposed for the DT entailment test cases, that
>
> "supporting datatypes X, Y, Z, ..."
>
> means
>
> "For two DT literals, each with a type from {X, Y, Z, ...},
> you can determine whether they denote the same value"
>
> (determining the value itself is not required)
>
> is a reasonable one.

OK, if that is explicitly defined for all of the datatype entailment
tests, then I think that is OK insofar as the tests are concerned.

Though there still is the issue of implied expectations from applications
that provide some form of datatyping support, and I'm concerned that the
above definition will be percieved as the expected minimal level of
datatyping support for all RDF applications that "do" datatyping, which
IMO is not a reasonable *minimal* level of expected DT support.

I see three meaningful levels of datatyping support:

Level 1: Minimal (each datatype dealt with in isolation)

   For two DT literals of the same DT, where the DT is supported,
   it can be determined whether they denote the same value, and if
   the DT is ordered, what their ordering relationship is.

Level 2: Segmented (relations between some datatypes supported)

   For two DT literals of different datatypes where both datatypes
   are supported and their is a known intersection of the value
   spaces of the two datatypes, it can be determined whether
   they denote the same value, and if the DTs are like ordered, what
   their ordering relationship is.

Level 3: Full (relations between all supported datatypes supported)

   (same as level 2, but all possible valid relations between all
    supported datatypes are supported)

An example of where one may support individual datatypes but not
(potentially valid) relationships between other datatypes could
be an engine that provides support for both XML Schema datatypes
and UAProf datatypes, where in practice, these are used in a disjunct
fashion, therefore, no relationship is defined between e.g. xsd:integer
and uap:Number (even though there could be).

Now, later, the engine could provide support for comparing
xsd:integer values and uap:Number values, but providing support
for *both* those datatypes does not necessarily mean it provides
support (or should provide support) for comparision *between* those
two datatypes.

I.e., an application providing minimal datatyping support for
two datatypes which theoretically are related is not deviant or
broken, etc. if it does not support comparisons between the
two datatypes.

And one cannot presume some ad-hoc equality between datatype values
just based on the native internal representation used (eg. both are stored
as Java double values) because that is no guaruntee that they are the
same value -- e.g. one can store xsd:boolean values and xsd:integer
values internally as Java double values , but that does not mean
that "1"^^xsd:boolean == "1"^^xsd:integer, eh?

We cannot, and should not, say that applications are required, or
even expected, to support datatype entailments at any level, much
less at the full level (level 3 above).

*BUT* we should be clear about the level of support presumed when
we are defining datatype entailment tests in the test cases, *AND*
be clear that the specified level of support is a requirement for
the test case, but not for every RDF application dealing with
datatypes. I think that the kernel of this is there in Jan's proposed
verbage, but simply needs to finessed a bit.

I suggest we adopt and utilize something akin to the above three
levels of datatyping support, in a non-normative but consistent
fashion, and qualify our datatyping test cases clearly in those
terms.

Regards,

Patrick

Received on Friday, 29 November 2002 04:36:11 UTC