3243 – Definition of anySimpleType

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 3243 - Definition of anySimpleType

Summary: Definition of anySimpleType

Status:	RESOLVED FIXED

Alias:	None

Product:	XML Schema
Classification:	Unclassified
Component:	Datatypes: XSD Part 2 (show other bugs)
Version:	1.1 only
Hardware:	PC Windows XP

Importance:	P1 major
Target Milestone:	---
Assignee:	C. M. Sperberg-McQueen
QA Contact:	XML Schema comments list

URL:
Whiteboard:	cluster: effability
Keywords:	resolved

Depends on:
Blocks:

Reported:	2006-05-09 10:40 UTC by Michael Kay
Modified:	2008-05-23 19:43 UTC (History)
CC List:	0 users

See Also:

Attachments

Description Michael Kay 2006-05-09 10:40:56 UTC

QT approved comment:

In 3.2.1, the definition of the value space of anySimpleType suffers from the fact that there are values here that have no lexical representation, for example the list consisting of two strings each comprising a single space. In addition, it's not clear whether a list containing a single item is the same value as that item. I don't know if these distinctions have any practical consequences but it's important to make the foundations as solid as possible.

Comment 1 C. M. Sperberg-McQueen 2006-10-09 18:14:58 UTC

The same error occurs in section 2.1, although not elsewhere, as
far as I can tell by examining sentences containing the word
'mapping'.

A possible solution is: (1) in section 2.1, replace 

    [Definition:] In this specification, a datatype has three
    properties:

      ...

      * A small collection of functions, relations, and
        procedures associated with the datatype.  Included are
        equality and order relations on the ·value space·, and a
        ·lexical mapping·, which is a function on the ·lexical
        space· onto the ·value space·.

with

    [Definition:] In this specification, a datatype has three
    properties:

      ...

      * A small collection of functions, relations, and
        procedures associated with the datatype.  Included are
        equality and order relations on the ·value space·, and a
        ·lexical mapping·, which is a mapping from the ·lexical
        space· onto the ·value space·.

And (2) in section 2.3, replace the first paragraph

    [Definition:] The lexical mapping for a datatype is a
    prescribed function whose domain is a prescribed set of
    character strings (the ·lexical space·) and whose range is
    the ·value space· of that datatype.

with (draft A):

    [Definition:] The lexical mapping for a datatype is a
    prescribed relation whose domain is a prescribed set of
    character strings (the ·lexical space·) and whose range is
    the ·value space· of that datatype.

        Note: For each primitive datatype defined here, the
        lexical mapping is a total function from the lexical
        space onto the value space of the datatype.  

        For unions, the lexical mapping may or may not be a
        function; when a lexical representation maps to more than
        one value in the union, the choice of which value to use
        for further processing is determined by the order of the
        members of the union, or (when performing schema-validity
        assessment as defined in [XML Schema Part 1: Structures])
        an xsi:type attribute in the document instance.

        For the special datatypes, the lexical mapping is not a
        function, since the same character sequence may map to
        several values. When a lexical representation maps to
        more than one value, the choice of which value to use for
        further processing is not determined by this
        specification; when performing schema-validity assessment
        as defined in [XML Schema Part 1: Structures], the value
        to be used may be determined by rules given there.  In
        the case of anySimpleType, alone among datatypes defined
        here, the lexical mapping is also not a function onto the
        value space, because there are values with no lexical
        representation.

or (draft B):

    [Definition:] The lexical mapping for a datatype is a
    prescribed relation whose domain is a prescribed set of
    character strings (the ·lexical space·) and whose range is
    the ·value space· of that datatype.

        Note: For each primitive datatype defined here, the
        lexical mapping is a total function from the lexical
        space onto the value space of the datatype.  

        For unions and special datatypes, the lexical mapping is
        not a function, since the same character sequence may map
        to several values.  When a lexical representation maps to
        more than one value, the choice of which value to use for
        further processing may be determined by rules given
        elsewhere (for unions, by the order of the members of the
        union; when performing schema-validity assessment as
        defined in [XML Schema Part 1: Structures] by rules given
        there; otherwise, by rules given in other
        specifications).

        For every datatype except anySimpleType, the lexical
        mapping maps *onto* the value space of the datatype
        (i.e., every member of the value space is mapped to by
        some lexical representation).  In the case of
        anySimpleType, alone among datatypes defined here, the
        lexical mapping is also not a function onto the value
        space, because there are values with no lexical
        representation.
    
or (draft C):

    [Definition:] The lexical mapping for a datatype is a
    prescribed relation whose domain is a prescribed set of
    character strings (the ·lexical space·) and whose range is
    the ·value space· of that datatype.

        Note: For each primitive datatype defined here, the
        lexical mapping is a total function from the lexical
        space onto the value space of the datatype.  

        For unions and special datatypes, the lexical mapping is
        not a function, since the same character sequence may map
        to several values.  When a lexical representation maps to
        more than one value, the choice of which value to use for
        further processing may be determined by rules given
        elsewhere.

As currently defined, anySimpleType has values without lexical
representations; that fact is noted in drafts A and B, but not C.
It should also be noted in section 3.2.1, but that change is not
part of this proposal; it belongs to bug 3243.

Comment 2 C. M. Sperberg-McQueen 2006-10-09 18:17:24 UTC

Apologies; comment #1 was added here by mistake; it belongs in bug 3025.

Comment 3 C. M. Sperberg-McQueen 2007-09-26 02:33:31 UTC

This issue is tied up with bug 3025 and bug 5058 and should
be discussed together with them.

Comment 4 C. M. Sperberg-McQueen 2007-10-14 18:25:47 UTC

The XML Schema WG discussed this issue together with XML Query and XSL
at the ftf meetings in October 2007 in Redmond.  The discussion uncovered
some new technical arguments, but we did not reach consensus.  

In particular, we noted that QT allows implementations to add new primitive
datatypes, and regards those primitives as restrictions of anySimpleType.
This means (some WG members argued) that the QT specs already entail an
understanding of anySimpleType as including all atomic values, whether
they are values of currently defined primitives or not.  Other WG members
argued that an alternative view would be that QT entails a view in which
it doesn't matter in practice if different implementations have slightly
different value spaces for anySimpleType.  Analogous positions can be
constructed regarding the addition of new primitive datatypes in future
versions of XSDL and/or QT.

Next steps:  the chair does not propose to schedule this issue for further
WG discussion until the editors have prepared one or more wording proposals
for concrete consideration.

Comment 5 Michael Kay 2007-10-14 18:47:29 UTC

I did some musing on this on the flight home.

It seems to me we identified two concepts, which I will call the "potential value space" and the "effable value space". The potential value space contains all values that meet the criteria implicit in the definition of the type. The effable value space contains that subset of the values that have a lexical representation.

Rather than arguing about which of these is the one true meaning of the term "value space", it would seem useful (a) to introduce and name both concepts, (b) to explain that as far as the XSDL 1.1 spec is concerned, there would be no operational differences caused by using one concept rather than the other, but (c) because other specifications may use the same type system and allow means of constructing values other than by mapping from the lexical space, the term "value space" is taken to mean the "potential value space" unless otherwise specified.

There is also another subset, of course, namely the "implemented value space", which takes into account implementation limitations on the range of values.

Comment 6 Dave Peterson 2007-10-26 00:28:15 UTC

(In reply to comment #5)

> There is also another subset, of course, namely the "implemented value space",
> which takes into account implementation limitations on the range of values.

Actually two subset concepts:  One, the "implemented value space", which varies by implementation, and the other, the "minimal implemented value space", which is fixed by the partial-implementation minimum requirements prescribed in the spec.

Comment 7 C. M. Sperberg-McQueen 2007-10-26 22:15:46 UTC

The Working Group discussed the idea in comment #5 during our call of 
26 October 2007.  There was broad agreement that distinguishing between
potential and effable value spaces, and introducing terms for them,
would be useful; also that the special types should be described as
having potential value spaces larger than their effable value spaces
(because they contain ineffable values).

There was disagreement over whether to say that list types had ineffable
values in their potential value spaces or to say that for user-constructable
list types, the potential value space and the effable value space are always
the same.  No consequence of the different formulations was identified which
would be identifiable by means of schema-validity assessment; the main
consequence would be for the use of list types in other contexts, and
for compatibility of current list types with list types constructed using
some future version of XSDL in which delimiters other than white space
might be allowed.

The status of the issue remains needsDrafting; the proposals prepared by
the editors should be informed by the WG discussion, but no specific
instructions were given.

Comment 8 C. M. Sperberg-McQueen 2008-05-21 12:30:41 UTC

A wording proposal intended to resolve this issue is available at
http://www.w3.org/XML/Group/2004/06/xmlschema-2/datatypes.b3025.html
(member-only link).

Comment 9 C. M. Sperberg-McQueen 2008-05-23 19:43:33 UTC

The wording proposal mentioned in comment #8 was adopted by the XML Schema
WG on its telcon today.  Accordingly, I'm marking this issue resolved.
Michael, as the originator of the issue, I hope you will (a) report back
to the QT groups on its resolution and (b) signal your and their assent to 
this decision by changing the status of the bug to CLOSED, or your dissent 
by reopening it.  If we have not heard from you within the next two weeks or
so, we will assume that silence implies consent.