This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 3244 - Sets of lists
Summary: Sets of lists
Status: CLOSED FIXED
Alias: None
Product: XML Schema
Classification: Unclassified
Component: Datatypes: XSD Part 2 (show other bugs)
Version: 1.1 only
Hardware: PC Windows XP
: P2 normal
Target Milestone: ---
Assignee: C. M. Sperberg-McQueen
QA Contact: XML Schema comments list
URL:
Whiteboard: thimble, easy; do-it cluster
Keywords: resolved
Depends on:
Blocks:
 
Reported: 2006-05-09 10:43 UTC by Michael Kay
Modified: 2008-01-30 15:33 UTC (History)
0 users

See Also:


Attachments

Description Michael Kay 2006-05-09 10:43:36 UTC
QT approved comment:

In 3.2.1, the initial definition correctly refers to "the set of all lists", while section 3.2.1.1 incorrectly refers to "all sets of lists". [The union of all sets of lists is a powerset.] Since 3.2.1.2 restricts the lexical representation to finite-length character strings, it might also be appropriate to restrict the value space to contain only finite-length lists.
Comment 1 Dave Peterson 2006-05-20 01:18:36 UTC
(In reply to comment #0)
> QT approved comment:
> 
> In 3.2.1, the initial definition correctly refers to "the set of all lists",
> while section 3.2.1.1 incorrectly refers to "all sets of lists". [The union of
> all sets of lists is a powerset.] 

The union of all sets of lists is a set of lists, which is what we want.  To
coin some in-line notation (read ';' as "subject to the condition"), in 3.2.1
we specified:

    ( U x ; (x a primitive datatype) (value space of x) ) u { x ; x is a list }

or equivalently:

    ( U x ; (x is a primitive datatype's value space ) x u { x ; x is a list }

In 3.2.1.1 we probably were thinking of:

    U x ; (x is a primitive datatype's value space or x is a set of lists) x

The comma changed the meaning, and in any case we might make the two parallel
or perhaps remove one.
Comment 2 C. M. Sperberg-McQueen 2006-09-09 02:20:01 UTC
I have no problem with the wording you prefer, but I think I take
issue with the claim that the existing wording in 3.2.1.1 is 
incorrect.  The union of any number of sets of X is itself
a set of X; a powerset is not the union but the set of some
sets.  3.2.1.1 says that the value space of anySimpleType is
the union of 

  - all of the primitives (i.e. all of the sets of
    atomic values), and 
  - all of the sets of lists formed from atomic values

Since the union of all sets of lists and the set of all lists
are (at least in this instance) the same, and the phrase
"the set of all lists" seems not to tempt the reader into
thinking of power sets, I'm happy to adopt the change.

I'm marking this editorial because I don't think there's any real
question in the WG about what is intended.
Comment 3 C. M. Sperberg-McQueen 2007-09-27 04:42:41 UTC
To make the wording proposal easier to review, I repeat it in more
verbose form here.  Delete the current first and only paragraph of
3.2.1.1, which now reads:

    The ·value space· of anySimpleType is the union of the ·value
    spaces· of all the ·primitive· datatypes defined here, and of all
    sets of lists formed from the members of the ·primitive·
    datatypes.

Replace it with this reformulation, which changes "and of all sets of
lists" with "the set of all finite-length lists".

    The ·value space· of anySimpleType is the union of the ·value
    spaces· of all the ·primitive· datatypes defined here, and of the
    set of all finite-length lists formed from the members of the
    ·primitive· datatypes.

Also change the definition of anySimpleType by replacing the phrase
"the set of all lists" with the phrase "the set of all finite-length
lists", the phrase "of all members" with "of members", and the phrase
"of all the primitive datatypes" with "of the primitive datatypes", so
that it reads:

    [Definition:] The definition of anySimpleType is a special
    ·restriction· of anyType.  anySimpleType has an unconstrained
    ·lexical space·, a ·value space· consisting of the union of the
    ·value spaces· of all the ·primitive· datatypes and the set of all
    finite-length lists of members of the ·value spaces· of the
    ·primitive· datatypes.

Note that this wording proposal is intended to resolve bug 3244 only;
it is not intended to resolve any other of the outstanding issues
about the definition of the special types, such as bug 3243, bug 5058,
or bug 3025.  The question of cleaning up the quantifiers, raised
here, is orthogonal to the questions raised in those other bugs.

I'm setting the status of this issue to needsReview; but note that
this wording proposal has not had the normal full editorial review.
Comment 4 Michael Kay 2007-09-27 08:04:47 UTC
Looking good. But one minor quibble (or rather, two). In

the set of all finite-length lists formed from the members of the ·primitive· datatypes.

(a) is "formed from" clear?

(b) is "members" clear?

(and in particular, is there any danger that anyone could misread this as suggesting the items in a list must all belong to the same primitive type?)

(I'll avoid reopening the discussion on "finite-length" as I'm in danger of shooting myself in the foot...)

Would the following be better:

the set of all finite-length lists of ·atomic values·

and then taking care to define atomic value (we almost do so already, but not quite).
Comment 5 Dave Peterson 2007-10-26 01:30:48 UTC
(In reply to comment #3)

> Note that this wording proposal is intended to resolve bug 3244 only;
> it is not intended to resolve any other of the outstanding issues
> about the definition of the special types, such as bug 3243, bug 5058,
> or bug 3025.  The question of cleaning up the quantifiers, raised
> here, is orthogonal to the questions raised in those other bugs.

Since the existing wording is technically correct, even if someone did misread it, and bug 5058 points out that the definition of anySimpleType used here conflicts with the definition of value space, there is no point in trying to improve the wording here until we decide whether it is correct or the definition of value space is correct.  The WG should not spend time on this until 5058 is decided.
Comment 6 C. M. Sperberg-McQueen 2007-10-26 02:47:30 UTC
I suggest we modify the wording proposal of comment #3 by adopting the
changes suggested in comment #4, and also try to recast the
definitions in light of Sandy Gao's comment at the face to face
meeting that it feels odd to define the base type in terms of its
restrictions instead of vice versa.  (There is a limit to how fully I
have been able to address SG's concern, but I've tried.)

Net result:

1) In section 2.4.1, define atomic value and recast the definitions of
atomic and list datatypes to exploit it (and to align with the WG's
decision on bug 3230).  Specifically, for

    First, we distinguish ·atomic·, ·list·, and ·union· datatypes.

      - [Definition:] Atomic datatypes are those having values
        treated by this specification as indivisible.  Atomic
        datatypes are anyAtomicType and all datatypes ·derived· from
        it.

      - [Definition:] List datatypes are those having values each of
        which consists of a finite-length (possibly empty) sequence of
        values of an ·atomic· datatype (or a ·union· of ·atomic·
        datatypes), which is the ·item type· of the list.

read 

    First, we distinguish ·atomic·, ·list·, and ·union· datatypes.

    [Definition:] An atomic value is an elementary value, not
    constructed from simpler values by any means defined by this
    specification.

      - [Definition:] Atomic datatypes are those whose value spaces
        contain only atomic values.  Atomic datatypes are anyAtomicType 
        and all datatypes ·derived· from it.

      - [Definition:] List datatypes are those having values each of
        which consists of a finite-length (possibly empty) sequence of
        atomic values.  The values in a list are drawn from some 
        ·atomic· datatype (or from a ·union· of ·atomic·
        datatypes), which is the ·item type· of the list.

2) In 3.2.1, change the definition of anySimpleType, which currently
reads

    [Definition:] The definition of anySimpleType is a special
    ·restriction· of anyType.  anySimpleType has an unconstrained
    ·lexical space·, a ·value space· consisting of the union of the
    ·value spaces· of all the ·primitive· datatypes and the set of all
    lists of all members of the ·value spaces· of all the ·primitive·
    datatypes.

to read

    [Definition:] The definition of anySimpleType is a special
    ·restriction· of anyType.  Its ·lexical space· is the set of all
    sequences of Unicode characters, and ·value space· includes all
    ·atomic values· and all finite-length lists of ·atomic values·.

3) In 3.2.1.1, replace the current first and only paragraph, which now
reads:

    The ·value space· of anySimpleType is the union of the ·value
    spaces· of all the ·primitive· datatypes defined here, and of all
    sets of lists formed from the members of the ·primitive·
    datatypes.

with this reformulation:

    The ·value space· of anySimpleType is the set of all atomic values
    and of all finite-length lists of atomic values.


Note that this wording proposal has not had the normal full editorial
review.

Note also that as far as I can tell this proposal is compatible with
any of the proposed resolutions for the cluster of bug 3243, bug 3025,
and bug 5058.  Depending on how we resolve those issues, we should
probably add notes to 3.2.1 and 3.2.2 explaining how the value spaces
of anySimpleType and anyAtomicType relate to those of the primitives,
and how they stand on the question of effability.
Comment 7 Michael Kay 2007-10-26 08:19:22 UTC
The proposal in comment #6 looks good to me (who started all the fuss).

Michael Kay
Comment 8 Dave Peterson 2007-10-26 13:54:49 UTC
(In reply to comment #6)

>     First, we distinguish ·atomic·, ·list·, and ·union· datatypes.
> 
>     [Definition:] An atomic value is an elementary value, not
>     constructed from simpler values by any means defined by this
>     specification.

Since we have been nit-picking the definitions, it seems to me that the prescriptions of built-up primitive datatypes (e.g., date/time datatypes, duration, and precisionDecimal) could be construed as "means defined by this specification".

Is the important point that an atomic *datatype* cannot be constructed from simpler *datatypes* by any means defined in this specification?  (And then, of course, atomic values are values in the value space of an atomic datatype.)

> - [Definition:] Atomic datatypes are those whose value spaces
>         contain only atomic values.  Atomic datatypes are anyAtomicType 
>         and all datatypes ·derived· from it.

    - [Definition:]  Atomic datatypes are those which cannot be constructed
      from simpler datatypes by any construction mechanism defined in
      this specification.

While we construct the built-up datatypes using objects with named properties whose values are generally real numbers (specifically decimals and integers) or strings of characters, these values are not "members of the value space" of the corresponding datatypes,  We refer to them differently; membership in the value spaces is conferred by intension, not extension, in our spec.  Thus the value spaces are not constructed using the value spaces of other datatypes.

Granted, at some point we have to stop nit-picking and credit our readers with intelligence.  ;-)
Comment 9 C. M. Sperberg-McQueen 2007-10-26 22:08:16 UTC
The XML Schema WG discussed this issue ("sets of lists") during our call of
26 October 2007.  We accepted the wording proposal of comment #6 and
instructed the editors mark the issue decided.  The amendment proposed in 
comment #8 was considered but not accepted; the arguments mentioned were that
it's better to define atomicity with respect to values not value spaces, and
the definition offered in comment #8 seems to be a definition of primitiveness,
not atomicity.  

Speaking personally, I am conscious that the latter argument also applies 
(albeit less strongly, I think) to the wording we accepted, and I continue 
to regard the definition of atomic value as a bit of a soft spot.  If a 
better formulation is found, we can and should consider it in a separate
later editorial proposal.
Comment 10 C. M. Sperberg-McQueen 2008-01-30 15:24:49 UTC
The WG decision mentioned in comment #9 was integrated into the 
status-quo document in October 2007.  (That fact should
have been noted here earlier, but there were distractions.)

Michael, as the individual who entered the issue on behalf of QT, 
could you at some convenient point report the WG's disposition of the 
issue back to QT and let us know in the usual way whether QT is content
with that disposition or not, by changing the status either to CLOSED
or to REOPENED?  Thank you.  
Comment 11 Michael Kay 2008-01-30 15:33:02 UTC
As already indicated in comment #7, this solution looks fine. Thanks.