This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 11335 - Inappropriate use of MAY in 2.4.1.1
Summary: Inappropriate use of MAY in 2.4.1.1
Status: RESOLVED FIXED
Alias: None
Product: XML Schema
Classification: Unclassified
Component: Datatypes: XSD Part 2 (show other bugs)
Version: 1.1 only
Hardware: PC Windows NT
: P2 minor
Target Milestone: ---
Assignee: David Ezell
QA Contact: XML Schema comments list
URL:
Whiteboard:
Keywords: decided
Depends on:
Blocks:
 
Reported: 2010-11-17 17:23 UTC by Michael Kay
Modified: 2010-12-17 16:50 UTC (History)
3 users (show)

See Also:


Attachments

Description Michael Kay 2010-11-17 17:23:58 UTC
In 2.4.1.1 we read "No ·user-defined· datatype MAY have anyAtomicType as its ·base type·"

The RFC2119 "MAY" is always permissive. We should not use RFC markup for an occurrence of "may" that is imposing a constraint. 

Suggest: "A ·user-defined· datatype MUST NOT have anyAtomicType as its ·base type·"
Comment 1 Mukul Gandhi 2010-11-17 23:11:45 UTC
I'm in favor of usage of MAY in the section of XML Schema 1.1 spec you've cited. This allows the implementers of XML Schema language to have benefit of lax implementation (and that can be implementation defined) of this behavior.
Comment 2 Michael Kay 2010-11-17 23:56:09 UTC
Mukul's response suggests that he is reading the text differently from the way I read it, which confirms that it needs changing.

My reading of "No X may have a Y" is "There must not be an X that has a Y" - that is, it is a prescriptive statement, not a permissive one.
Comment 3 Dave Peterson 2010-11-18 00:22:42 UTC
(In reply to comment #2)

> My reading of "No X may have a Y" is "There must not be an X that has a Y" -
> that is, it is a prescriptive statement, not a permissive one.

That's the way I read it, too.  And I agree that RFC2119 prefers "MUST NOT" to "MAY not", even if they technically mean the same thing.  (But since they mean the same thing, it's truly a minor editorial fix.)
Comment 4 Michael Kay 2010-11-18 08:51:19 UTC
They may mean the same thing, or they may not.
Comment 5 Mukul Gandhi 2010-11-19 01:00:58 UTC
(In reply to comment #2)
> Mukul's response suggests that he is reading the text differently from the way
> I read it, which confirms that it needs changing.

MAY and "MUST NOT" really mean different things. MAY would mean optional and "MUST NOT" means that the feature is prohibited under any circumstances.

We have to decided whether, a user-defined type is prohibited (in that case this feature would be specified with "MUST NOT") to have anyAtomicType as it's base OR whether implementations would be allowed to provide this feature if they wish to (in that case this need to be specified with keyword MAY).

My understanding of MAY and "MUST NOT" is same as those specified in RFC2119.

And I'm in favor of using MAY in this case (unless decided otherwise by the WG).

Thanks.
Comment 6 Mukul Gandhi 2010-11-19 23:33:29 UTC
(In reply to comment #3)
> That's the way I read it, too.  And I agree that RFC2119 prefers "MUST NOT" to
> "MAY not", even if they technically mean the same thing.  (But since they mean
> the same thing, it's truly a minor editorial fix.)

After thinking a bit over (probably these two things mean the same thing, as been discussed in this thread) this I'm inclined to take back my comments. I'll be fine with the editorial changes that'll be made (or will not be made) for this issue.

Thanks.
Comment 7 C. M. Sperberg-McQueen 2010-11-26 19:18:36 UTC
Mukul Gandhi's remarks have been helpful to me in drawing my attention to a flaw in the current formulation of section 2.4.4:  the distinction between built-in and user-defined datatypes assumes that all datatypes are either defined in this spec or defined by the authors of individual schemas, and that the first is what we mean by 'built-in' datatypes.

Even in 1.0, a schema processor is allowed to have automatic knowledge of types other than those defined in the spec; users of such processors may well feel that they are obviously built-in, in any ordinary-language use of the term, but in the XSD spec 'built-in' is a technical term and it does not always have its ordinary-language meaning.

In 1.1, however, implementations are allowed to provide additional primitive types, and ipso facto they are allowed, if they wish, to pass this freedom on to their users.  If a schema processor wishes to define a mechanism for defining new primitive types and allow its users to deploy that mechanism, then it would not be unreasonable to regard those new primitives as user-defined types.  But as primitive types, they will typically need to have anyAtomicType as their base type.  Such user-defined primitives are not meant to be included in the prohibition on user-defined types having anyAtomicType as their base, but it does suggest the need for some thinking about our categories and the terms we use for them.

I think that what is meant by the sentence quoted in the description of the issue is that user-defined ordinary datatypes cannot have anyAtomicType as their base.  This follows, I think, from the fact that no ordinary datatype can have anyAtomicType as its base:  they are either restrictions of primitives or other ordinary datatypes, or they are constructions by list or union, which have anySimpleType as their base type.

The statement seems to take as a premise that users can only define ordinary datatypes.  That's true in XSD as a host language for Datatypes, but nothing in Datatypes requires host languages to use XSD simple type definitions to describe datatypes, and I don't believe anything in Datatypes forbids host languages to allow users to define new 'primitive' types which have anyAtomicType as their base type.

If the sentence in question is just trying to make explicit something that readers would otherwise have to glean indirectly from the spec, then we should recast it to be clearer and more correct, perhaps in non-normative words.  If it's essential that the statement be normative, we need to figure out just how to phrase it, given that other host languages are allowed to provide new primitives, and may choose to allow user-specified primitives.
Comment 8 David Ezell 2010-12-17 16:50:27 UTC
From 2010-12-03:
close this bug as overtaken, and open a new bug addressing all the issues.

MSM to perform a critical review of "user-defined" terminology and propose revision as needed.