This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 5476 - xsi:schemaLocation should be a hint, should be MAY not SHOULD
Summary: xsi:schemaLocation should be a hint, should be MAY not SHOULD
Status: RESOLVED FIXED
Alias: None
Product: XML Schema
Classification: Unclassified
Component: Structures: XSD Part 1 (show other bugs)
Version: 1.0/1.1 both
Hardware: Macintosh All
: P2 minor
Target Milestone: ---
Assignee: C. M. Sperberg-McQueen
QA Contact: XML Schema comments list
URL:
Whiteboard: composition cluster
Keywords: resolved
Depends on:
Blocks:
 
Reported: 2008-02-13 03:24 UTC by C. M. Sperberg-McQueen
Modified: 2008-10-10 18:25 UTC (History)
3 users (show)

See Also:


Attachments

Description C. M. Sperberg-McQueen 2008-02-13 03:24:15 UTC
Section 4.3.2 of the spec reads (and in essentials has read since
XSD 1.0):

    Unless directed otherwise, for example by the invoking 
    application or by command line option, processors should
    attempt to dereference each schema document location URI 
    in the ·actual value· of such xsi:schemaLocation and 
    xsi:noNamespaceSchemaLocation [attributes]

We have long accepted the principle that the choice of schema
for validation should lie with the entity requesting the
validation, not with the author (when the two are distinct).  This
wording tends to undercut that principle and should be changed.

For SHOULD, read MAY.  Indeed, so cumbersome have I found this
convention in practice that I'm tempted to say SHOULD NOT.
Comment 1 Michael Kay 2008-02-13 09:20:40 UTC
I hate the word "hint" with its connotation of "do whatever you feel like".

I would prefer this to read:

If directed to do so, for example by the invoking 
    application or by command line option, processors should
    attempt to dereference each schema document location URI 
    in the ·actual value· of such xsi:schemaLocation and 
    xsi:noNamespaceSchemaLocation [attributes]

It's not really a substantive change because choice of defaults is up to the API designer, but it sets expectations differently.

There are some validators out there that require you to add an xsi:schemaLocation attribute before you can validate the instance. I would love to make that illegal, though on balance I'm happy to let the market decide whether it wants such products or not. But I certainly think we should set the expectations a bit differently.
 
Comment 2 Noah Mendelsohn 2008-09-23 21:29:42 UTC
The original bug report (from MSM, I think) reads:

> We have long accepted the principle that the choice
> of schema for validation should lie with the entity
> requesting the validation, not with the author (when
> the two are distinct).  This wording tends to
> undercut that principle and should be changed.

I tend to agree.

> For SHOULD, read MAY.  

I would prefer that.

> Indeed, so cumbersome have I found this convention in
> practice that I'm tempted to say SHOULD NOT.

My vote is: probably a step too far, but I wouldn't spend time objecting if consensus is to go this way.  I think we want to proceed in the spirit of the first comment quoted above, which suggests we should be neutral.

Michael Kay writes:

> If directed to do so, for example by the invoking
> application or by command line option, processors
> should attempt to dereference each schema document
> location URI in the ·actual value· of such
> xsi:schemaLocation and xsi:noNamespaceSchemaLocation
> [attributes]

I've never liked the "if directed to do so formulation", which I recall accepting reluctantly as a compromise in the run up to Schema 1.0.  I would prefer to just say that "schemaLocaion is a hint provided by the document author to suggest  where a suitable schema document might be found.  Processors MAY but need not attempt to retrieve and use such schema documents and MAY but need not fail if such a retrieval attempt fails.  The choice of whether to use such documents would typically determined according to the needs of the particular system or application.", or words to that effect.

Noah




Comment 3 Michael Kay 2008-09-23 22:17:49 UTC
Well, I have to say, I hate this term "hint" which suggests that processors can do what they like and users have no influence over the process. It seems to put the control and decision-making in the wrong place. I agree it's a soft distinction, but it does set expectations. Some products indeed appear to have taken this to heart, and give users very little control over choice of a schema for validation, which makes a nonsense of the whole idea of validation - what use is validation if you can't control the choice of rules to be applied?
Comment 4 Noah Mendelsohn 2008-09-25 16:22:31 UTC
> Some products indeed appear to have taken this to heart, and give
> users very little control over choice of a schema for validation,
> which makes a nonsense of the whole idea of validation - what use is
> validation if you can't control the choice of rules to be applied?

I think there are contexts in which that lack of control is exactly what I want to implement.  Let's say I'm implementing, as I have done at least in prototype form, a schema validator specifically optimized for and intended specifically for use in a Web Services client.  In those scenarios, we very commonly want to force the behavior that no network connections will be made by the validator, and it's also typically the case that we would not trust a schema provided by the document author if it were provided (if we trusted the document author, than why validate in the first place.)  So, it's fine if I  by a processor that implements both options, but that will implement a code path I never use.  I'll always set it to not honor the "hint".  I think it's equally appropropriate for me to use a processor that leaves out that unused code, and that just ignores the hint.

As an aside, I've often thought that in such Web services scenarios what I'd really want is to check that the URI provided for a schemaLocation, if any, matches the one I as the receiver was expecting anyway.  For a purchase order, for example, that would give one the comfort that:  you wrote the purchase order with knowledge of the one fixed schema that I am going to use to validate it.  This is particularly critical if we're going to depend on things like type assignments, attribute value defaults etc.   I don't think we need modify our recommendation to provide for this;  I think it's a service that particular validation software can provide above and beyond the computation of the PSVI, or as a precondition on whether to attempt validation at all, I.e. if I discover that the author had in mind a different schema than the one I was expecting, then punt, because the communication is known to be unreliable from that one fact alone.

Comment 5 Henry S. Thompson 2008-10-04 12:08:05 UTC
At its telcon of 2008-10-03, the WG agreed this resolution: 

    Ask the Editors to change the relevant para of 4.3.2 to "Processors may
    attempt to dereference each schema document location URI in the
    .actual value. of such xsi:schemaLocation and
    xsi:noNamespaceSchemaLocation [attributes]. Schema processors
    SHOULD provide an option to control whether they do so."

http://www.w3.org/2008/10/03-xmlschema-minutes.html