This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
Section 4.3.2 of the spec reads (and in essentials has read since XSD 1.0): Unless directed otherwise, for example by the invoking application or by command line option, processors should attempt to dereference each schema document location URI in the ·actual value· of such xsi:schemaLocation and xsi:noNamespaceSchemaLocation [attributes] We have long accepted the principle that the choice of schema for validation should lie with the entity requesting the validation, not with the author (when the two are distinct). This wording tends to undercut that principle and should be changed. For SHOULD, read MAY. Indeed, so cumbersome have I found this convention in practice that I'm tempted to say SHOULD NOT.
I hate the word "hint" with its connotation of "do whatever you feel like". I would prefer this to read: If directed to do so, for example by the invoking application or by command line option, processors should attempt to dereference each schema document location URI in the ·actual value· of such xsi:schemaLocation and xsi:noNamespaceSchemaLocation [attributes] It's not really a substantive change because choice of defaults is up to the API designer, but it sets expectations differently. There are some validators out there that require you to add an xsi:schemaLocation attribute before you can validate the instance. I would love to make that illegal, though on balance I'm happy to let the market decide whether it wants such products or not. But I certainly think we should set the expectations a bit differently.
The original bug report (from MSM, I think) reads: > We have long accepted the principle that the choice > of schema for validation should lie with the entity > requesting the validation, not with the author (when > the two are distinct). This wording tends to > undercut that principle and should be changed. I tend to agree. > For SHOULD, read MAY. I would prefer that. > Indeed, so cumbersome have I found this convention in > practice that I'm tempted to say SHOULD NOT. My vote is: probably a step too far, but I wouldn't spend time objecting if consensus is to go this way. I think we want to proceed in the spirit of the first comment quoted above, which suggests we should be neutral. Michael Kay writes: > If directed to do so, for example by the invoking > application or by command line option, processors > should attempt to dereference each schema document > location URI in the ·actual value· of such > xsi:schemaLocation and xsi:noNamespaceSchemaLocation > [attributes] I've never liked the "if directed to do so formulation", which I recall accepting reluctantly as a compromise in the run up to Schema 1.0. I would prefer to just say that "schemaLocaion is a hint provided by the document author to suggest where a suitable schema document might be found. Processors MAY but need not attempt to retrieve and use such schema documents and MAY but need not fail if such a retrieval attempt fails. The choice of whether to use such documents would typically determined according to the needs of the particular system or application.", or words to that effect. Noah
Well, I have to say, I hate this term "hint" which suggests that processors can do what they like and users have no influence over the process. It seems to put the control and decision-making in the wrong place. I agree it's a soft distinction, but it does set expectations. Some products indeed appear to have taken this to heart, and give users very little control over choice of a schema for validation, which makes a nonsense of the whole idea of validation - what use is validation if you can't control the choice of rules to be applied?
> Some products indeed appear to have taken this to heart, and give > users very little control over choice of a schema for validation, > which makes a nonsense of the whole idea of validation - what use is > validation if you can't control the choice of rules to be applied? I think there are contexts in which that lack of control is exactly what I want to implement. Let's say I'm implementing, as I have done at least in prototype form, a schema validator specifically optimized for and intended specifically for use in a Web Services client. In those scenarios, we very commonly want to force the behavior that no network connections will be made by the validator, and it's also typically the case that we would not trust a schema provided by the document author if it were provided (if we trusted the document author, than why validate in the first place.) So, it's fine if I by a processor that implements both options, but that will implement a code path I never use. I'll always set it to not honor the "hint". I think it's equally appropropriate for me to use a processor that leaves out that unused code, and that just ignores the hint. As an aside, I've often thought that in such Web services scenarios what I'd really want is to check that the URI provided for a schemaLocation, if any, matches the one I as the receiver was expecting anyway. For a purchase order, for example, that would give one the comfort that: you wrote the purchase order with knowledge of the one fixed schema that I am going to use to validate it. This is particularly critical if we're going to depend on things like type assignments, attribute value defaults etc. I don't think we need modify our recommendation to provide for this; I think it's a service that particular validation software can provide above and beyond the computation of the PSVI, or as a precondition on whether to attempt validation at all, I.e. if I discover that the author had in mind a different schema than the one I was expecting, then punt, because the communication is known to be unreliable from that one fact alone.
At its telcon of 2008-10-03, the WG agreed this resolution: Ask the Editors to change the relevant para of 4.3.2 to "Processors may attempt to dereference each schema document location URI in the .actual value. of such xsi:schemaLocation and xsi:noNamespaceSchemaLocation [attributes]. Schema processors SHOULD provide an option to control whether they do so." http://www.w3.org/2008/10/03-xmlschema-minutes.html