This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 2251 - R-259: Are QNames without prefix bindings type-valid?
Summary: R-259: Are QNames without prefix bindings type-valid?
Status: CLOSED LATER
Alias: None
Product: XML Schema
Classification: Unclassified
Component: Datatypes: XSD Part 2 (show other bugs)
Version: unspecified
Hardware: All All
: P2 normal
Target Milestone: ---
Assignee: C. M. Sperberg-McQueen
QA Contact: XML Schema comments list
URL:
Whiteboard:
Keywords: resolved
Depends on:
Blocks:
 
Reported: 2005-09-14 19:48 UTC by Sandy Gao
Modified: 2009-04-21 19:21 UTC (History)
0 users

See Also:


Attachments

Description Sandy Gao 2005-09-14 19:48:54 UTC
Discussed on 20 February 2004 and agreed to be a problem. The question arose in 
discussion of a draft comment on the QT serialization spec. The relevant part 
of our minutes reads: 

"HST was unhappy about the mention of QNames being type-valid when unbound; he 
remembered it differently. We discussed this question for a few minutes; some 
WG members (HST) believed having a binding for the prefix is essential to QName 
type validity, while some (MSM, PVB) believed they remembered removing that 
from the definition of type validity at the same time we removed the uniqueness 
constraint from the type-validity conditions for ID and moved it into 
Structures. 

Ultimately, we constructed what amounts to an argument that having a bound 
prefix IS a condition of type validity. At the very least, if you have a type 
derived from QName via enumeration, you'll need a QName value to check 
instances against the enumeration. If you don't have a binding for the prefix, 
you won't have a value. The same goes for any other facet except 'pattern'. 
Therefore, a prefix binding seems essential at least for types derived from 
QName; as regards QName itself, some WG members suggested that the binding 
might not be essential; at least, they could not find any general rule that 
says that an instance of a simple type is type-valid if and only if the lexical 
form successfully maps into a type. 

Other WG members noted that clause 2 of validation rule Datatype Valid 
(http://www.w3.org/XML/Group/2003/09/xmlschema-2/datatypes-with-errata.html#cvc-
datatype-valid) requires that "the value denoted by the literal matched in the 
previous step [be] a member of the value space of the datatype, as determined 
by it being Facet Valid (sect. 4.1.4) with respect to each member of {facets} 
(except for pattern)." This seems to entail that there must be such a value, 
and thus constitute a requirement that the LV mapping must succeed for an 
instance to be type-valid."

Mary Holstege's note reads: 

The question is whether this instance: 

   <tricky qname="unbound:something"/>

would be invalid per this element declaration: 

   <xs:element name="tricky">
     <xs:complexType>
       <xs:attribute name="qname" type="xs:QName"/>
     </xs:complexType>
   </xs:element>

It is valid per the lexical rules in part 2, so the only question comes as to 
where we make an appeal to their being a value and a value that we know. 

The relevant clause in "Validation Rule: Datatype Valid" [3] is: 

"A string is datatype-valid with respect to a datatype definition if: 
... 
2 the value denoted by the literal matched in the previous step is a member of 
the value space of the datatype, as determined by it being Facet Valid (sect. 
4.1.4) with respect to each member of {facets} (except for pattern). 

this being the only clause that makes any appeal to the value space."

Some read this as implicitly requiring you to have the value in hand in order 
to do the facet check. 

I read this as saying that "Facet Valid" decides whether the value is OK. In 
this case, Facet Valid does not obtain, because there are no facets. (The 
clause "as determined by..." I take to be definitional.) 

Further, even if it did obtain, the difficulty with a QName with an unbound 
prefix isn't that there isn't a value, it is only that you don't _know_ what it 
is. So I would look at this as very much analogous to undischarged component 
references, where it isn't that you know something is wrong, it is that you 
don't know what the state of affairs is. And indeed, if the instance document 
were a schema, and the "qname" attribute above were instead a "type" attribute, 
this would be an entirely consistent view to take. 

Similarly, the only place in Structures that makes an appeal to the value is if 
there is some kind of value constraint. In this case, there is no value 
constraint, so again, those constraints don't apply. I don't believe there is 
any disagreement about the interpretation of the applicability of those 
clauses. 

I did not find anywhere that explicitly requires that (a) there be a value and 
(b) you know what it is. 

There is a note is 3.2.18 of Datatypes [4] that says: 

"NOTE: The mapping between literals in the lexical space and values in the 
value space of QName requires a namespace declaration to be in scope for the 
context in which QName is used."

Some read this as noting that you need to have an inscope namespace definition 
in order to have a good QName. But again, I don't read this as compelling me to 
know what the value happens to be, but only noting that indeed there are some 
cases where things get sticky if you want to get your hands on the right value 
in the value space. 

I am not entirely sure that it is a good idea to require that there both be a 
value and that you know exactly what it is in the case of QName. (And I wonder 
about whether it would be a good one in some of those float/double edge cases, 
too.) I have always been uncomfortable with the pun of using the mechanism for 
declaring namespace for tags to provide namespace declarations for content 
(i.e. architecturally I would always prefer seeing explicit, distinct kinds of 
declarations). In the context of XQuery, for example, this gets strange, 
because the above instance would be valid if there were the necessary "declare 
namespace..." in the prolog, even though the instance in itself would be 
invalid per the putative XML Schema "must have a known value" rules. I find 
that odd. 

Note that once you put a value constraint on the attribute or derive a type 
that adds some facets (enumerations, for example), you do run afoul of the 
cited clauses, among other things. On this we all agree. 

So the questions are: 

(1) Do we in fact require that an instance of a simple type have a value to be 
valid? And if not, should we? 
(2) Do we in fact require that you know what the (or a?) value of an instance 
of a simple type is in order to be valid? And if not, should we?
Comment 1 C. M. Sperberg-McQueen 2005-11-09 16:38:30 UTC
See also bug 2088 and bug 2075, which are related to this one and
should be considered and resolved at the same time.
Comment 2 C. M. Sperberg-McQueen 2006-09-20 23:30:54 UTC
At the face to face meeting of January 2006 in St. Petersburg,
the Working Group discussed this issue.  While there was some
regret over the decision, in the end the Working Group decided
not to take further action on this issue in XML Schema 1.1.

The rationale for the decision (as I understand it) was roughly
as follows.  This item is similar in some respects to others (bug
2088, bug 2200, bug 2251, bug 2075, bug 2314); all involve
datatypes whose values are in some sense correct only if
appropriate declarations (or other constructs) are in scope.  It
would be good to have a clearer account of such datatypes, but
while the lack of a clear account is highly visible in the spec,
it does not seem to cause serious problems for many people in
practice.  Since we don't seem to have any immediate prospect of
achieving greater clarity, and the problem does not seem acute
for users, it seems unwise to delay Datatypes 1.1 for further
work in this area.

This issue should have been marked as RESOLVED / LATER at that
time, but apparently was not.  I am marking it that way now, to
reduce confusion.