This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 3673 - revisit (and enlarge) XPath subset for assertions?
Summary: revisit (and enlarge) XPath subset for assertions?
Status: CLOSED FIXED
Alias: None
Product: XML Schema
Classification: Unclassified
Component: Structures: XSD Part 1 (show other bugs)
Version: 1.1 only
Hardware: Macintosh All
: P2 normal
Target Milestone: ---
Assignee: C. M. Sperberg-McQueen
QA Contact: XML Schema comments list
URL:
Whiteboard: important, hard
Keywords: needsDrafting
Depends on:
Blocks:
 
Reported: 2006-09-08 04:09 UTC by C. M. Sperberg-McQueen
Modified: 2007-05-02 00:11 UTC (History)
1 user (show)

See Also:


Attachments

Description C. M. Sperberg-McQueen 2006-09-08 04:09:24 UTC
Thinking about some of our use cases for check clauses the other day,
I became slightly more concerned than I had been by the restrictions
placed on the XPath expressions of assertions in our current working
draft.

The use cases provide the following examples of XPath expressions for
'high priority' use cases.  Those marked * are not legal according to
the grammar in our most recent public working draft.  Some variations
on them, which do not appear in the wiki, some of which are intended
to have the same semantics when used as assertions and some of which
are intended to illustrate syntactic limits on the existing grammar,
are also given, marked + (or +* if they are not accepted by the
grammar).

It's possible that my parser is faulty, so first of all I ask those
interested to see if they agree with me about which of these XPath
expressions are accepted, and which are not accepted, by our current
XPath subset.

  Value-equals test required (1)

  *  @type='BridgeEthernet' & @BrEthernetIP = '' 
  +* @type='BridgeEthernet' and @BrEthernetIP = '' 
  +* @type eq 'BridgeEthernet' and @BrEthernetIP eq '' 
  +  @type eq xsd:string('BridgeEthernet') and @BrEthernetIP eq xsd:string('') 

  * ./@type='BridgedEthernet'
  + ./@type eq xsd:string('BridgedEthernet')

  * ./@type='BridgedEthernet' and not ./@BrEthernetIP
  +* ./@type eq xsd:string('BridgedEthernet') and not ./@BrEthernetIP
  +* ./@type eq xsd:string('BridgedEthernet') and not(./@BrEthernetIP)
  +  ./@type eq xsd:string('BridgedEthernet') and ./@BrEthernetIP


  Value arithmetic required - attributes (2)

  *  @min <= @max
  +  @min le @max

  *  . < ../@min
  *  . le ../@min

  *  @max >= @min
  +  @max ge @min


  Constraints on grandchildren (5)

  Simple attribute implication (6)
  *  ./@attrOne or not(./@attrTwo)
  *  ./@attrOne or not ./@attrTwo
  +  ./@attrOne or ./@attrTwo

    ./@attrOne

  *  not(./@attrOne)

  *  ./@attrTwo and not(@attrOne)

  Attribute mutex (7)
  *  (./@dec or ./@hex) and not(./@dec and ./@hex)
  +* (./@dec or ./@hex) and (./@dec and ./@hex)
  +* (./@dec or ./@hex)
  +  ./@dec or ./@hex

     ./@dec and ./@hex
  +* (./@dec and ./@hex)

  *  not(./@dec) and not(./@hex)

  Open content, sort of (9)

  Value arithmetic required - elements (12)
  *  (./a + ./b + ./c) <= 30
  +* (./a + ./b + ./c) le 30
  +* ./a <= 30
  +* ./a le 30
  +* ./a le xsd:int(30)
  + ./a le xsd:int('30')

  * ./a + ./b + ./c > 30.00


  Require somewhere (20)
  *  count(//buyer) > 0
  +* count(//buyer) gt 0
  +* count(//buyer) gt xsd:int('0')
  +* count(//buyer)

  *  count(//buyer-id | //buyer/@id) &gt; 0
  *  count(//seller) &gt; 0
  *  count(//seller-id | //seller/@id) &gt; 0
  *  count(//binding-jurisdiction) = 1
  *  count(//severability | //nonseverability) = 1
  *  count(//start-date) = 1 and count(//end-date) = 1

  Deep inclusions (23)
  *  not(ancestor::html:form)
  *  not(.//html:input[not(./ancestor::html:form])
  *  count(.//html:input[ancestor::html:form])
                = count(.//html:input)
  *  count(.//html:form//html:input)
                = count(.//html:input)


I think the bottom line is (a) either the grammar or my parser is
having some trouble with not() and count(), and (b) that if my parser
is correct then our subset is too small, because it makes it too hard
to write useful assertions.
Comment 1 Sandy Gao 2006-09-08 20:53:29 UTC
I was also thinking about expanding the subset. My focus has been:
- Allow "quantified" expression (some/every ... satisfies ...) and possibly "if" expressions (if ... then ... else ...) (it's unfortunate there isn't a short form like if ... then ...)
- Allow more than attributes in predicates (hoping that it's still streamable)

Now how are you suggesting we expand it?
- not/count: in XPath 2.0, I think they became fn:not and fn:count, which are allowed by the grammar. Hum... not sure whether it still allows XPath 1.0 functions without the namespace.
- About casting: I think maybe it's OK to omit xs:string(). Treat it as the default. We can also treat integer literals in the same way. Or we can go to the extreme and omit all constructor functions and implicitly cast the string value to the value space of the other operand.
- Comparison: I think we have to use the 2-letter operators to match XPath 2.0 semantics
- About arithmatic and promotion/casting: this is the discussion we had and I'm inclined to say "no" for now.

Also note that for "Require somewhere", we only allow ".//buyer" and not "//buyer". To make sure "buyer" appears somewhere in the tree, you only need
.//buyer
which is equivalent to
fn:count(.//buyer) > xs:int('0')
Comment 2 Michael Kay 2006-09-27 18:00:24 UTC
Here are some other things which I would like to say, but can't:

(1) events must be in chronological order

every $x in event, $y in event satisfies if $y >> $x then $y/date >= $x/date

(2) currency must be one of the currencies in http://example.com/currencies

. = doc(http://example.com/currencies)/currencies/currency

(3) events must not be in the future

date <= current-date()

(4) date must not be a Sunday

(5) height must be a multiple of 0.25

I'm even finding it difficult to write basic co-occurrence tests such as

if (@a > 0) then exists(@b)

I think there are workarounds for most of these within the proposed subset, but some of them are unnecessarily tortuous, for example

not(@a gt 0 and not(@b))

The restrictions are so arbitrary that it's going to be very hard for users to remember them, let alone to learn how to work around them.

Michael Kay
Comment 3 C. M. Sperberg-McQueen 2007-02-22 20:21:37 UTC
An update on current status.  At the ftf meeting at the end of
October and beginning of November, the WG agreed that in principle
a legal schema can use any legal XPath 2.0 expression as an
assertion.  

To avoid requiring all XSD processors to implement all of
XPath 2.0, the subset defined in the spec will be retained, 
and all schema processors are required to support at least 
that subset of XPath 2.0; other processors may choose to 
support more, or all, of XPath 2.0.  Schema authors who care
more about power than interoperability will choose schema
processors accordingly; schema authors who care about
interoperability more than about power will restrict themselves
to expressions in the subset.  (Schema authors who care about
both power and interoperability will presumably just curse
the Working Group.) 

So a technical direction for resolution of this issue
has been set, although no final wording has been adopted
(and thus the decision is not yet part of the status quo
text).  A wording proposal is expected to go to the Working
Group real soon now, possibly today or tomorrow.
Comment 4 C. M. Sperberg-McQueen 2007-03-27 15:00:49 UTC
The wording proposal mentioned in comment 3 has been discussed at length by
the Working Group and most of it was adopted in the WG call of 23 March 2007.

The part not adopted dealt with the typing of the data model instance;
see bug 4416.

The part adopted makes clear that the XPath subset is not a restriction on the
XPath expressions contained in a legal schema, but an implementation minimum.
Accordingly, I'm marking this issue closed.
Comment 5 C. M. Sperberg-McQueen 2007-05-02 00:11:54 UTC
As the originator of this issue, I assent to the WG's resolution of the
question and accordingly close the bug report.  I note in passing
that some other readers of the XML Schema 1.1 spec are still unhappy 
with the definition of the XPath subset as an implementation minimum,
but I will leave it to those readers to make their views heard.