This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 2711 - [xqueryx] #) in pragma content
Summary: [xqueryx] #) in pragma content
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: XQueryX 1.0 (show other bugs)
Version: Candidate Recommendation
Hardware: PC Windows XP
: P2 normal
Target Milestone: ---
Assignee: Jim Melton
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-01-13 11:56 UTC by David Carlisle
Modified: 2006-01-26 11:00 UTC (History)
0 users

See Also:


Attachments

Description David Carlisle 2006-01-13 11:56:06 UTC
The Xquery EBNF for pragma content is
[67]    	PragmaContents 	   ::=    	(Char* - (Char* '#)' Char*))

However the schema just types xqx:pragmacontents as xs:string, and the
stylesheet doesn't enforce any restriction on #) so

This is a schema valid XqueryX file which translates to a valid, executable
Xquery expression:

<xqx:module xmlns:xqx="http://www.w3.org/2005/XQueryX">
   <xqx:mainModule>
      <xqx:queryBody>
         <xqx:extensionExpr>
            <xqx:pragma>
               <xqx:pragmaName>a</xqx:pragmaName>
               <xqx:pragmaContents> #){1},1+2,(#b</xqx:pragmaContents>
            </xqx:pragma>
            <xqx:argExpr>
               <xqx:integerConstantExpr>
                  <xqx:value>1</xqx:value>
               </xqx:integerConstantExpr>
            </xqx:argExpr>
         </xqx:extensionExpr>
      </xqx:queryBody>
   </xqx:mainModule>
</xqx:module>

which has meaning specified by the result of transforming with the stylesheet
which is
(# a  #){1},1+2,(#b #){1}


which evaluates to the sequence 1 3 1 (assuming the pragma Qnames a and b are
unknown)


Of course the "1+2" above could be any Xquery expression and it means that an
XqueryX engine can not just use an XML parser but must be able to parse full
xquery syntax as well.

This could be fixed by adding a pattern facet to the schema or a check in the
stylesheet to give a fatal error if #) appears in xqx:pragmaContents

David
Comment 1 Jim Melton 2006-01-25 04:19:25 UTC
Good observation! 

With a little help from my friends (thanks, Liam!), I've modified the XQueryX
schema to use the following pattern that will exclude (i.e., fail validation)
XQueryX documents containing a pragmaContents that contains the literal sequence #)

The pattern is:
   <xsd:pattern value="[^#]*([^\)]*[^#]\))*[^\)]*"/>

Since this does exactly what you requested, I assume that you will be willing to
mark this bug CLOSED.  Of course, if you believe that this pattern does not do
the job fully or correctly, please let me know. 
Comment 2 David Carlisle 2006-01-25 12:09:16 UTC
> The pattern is:
>    <xsd:pattern value="[^#]*([^\)]*[^#]\))*[^\)]*"/>
> 

I think that allows (any string containing) #))

#)) matches as shown (using . to denote a null string)

[^#]*([^\)]*[^#]\))*[^\)]*
  .     #     )  )   .


I think probably

       <xsd:pattern value="[^#]*([^#]*#+[^\)]+)*#*"/>

does the job, although thinking about it gave me a headache:-)
Comment 3 Jim Melton 2006-01-25 18:12:33 UTC
Sigh...thanks for finding that error.  With Michael Sperberg-McQueen's help,
here's another (simpler, I think) pattern regex that doesn't have the problem
you identified:

(([^#]|#+[^\)#])*#*)

All of the test cases (about 50 or 60) I've tried on this one seem to succeed. 
If you find an error on this, please report it and we'll keep on trying. 
Comment 4 David Carlisle 2006-01-26 11:00:09 UTC
> (([^#]|#+[^\)#])*#*)

Look good to me, closing this report, David