This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
I was reading the latest working draft of "XQuery 1.0 and XPath 2.0 Functions and Operators" at http://www.w3.org/TR/xpath-functions/ .. I felt a need for improvement of fn:tokenize function (described in section 7.6.4) . Just now tokenize function breaks the input string into a sequence of strings .. I'll illustrate the problem I am facing with an example (this is tested with Saxon 8.4).. I want to tokenize a string by "any capital letter". So A,B,C .... Z will be possible delimeters. I can solve this problem as below with the tokenize function (using a regular expression) .. <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="text" /> <xsl:variable name="tempstr" select="'HelloThere'" /> <xsl:template match="/"> <xsl:for-each select="tokenize($tempstr, '[A-Z]')"> <xsl:value-of select="." /><xsl:text> </xsl:text> </xsl:for-each> </xsl:template> </xsl:stylesheet> This gives output ello here Its fine.. But I have no access to the current delimeter (it is variable for each iteration) .. I propose a function like "fn:delim() as xs:string" which will return the delimeter in context .. (it will be conceptually similar to position() function) For example, I would be able to modify the above example to like .. <xsl:for-each select="tokenize($tempstr, '[A-Z]')"> <xsl:value-of select="delim()" /><xsl:value-of select="." /><xsl:text> </xsl:text> </xsl:for-each> This will return output Hello There I guess it will be useful.. Regards, Mukul
Thanks for the comment, Mukul. We did try to design a function that provided this capability but found that it was too difficult to do as a pure function because of the complexity of the result. Providing access to a secondary result using an ancillary function delim() might seem natural in an XSLT context, but it doesn't fit the stricter functional style of XPath and XQuery. XQuery avoids such context-dependent functions because they make the job of the optimizer much harder. So instead we provided this functionality in XSLT through the xsl:analyze-string instruction, which has two sub-instructions, matching-substring and non-matching-substring. This is similar to tokenize() except that both the tokens and the separators are returned (and you can also get access to the matched subgroups within the matched pattern using the ancillary regex-group() function). You might also be interested that in Saxon I have provided the functionality of xsl:analyze-string as an extension function saxon:analyze-string so that it is available to XQuery users. This exploits Saxon's support for higher-order functions: once XQuery supports higher-order functions in some future release it will be much easier to design functions that do this kind of job. This is a personal response, you will get an official response from the WGs in due course. Michael Kay
Hi Mike, I am curious to know, has there been any progress made for my this suggestion? I wish to know how XSL WG felt about this subject. I am also keen to know what are the implications with regards to this. i.e. how this may or may not fit in the functional programming style of XSLT (2.0) (and what benifit the user will get).. I am a newbie to functional programming style .. I'll be grateful.. if you can please elaborate more on this topic.. Regards, Mukul
There was not enough support for this enhancement during the discussion by the joint QT WGs on 5/19/2005. It was suggested that the requested functionality could be achieved via the XSLT analyze-string function as discussed in the post by Michael Kay. Ashok Malhotra