<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>10261</bug_id>
          
          <creation_ts>2010-07-29 14:29:50 +0000</creation_ts>
          <short_desc>[F&amp;O] return value of fn:replace and fn:tokenize if $pattern does not match</short_desc>
          <delta_ts>2010-10-20 07:18:18 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>XPath / XQuery / XSLT</product>
          <component>Functions and Operators 3.0</component>
          <version>Working drafts</version>
          <rep_platform>PC</rep_platform>
          <op_sys>Windows XP</op_sys>
          <bug_status>CLOSED</bug_status>
          <resolution>FIXED</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>minor</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Herbert Oppmann">Herbert.Oppmann</reporter>
          <assigned_to name="Michael Kay">mike</assigned_to>
          
          
          <qa_contact name="Mailing list for public feedback on specs from XSL and XML Query WGs">public-qt-comments</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>37164</commentid>
    <comment_count>0</comment_count>
    <who name="Herbert Oppmann">Herbert.Oppmann</who>
    <bug_when>2010-07-29 14:29:50 +0000</bug_when>
    <thetext>Regarding: Recommendation 23 January 2007, chapter 7.6.3 fn:replace and 7.6.4 fn:tokenize

The document does not explicitely describe what fn:replace and fn:tokenize shall return if the $pattern does not match the $input.

I tried it with Altova XMLSpy, the only implementation of XSLT 2.0 I have, and it returns $input unchanged.

This makes sense, and should be documented.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>37166</commentid>
    <comment_count>1</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2010-07-29 15:35:16 +0000</bug_when>
    <thetext>Thanks for the comment.

Concerning fn:replace, the summary says: &quot;The function returns the xs:string that is obtained by replacing each non-overlapping substring of $input that matches the given $pattern with an occurrence of the $replacement string.&quot;

Admittedly this relies on an understanding of the English phrase &quot;replacing a substring of a string&quot;, and in particular the implication that the rest of the string (outside that substring) is left unchanged; but given that expansion, I don&apos;t see how one can interpret this in any way other than meaning if there are no matches, no replacements are made, and therefore the original string is returned unchanged. (OK, perhaps it&apos;s also relying on the mathematician&apos;s understanding of &quot;each&quot; as opposed to the everyday understanding.)

Similarly the summary of fn:tokenize is &quot;This function breaks the $input string into a sequence of strings, treating any substring that matches $pattern as a separator. The separators themselves are not returned.&quot; 

Again this is informal language, but I think the only possible interpretation is that if there are no matches then there are no separators and therefore the result sequence contains a single string which is the same as the original.

My recommendation to the WG is to treat this as editorial, and to try and improve the text for the next (&quot;1.1&quot;) version by making the language a bit more formal. In particular, I&apos;ve been trying in 1.1 to ensure that the function summaries contain no information that does not appear explicitly in the rules, and these two functions don&apos;t satisfy that rule (in the 1.0/2.0 spec, very few did).</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>37187</commentid>
    <comment_count>2</comment_count>
    <who name="Herbert Oppmann">Herbert.Oppmann</who>
    <bug_when>2010-07-30 09:41:33 +0000</bug_when>
    <thetext>Dear Michael,

thanks for your quick response on that. Your recommendation to the WG is perfectly OK for me.

I was just not sure if I can rely on these functions returning the input string unchanged if the pattern does not match, because this was a bit &quot;between the lines&quot; while all the other aspects of the behaviour of these functions were so clear and formal.

Thanks again.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>37189</commentid>
    <comment_count>3</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2010-07-30 10:18:06 +0000</bug_when>
    <thetext>An observation: in 1.1/2.1 we could define fn:tokenize(S, P, F) to be the value of fn:analyze-string(S, P, F)/*/fn:match/string().

Defining fn:replace(S, P, R, F) in terms of fn:analyze-string(S, P, F) is rather more challenging because of the complexities of captured groups, but it ought to be possible to do it.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>41565</commentid>
    <comment_count>4</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2010-10-19 16:00:14 +0000</bug_when>
    <thetext>Changed status to editorial, and version to 1.1; closing as FIXED on the basis that the editor has agreed to clarify the text (e.g. by adding examples) without changing the technical nature of the specification.

Originator: please mark as CLOSED to indicate your acceptance of this resolution.</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>