<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>24385</bug_id>
          
          <creation_ts>2014-01-24 16:33:18 +0000</creation_ts>
          <short_desc>[FO30] Unclarity in last-line resolution for fn:unparsed-text-lines()</short_desc>
          <delta_ts>2014-02-19 09:17:58 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>XPath / XQuery / XSLT</product>
          <component>Functions and Operators 3.0</component>
          <version>Proposed Recommendation</version>
          <rep_platform>PC</rep_platform>
          <op_sys>Windows NT</op_sys>
          <bug_status>CLOSED</bug_status>
          <resolution>FIXED</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Abel Braaksma">abel.braaksma</reporter>
          <assigned_to name="Michael Kay">mike</assigned_to>
          <cc>jim.melton</cc>
          
          <qa_contact name="Mailing list for public feedback on specs from XSL and XML Query WGs">public-qt-comments</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>99010</commentid>
    <comment_count>0</comment_count>
    <who name="Abel Braaksma">abel.braaksma</who>
    <bug_when>2014-01-24 16:33:18 +0000</bug_when>
    <thetext>The last sentence under fn:unparsed-text-lines() [1] reads:

&quot;but if the external resource ends with a newline sequence, no zero-length string will be returned as the last item in the result.&quot;

and the example given is:

fn:tokenize(fn:unparsed-text($href), &apos;\r\n|\r|\n&apos;)[not(position()=last() and .=&apos;&apos;)]

The explanation and the example create a slight ambiguity when dealing with multiple empty lines at the end of the input. The text seems to imply that empty lines are removed, i.e. &quot;no zero-length string will be returned as the last item in the result&quot;. But the normative code snippet only strips the last empty line.

We&apos;ve currently implemented this to only strip at most one line from the end, if it is empty. I think this is correct. But another interpretation of the above may mean that all empty lines at the end should be removed.

This also raises the question on the corner case of input consisting solely of empty lines. From the example code, zero or one empty line will return the empty sequence, more empty lines will return a sequence of empty strings, one less than the number of empty lines in the input.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>100859</commentid>
    <comment_count>1</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2014-02-19 09:17:44 +0000</bug_when>
    <thetext>The WG examined this and agreed that the text as written was capable of being misinterpreted.

The intended meaning (in line with the usual behaviour of regular expressions in other languages) is that at most one trailing newline should be ignored. To clarify, the text in both 3.0 and 3.1 has been changed to read:

If there are two adjacent newline sequences, a zero-length string will be returned to represent the empty line; but if the external resource ends with the sequence x0A, x0D, or x0Dx0A, the result will be as if this final line ending were not present.</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>