<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>29496</bug_id>
          
          <creation_ts>2016-02-21 18:39:08 +0000</creation_ts>
          <short_desc>[FO31] parse-ietf-date with military timezones and leniency towards single-digit numbers</short_desc>
          <delta_ts>2016-07-21 15:56:23 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>XPath / XQuery / XSLT</product>
          <component>Functions and Operators 3.1</component>
          <version>Candidate Recommendation</version>
          <rep_platform>PC</rep_platform>
          <op_sys>Windows NT</op_sys>
          <bug_status>CLOSED</bug_status>
          <resolution>FIXED</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>minor</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Abel Braaksma">abel.braaksma</reporter>
          <assigned_to name="Michael Kay">mike</assigned_to>
          <cc>andrew_coleman</cc>
    
    <cc>debbie</cc>
    
    <cc>liam</cc>
          
          <qa_contact name="Mailing list for public feedback on specs from XSL and XML Query WGs">public-qt-comments</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>125192</commentid>
    <comment_count>0</comment_count>
    <who name="Abel Braaksma">abel.braaksma</who>
    <bug_when>2016-02-21 18:39:08 +0000</bug_when>
    <thetext>If I understand the text in the internal draft and CR correctly, the function fn:parse-ietf-date is meant to parse a date that is approximate to RFC-822, RFC-1123, RFC-850, RFC-1036, POSIX actime. It is more liberal than the more restrictive grammar in RFC-2616.

I have a few observations:

1) I am missing the military timezones allowed by RFC-822. Since format-dateTime can create them, it seems to make sense to allow them as input as well.

2) In a similar vain, with the note on &quot;be liberal in what to accept&quot; it seems to make sense to allow unmentioned timezones with an implementation-defined offset. Currently that is an error (but this may well be intentional).

3) The text explains for each absent token or partial token what the default is, but not for fractional seconds. Obviously this must be zero and perhaps it is a bit too pedantic to add it, but nevertheless, all the other optional parts of the grammar have such a mention.

4) The Note on leniency towards single-digit vs double-digit numeric values says &quot;Accepts a single-digit value in place of a two-digit value with a leading zero&quot;. This appears to imply &quot;in a place where two digits can be replaced by a single digit then...&quot;. But the grammar only allows this for the daynum, not for hours. Is &quot;3:45&quot; to be treated as an error or may it be parsed as &quot;03:45&quot;? If the latter was the intend of this Note, I think the grammar should reflect that, or the Note could perhaps give it as example (or conversely, mention specifically that *only* daynum can be treated this way).

5) Perhaps the 4th paragraph of the Note could be written as follows to reflect point (4) above or more generally, remove the confusion that the grammar should not be taken too strictly (which I doubt is the intend):

Suggestion to replace: &quot;Reflecting the internet tradition of being liberal in what is accepted, the function also:&quot;

with: &quot;Reflecting the internet tradition of being liberal in what is accepted, the grammar of the function deliberately accepts:&quot;</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>125441</commentid>
    <comment_count>1</comment_count>
    <who name="Andrew Coleman">andrew_coleman</who>
    <bug_when>2016-03-11 13:25:17 +0000</bug_when>
    <thetext>The WG agreed on 2016-03-01:

DECISION: (bug 29496) accepted one technical change, allowing the hours value to be single digit, plus editorial clarifications as suggested in points (3) and (5).</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>125577</commentid>
    <comment_count>2</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2016-03-21 22:24:31 +0000</bug_when>
    <thetext>The changes have been applied.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>125581</commentid>
    <comment_count>3</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2016-03-22 09:30:26 +0000</bug_when>
    <thetext>Note: I interpreted &quot;the hours component&quot; to include both the hours part of the time, and the hours part of the timezone.

A test case has been added.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>125701</commentid>
    <comment_count>4</comment_count>
    <who name="Abel Braaksma">abel.braaksma</who>
    <bug_when>2016-04-03 23:21:15 +0000</bug_when>
    <thetext>I took a moment to review the changes in the internal WD as written now, and it looks like the text and the grammar were updated as expected. The suggested text for the &quot;liberal in what to accept&quot; was changed differently than proposed, but I think the revised text is better/clearer. Thanks.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>126044</commentid>
    <comment_count>5</comment_count>
    <who name="Debbie Lockett">debbie</who>
    <bug_when>2016-04-22 16:32:54 +0000</bug_when>
    <thetext>Being picky, please can you also update the following sentence of the Rules for parse-ietf-date():

&quot;If a tzoffset is supplied then its first two digits supply the hours part of the timezone offset, and its next two digits, if present, supply the minutes part.&quot;

to say &quot;...its first one or two digits...&quot;

(Note there were a couple of other tests, parse-ietf-date-errs5 and parse-ietf-date-errs28, which expected errors for single digit hours components, which I have modified.)</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>126156</commentid>
    <comment_count>6</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2016-04-26 14:21:33 +0000</bug_when>
    <thetext>Having slight difficulty working out how best to say this in a way that actually tells people that a tzoffset of 130 means one hour and 30 minutes, without relying on the reader&apos;s common sense, or at the other extreme treating them like idiots...

The production rule is

tzoffset ::= (&quot;+&quot;|&quot;-&quot;) hours &quot;:&quot;? minutes?

where 

hours	::=	digit digit?
minutes	::=	digit digit

So assuming people know how to parse from BNF, I think the best way would be to refer to the parts of the production rule:

&quot;If a @tzoffset@ is supplied then @hours@ supplies the hours part of the timezone offset, and @minutes@, which defaults to zero if absent, supplies the minutes part.&quot;

That&apos;s dangerously close to being tautological but I think the font changes make clear the distinction between syntactic components of the supplied string and semantic components of the resulting value.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>126214</commentid>
    <comment_count>7</comment_count>
    <who name="Abel Braaksma">abel.braaksma</who>
    <bug_when>2016-04-27 17:49:59 +0000</bug_when>
    <thetext>I&apos;d like to suggest to disallow 130 and to allow 1:30, 01:30 and 0130. I don&apos;t think anyone would expect military time (which I believe this format comes from) to be other than four digits.

That would mean a slight change in the production rules, for instance:

tzoffset ::= (&quot;+&quot;|&quot;-&quot;) (hours (&quot;:&quot; minutes)? | miltime)

miltime		::=	milhours minutes
milhours	::=	digit digit
hours		::=	digit digit?
minutes		::=	digit digit</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>126234</commentid>
    <comment_count>8</comment_count>
    <who name="Liam R E Quin">liam</who>
    <bug_when>2016-04-27 21:50:32 +0000</bug_when>
    <thetext>I don&apos;t think it&apos;s worth making the change Abel suggests in comment 7. I&apos;d have to go back and check the RFCs and implementations and data to see if we&apos;d be rejecting in-use values used by automatically-generated datestamps (the primary usecase for this function), which I could do but would rather not - people have implemented what they&apos;ve implemented at this point I expect, and I don&apos;t think rejecting values because they look odd to us is a good approach.

A note that 130 (for example) is short for 0130 or 01:30 might be helpful.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>126245</commentid>
    <comment_count>9</comment_count>
    <who name="Abel Braaksma">abel.braaksma</who>
    <bug_when>2016-04-28 00:52:52 +0000</bug_when>
    <thetext>(In reply to Liam R E Quin from comment #8)
&gt; I&apos;d have to go back and check the RFCs and implementations and data to see if
&gt; we&apos;d be rejecting in-use values used by automatically-generated datestamps
Probably not needed, as my proposal is a partial reversion of a change following the accepted proposal in comment#1 and the editorial license mentioned in comment#3. In fact, I think it is closer to the original decision of comment#1. 

In other words, we remove something that was accidentally added.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>126247</commentid>
    <comment_count>10</comment_count>
    <who name="Liam R E Quin">liam</who>
    <bug_when>2016-04-28 02:56:10 +0000</bug_when>
    <thetext>Thanks, Abel. I agree that a TZ offset of 130 is weird but I don&apos;t see it as a problem. None the less I&apos;m also OK with disallowing it and allowing only (1:30, 01:30, 0130).  It doesn&apos;t come up often in practice but it does happen (Newfoundland is an example). So on rereading in that spirit I&apos;m ok with your comment 7, although let&apos;s use tzoffset and not miltime for the name -- &quot;Military time&quot; is sometimes used to mean the 24-hour clock system, and I don&apos;t think you&apos;re proposing to disallow 130 as equivalent to 1:30am in the time part.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>126363</commentid>
    <comment_count>11</comment_count>
    <who name="Andrew Coleman">andrew_coleman</who>
    <bug_when>2016-05-06 09:56:27 +0000</bug_when>
    <thetext>At the meeting on 2016-05-03, the WG decided to retain the status quo with no further changes, allowing TZ 130.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>127020</commentid>
    <comment_count>12</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2016-07-21 15:56:23 +0000</bug_when>
    <thetext>Comment #5 had been overlooked. I have amended the relevant paragraph to read:

* If it contains a colon, this separates the hours part from the minutes part.

* Otherwise, the grammar allows a sequence of from one to four digits. These are interpreted as &lt;code&gt;H&lt;/code&gt;, &lt;code&gt;HH&lt;/code&gt;, &lt;code&gt;HMM&lt;/code&gt;, or &lt;code&gt;HHMM&lt;/code&gt; respectively, where &lt;code&gt;H&lt;/code&gt; or &lt;code&gt;HH&lt;/code&gt; is the hours part, and &lt;code&gt;MM&lt;/code&gt; (if present) is the minutes part.&lt;/p&gt;&lt;/item&gt;</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>