<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>21902</bug_id>
          
          <creation_ts>2013-05-02 12:47:50 +0000</creation_ts>
          <short_desc>#url-code-points: Add notes that points out the need/lack of need to escape certain code points</short_desc>
          <delta_ts>2015-08-19 09:09:48 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WHATWG</product>
          <component>URL</component>
          <version>unspecified</version>
          <rep_platform>All</rep_platform>
          <op_sys>All</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>FIXED</resolution>
          
          
          <bug_file_loc>http://url.spec.whatwg.org/#url-code-points</bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>Unsorted</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Leif Halvard Silli">xn--mlform-iua</reporter>
          <assigned_to name="Anne">annevk</assigned_to>
          <cc>annevk</cc>
    
    <cc>mathias</cc>
    
    <cc>mike</cc>
    
    <cc>public-html-admin</cc>
    
    <cc>public-html-wg-issue-tracking</cc>
    
    <cc>rubys</cc>
    
    <cc>shadow2531</cc>
    
    <cc>xn--mlform-iua</cc>
          
          <qa_contact>sideshowbarker+urlspec</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>87164</commentid>
    <comment_count>0</comment_count>
    <who name="Leif Halvard Silli">xn--mlform-iua</who>
    <bug_when>2013-05-02 12:47:50 +0000</bug_when>
    <thetext>The #url-code-points paragraph should be annotated with more notes that point out important implications and important details related to the characteds included/excluded in the list of URL code points.

I propose that you consider clarifying the motivation for the current note (1) plus add 2 more notes (2),(3) and conisder a fourth note (4):

(1) The #url-code-points paragraph is already accompanied with a note about URL parser’s behavios, whose relevance to URL writing is unexplained. That these code points does not need escaping is obvious from their inclusion in the code ranges list, and that they are escaped by the URL parse, should have little bearing with regard to authoring, no? Please consider a hint about why you wanted to point this out.

(2) Add a note that points out that code points *not* listed amongst the URL code points, need to be escaped. Feel free to list these code points (all of them) but do at any rate at least explicitly mention common code points in need of escaping such as U+0009, U+000A, and U+000D - and please include their Unicode names as well, to help readers! (The &apos;#&apos; seems to belong in this category, whenever its &apos;fragment semantics&apos; should be escxape, belongs here as well, may be.)

(3) Add a note about which of the ‘special characters’ that *are* listed, need to be percentage-encoded whenever their URL specific functions need to be escaped as well. This includes characteres such as ?, / etc. (And if escaping is not necessary for some of these special chaaracters, then that is unexpected as well, and thus ought to be pointed out.)

(4) Consider adding note about format specific considerations. E.g. the need to escape &lt; and &amp; in XML.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>106632</commentid>
    <comment_count>1</comment_count>
    <who name="Anne">annevk</who>
    <bug_when>2014-05-22 10:33:56 +0000</bug_when>
    <thetext>1) I pointed this out because otherwise you might be surprised by what the API returns.

2-4) I think you make some valid points. I&apos;m tempted to wait with fixing this until I use Bikeshed and can include railroad diagrams and such. If you have suggestions for clearer notes, that&apos;d be most welcome.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>114567</commentid>
    <comment_count>2</comment_count>
    <who name="Sam Ruby">rubys</who>
    <bug_when>2014-11-05 21:29:28 +0000</bug_when>
    <thetext>Does http://intertwingly.net/projects/pegurl/url.html help?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>122650</commentid>
    <comment_count>3</comment_count>
    <who name="Anne">annevk</who>
    <bug_when>2015-08-19 09:09:48 +0000</bug_when>
    <thetext>Railroad diagrams is now https://github.com/whatwg/url/issues/67 but seem potentially problematic.

I added a note to clarify what percent-encoded bytes are good for:

https://github.com/whatwg/url/commit/fef9bcec9615d92695503107732a9cd5f9d05ab8

I didn&apos;t go into as much detail as I don&apos;t think that&apos;s warranted here. We just want to describe the data model, syntax, and the parser, and various operations around that. And on top of that folks can build whatever they want.</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>