<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>27257</bug_id>
          
          <creation_ts>2014-11-06 12:19:09 +0000</creation_ts>
          <short_desc>anyURI_b006 seems to be valid</short_desc>
          <delta_ts>2014-11-07 12:16:36 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>XML Schema Test Suite</product>
          <component>Microsoft tests</component>
          <version>2006-11-06</version>
          <rep_platform>PC</rep_platform>
          <op_sys>All</op_sys>
          <bug_status>NEW</bug_status>
          <resolution></resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Georgiy Rakov">georgiy.rakov</reporter>
          <assigned_to name="C. M. Sperberg-McQueen">cmsmcq</assigned_to>
          <cc>ht</cc>
          
          <qa_contact name="XML Schema Test Suite mailing list">public-xml-schema-testsuite</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>114600</commentid>
    <comment_count>0</comment_count>
    <who name="Georgiy Rakov">georgiy.rakov</who>
    <bug_when>2014-11-06 12:19:09 +0000</bug_when>
    <thetext>Bug 4048 [1] resulted in marking the expected result for anyURI_b006 test as &quot;invalid&quot; because &quot;//&quot; (double slash) is considered as invalid URI. However according to reading of rfc2396 [2] presented below double slash should be considered as valid URI.

Section &quot;5. Relative URI References&quot; from rfc2396.txt [2] states that:

   A relative reference beginning with two slash characters is termed a
   network-path reference, as defined by &lt;net_path&gt; in Section 3.  

Section &quot;3. URI Syntactic Components&quot; from rfc2396 [2] states:

      net_path      = &quot;//&quot; authority [ abs_path ]

Section &quot;3.2. Authority Component&quot; from rfc2396 [2] states:

      authority     = server | reg_name

So if &apos;server&apos; component can be empty then &apos;//&apos; should be considered as valid URI. According to following reasoning &apos;server&apos; component can be empty.

Section &quot;3.2.2. Server-based Naming Authority&quot; from rfc2396 [2] states:

      server        = [ [ userinfo &quot;@&quot; ] hostport ]

namely according to BNF rules above it is allowed for &apos;server&apos; component to be empty, thus &apos;//&apos; can be considered as empty relative network-path reference.

I understand that 3.2.2 from rfc2396 [2] in its beginning states:

   URL schemes that involve the direct use of an IP-based protocol to a
   specified server on the Internet use a common syntax for the server
   component of the URI&apos;s scheme-specific data:

      &lt;userinfo&gt;@&lt;host&gt;:&lt;port&gt;

   where &lt;userinfo&gt; may consist of a user name and, optionally, scheme-
   specific information about how to gain authorization to access the
   server. The parts &quot;&lt;userinfo&gt;@&quot; and &quot;:&lt;port&gt;&quot; may be omitted.

thus it looks like that from:
1. definition &apos;&lt;userinfo&gt;@&lt;host&gt;:&lt;port&gt;&apos;
2. and the excerpt from above: &apos;The parts &quot;&lt;userinfo&gt;@&quot; and &quot;:&lt;port&gt;&quot; may be omitted&apos;
it follows that &apos;&lt;host&gt;&apos; part is obligatory,
but section &quot;1.6. Syntax Notation and Common Elements&quot; states:

   This document uses two conventions to describe and define the syntax
   for URI.  The first, called the layout form, is a general description
   of the order of components and component separators, as in

      &lt;first&gt;/&lt;second&gt;;&lt;third&gt;?&lt;fourth&gt;

   The component names are enclosed in angle-brackets and any characters
   outside angle-brackets are literal separators.  Whitespace should be
   ignored.  These descriptions are used informally and do not define
   the syntax requirements.

namely it says: &quot;These descriptions are used informally and do not define the syntax requirements.&quot;. Hence I believe no conclusions about syntax should be made from layout syntax definition &apos;&lt;userinfo&gt;@&lt;host&gt;:&lt;port&gt;&apos; of &apos;server&apos; component.

[1] https://www.w3.org/Bugs/Public/show_bug.cgi?id=4048
[2] http://www.ietf.org/rfc/rfc2396.txt</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>114601</commentid>
    <comment_count>1</comment_count>
    <who name="Henry S. Thompson">ht</who>
    <bug_when>2014-11-06 13:17:09 +0000</bug_when>
    <thetext>2396 was obsoleted by 3986 [3], whose BNF does _not_ allow the
authority to be empty:

  relative-part = &quot;//&quot; authority path-abempty
  authority     = [ userinfo &quot;@&quot; ] host [ &quot;:&quot; port ]

ht

[3] http://tools.ietf.org/html/rfc3986</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>114602</commentid>
    <comment_count>2</comment_count>
    <who name="Georgiy Rakov">georgiy.rakov</who>
    <bug_when>2014-11-06 14:03:54 +0000</bug_when>
    <thetext>Yes, but XML Schema Part 2: Datatypes Second Edition [4] references rfc2396 rather than rfc3986.

[4] http://www.w3.org/TR/xmlschema-2/

Georgiy.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>114604</commentid>
    <comment_count>3</comment_count>
    <who name="Henry S. Thompson">ht</who>
    <bug_when>2014-11-06 14:26:36 +0000</bug_when>
    <thetext>Indeed it does.  And 2396 says it has been replaced by 3986.  See recent discussion about &apos;tight binding&apos; vs. &apos;loose binding&apos;:

 http://lists.w3.org/Archives/Public/www-xml-schema-comments/2014OctDec/0004.html</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>114605</commentid>
    <comment_count>4</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2014-11-06 14:37:37 +0000</bug_when>
    <thetext>I have some sympathy with Georgiy on this one. XSD 1.0 references RFC 2396. The problem is that RFC 2396 is a mess.

When I raised this as a bug in bug #4048, I was probably influenced by the fact that the java.net.URI class rejects &quot;//&quot;, with the error:

java.net.URISyntaxException: Expected authority at index 2: //

I suspect that the designers of class java.net.URI noted that very often when the RFC mentions the term &quot;authority&quot;, it means a non-empty authority. Examples of this usage are: &quot;A base URI without an authority component&quot;, &quot;some URI schemes do not allow an &lt;authority&gt; component&quot;, &quot;If the authority component is defined&quot;.

The Javadoc comments for java.net.URI say:

&quot;This constructor parses the given string exactly as specified by the grammar in RFC 2396, Appendix A, except for the following deviations:

(1) An empty authority component is permitted as long as it is followed by a non-empty path, a query component, or a fragment component. This allows the parsing of URIs such as &quot;file:///foo/bar&quot;, which seems to be the intent of RFC 2396 although the grammar does not permit it. If the authority component is empty then the user-information, host, and port components are undefined.

(2) ...&quot;

So I think the justification for rejecting &quot;//&quot; is the belief that RFC 2396 doesn&apos;t mean what it says.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>114644</commentid>
    <comment_count>5</comment_count>
    <who name="Georgiy Rakov">georgiy.rakov</who>
    <bug_when>2014-11-07 12:16:36 +0000</bug_when>
    <thetext>If I understand correctly the intention is to treat referencing rfc2396 within [4] in &apos;loose binding&apos; manner (is this correct?). But W3C spec [4] doesn&apos;t state that referencing to rfc2396 is done in &apos;loose binding&apos; way. BTW: rfc2396 doesn&apos;t have any references to rfc3986 but even if such reference existed, I believe, it wouldn&apos;t be obvious that it should take &apos;superseding&apos; effect when applying to [4].

So as I see it there is no normative spec stating that rfc2396 should be superseded by rfc3986 when applying to W3C spec [4]. I believe &apos;tight binding&apos; is the &apos;default understanding&apos; (it&apos;s closer to literal interpretation of the text). 

Neither are there any comments that rfc2396 should be understood with some corrections taken into account (as Michael said rfc2396 is a mess).</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>