This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 12823 - Host component analyzed in a URI scheme without host
Summary: Host component analyzed in a URI scheme without host
Status: RESOLVED WORKSFORME
Alias: None
Product: HTML Checker
Classification: Unclassified
Component: General (show other bugs)
Version: unspecified
Hardware: All All
: P2 normal
Target Milestone: ---
Assignee: Michael[tm] Smith
QA Contact: qa-dev tracking
URL: http://piratery.net/temp/ed2k-link.xhtml
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-05-30 18:28 UTC by J
Modified: 2015-08-23 07:07 UTC (History)
1 user (show)

See Also:


Attachments

Description J 2011-05-30 18:28:17 UTC
The "ed2k" URI scheme has no host component (which is only natural for resources found in a peer-to-peer network). The HTML5 validator is giving errors for these URIs, which are valid.

The section "1.2.3 Hierarchical Identifiers" of the RFC 3986 "Uniform Resource Identifier (URI): Generic Syntax" specifies:

    For some URI schemes, the visible hierarchy is limited to the scheme itself: everything after the scheme component delimiter (":") is considered opaque to URI processing.

That should be the case for the "ed2k" scheme.

For an example of page giving an unexpected error, you can try to validate that document: http://piratery.net/temp/ed2k-link.xhtml
Comment 1 Julian Reschke 2011-05-31 07:56:18 UTC
I believe the warning is correct. Check the RFC 3986 grammar.

In 

  ed2k://|file|empty.txt|0|31D6CFE0D16AE931B73C59D7E0C089C0|/

everything after the scheme parses as "authority" component.

Lesson: don't use "//" at the beginning of the scheme-specific part unless it indeed starts the authority component.
Comment 2 Michael[tm] Smith 2011-05-31 08:15:45 UTC
(Following up on comment #1)
>   ed2k://|file|empty.txt|0|31D6CFE0D16AE931B73C59D7E0C089C0|/
> 
> everything after the scheme parses as "authority" component.
> 
> Lesson: don't use "//" at the beginning of the scheme-specific part unless it
> indeed starts the authority component.

So to be clear, it seems the that URI actually intentionally doesn't have an authority component. And the prose of RFC 3986 says:

  http://tools.ietf.org/html/rfc3986#section-3
  When authority is not present, the path cannot begin with two slash characters ("//").

So the problem is that the ed2k URI syntax doesn't conform to RFC 3986. 

  http://en.wikipedia.org/wiki/Ed2k_URI_scheme

Given that, moving this bug to resolved=worksforme because the validator is behaving as expected for this case. If you still believe otherwise, please reopen it and add a comment explaining why.
Comment 3 J 2011-05-31 20:46:13 UTC
Yes, you are both right, I misinterpreted RFC 3986. This is a bit annoying given that the ed2k scheme is never going to be fixed given its nature, but I guess I'll have to live with it and have non-conforming webpages.