This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 23968 - Reference the URL spec
Summary: Reference the URL spec
Status: NEW
Alias: None
Product: CSS
Classification: Unclassified
Component: Values and Units (show other bugs)
Version: unspecified
Hardware: PC All
: P2 normal
Target Milestone: ---
Assignee: Tab Atkins Jr.
QA Contact: public-css-bugzilla
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 24089
  Show dependency treegraph
 
Reported: 2013-12-03 12:10 UTC by Simon Pieters
Modified: 2013-12-20 14:41 UTC (History)
3 users (show)

See Also:


Attachments

Description Simon Pieters 2013-12-03 12:10:00 UTC
CSS references RFC 3986 for processing of URLs, which leaves questions unanswered such as which character encoding to use for the query part. We should reference http://url.spec.whatwg.org/ . (I think the query encoding should be utf-8 for URLs in CSS.)
Comment 1 Anne 2013-12-03 13:34:29 UTC
Yes, you should use the URL parser with just input and a base, not with anything else. You don't need to mention you set it to utf-8 as that will be implied.
Comment 2 Tab Atkins Jr. 2013-12-03 16:07:44 UTC
Ah cool, yeah.  We only referenced the RFC because there was literally nothing better.  I'll update the ref soonish.

By the time URLs need to be derefed, they're already a sequence of codepoints.  Decoding happened back in CSS parsing.  That fine, Anne?
Comment 3 Anne 2013-12-03 20:53:54 UTC
Yes. The encoding override argument is not about that, but you don't need to worry about that argument anyway. The input to the URL parser is a string and a base URL.
Comment 4 Simon Pieters 2013-12-13 22:10:19 UTC
Hmm, it seems Blink/Gecko/IE11 use the document's encoding, at least for background-image. Presto uses utf-8.
Comment 5 Simon Pieters 2013-12-13 22:20:14 UTC
(I have only tested <style> so far.)
Comment 6 Simon Pieters 2013-12-16 12:54:35 UTC
Results for <link>

Gecko: document's encoding
Blink: style sheet's encoding
Presto: utf-8
IE: not tested
Comment 7 Anne 2013-12-16 13:14:20 UTC
I think for CSS we could get away with utf-8-only given how rarely query strings occur there. Also given the diversity in implementations. Could you try Safari as well? It does different things for URLs from Blink.
Comment 8 Simon Pieters 2013-12-16 13:38:40 UTC
(In reply to Anne from comment #7)
> I think for CSS we could get away with utf-8-only given how rarely query
> strings occur there.

Do you have data on that?

> Also given the diversity in implementations. Could you
> try Safari as well? It does different things for URLs from Blink.

Safari 6.0.5 (8536.30.1) uses the style sheet's encoding for <link>.
Comment 9 Anne 2013-12-16 13:49:37 UTC
No data. But the reason we have the quirk in HTML is because of <form>.

Are your tests public?
Comment 10 Tab Atkins Jr. 2013-12-16 18:19:34 UTC
(In reply to Simon Pieters from comment #8)
> (In reply to Anne from comment #7)
> > I think for CSS we could get away with utf-8-only given how rarely query
> > strings occur there.
> 
> Do you have data on that?

Given that currently url() is only usable for importing stylesheets and including images, it seems likely that Anne is correct.  Anecdotally, the main use of query strings on stylesheets is to cache-bust (and then the contents are irrelevant), and query strings on images are only for dynamically-generated ones, which are rare, or database keys, which are nearly always ASCII.
Comment 11 Simon Pieters 2013-12-16 22:43:28 UTC
(In reply to Tab Atkins Jr. from comment #10)

> Given that currently url() is only usable for importing stylesheets and
> including images,

and font files...

> it seems likely that Anne is correct.  Anecdotally, the
> main use of query strings on stylesheets is to cache-bust (and then the
> contents are irrelevant), and query strings on images are only for
> dynamically-generated ones, which are rare, or database keys, which are
> nearly always ASCII.

This seems reasonable, but it's not data. :-P

(In reply to Anne from comment #9)
> No data. But the reason we have the quirk in HTML is because of <form>.

I thought it was required for at least <a href> also, but yeah.

> Are your tests public?

https://github.com/w3c/web-platform-tests/pull/444

It turns out that:
in Presto, @import uses the document's encoding.
in Blink, @import uses utf-8.
in Gecko, <style> uses utf-8. (I get a different result now than in comment 4. Not sure why.)