This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
This is supported by IE, Blink and WebKit, but not Gecko. Usage in Chrome is around 4%: https://www.chromestatus.com/metrics/feature/timeline/popularity/127 It's not readonly like characterSet, but we can probably remove the setter: https://www.chromestatus.com/metrics/feature/timeline/popularity/427 So, make charset an alias of characterSet? It's very unlikely that it can be removed in Blink, since at this level of usage it's bound to show up on code paths that Gecko doesn't take for some reason or another.
I would add one another thing if you have already started this bug. Document.characterSet should retrun encoding's name but lowercase (what we have in table on encoding spec) or uppercase? Ask becasue I noticed different behavior in browsers. https://encoding.spec.whatwg.org/#names-and-labels Some results returned by various commands: Document.characterSet Firefox UTF-8 Chrome UTF-8 IE utf-8 Document.inputEncoding (DOM Level 3) Firefox UTF-8 Chrome UTF-8 IE UTF-8 Document.charset (not standard) Chrome UTF-8 IE utf-8 Document.characterSet (not standard) Chrome ISO-8859-2 IE windows-1250 TextEncoder.encoding and TextDecoder.encoding Firefox utf-8 Chrome utf-8
> Document.characterSet (not standard) > Chrome ISO-8859-2 > IE windows-1250 > Here is Document.defaultCharset (not Document.characterSet).
Adding some Mozillians who might have opinions on adding an alias. They should return the names in lowercase per the Encoding Standard. If different casing needs to be considered (note that browsers do not consistently use uppercase or lowercase today) we'd need to address that through a "display name" field in the Encoding Standard or some such.
I agree that we should try to return lowercase string, but that's orthogonal to this bug. charset is already an alias of characterSet in Blink, and any changes would apply to both.
Well not completely right, or do both have a setter in Blink? Should we wait with adding an alias until the setter has been removed?
What does the setter do? Is it known that if the property sniffs as existing, sites won't try to use the setter (i.e. having it as getter-only would be safe)? (In reply to Anne from comment #3) > If > different casing needs to be considered (note that browsers do not > consistently use uppercase or lowercase today) Didn't WebKit make a specific effort to be consistent with Gecko's (rather arbitrary) casing? Have you researched why the WebKit developers made the effort to be case-consistent with Gecko?
About adding a getter alias, I'm not sure what that will buy us for Gecko, since it is clearly not required for web compat for content that we're handling (at least I have never seen anyone ask for it, or any major website being broken in Gecko because we don't support it.) About adding a setter, I'm not sure if I understand what the semantics would be. In fact, I can't think of a use case for dynamically changing the charset of a document.
(In reply to Henri Sivonen from comment #6) > Didn't WebKit make a specific effort to be consistent with Gecko's (rather > arbitrary) casing? Have you researched why the WebKit developers made the > effort to be case-consistent with Gecko? WebKit did? I'm not aware of that. I remember that what I found was inconsistent across user agents. From https://bugs.webkit.org/buglist.cgi?query_format=specific&order=relevance+desc&bug_status=__all__&product=&content=characterset I cannot find anything that supports what you suggest.
(In reply to Anne from comment #5) > Well not completely right, or do both have a setter in Blink? Only charset has a setter. (In reply to Henri Sivonen from comment #6) > What does the setter do? It's propagated to a TextResourceDecoder where it looks like it will prevent further checks for <meta charset>, but I've been unable to produce a simple test case where it has any observable effect. I'm betting on removal, in which case it doesn't matter. > Is it known that if the property sniffs as existing, sites won't try to use > the setter (i.e. having it as getter-only would be safe)? All I know is that the usage of the setter is in the range where it's plausible that removal would work, currently ~0.01% of page views. In my experience, only actually attempting removal will tell you if it's safe or not.
(In reply to Anne from comment #8) > (In reply to Henri Sivonen from comment #6) > > Didn't WebKit make a specific effort to be consistent with Gecko's (rather > > arbitrary) casing? Have you researched why the WebKit developers made the > > effort to be case-consistent with Gecko? > > WebKit did? I'm not aware of that. I remember that what I found was > inconsistent across user agents. Maybe they didn't. Still, the case is remarkably consistent across WebKit and Gecko. I quick look suggests that WebKit follows IANA casing and Gecko follows IANA casing except for gbk and gb18030 (which are upper case in IANA & WebKit). So maybe WebKit didn't copy Gecko but both WebKit and Gecko used IANA casing, except Gecko somehow failed to do that for gbk and gb18030.
As for incentives, the status quo for many years has been that Gecko has no incentive to add Document.charset, and IE/WebKit/Blink have no incentive to remove it. The result is a small but ever-present opportunity for writing non-portable code... In this case, the quickest path to interop appears to be for Blink to remove the setter and for the spec and Gecko to add the getter. Other ideas welcome :)
document.charset was once spec'ed then removed. Why is it going to added once again? Because WebKit refused to remove it? Because everyone except Gecko has the support? (It is basically what I said in Gecko bug 647621 comment #0.)
Masatoshi, do you have another proposal for how to reach agreement between the spec and browsers?
I'm not necessarily opposed to Gecko implementing the getter, but I would like to know what we will gain from that (in addition to comment 11, of course.) Specifically, do we have any data on how this property is used on the 4% of pages viewed in Blink based browsers? If we have a way to obtain more info on the actual usage of this property on the Web, that may help guide us to decide whether it makes more sense for Gecko to implement or for Blink/IE to drop.
The 4% is any access to Document.charset, notably including code like (document.charset || document.characterSet) that would work without it, which is likely a large majority of cases. Answering questions like these using Blink's UseCounter system is difficult, one would have to collect a representative sample of pages that access document.charset and analyze them manually. If someone has access to a large corpus of Web content, a grep for pages that say "document.charset" without "document.characterSet" in the vicinity might be illuminating.
I added compatibility names in https://github.com/whatwg/dom/commit/03e170351f095e4fe749e0259a3aafc0cbb49c91 I want to wait with adding .charset until at least the setter has disappeared. Removing that seems like a win for everyone. Then we can evaluate again.
OK, I'll try to get rid of the Document.charset setter and then report back here.
I've now removed the setter from Blink, let's hope it sticks: https://code.google.com/p/chromium/issues/detail?id=438392#c4
The removal of the setter appears to have worked out. It was gone in M45, which reached Chrome stable on September 1. Now that Document.charset is an alias of Document.characterSet, can we spec it?
https://github.com/whatwg/dom/commit/6941936bd06438f84ad91d131e2e89ab0f1f7a45
https://github.com/w3c/web-platform-tests/pull/2192