This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 11540 - The willful violation clause is most unwise. A standard should not violate another standard for any reason. This wouls lead to 2 things : 1) Content correctly encoded content would never be displayed correctly. 2) All future standards would need to includ
Summary: The willful violation clause is most unwise. A standard should not violate an...
Status: RESOLVED WONTFIX
Alias: None
Product: HTML WG
Classification: Unclassified
Component: LC1 HTML5 spec (show other bugs)
Version: unspecified
Hardware: Other other
: P3 normal
Target Milestone: ---
Assignee: Ian 'Hixie' Hickson
QA Contact: HTML WG Bugzilla archive list
URL: http://www.whatwg.org/specs/web-apps/...
Whiteboard:
Keywords: a11y
Depends on:
Blocks:
 
Reported: 2010-12-12 20:19 UTC by contributor
Modified: 2011-08-04 05:12 UTC (History)
12 users (show)

See Also:


Attachments

Description contributor 2010-12-12 20:19:14 UTC
Specification: http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html
Section: http://www.whatwg.org/specs/web-apps/current-work/#table-encoding-overrides

Comment:
The willful violation clause is most unwise. A standard should not violate
another standard for any reason. This wouls lead to 2 things : 1) Content
correctly encoded content would never be displayed correctly. 2) All future
standards would need to include the same clause, making them more complex and
weakening the standards defining the charsets.

Posted from: 81.57.229.79
Comment 1 Aryeh Gregor 2010-12-12 23:28:48 UTC
This cannot be avoided without breaking huge amounts of web content, so it's not going to change.  It doesn't matter whether we like it or not; everything has to be subsumed to the goal of writing a standard that browsers are actually willing to implement.  Browsers are not willing to implement anything that breaks more than a small number of web pages.
Comment 2 Julian Reschke 2010-12-13 08:44:06 UTC
Hm. Why is a bug resolved as "WORKSFORME" by someone != the editor?
Comment 3 Laura Carlson 2010-12-13 09:42:27 UTC
Reopening. Would like to hear editor's rationale.

Also applicable to "willful violations" of WCAG and other accessibility guidelines.
Comment 4 Benjamin Hawkes-Lewis 2010-12-13 11:17:13 UTC
(In reply to comment #3)
> Reopening. Would like to hear editor's rationale.

The spec gives the rationale, as it does for every designated willful violation: "motivated by a desire for compatibility with legacy content". Are you disputing that user agents need to follow this behavior in order to enable people to access the current web corpus?

> Also applicable to "willful violations" of WCAG and other accessibility
> guidelines.

It's certainly not, since (a) there are no willful violations of conformance criteria of accessibility-related standards designated in the spec and (b) if they were, it would be off-topic for this bug, which is concerned with the willful violations in:

http://www.whatwg.org/specs/web-apps/current-work/#table-encoding-overrides

(If you think that the spec should designate a willful violation of a conformance criterion of an accessibility-related standard where it currently does not, please file a bug to that effect. Note it's technically impossible for HTML5 to establish conformance criteria that willfully violate WCAG2, since (unlike WAI-ARIA) WCAG2 does not establish any conformance criteria for host languages and (unlike UAAG) WCAG2 does not establish any conformance criteria for user agents. At worst, HTML5 might introduce features that are impossible for authors to use in conformance with WCAG, but I don't think that's been demonstrated to be the case.)
Comment 5 Benjamin Hawkes-Lewis 2010-12-13 11:43:13 UTC
(In reply to comment #0)
> A standard should not violate another standard for any reason.

Why not? What if the violated standard is harmful or erroneous? 

> This wouls lead to 2 things : 1) Content
> correctly encoded content would never be displayed correctly.

Market forces, not HTML5, force browsers to provide access to the current web corpus by violating the HTTP spec. If HTML5 complied with the HTTP spec, market forces would still force them to provide access to the web corpus, so /theoretically/ correctly encoded content would still be displayed incorrectly, as you put it.

> 2) All future
> standards would need to include the same clause, making them more complex and
> weakening the standards defining the charsets.

Accurately describing the interoperable behavior required for usefulness (in this case, access to the web corpus) strengthens rather than weakens standards:

"an Internet Standard is a specification that is stable and well-understood, is technically competent, has multiple, independent, and interoperable implementations with substantial operational experience, enjoys significant public support, and is recognizably useful in some or all parts of the Internet."

http://tools.ietf.org/html/rfc2026#section-1.1
Comment 6 Laura Carlson 2010-12-13 12:30:07 UTC
> > Also applicable to "willful violations" of WCAG and other accessibility
> > guidelines.
> 
> It's certainly not, since (a) there are no willful violations of conformance
> criteria of accessibility-related standards designated in the spec and 

* CAPTCHA Bug 9216 on HTML5 [1] and CAPTCHA Bug 9169 on Techniques for
providing useful text alternatives [2].
* Webcam Bug 9215 on HTML5 [3] and Webcam Bug 9174 on Techniques for
providing useful text alternatives [4].

[1] http://www.w3.org/Bugs/Public/show_bug.cgi?id=9216
[2] http://www.w3.org/Bugs/Public/show_bug.cgi?id=9169
[3] http://www.w3.org/Bugs/Public/show_bug.cgi?id=9215
[4] http://www.w3.org/Bugs/Public/show_bug.cgi?id=9174

Both the HTML5 Spec and the Techniques for providing useful text
alternatives are in direct conflict. Techniques for providing useful text alternatives provides normative advice.

> (b) if
> they were, it would be off-topic for this bug, which is concerned with the
> willful violations in:
> 
> http://www.whatwg.org/specs/web-apps/current-work/#table-encoding-overrides
Other bugs have previous been expanded in scope, for instance HTML ISSUE 122.
Comment 7 Benjamin Hawkes-Lewis 2010-12-13 12:52:11 UTC
(In reply to comment #6) 
> * CAPTCHA Bug 9216 on HTML5 [1] and CAPTCHA Bug 9169 on Techniques for
> providing useful text alternatives [2].
> * Webcam Bug 9215 on HTML5 [3] and Webcam Bug 9174 on Techniques for
> providing useful text alternatives [4].
> 
> [1] http://www.w3.org/Bugs/Public/show_bug.cgi?id=9216
> [2] http://www.w3.org/Bugs/Public/show_bug.cgi?id=9169
> [3] http://www.w3.org/Bugs/Public/show_bug.cgi?id=9215
> [4] http://www.w3.org/Bugs/Public/show_bug.cgi?id=9174
> 
> Both the HTML5 Spec and the Techniques for providing useful text
> alternatives are in direct conflict. Techniques for providing useful text
> alternatives provides normative advice.

The Techniques document is not in scope for this bug (see the Component field).

The HTML5 Spec does not designate the conformance criteria in the sections discussed in those tickets as willful violations of WCAG2, and I do not accept that the tickets discuss violations of WCAG2. The tickets ultimately boil down to questions about what attributes should contain what sort of text alternatives. WCAG2 requires text alternatives, but being a technology-independent standard it does not provide conformance criteria mandating what "alt" and "title" should mean or how they should be used.

> > http://www.whatwg.org/specs/web-apps/current-work/#table-encoding-overrides
> Other bugs have previous been expanded in scope, for instance HTML ISSUE 122.

That people have expanded the scope of previous bugs is not a good reason to expand the scope of yet more bugs.

"One bug per report" is a basic principle of bug reporting:

https://developer.mozilla.org/en/bug_writing_guidelines

Trying to argue about "alt" and charsets in the same bug is just confusing, but if you want to open *another* bug expressly concerned with the general principle of not violating other standards, feel free.
Comment 8 Shelley Powers 2010-12-13 20:18:02 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > Reopening. Would like to hear editor's rationale.
> 
> The spec gives the rationale, as it does for every designated willful
> violation: "motivated by a desire for compatibility with legacy content". Are
> you disputing that user agents need to follow this behavior in order to enable
> people to access the current web corpus?

That's a generic phrase but isn't necessarily specific to this issue. It is fair to ask why, in this particular instance, is a willful violation of compatibility necessary, particularly if doing to increases understanding (and provides a specific point of reference if the question gets asked again).

> 
> > Also applicable to "willful violations" of WCAG and other accessibility
> > guidelines.
> 
> It's certainly not, since (a) there are no willful violations of conformance
> criteria of accessibility-related standards designated in the spec and (b) if
> they were, it would be off-topic for this bug, which is concerned with the
> willful violations in:
> 
> http://www.whatwg.org/specs/web-apps/current-work/#table-encoding-overrides
>

I'm not sure it is appropriate for any of us to tell each other we're off-topic or not. If Laura is concerned about the phrase "willful violation", then hearing more details about what drives the use of this phrase in this bug could then lead her to decide against posting another bug, or to post a bug that is more likely to generate a useful response. 

The original bug is fairly generic. The example seems to be more of a an example of one specific mention of the "willful violation". Unfortunately, since this bug came in through the WhatWG document form, and the person is unlikely to know this discussion is going on in the bugs because of the way this system is designed, we can't know for sure if she or he linked the use of the phrase as an example or because she or he had problems with the specific use of the phrase. 
 
So, erring on the side of question, the bug could be broken into two parts:

Is the use of willful violation justified? 

Is this specific use of willful justification justified?

> (If you think that the spec should designate a willful violation of a
> conformance criterion of an accessibility-related standard where it currently
> does not, please file a bug to that effect. Note it's technically impossible
> for HTML5 to establish conformance criteria that willfully violate WCAG2, since
> (unlike WAI-ARIA) WCAG2 does not establish any conformance criteria for host
> languages and (unlike UAAG) WCAG2 does not establish any conformance criteria
> for user agents. At worst, HTML5 might introduce features that are impossible
> for authors to use in conformance with WCAG, but I don't think that's been
> demonstrated to be the case.)


Again, your response is unnecessarily suppressive.
Comment 9 Benjamin Hawkes-Lewis 2010-12-14 02:46:42 UTC
(In reply to comment #8)
> I'm not sure it is appropriate for any of us to tell each other we're off-topic
> or not.

"Only one issueplease use separate bugs for separate issues."

http://dev.w3.org/html5/decision-policy/decision-policy.html

> If Laura is concerned about the phrase "willful violation", then
> hearing more details about what drives the use of this phrase in this
> bug could then lead her to decide against posting another bug, or to
> post a bug that is more likely to generate a useful response.

Optimising some theoretical other bug is a poor rationale for swamping
discussion in _this_ bug.

> The original bug is fairly generic. The example seems to be more of a
> an example of one specific mention of the "willful violation".

The rationale might be potentially applicable to other willful
violations, but the report applied it to /a/ clause (singular), not
multiple clauses. It doesn't say anything about it being a mere example.

> So, erring on the side of question, the bug could be broken into two parts:
> 
> Is the use of willful violation justified? 

The bug report posits /a priori/ that willful violations can never be
justified. I think that's an indefensible position since, while it's
reasonable to expect groups working on different standards to try to
work together:

    1. It's ultimately unrealistic to expect a group in charge of
formulating Standard X to be able to force a group in charge of
formulating Standard Y to reformulate Standard Y as required for the
target audience of Standard X.

    2. It's inhuman to expect the group in charge of formulating
Standard X to sacrifice the human needs of its target audience (e.g.
access to access information and services over the world wide web,
protection of their privacy and security) on the altar of technical
consistency with Standard Y.

To put it another way: free agents are free agents. :)

Do you have any arguments or information to add on this?

> Is this specific use of willful justification justified?

This is always a good question to ask. :)

I claim no expertise in the subject of character encodings, so take the
following with a pinch of salt.

HTML5 character mappings need to enable access to the deployed web
corpus interoperably with major user agents.

Not least of the advantages of standardizing such mappings is to help
protect users from security problems like:

http://shiflett.org/blog/2005/dec/google-xss-example

http://code.google.com/p/chromium/issues/detail?id=15701

For general background see:

Web encodings page on the WHATWG wiki:

http://wiki.whatwg.org/wiki/Web_Encodings

"Internal character encoding declaration" thread at WHATWG

http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2006-March/006000.html

"Superset encodings" thread at WHATWG

http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-April/019322.html

"charset name matching rules" thread at W3C:

http://lists.w3.org/Archives/Public/public-html-comments/2009Sep/0050.html

Some test cases:

http://hsivonen.iki.fi/test/wa10/encoding-detection/

http://www.hixie.ch/tests/adhoc/html/parsing/encoding/all.html

http://coq.no/character-tables/en

I've taken the trouble to search the archives for rationales specific
to each violation. I make no guarantee that this information is complete
or accurate; read the links and make up your own minds.

"Popular browsers" here is shorthand for the big four engines (Trident,
Gecko, WebKit, Presto).

   * Popular browsers and Google Web Search map EUC-KR to Windows-949.
     http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-April/019322.html
     http://mail.apps.ietf.org/ietf/charsets/msg01834.html
     http://code.google.com/p/chromium/issues/detail?id=15701
     http://trac.webkit.org/browser/trunk/WebCore/platform/text/TextCodecICU.cpp

   * Popular browsers map EUC-JP to CP51932.
     http://www.w3.org/Bugs/Public/show_bug.cgi?id=7444
     http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-September/023208.html
     http://lists.w3.org/Archives/Public/public-html-comments/2009Sep/0050.html

   * Popular browsers (but not Google Web Search) map GB2312 and GB_2312-80 to
     the superset GBK.
     http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2008-March/014219.html
     http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-April/019322.html
     http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-July/020846.html
     http://lists.w3.org/Archives/Public/public-html-comments/2009Sep/0050.html
     http://mail.apps.ietf.org/ietf/charsets/msg01834.html 

   * Popular browsers and Google Web Search map ISO-8859-1 to the superset
     windows-1252.
     http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2006-March/006000.html
     http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2006-November/007737.html
     http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2006-November/007882.html
     http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2007-June/011650.html
     http://wiki.whatwg.org/wiki/Web_Encodings
     http://mail.apps.ietf.org/ietf/charsets/msg01835.html
     http://mail.apps.ietf.org/ietf/charsets/msg01834.html

   * WebKit and Google Web Search map ISO-8859-9 to the superset
     windows-1254. Adopting this behavior has support from an Opera rep.
     http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2007-June/011648.html
     http://lists.w3.org/Archives/Public/public-html-comments/2009Aug/0047.html
     http://wiki.whatwg.org/wiki/Web_Encodings
     http://lists.w3.org/Archives/Public/public-html-comments/2009Aug/0041.html
     http://mail.apps.ietf.org/ietf/charsets/msg01834.html
     http://trac.webkit.org/browser/trunk/WebCore/platform/text/TextCodecICU.cpp

   * Popular browsers and Google Web Search map ISO-8859-11 to the
     superset windows-874.
     http://lists.w3.org/Archives/Public/public-html/2008Mar/0183.html
     http://lists.w3.org/Archives/Public/public-html-comments/2009Sep/0050.html
     http://mail.apps.ietf.org/ietf/charsets/msg01834.html
     http://trac.webkit.org/browser/trunk/WebCore/platform/text/TextCodecICU.cpp

   * Popular browsers and Google Web Search map TIS-620 to the
     superset windows-874.
     http://lists.w3.org/Archives/Public/public-html/2008Mar/0183.html
     http://lists.w3.org/Archives/Public/public-html-comments/2009Sep/0050.html
     http://wiki.whatwg.org/wiki/Web_Encodings
     http://mail.apps.ietf.org/ietf/charsets/msg01834.html
     http://trac.webkit.org/browser/trunk/WebCore/platform/text/TextCodecICU.cpp

   * Popular browsers and Google Web Search map KS_C_5601-1987 to windows-949.
     http://lists.w3.org/Archives/Public/ietf-charsets/2001AprJun/0030.html
     http://lists.w3.org/Archives/Public/www-archive/2008Jun/0155.html
     http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-July/021207.html
     http://lists.w3.org/Archives/Public/public-html-comments/2009Sep/0050.html
     http://mail.apps.ietf.org/ietf/charsets/msg01834.html
     
   * Popular browsers and Google Web Search map Shift_JIS to its superset
     Windows-31J.
     http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-April/019322.html
     http://mail.apps.ietf.org/ietf/charsets/msg01834.html
     http://lists.w3.org/Archives/Public/public-html-comments/2009Sep/0050.html
     Note ongoing discussion at the IETF:
     http://mail.apps.ietf.org/ietf/charsets/msg01942.html
 
   * Popular browsers and Google Web Search map TIS-620 to its superset
     windows-874. WebKit S60 made this change back in 2006 because of a
     bug report.
     http://trac.webkit.org/changeset/15974
     http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2007-June/011651.html
     http://lists.w3.org/Archives/Public/public-html-comments/2009Sep/0050.html
     http://wiki.whatwg.org/wiki/Web_Encodings
     http://mail.apps.ietf.org/ietf/charsets/msg01834.html
     http://trac.webkit.org/browser/trunk/WebCore/platform/text/TextCodecICU.cpp
     http://www.opera.com/docs/specs/presto27/encodings/

   * Opera, Firefox, Safari, and Google Web Search map US-ASCII to its
     superset windows-1252, while IE7 drops the high bit. Ian judged the
     later behavior to be a security risk.
     http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2008-July/015455.html
     http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2008-September/016170.html
     
   * Popular browsers map UTF-16 without BOM to LE. Content found in the wild
     depends on this behavior.
     http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-June/020552.html

Fixing these willful violations by pushing them upstream into the IANA
registry is non-trivial.

Consider the problem of globally mapping Shift_JIS to windows-31J at the
registry level, as expressed by a Microsoft rep:

"Problem is that there are 4+ implementations of shift_jis in 'common'
use, and none of them are likely to change, since it'd break their
customers. :(

"So I don't see a perfect solution here.  HTML5 is fairly clear about
browser behavior, but in other environments, I think the best we can do
is point to the variants and allow the clients to decide which version
they'd like to use."

http://mail.apps.ietf.org/ietf/charsets/msg01966.html

Once the principle of munging encodings is accepted, there's clearly
room for updating the details based on new data. Do you have any
new data to add?

Can you persuade major user agent vendors to commit to a different
implementation strategy than the one described in the spec?
Comment 10 Shelley Powers 2010-12-14 03:55:14 UTC
(In reply to comment #9)
> (In reply to comment #8)
> > I'm not sure it is appropriate for any of us to tell each other we're off-topic
> > or not.
> 
> "Only one issueplease use separate bugs for separate issues."
> 
> http://dev.w3.org/html5/decision-policy/decision-policy.html
> 
> > If Laura is concerned about the phrase "willful violation", then
> > hearing more details about what drives the use of this phrase in this
> > bug could then lead her to decide against posting another bug, or to
> > post a bug that is more likely to generate a useful response.
> 
> Optimising some theoretical other bug is a poor rationale for swamping
> discussion in _this_ bug.
> 
> > The original bug is fairly generic. The example seems to be more of a
> > an example of one specific mention of the "willful violation".
> 
> The rationale might be potentially applicable to other willful
> violations, but the report applied it to /a/ clause (singular), not
> multiple clauses. It doesn't say anything about it being a mere example.

Again, we can't tell for sure since the beginning of the bug implies generality, while the rest of the bug provides a specific example. This could be as much a problem due to using the type of bug reporting--which is based on reporting a problem in a specific point in the document. 


> 
> > So, erring on the side of question, the bug could be broken into two parts:
> > 
> > Is the use of willful violation justified? 
> 
> The bug report posits /a priori/ that willful violations can never be
> justified. I think that's an indefensible position since, while it's
> reasonable to expect groups working on different standards to try to
> work together:
>

Confused. You rejected the idea that this bug is generalized around willful violations, but then proceed to defend a generalized willful violations as a design and decision paradigm. 


>     1. It's ultimately unrealistic to expect a group in charge of
> formulating Standard X to be able to force a group in charge of
> formulating Standard Y to reformulate Standard Y as required for the
> target audience of Standard X.
>

But it is realistic to expect the group in charge of formulating standard X to cooperate to every extent possible in ensuring there are no incompatibilities between it and Standard Y, because said incompatibilities will eventually, most likely, cause problems. 
 
>     2. It's inhuman to expect the group in charge of formulating
> Standard X to sacrifice the human needs of its target audience (e.g.
> access to access information and services over the world wide web,
> protection of their privacy and security) on the altar of technical
> consistency with Standard Y.
>

Inhuman? Odd phrase. Puppy mills are inhuman. 

Technology sometimes requires us to compromise: sometimes you have to adapt in the short term, to benefit in the long run. 

I've found over the years that inconsistencies generate more security problems, and, overall, generate more of every other kind of problem. I don't easily disregard inconsistencies. 

  
> To put it another way: free agents are free agents. :)
> 
> Do you have any arguments or information to add on this?
> 

Free agents are free agents? No, nothing to add to this.

> > Is this specific use of willful justification justified?
> 
> This is always a good question to ask. :)
>

Well, it should have read "willful violation justified"...
 
> I claim no expertise in the subject of character encodings, so take the
> following with a pinch of salt.
>

I'll take only a fraction of your pinch of salt. I also know very little on the topic.
 
> HTML5 character mappings need to enable access to the deployed web
> corpus interoperably with major user agents.
> 
> Not least of the advantages of standardizing such mappings is to help
> protect users from security problems like:
> 
> http://shiflett.org/blog/2005/dec/google-xss-example
> 
> http://code.google.com/p/chromium/issues/detail?id=15701
> 
> For general background see:
> 
> Web encodings page on the WHATWG wiki:
> 
> http://wiki.whatwg.org/wiki/Web_Encodings
> 
> "Internal character encoding declaration" thread at WHATWG
> 
> http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2006-March/006000.html
> 
> "Superset encodings" thread at WHATWG
> 
> http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-April/019322.html
> 
> "charset name matching rules" thread at W3C:
> 
> http://lists.w3.org/Archives/Public/public-html-comments/2009Sep/0050.html
> 
> Some test cases:
> 
> http://hsivonen.iki.fi/test/wa10/encoding-detection/
> 
> http://www.hixie.ch/tests/adhoc/html/parsing/encoding/all.html
> 
> http://coq.no/character-tables/en
> 
> I've taken the trouble to search the archives for rationales specific
> to each violation. I make no guarantee that this information is complete
> or accurate; read the links and make up your own minds.
> 
> "Popular browsers" here is shorthand for the big four engines (Trident,
> Gecko, WebKit, Presto).
> 
>    * Popular browsers and Google Web Search map EUC-KR to Windows-949.
>      http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-April/019322.html
>      http://mail.apps.ietf.org/ietf/charsets/msg01834.html
>      http://code.google.com/p/chromium/issues/detail?id=15701
>     
> http://trac.webkit.org/browser/trunk/WebCore/platform/text/TextCodecICU.cpp
> 
>    * Popular browsers map EUC-JP to CP51932.
>      http://www.w3.org/Bugs/Public/show_bug.cgi?id=7444
>     
> http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-September/023208.html
>      http://lists.w3.org/Archives/Public/public-html-comments/2009Sep/0050.html
> 
>    * Popular browsers (but not Google Web Search) map GB2312 and GB_2312-80 to
>      the superset GBK.
>      http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2008-March/014219.html
>      http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-April/019322.html
>      http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-July/020846.html
>      http://lists.w3.org/Archives/Public/public-html-comments/2009Sep/0050.html
>      http://mail.apps.ietf.org/ietf/charsets/msg01834.html 
> 
>    * Popular browsers and Google Web Search map ISO-8859-1 to the superset
>      windows-1252.
>      http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2006-March/006000.html
>     
> http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2006-November/007737.html
>     
> http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2006-November/007882.html
>      http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2007-June/011650.html
>      http://wiki.whatwg.org/wiki/Web_Encodings
>      http://mail.apps.ietf.org/ietf/charsets/msg01835.html
>      http://mail.apps.ietf.org/ietf/charsets/msg01834.html
> 
>    * WebKit and Google Web Search map ISO-8859-9 to the superset
>      windows-1254. Adopting this behavior has support from an Opera rep.
>      http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2007-June/011648.html
>      http://lists.w3.org/Archives/Public/public-html-comments/2009Aug/0047.html
>      http://wiki.whatwg.org/wiki/Web_Encodings
>      http://lists.w3.org/Archives/Public/public-html-comments/2009Aug/0041.html
>      http://mail.apps.ietf.org/ietf/charsets/msg01834.html
>     
> http://trac.webkit.org/browser/trunk/WebCore/platform/text/TextCodecICU.cpp
> 
>    * Popular browsers and Google Web Search map ISO-8859-11 to the
>      superset windows-874.
>      http://lists.w3.org/Archives/Public/public-html/2008Mar/0183.html
>      http://lists.w3.org/Archives/Public/public-html-comments/2009Sep/0050.html
>      http://mail.apps.ietf.org/ietf/charsets/msg01834.html
>     
> http://trac.webkit.org/browser/trunk/WebCore/platform/text/TextCodecICU.cpp
> 
>    * Popular browsers and Google Web Search map TIS-620 to the
>      superset windows-874.
>      http://lists.w3.org/Archives/Public/public-html/2008Mar/0183.html
>      http://lists.w3.org/Archives/Public/public-html-comments/2009Sep/0050.html
>      http://wiki.whatwg.org/wiki/Web_Encodings
>      http://mail.apps.ietf.org/ietf/charsets/msg01834.html
>     
> http://trac.webkit.org/browser/trunk/WebCore/platform/text/TextCodecICU.cpp
> 
>    * Popular browsers and Google Web Search map KS_C_5601-1987 to windows-949.
>      http://lists.w3.org/Archives/Public/ietf-charsets/2001AprJun/0030.html
>      http://lists.w3.org/Archives/Public/www-archive/2008Jun/0155.html
>      http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-July/021207.html
>      http://lists.w3.org/Archives/Public/public-html-comments/2009Sep/0050.html
>      http://mail.apps.ietf.org/ietf/charsets/msg01834.html
> 
>    * Popular browsers and Google Web Search map Shift_JIS to its superset
>      Windows-31J.
>      http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-April/019322.html
>      http://mail.apps.ietf.org/ietf/charsets/msg01834.html
>      http://lists.w3.org/Archives/Public/public-html-comments/2009Sep/0050.html
>      Note ongoing discussion at the IETF:
>      http://mail.apps.ietf.org/ietf/charsets/msg01942.html
> 
>    * Popular browsers and Google Web Search map TIS-620 to its superset
>      windows-874. WebKit S60 made this change back in 2006 because of a
>      bug report.
>      http://trac.webkit.org/changeset/15974
>      http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2007-June/011651.html
>      http://lists.w3.org/Archives/Public/public-html-comments/2009Sep/0050.html
>      http://wiki.whatwg.org/wiki/Web_Encodings
>      http://mail.apps.ietf.org/ietf/charsets/msg01834.html
>     
> http://trac.webkit.org/browser/trunk/WebCore/platform/text/TextCodecICU.cpp
>      http://www.opera.com/docs/specs/presto27/encodings/
> 
>    * Opera, Firefox, Safari, and Google Web Search map US-ASCII to its
>      superset windows-1252, while IE7 drops the high bit. Ian judged the
>      later behavior to be a security risk.
>      http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2008-July/015455.html
>     
> http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2008-September/016170.html
> 
>    * Popular browsers map UTF-16 without BOM to LE. Content found in the wild
>      depends on this behavior.
>      http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-June/020552.html
> 
> Fixing these willful violations by pushing them upstream into the IANA
> registry is non-trivial.
> 
> Consider the problem of globally mapping Shift_JIS to windows-31J at the
> registry level, as expressed by a Microsoft rep:
> 
> "Problem is that there are 4+ implementations of shift_jis in 'common'
> use, and none of them are likely to change, since it'd break their
> customers. :(
> 
> "So I don't see a perfect solution here.  HTML5 is fairly clear about
> browser behavior, but in other environments, I think the best we can do
> is point to the variants and allow the clients to decide which version
> they'd like to use."
> 
> http://mail.apps.ietf.org/ietf/charsets/msg01966.html
> 
> Once the principle of munging encodings is accepted, there's clearly
> room for updating the details based on new data. Do you have any
> new data to add?
> 
> Can you persuade major user agent vendors to commit to a different
> implementation strategy than the one described in the spec?

There's a difference between accepting a technical decision because it is the best, and accepting one because some vendors hold us hostage. 

See, now this is why further discussion is good, as you've provided a great deal of information; much more so than the original quick rejection of the bug. 

I have started to access some of the material, but stopped when the Hixie test case link killed my browser. If I have time to go through the information and feel I have anything further to add on this specific instance of willful violation, I will do so.

Whether in general "willful violation" is a good principle on which to build a sound standard, though, is still an issue.
Comment 11 Ian 'Hixie' Hickson 2011-01-11 05:53:41 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Rejected
Change Description: no spec change
Rationale: Comment 1 is correct.

> The willful violation clause is most unwise. A standard should not violate
> another standard for any reason.

I disagree. That's a reversal of the priority of constituencies. Users, authors, and implementors are all more important than spec purity.


> This wouls lead to 2 things : 1) Content correctly encoded content would
> never be displayed correctly.

This is already the case.


> 2) All future
> standards would need to include the same clause, making them more complex and
> weakening the standards defining the charsets.

I don't see why this would affect other standards.
Comment 12 Martin Kliehm 2011-01-11 17:23:32 UTC
The bug-triage sub-team doesn't believe this should be W3C Accessibility Task Force priority. Since the two violated standards contradict each other, it's reasonable to fix this in HTML5. We don't see much of an accessibility issue here, and the bug is certainly not about willfully violating WCAG or other W3C standards that should harmonize and be non-redundant.
Comment 13 Michael[tm] Smith 2011-08-04 05:12:55 UTC
mass-move component to LC1