This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 23090 - Fallback encoding implied by the Greek locale missing
Summary: Fallback encoding implied by the Greek locale missing
Status: RESOLVED FIXED
Alias: None
Product: WHATWG
Classification: Unclassified
Component: HTML (show other bugs)
Version: unspecified
Hardware: All All
: P2 normal
Target Milestone: Unsorted
Assignee: Ian 'Hixie' Hickson
QA Contact: contributor
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-08-29 12:39 UTC by Henri Sivonen
Modified: 2014-01-18 00:46 UTC (History)
2 users (show)

See Also:


Attachments

Description Henri Sivonen 2013-08-29 12:39:29 UTC
http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html#determining-the-character-encoding doesn't list a special fallback encoding for Greek.

According to StatCounter, Firefox and Chrome  both much more popular than IE in Greece. Both Firefox and Chrome (as tested on a Mac) use ISO-8859-7 as a fallback encoding when the Greek localization is used. (Note: If you search the Firefox code base, you'll notice that the fallback is in principle ISO-8859-1, but the generic fallback is overridden for each platform so that it effectively ends up being ISO-8859-7.)

Please list ISO-8859-7 as the fallback encoding for the Greek locale.
Comment 1 Ian 'Hixie' Hickson 2013-08-30 18:03:02 UTC
Spec notes say:

<!-- el, Greek, is not listed here because Windows Vista wanted windows-1253, Chrome wanted ISO-8859-7, and Firefox wanted windows-1252 -->
<!-- el-GR, Greek (Greece), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1253 -->

I guess my Firefox data is out of date. Can you point me to the data so I can check the other rows as well?
Comment 2 Henri Sivonen 2013-09-06 07:45:27 UTC
> I guess my Firefox data is out of date. Can you point me to the data so I can check the other rows as well?

https://mxr.mozilla.org/l10n-mozilla-release/search?string=intl.charset.default&find=intl.properties

Note that you get four results per locale, because there are per-platform override files.  Of course, the notion that the fall back encoding could depend on the operating system is fundamentally bogus, since the Web out there does not change its legacy encoding depending on the operating system the browser is running on.

Furthermore,  these values go through Encoding Standard-based alias resolution, so ISO-8859-1 becomes windows-1252.

 If all the four values for a given locale agree, it's pretty safe to assume that that's the value that takes effect.  In the case of  Greek, you'll find that each platform overrides the platform-independent value.

(However, since Firefox and Chrome disagree with IE, there's a possibility that Greek doesn't actually need a special fallback anymore.)
Comment 3 Ian 'Hixie' Hickson 2013-11-06 21:41:35 UTC
I've just updated the Greek row.
Comment 4 contributor 2013-11-06 21:42:59 UTC
Checked in as WHATWG revision r8259.
Check-in comment: Add greek to the default encoding logic.
http://html5.org/tools/web-apps-tracker?from=8258&to=8259
Comment 5 Henri Sivonen 2013-11-08 10:33:59 UTC
Thanks.

An incorrect comment remains in the spec source, though. The comment says "Firefox wanted windows-1252". That's not true. Firefox actually used ISO-8859-7 all this time. The way it was accomplished was enough of a mess that it was easy to conclude otherwise from mere reading of the source, which I assume is what you did.

(FWIW, for future reference, the code in Firefox is now less of a mess: http://mxr.mozilla.org/mozilla-central/source/dom/encoding/localesfallbacks.properties and http://mxr.mozilla.org/mozilla-central/source/dom/encoding/FallbackEncoding.cpp#68 .)
Comment 6 contributor 2014-01-18 00:46:14 UTC
Checked in as WHATWG revision r8408.
Check-in comment: update internal comment on encoding to be more accurate, per henri
http://html5.org/tools/web-apps-tracker?from=8407&to=8408