This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 11337 - Some ASCII-compatible encodings have harmless substitutions
Summary: Some ASCII-compatible encodings have harmless substitutions
Status: RESOLVED WONTFIX
Alias: None
Product: HTML WG
Classification: Unclassified
Component: LC1 HTML5 spec (show other bugs)
Version: unspecified
Hardware: All All
: P2 normal
Target Milestone: ---
Assignee: Ian 'Hixie' Hickson
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-11-17 19:07 UTC by Yuhong Bao
Modified: 2011-08-07 11:36 UTC (History)
7 users (show)

See Also:


Attachments

Description Yuhong Bao 2010-11-17 19:07:06 UTC
"Encodings in which a series of bytes in the range 0x20 to 0x7E can encode characters other than the corresponding characters in the range U+0020 to U+007E represent a potential security vulnerability:"
What this doesn't mention is that some ASCII-compatible encodings like Shift-JIS have harmless substitutions, such as replacing the backslash with the yen sign, which is OK because it is not used much (if at all) in HTML.
Comment 1 Anne 2010-11-17 19:14:36 UTC
Per https://bugs.webkit.org/show_bug.cgi?id=24906 that is false.
Comment 2 Yuhong Bao 2010-11-17 19:18:49 UTC
Yes, some platforms do hack fonts so U+005C has the glyph of Yen sign.
Comment 3 Anne 2010-11-17 19:25:23 UTC
When it happens at the font-level the vulnerability is not the same, because all encodings are similarly affected.
Comment 4 Yuhong Bao 2010-11-17 19:27:24 UTC
(In reply to comment #3)
> When it happens at the font-level the vulnerability is not the same, because
> all encodings are similarly affected.

And my point is that it is not a real vulnerability, which is why I am trying to get the standard changed.
Comment 5 Anne 2010-11-17 19:30:06 UTC
Your point is about the encoding doing a substitution, but as I pointed out the encoding does no such substitution.
Comment 6 Yuhong Bao 2010-11-17 19:35:31 UTC
(In reply to comment #5)
> Your point is about the encoding doing a substitution, but as I pointed out the
> encoding does no such substitution.

On Windows only, where they use hacked fonts instead.
Comment 7 Aryeh Gregor 2010-11-18 22:13:48 UTC
Backslash has special meaning in JS and CSS, which are normally embedded in HTML, so using a character set where that byte has a different meaning could indeed lead to vulnerabilities.
Comment 8 Ian 'Hixie' Hickson 2010-12-31 04:05:17 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Rejected
Change Description: no spec change
Rationale: Anytime this happens it can lead to a vulnerability, because code that expects something to do one thing may find it does another.
Comment 9 Michael[tm] Smith 2011-08-04 05:17:32 UTC
mass-move component to LC1