16690 – euc-kr error handling

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 16690 - euc-kr error handling

Summary: euc-kr error handling

Status:	RESOLVED DUPLICATE of bug 16691

Alias:	None

Product:	WHATWG
Classification:	Unclassified
Component:	Encoding (show other bugs)
Version:	unspecified
Hardware:	PC Windows 3.1

Importance:	P2 normal
Target Milestone:	Unsorted
Assignee:	Anne
QA Contact:	sideshowbarker+encodingspec

URL:
Whiteboard:
Keywords:

Depends on:	16691
Blocks:
	Show dependency tree / graph

Reported:	2012-04-10 18:46 UTC by Anne
Modified:	2013-12-16 18:17 UTC (History)
CC List:	2 users (show)

See Also:

Attachments

Description Anne 2012-04-10 18:46:25 UTC

>>> -- 5.4: Error handling in line with EUC-JP and Shift-JIS.  A second  
>>> byte 0x53--0xA0 after a first byte 0xC6 (the undefined area after the  
>>> last UHC hangul) is arguably outside the encoding as well and not just  
>>> accidentally undefined, so such a byte should strictly speaking be  
>>> reprocessed as well.
>>
>> It is already in line. Because pointer (formerly index) would be null.
>
> Hm, I have now attempted to apply the algorithm to the byte sequence C6  
> 53 and I seem to get the pointer value (26+26+126) x (C6-0x81) +  
> (0x53-0x41) = 12,300, which is not null.  Am I missing something?

Comment 1 Anne 2012-04-10 19:01:20 UTC

In other words, should bytes in the range 0x53 to 0xA0 following lead byte 0xC6 be eaten or not.

Comment 2 pub-w3 2012-04-25 18:39:54 UTC

Firefox, Safari, Opera and IE6 all handle C6 0x53 in the same way as C7 0x53  (the last UHC codepoint being C6 0x52), i.e., 0x53 is reprocessed in both cases (Firefox) or neither (the others).  Introducing a difference here seems a bad idea.

Related:  <https://www.w3.org/Bugs/Public/show_bug.cgi?id=16771> (Big5 reprocessing).

Comment 3 Anne 2013-12-16 18:17:05 UTC


*** This bug has been marked as a duplicate of bug 16691 ***