This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
This is related to Bug 27477: Section 4.3e of the serialization spec says that If the JSON output method is selected, replace " with \" and the newline character (
) with \n. It needs to be clarified how 	 and are to be serialized in JSON (I assume it will be '\t' and '\r' ?). If there are cases left in which Unicode characters need to be escaped in the JSON output, it may be worth telling if an implementation should choose upper or lower case (RFC7159 allows both variants). For example, can currently be serialized as "\r", "\u000D", or \u000d".
The joint WGs discussed this bug today. JSON (as defined by RFC 7159) doesn't require escaping of characters other than " and \ but on consideration the WGs agreed that it's probably useful to escape other characters which might otherwise frequently be corrupted in transmission over some channels. The rules in serialization will be aligned with the json-to-xml function of XSLT [1], which says that in addition to escaping \, any occurrence of quotation mark, backspace, form-feed, newline, carriage return, or tab is replaced by \", \b, \f, \n, \r, or \t respectively, and any other codepoint in the range 1-31 or 127-159 is replaced by an escape in the form \uHHHH where HHHH is the hexadecimal representation of the codepoint value. [1] http://www.w3.org/TR/xslt-30/#func-xml-to-json We believe this resolves the issues, so we are marking this Bugzilla entry RESOLVED. Christian, if you would review this resolution and indicate your agreement by changing the bug status to CLOSED (or your dissent by RE-OPENING it), it would be helpful. If we don't hear from you in the next two weeks, we will assume that you are content with the resolution of the issue.
Michael, thanks a lot for the summary. I completely agree with the resolution. Before closing this bug, I have one last question, regarding the hexadecimal representation: Does HHHH mean that the hexadecimal digits A-F should be output in upper case, or is this implementation dependent?
I'm not Mike but I'll take a stab at answering the question anyway. I do NOT believe that the use of "HHHH" (as opposed to "hhhh", "Hhhh", "hHhh, etc.) was intended to imply that the hexadecimal digits must be output in upper case. In fact, it is rather more common for lowercase to be used. I don't believe that it is useful for our specs to prescribe one or the other, so I think implementation-dependent (not -defined!) is most appropriate. And that probably is worth an entry into the appendix. I'm re-marking the bug RESOLVED/FIXED. If you agree that this is the appropriate response, please mark the bug CLOSED.
Jim, thanks for the information. As suggested I'm closing the bug.