[Last Call] Registration of media type application/sparql-query (fwd)

.. always UTF8 ...

>  Unicode code points may also be expressed using an \uXXXX (U+0 to
>  U+FFFF) or \UXXXXXXXX syntax (for U+10000 onwards) where X is a
>  hexadecimal digit [0-9A-F]

I assume that what is ment here is the use of 7bit safe chars to express
unicode code points. This begs the question:

->	can this be mixed with true utf8 in the same payload.

	-> my advise would be NOT to allow this; think cross
	site scripting for an example of the pain you may get
	into at some point in the future.

->	Is there 'escaping' for the \u and \U sequence itself ?

	And if there is - can this be mixed in utf8 ? And if not
	- how does one know for a fact what mode one is ?

Or on other words:

->	If you really want this - better define it narrower

OR

->	Drop it altogether.

As to give strict parsers in hostile environments a chance.

DW

Received on Thursday, 9 March 2006 09:43:54 UTC