See also: IRC log
Norm: Let's add my HTML/encoding question and drop 2.1 because there's nothing new today.
Henry: No, there's one thing we can talk about wrt 2.1
Paul is at risk, he'll dial in if he can.
Some discussion of what the expected processing is for an XHTML document sent as text/html
Alex: If you do this with an
Reader in Java, you've already made the encoding choice. On an
InputStream, you haven't.
... What processors do here is sniff if the content type isn't specified and work out the encoding from the first 200 bytes or so.
Henry: I've been looking at RFC
2854, the RFC that current governs text/html
... oddly, the RFC makes several observations but doesn't actually seem to say what to do.
Spec exploration ensues
Henry: The final note in 22.214.171.124 is clearly wrong, if there's a charset parameter it is text.
<ht> Content-Type: text/html; charset=utf-8
<scribe> ACTION: Norm to propose an erratum for the note at the end of 126.96.36.199 to add something like "without a charset" [recorded in http://www.w3.org/2010/06/17-xproc-minutes.html#action01]
<ht> Or you could have said override-content-type="text/html; charset=utf-8"
Norm: Yes, I could. That might be the easiest solution, in fact.
Some discussion of content transfer encoding.
<ht> For what it's worth, RFC2616 defines 'entity body' as the octets in the message
<ht> "The entity-body is obtained
<ht> from the message-body by decoding any Transfer-Encoding that might
<ht> have been applied to ensure safe and proper transfer of the message.
<scribe> ACTION: Norm to propose en erratum for 188.8.131.52 to clarify that "decoded if necessary" applies to Content-Encoding headers. [recorded in http://www.w3.org/2010/06/17-xproc-minutes.html#action02]
Henry: The .svgz documents should allow us to demonstrate the problem pretty quickly.