W3C

- DRAFT -

XML Processing Model WG

Meeting 174, 17 Jun 2010

Agenda

See also: IRC log

Attendees

Present
Paul, Henry, Alex, Norm
Regrets
Vojtech, Mohamed
Chair
Norm
Scribe
Norm

Contents


Accept this agenda?

-> http://www.w3.org/XML/XProc/2010/06/17-agenda

Norm: Let's add my HTML/encoding question and drop 2.1 because there's nothing new today.

Henry: No, there's one thing we can talk about wrt 2.1

Accepted.

Accept minutes from the previous meeting?

-> http://www.w3.org/XML/XProc/2010/06/10-minutes

Accepted.

Next meeting: telcon, 24 June 2010?

Paul is at risk, he'll dial in if he can.

Of HTML and encodings

Some discussion of what the expected processing is for an XHTML document sent as text/html

Alex: If you do this with an Reader in Java, you've already made the encoding choice. On an InputStream, you haven't.
... What processors do here is sniff if the content type isn't specified and work out the encoding from the first 200 bytes or so.

<ht> http://www.rfc-editor.org/rfc/rfc2854.txt

Henry: I've been looking at RFC 2854, the RFC that current governs text/html
... oddly, the RFC makes several observations but doesn't actually seem to say what to do.

Spec exploration ensues

Henry: The final note in 7.1.10.4 is clearly wrong, if there's a charset parameter it is text.

<ht> Content-Type: text/html; charset=utf-8

<scribe> ACTION: Norm to propose an erratum for the note at the end of 7.1.10.4 to add something like "without a charset" [recorded in http://www.w3.org/2010/06/17-xproc-minutes.html#action01]

<ht> Or you could have said override-content-type="text/html; charset=utf-8"

Norm: Yes, I could. That might be the easiest solution, in fact.

Some discussion of content transfer encoding.

<ht> For what it's worth, RFC2616 defines 'entity body' as the octets in the message

<ht> Wrong

<ht> "The entity-body is obtained

<ht> from the message-body by decoding any Transfer-Encoding that might

<ht> have been applied to ensure safe and proper transfer of the message.

<ht> "

<scribe> ACTION: Norm to propose en erratum for 7.1.10.3 to clarify that "decoded if necessary" applies to Content-Encoding headers. [recorded in http://www.w3.org/2010/06/17-xproc-minutes.html#action02]

Henry: The .svgz documents should allow us to demonstrate the problem pretty quickly.

Any other business?

None heard.

Adjourned.

Summary of Action Items

[NEW] ACTION: Norm to propose an erratum for the note at the end of 7.1.10.4 to add something like "without a charset" [recorded in http://www.w3.org/2010/06/17-xproc-minutes.html#action01]
[NEW] ACTION: Norm to propose en erratum for 7.1.10.3 to clarify that "decoded if necessary" applies to Content-Encoding headers. [recorded in http://www.w3.org/2010/06/17-xproc-minutes.html#action02]
 
[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.135 (CVS log)
$Date: 2010/06/22 14:29:26 $