XHTML-Print Last Call Working Draft review Disposition of Comments

20 January 2004

This version:: http://www.w3.org/MarkUp/2004/xhtml-print-lc-doc-20040120
Editor:: Jim Bigelow, Hewlett-Packard Co.

Abstract

This document outlines the way in which the HTML Working Group addressed the comments submitted during the XHTML-Print Last Call Working Draft review period.

Status of this document

During the Last Call Working Draft review period for XHTML-Print a number of comments were received from both inside and outside of the W3C. This document summarizes those comments and describes the ways in which the comments were addressed by the HTML Working Group.

Note that the majority of this document is automatically generated from the Working Group's database of comments. As such, it may contain typographical or stylistic errors. If so, these are contained in the original submissions, and the HTML Working Group elected to not change these submissions.

This document is a product of the W3C's HTML Working Group. This document may be updated, replaced or rendered obsolete by other W3C documents at any time. It is inappropriate to use this document as reference material or to cite it as other than "work in progress". This document is work in progress and does not imply endorsement by the W3C membership.

This document has been produced as part of the W3C HTML Activity. The goals of the HTML Working Group (members only) are discussed in the HTML Working Group charter (members only).

Please send detailed comments on this document to www-html-editor@w3.org. We cannot guarantee a personal response, but we will try when it is appropriate. Public discussion on HTML features takes place on the mailing list www-html@w3.org.

A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR.

Issue	State	Resolution
6492: Incorrect example in Appendix B.3 of XHTML Print	Closed	Fixed incorrect syntax of example
6782: Minor editorial comments	Closed	Apply changes as noted -- Jim
6869: XHTML-Print: change of url from xhtml-print.org to w3c.org breaks current implementations.	Closed	Equivalent of 6777
6772: Scripts and Events	Closed	Defined ignore as display:none
6773: Document Conformance	Closed	DOCTYPE does not add an extra burden on printers
6774: allow UTF-16 not just UTF-8	Closed	Lexmark dessenting, all others accept
6775: why does object type override content type/HTTP level?	Closed	Agreed changed wording to say resources
6870: XHTML-Print: treating a missing media attribute as media="screen"	Closed	changed to "all"
6776: support for character entities too expensive for low-cost printers	Closed	No response to response to reply 2, assuming agreement.
6777: MIME type Application/Multiplexed not correct	Closed	correct spec as indicated in issue
6871: XHTML-Print: Appendix B.2.1 uses "image header" without defining it.	Closed	defined image header
6778: Required support for script, noscript, and hidden	Closed	same issue as 6772
6779: treatment of attributes	Closed	resolve as comment (albeit a nice one)
6780: change of MIME type to application/xhtml+xml not compatible with UPnP	Closed	Printers must support W3C and PWG MIME Type and DTD. PWG versions deprecated.
6815: Relaxing XHTML-Print's restriction to UTF-8 to include UTF-16	Approved	duplicate of 6774
6781: Change to wording of Section 2.3.1, "Images" section, fourth bullet confusing	Closed	change spec to use wording in followup 1
6783: RFC 2119 keyword in informative section	Closed	Remove RFC 219 keyword annotations from informative section -- Jim
6784: Diagram 1 height & width not right	Closed	Change height and width -- Jim
6785: Spell out abbreviations at first occurance	Closed	Make changes as noted -- Jim
6786: markup elements and attributes globally	Closed	Make changes as noted. -- Jim

1. XHTML-Print

1.1 Incorrect example in Appendix B.3 of XHTML Print

PROBLEM ID: 6492

STATE: Closed
RESOLUTION: Modify and Accept
USER POSITION: Agree

NOTES:

  Fixed incorrect syntax of example

ORIGINAL MESSAGE:

  From: Jun Fujisawa <fujisawa.jun@canon.co.jp>
  
  From: Jun Fujisawa <fujisawa.jun@canon.co.jp>
  To: www-html-editor@w3.org
  Cc: www-html@w3.org, xp@pwg.org,
  	Jon Ferraiolo <jon.ferraiolo@adobe.com>
  Subject: Incorrect example in Appendix B.3 of XHTML Print
  Date: Fri, 25 Jul 2003 12:48:47 +0900
  Message-Id: <p05111011bb4654080f6f@[172.23.45.13]>
  X-Archived-At: http://www.w3.org/mid/p05111011bb4654080f6f@%5B172.23.45.13%5D
  
  Hello HTML editors,
  
  Here is a comment to the last call draft for XHTML Print.
  
  At 6:28 PM +0200 03.7.24, Steven Pemberton wrote:
  >XHTMLT-Print
  >http://www.w3.org/MarkUp/Group/2003/WD-xhtml-print-20030723
  
  Jon Ferraiolo of SVG WG found out that the example in Appendix
  B.3 looks strange since the two instances of 'object' element have
  the sample value for 'id' attribute in a single XML document.
  
  <object declare="declare"
       height="20 mm" width="20 mm"
       type="image/jpeg"
       id="image_1" >
  </object>
  
  . . . .
  
  <object id="image_1"
       data="data:image/jpeg;base64,aGh67Fghsapa0Hji7dfGSweTa . . . ">
  </object>
  
  I believe the example is not correct. Also, I think the choice of this
  particular example is not appropriate because we don't need to use
  the case for 'object' element with 'declare' attributes in order to
  show how we can include inline image data in XHTML-Print by using
  data URI scheme.
  
  I would like to suggest to replace this example by simpler ones such
  as the following:
  
  <object height="20 mm" width="20 mm" type="image/jpeg"
       data="data:image/jpeg;base64,aGh67Fghsapa0Hji7dfGSweTa . . . ">
       Example Image
  </object>
  
  or
  
  <img height="20 mm" width="20 mm" alt="Example Image"
       src="data:image/jpeg;base64,aGh67Fghsapa0Hji7dfGSweTa . . . " />
  
  -- 
  Jun Fujisawa
  <mailto:fujisawa.jun@canon.co.jp>

FOLLOWUP 1:


  From: Masayasu Ishikawa <mimasa@w3.org>
  
  So, we are receiving Last Call comments even before publication.  Great.
  
  Jim, do you think this is an easy-to-fix thing that we should just
  do it now (i.e. fix it and publish the Last Call WD, which should
  happen today), or leave it for now and fix later?
  
  -- 
  Masayasu Ishikawa / mimasa@w3.org
  W3C - World Wide Web Consortium
  
  mimasa@w3.mag.keio.ac.jp wrote:
  
  > From: Jun Fujisawa <fujisawa.jun@canon.co.jp>
  > To: www-html-editor@w3.org
  > Cc: www-html@w3.org, xp@pwg.org,
  > 	Jon Ferraiolo <jon.ferraiolo@adobe.com>
  > Subject: Incorrect example in Appendix B.3 of XHTML Print
  > Date: Fri, 25 Jul 2003 12:48:47 +0900
  > Message-Id: <p05111011bb4654080f6f@[172.23.45.13]>
  > X-Archived-At: http://www.w3.org/mid/p05111011bb4654080f6f@%5B172.23.45.13%5D
  > 
  > Hello HTML editors,
  > 
  > Here is a comment to the last call draft for XHTML Print.
  > 
  > At 6:28 PM +0200 03.7.24, Steven Pemberton wrote:
  > >XHTMLT-Print
  > >http://www.w3.org/MarkUp/Group/2003/WD-xhtml-print-20030723
  > 
  > Jon Ferraiolo of SVG WG found out that the example in Appendix
  > B.3 looks strange since the two instances of 'object' element have
  > the sample value for 'id' attribute in a single XML document.
  > 
  > <object declare="declare"
  >      height="20 mm" width="20 mm"
  >      type="image/jpeg"
  >      id="image_1" >
  > </object>
  > 
  > . . . .
  > 
  > <object id="image_1"
  >      data="data:image/jpeg;base64,aGh67Fghsapa0Hji7dfGSweTa . . . ">
  > </object>
  > 
  > I believe the example is not correct. Also, I think the choice of this
  > particular example is not appropriate because we don't need to use
  > the case for 'object' element with 'declare' attributes in order to
  > show how we can include inline image data in XHTML-Print by using
  > data URI scheme.
  > 
  > I would like to suggest to replace this example by simpler ones such
  > as the following:
  > 
  > <object height="20 mm" width="20 mm" type="image/jpeg"
  >      data="data:image/jpeg;base64,aGh67Fghsapa0Hji7dfGSweTa . . . ">
  >      Example Image
  > </object>
  > 
  > or
  > 
  > <img height="20 mm" width="20 mm" alt="Example Image"
  >      src="data:image/jpeg;base64,aGh67Fghsapa0Hji7dfGSweTa . . . " />
  > 
  > -- 
  > Jun Fujisawa
  > <mailto:fujisawa.jun@canon.co.jp>

FOLLOWUP 2:


  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>
  
  > From: don@lexmark.com [mailto:don@lexmark.com] 
  > Sent: Friday, July 25, 2003 5:15 AM
  > To: Jun Fujisawa
  > Cc: xp@pwg.org; jim.bigelow@hp.com
  > Subject: Re: XP> Incorrect example in Appendix B.3 of XHTML Print
  > 
  > 
  > 
  > Jun:
  > 
  > The intent of this example was to show how an image can be 
  > declared inline with the other XHTML while the actual data 
  > for the image may come later. Neither of your two 
  > alternatives separate the delaration of the image from the 
  > actual data of the image.  If the example provided is 
  > incorrect, can you provide an example that achieves this separation?
  > 
  > **********************************************
  >  Don Wright                 don@lexmark.com
  > 
  >  Chair,  IEEE SA Standards Board
  >  Member, IEEE-ISTO Board of Directors
  >  f.wright@ieee.org / f.wright@computer.org
  > 
  >  Director, Alliances & Standards
  >  Lexmark International
  >  740 New Circle Rd
  >  Lexington, Ky 40550
  >  859-825-4808 (phone) 603-963-8352 (fax)
  > **********************************************
  >

FOLLOWUP 3:


  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>
  
  > From: Jun Fujisawa [mailto:fujisawa.jun@canon.co.jp] 
   Sent: Monday, July 28, 2003 3:44 AM
   To: don@lexmark.com
   Cc: xp@pwg.org; jim.bigelow@hp.com
   Subject: Re: XP> Incorrect example in Appendix B.3 of XHTML Print
   
   
   Hello Don,
   
   At 8:15 AM -0400 03.7.25, don@lexmark.com wrote:
   >The intent of this example was to show how an image can be declared 
   >inline with the other XHTML while the actual data for the image may 
   >come later.
   
   I don't understand the intent. I you want to get actual image 
   data later (not at the declaration), you can just use 'img' 
   or 'object' element without 'declare' attribute.
   
   >If the example provided is incorrect, can
   >you provide an example that achieves this separation?
   
   The following example shows one type of separation, but I 
   don't think that meets your need.
   
   <object id="image_1" declare="declare" type="image/jpeg"
        data="data:image/jpeg;base64,aGh67Fghsapa0Hji7dfGSweTa . 
   . . "> </object>
   
   . . . .
   
   <object height="20 mm" width="20 mm"
        data="#image_1" >
   </object>
   
   -- 
   Jun Fujisawa
   <mailto:fujisawa.jun@canon.co.jp>

FOLLOWUP 4:

  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>

  From: ElliottBradshaw@oaktech.com  [mailto:ElliottBradshaw@oaktech.com] 
  Sent: Friday, August 01, 2003 8:07 AM
  To: Jun Fujisawa
  Cc: don@lexmark.com; jim.bigelow@hp.com; owner-xp@pwg.org; xp@pwg.org
  Subject: Re: XP> Incorrect example in Appendix B.3 of XHTML Print

  I see two issues here, perhaps separable.

  1.  Use of inline data.

  This can be accomplished by adding support for the data scheme.

  Examples (from Fujisawa-san):

  <object height="20 mm" width="20 mm" type="image/jpeg"
       data="data:image/jpeg;base64,aGh67Fghsapa0Hji7dfGSweTa . . . ">
       Example Image
  </object>

  or

  <img height="20 mm" width="20 mm" alt="Example Image"
       src="data:image/jpeg;base64,aGh67Fghsapa0Hji7dfGSweTa . . . " />

  2.  Separation of the data from the reference

  This is where the declare attribute comes in.  I went back and read

    http://www.w3.org/TR/html4/struct/objects.html#h-13.3.4

  It seems to me that the declare facility would let a client supply the
  content for the object before its reference, not after.  If the requirement
  is that the client can send the image data at the end, I'm not sure that
  HTML supports that.

  If there is a requirement that the client can send the data first, then
  refer to it, then an example (again, thanks Fujisawa) is:

  <object id="image_1" declare="declare" type="image/jpeg"
       data="data:image/jpeg;base64,aGh67Fghsapa0Hji7dfGSweTa . . . ">
  </object>

  . . . .

  <object height="20 mm" width="20 mm"
       data="#image_1" >
  </object>

  I think the first requirement is good to have, but we can probably drop the
  second, especially since the ordering is probably not what we want.

  ------------------------------------------
  Elliott Bradshaw
  Director, Software Engineering
  Oak Technology Imaging Group
  781 638-7534

FOLLOWUP 5:

  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>

  From: BIGELOW,JIM (HP-Boise,ex1) 
  Sent: Friday, August 01, 2003 8:38 AM
  To: 'ElliottBradshaw@oaktech.com'; Jun Fujisawa
  Cc: don@lexmark.com; BIGELOW,JIM (HP-Boise,ex1); 
  owner-xp@pwg.org; xp@pwg.org
  Subject: RE: XP> Incorrect example in Appendix B.3 of XHTML Print

  Elliott wrote:
  > I see two issues here, perhaps separable.
  > 1.  Use of inline data.
  > 
  > This can be accomplished by adding support for the data scheme. ...
  > 
  > 2.  Separation of the data from the reference
  > 
  > ...
  > 
  > I think the first requirement is good to have, but we can
  > probably drop the second, especially since the ordering is 
  > probably not what we want.
  > 

  I'm not perfectly clear on what you think the requirements should be.  The
  current spec says that printer may support in-line data via the object/img
  elements, but is not required to.  

  Are you calling for a change to this statement?

  Arguments against requiring support for in-line image data have been that:
  1. it requires too much buffering
  2. the image data could overflow the memory used to store element
  attributes.  Alternately, to avoid the possibility of exceeding the memory
  set aside for storing element attributes while processing a job, a printer
  must either reserve large amounts of memory to avoid problems in this one,
  almost unique case, or implement a complex, dynamic memory allocation
  scheme. 

  In any event supporting in-line data via the object and image attributes
  means that the entire image is funneled through the document parser,
  whereas, alternate means of handling image data are possible if the image is
  referenced via the cid or http schemes. 

  There is another method for managing image data buffering, Section B.2.1
  In-line images of the W3C spec provides some informative suggestions about
  ways to stage the delivery of image data using the (required) multiplexed
  document format. This method seeks to reduce the memory needed to store
  images while processing the document, by providing enough of the image
  header to determine the image's size, synchronized with the image's
  reference.  The remainder or bulk of the image is delivered later in the
  document, hopefully, when the printer is ready to commit the image to the
  page.

  Jim
  --

FOLLOWUP 6:

  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>

  From: ElliottBradshaw@oaktech.com [mailto:ElliottBradshaw@oaktech.com] 
  Sent: Friday, August 01, 2003 9:46 AM
  To: BIGELOW,JIM (HP-Boise,ex1)
  Cc: don@lexmark.com; Jun Fujisawa; BIGELOW,JIM  (HP-Boise,ex1);
  owner-xp@pwg.org; xp@pwg.org
  Subject: RE: XP> Incorrect example in Appendix B.3 of XHTML Print

  Sorry, I didn't mean to change the actual requirements.  Section B.3 should
  stay informative and just be a discussion of different things a printer may
  choose to implement.

  However, there is at least one case of a conditional requirement elsewhere
  in the document (the Object Module) that refers to this section.

  But, it is confusing what problem this section is trying to solve (in an
  optional way).  And, it looks like the example for use of the declare
  attribute is just plain wrong.

  I propose that we re-write this section to eliminate all discussion of the
  declare attribute, and simply show how to use the data URL scheme to handle
  inline data.

  For example:

  <proposal>

  This section is informative.

  An alternative method to include inline image data in XHTML-Print is via the
  "data" URL scheme (see RFC2397). Because this method normally encodes the
  binary image data using base64 encoding, a significant increase in the size
  of the data transmitted will be experienced. This SHOULD be avoided over low
  speed connections. Printers supporting inline data MAYsupport base64
  encoding using the img or object element.

  <object height="20 mm" width="20 mm" type="image/jpeg"
       data="data:image/jpeg;base64,aGh67Fghsapa0Hji7dfGSweTa . . . ">
       Example Image
  </object>

  or

  <img height="20 mm" width="20 mm" alt="Example Image"
       src="data:image/jpeg;base64,aGh67Fghsapa0Hji7dfGSweTa . . . " />

  This method MAY be useful for very simple clients that cannot afford a
  server for image downloading or for some reason cannot utilize the
  Application/Multiplexed MIME type; however, it is not RECOMMENDED for
  general use especially if the size of the printer's buffer is unknown.

  </proposal>

REPLY 1:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Fujisawa-san,
  
  Thank you for you comment.  It is recorded as issue 6492 [1] in the HTML Working
  Group's issue tracking system. The working group has elected to accept this
  defect and modify XHTML-Print spec by accepting Elliott Bradshaw's proposal to
  change Appendix B.3 to read as shown below.  If this is not acceptable, please
  respond to this message with your comments.
  
  Jim Bigelow
  
  --
  This section is informative.
  
  
  An alternative method to include inline image data in XHTML-Print is via the
  "data" URL scheme (see RFC2397). Because this method normally encodes the
  binary image data using base64 encoding, a significant increase in the size
  of the data transmitted will be experienced. This SHOULD be avoided over low
  speed connections.. Printers supporting inline data MAY support base64
  encoding using the img or object element.
  
  <object height="20 mm" width="20 mm" type="image/jpeg"
       data="data:image/jpeg;base64,aGh67Fghsapa0Hji7dfGSweTa . . . ">
       Example Image
  </object>
  
  or
  
  <img height="20 mm" width="20 mm" alt="Example Image"
       src="data:image/jpeg;base64,aGh67Fghsapa0Hji7dfGSweTa . . . " />
  
  This method MAY be useful for very simple clients that cannot afford a
  server for image downloading or for some reason cannot utilize the
  Application/Multiplexed MIME type; however, it is not RECOMMENDED for
  general use especially if the size of the printer's buffer is unknown.
  
  
  
  [1] http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6492;user=guest

1.2 Minor editorial comments

PROBLEM ID: 6782

STATE: Closed
RESOLUTION: Accept
USER POSITION: Agree

NOTES:

  Apply changes as noted -- Jim

ORIGINAL MESSAGE:

  From: Susan Lesch [mailto:lesch@w3.org] 
  
  These are minor editorial comments for your XHTML-Print Last Call Working
  Draft [1]. Kudos to the editor and your group(s). It looks great.
  
  s/family of XHTML Languages/family of XHTML languages/ s/members/Members/
  s/whitespace/white space/ s/Style Sheet/style sheet/
  s/guillemots/guillemets/ s/ththe/the/ s/, support/. Support/
  
  [extracted from 6899]
  
  [1] http://www.w3.org/TR/2003/WD-xhtml-print-20030729/
  
  Best wishes for your project,
  -- 
  Susan Lesch           http://www.w3.org/People/Lesch/
  mailto:lesch@w3.org               tel:+1.858.483.4819
  World Wide Web Consortium (W3C)    http://www.w3.org/

FOLLOWUP 1:


  From: Mail Delivery Subsystem <MAILER-DAEMON@hades.mn.aptest.com>
  
  This is a MIME-encapsulated message
  
  --h8QNj9b28021.1064619909/hades.mn.aptest.com
  
  The original message was received at Fri, 26 Sep 2003 18:45:09 -0500
  from IDENT:7ywgpQCDze4q049jyJPGDf82aNuXvKE8@localhost [127.0.0.1]
  
     ----- The following addresses had permanent fatal errors -----
  <[mailto:lesch@w3.org]>
      (reason: 550 Host unknown)
  
     ----- Transcript of session follows -----
  550 5.1.2 <[mailto:lesch@w3.org]>... Host unknown (Name server: w3.org]: host not found)
  
  --h8QNj9b28021.1064619909/hades.mn.aptest.com
  Content-Type: message/delivery-status
  
  Reporting-MTA: dns; hades.mn.aptest.com
  Received-From-MTA: DNS; localhost
  Arrival-Date: Fri, 26 Sep 2003 18:45:09 -0500
  
  Final-Recipient: RFC822; [mailto:lesch@w3.org]
  Action: failed
  Status: 5.1.2
  Remote-MTA: DNS; w3.org]
  Diagnostic-Code: SMTP; 550 Host unknown
  Last-Attempt-Date: Fri, 26 Sep 2003 18:45:09 -0500
  
  --h8QNj9b28021.1064619909/hades.mn.aptest.com
  Content-Type: message/rfc822
  
  Return-Path: <voyager-issues@mn.aptest.com>
  Received: from localhost (IDENT:7ywgpQCDze4q049jyJPGDf82aNuXvKE8@localhost [127.0.0.1])
  	by hades.mn.aptest.com (8.11.6/8.11.6) with ESMTP id h8QNj9b28019
  	for <[mailto:lesch@w3.org]>; Fri, 26 Sep 2003 18:45:09 -0500
  Date: Fri, 26 Sep 2003 18:45:09 -0500
  Message-Id: <200309262345.h8QNj9b28019@hades.mn.aptest.com>
  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  To: lesch@w3.org]
  Subject: Re: Minor editorial comments (PR#6782)
  X-Loop: voyager-issues@mn.aptest.com
  
  Thank you for your comment on the XHTML-Print Last Call 
  Working Draft. It is recorded as issue 6782 [1] in the HTML 
  Working Group's issue tracking system. 
  
  The working group has elected to implement you suggestions.
  
  
  Jim Bigelow
  Editor
  
  [1] http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6782;user=guest
  
  --h8QNj9b28021.1064619909/hades.mn.aptest.com--

REPLY 1:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Thank you for your comment on the XHTML-Print Last Call 
  Working Draft. It is recorded as issue 6782 [1] in the HTML 
  Working Group's issue tracking system. 
  
  The working group has elected to implement you suggestions.
  
  
  Jim Bigelow
  Editor
  
  [1] http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6782;user=guest

1.3 XHTML-Print: change of url from xhtml-print.org to w3c.org breaks current implementations.

PROBLEM ID: 6869

STATE: Closed
RESOLUTION: Modify and Accept
USER POSITION: Agree

NOTES:

  Equivalent of 6777

ORIGINAL MESSAGE:

  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>

  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>
  To: www-html-editor@w3.org
  Cc: xp@pwg.org
  Subject: XHTML-Print: change of url from xhtml-print.org to w3c.org breaks 	 current implementations.
  Date: Thu, 4 Sep 2003 11:02:17 -0700 
  Message-ID: <020A3CF87FB5AC47AA67966B33845755050DB585@xboi22.boise.itc.hp.com>
  X-Archived-At: http://www.w3.org/mid/020A3CF87FB5AC47AA67966B33845755050DB585@xboi22.boise.itc.hp.com

  The W3C Last Call Working Draft of XHTML-Print [1] changes the URL in the
  DOCTYPE from 
  "http://www.xhtml-print.org/xhtml-print/xhtml-print10.dtd" to
  "http://www.w3.org/MarkUp/DTD/xhtml-print10.dtd".

  This breaks compatibility with existing implementations. Can this situation
  be handled by redirecting the xhtml-print.org url to the w3.org url?  If so,
  how is this done?

  [1] http://www.w3.org/TR/2003/WD-xhtml-print-20030729/

  Jim Bigelow
  Hewlett-Packard

REPLY 1:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Jonny Axelsson wrote:
  
  Just for my curiosity: How does that break backwards compatibility? The 
  old DTD will presumably remain at the www.xhtml-print.org location for at 
  least as long as is needed (for the current implementations), while new or 
  updated XHTML-Print implementations will use the new location. Or?
  
  -- 
  Jonny Axelsson,
  Web Standards,
  Opera Software

REPLY 2:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Elliott Bradshaw wrote:
  
  Don is going to remind us (as well he should) that the URL is not used for a
  live retrieval from that server.  So a redirect doesn't work.
  
  So I think this is, technically, an incompatible change.  But I think it's one
  we could live with.
  
  --------------------------------------------------------------------------------
  
  Elliott Bradshaw
  Director, Software Engineering
  Zoran Imaging Group (formerly Oak Technology Imaging Group)
  781 638-7534

REPLY 3:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Jim Bigelow wrote:
  Jonny,
  
  Thanks for the question.  
  
  If a document with the w3c DTD is sent to a printer that shipped with firmware
  written using the spec saying that conforming XHTML-Print documents must have a
  DTD containing a URL to the xhtml-print.org DTD, then the it is possible that
  the document wouldn't print correctly, even though the printer
  is not validating.   
  
  In the extreme case, it is possible that the document wouldn't print at all,
  since Section 2.3.1, item 1 says, "A printer MAY ignore or otherwise reject a
  non-conforming XHTML-Print document."
  
  I think we're all better off avoiding  things that could make the user unhappy!
  :-)
  
  
  Jim

REPLY 4:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Thank you for your comment on the XHTML-Print Last Call 
  Working Draft. It is recorded as issue 6869 [1] in the HTML 
  Working Group's issue tracking system. 
  
  The working group following the reasoning of issue 6780 [2] decided that the DTD
  in in Appendix C of the spec [3] and the DTD in Appendix C of XHTML-Print [4]
  must be accepted.  However, the DTD in Appendix C of XHTML-Print [4] is
  deprecated in favor of the DTD in Appendic C. Future releases of this
  specification may remove the required support for the DTD in Appendix C of
  XHTML-Print [4].
  
  If you feel that this resolution of your comment is not acceptable, please
  respond to this message with your comments.
  
  Jim Bigelow
  Editor
  
  [1] http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6869;user=guest
  [2] http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6780;user=guest
  [3] http://www.w3.org/TR/2003/WD-xhtml-print-20030729/
  [4] http://www.pwg.org/xhtml-print/HTML-Version/XHTML-Print.html

1.4 Scripts and Events

PROBLEM ID: 6772

STATE: Closed
RESOLUTION: Accept
USER POSITION: Agree

NOTES:

  Defined ignore as display:none

ORIGINAL MESSAGE:

  From: Henri Sivonen <hsivonen@iki.fi>

  From: Henri Sivonen <hsivonen@iki.fi>
  To: www-html-editor@w3.org
  Subject: Scripts and Events
  Date: Sun, 3 Aug 2003 22:01:47 +0300
  Message-Id: <EE667E7F-C5E4-11D7-B77B-003065B8CF0E@iki.fi>
  X-Archived-At: http://www.w3.org/mid/EE667E7F-C5E4-11D7-B77B-003065B8CF0E@iki.fi

  1.3.1 Script and Events
  Since the specification requires the documents to conform to 
  restrictions that are not applicable to all XHTML documents, it is 
  unlikely that casually authored XHTML documents would happen to be 
  conforming XHTML-Print documents. Therefore, it is reasonable to expect 
  some preprocessing to take place in the application before sending a 
  document to the printer. That application could be required to discard 
  script elements without burdening the printer with that task.

  Such modification would change the document tree, though, and could 
  change the matching of CSS selectors. If it is important to take into 
  account the special case that someone could use a CSS selector such as 
  "script + p" to style a paragraph, it would be necessary to elaborate 
  on what "discarding" an element on the printer means (that is, is it 
  discarded from the document tree or merely defaulted to display: none;).

  [extracted from issue 6548]
  -- 
  Henri Sivonen
  hsivonen@iki.fi
  http://www.iki.fi/hsivonen/

REPLY 1:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Thank you for your comment.  It is recorded as issue 6772 [1] in the HTML
  Working
  Group's issue tracking system. The working group has elected to accept your
  comment by clarifying that discarding an element should be the equivalent to
  setting its display property to "none". 
  
  If this resolution of you comment is not acceptable, please
  respond to this message with your comments.
  
  Jim Bigelow
  Editor
  
  [1] http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6772;user=guest

1.5 Document Conformance

PROBLEM ID: 6773

STATE: Closed
RESOLUTION: Reject
USER POSITION: No Response

NOTES:

  DOCTYPE does not add an extra burden on printers

ORIGINAL MESSAGE:

  From: Henri Sivonen <hsivonen@iki.fi>

  From: Henri Sivonen <hsivonen@iki.fi>
  To: www-html-editor@w3.org
  Subject: Document Conformance
  Date: Sun, 3 Aug 2003 22:01:47 +0300
  Message-Id: <EE667E7F-C5E4-11D7-B77B-003065B8CF0E@iki.fi>
  X-Archived-At: http://www.w3.org/mid/EE667E7F-C5E4-11D7-B77B-003065B8CF0E@iki.fi

  2.1 Document Conformance
  Considering that printers are allowed to ignore non-conforming 
  documents, requiring a particular doctype declaration and DTD validity 
  looks like a significant burden for applications producing XHTML-Print 
  documents. In particular, DTD validity requires namespaces to be 
  represented in a particular way even though other representations would 
  be semantically equivalent. This means applications producing 
  XHTML-Print documents cannot use any off-the-shelf XML serializer but 
  need a serializer specifically tailored to meet the requirements of 
  XML-Print.

  Wouldn't it be enough allow DTDless documents as long as the element 
  structure meets the requirements expressed in the DTD (even though this 
  kind of conformance can't be checked with a [DTD-]validating XML 
  processor)?

  [extracted from issue 6548]
  -- 
  Henri Sivonen
  hsivonen@iki.fi
  http://www.iki.fi/hsivonen/

REPLY 1:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Thank you for your comment on the XHTML-Print Last Call 
  Working Draft. It is recorded as issue 6773 [1] in the HTML 
  Working Group's issue tracking system. The working group does
  not agree that the inclusion of the required doctype element in 
  XHTML-Print documents would be a burden either to an application
  that produced XHTML-Print documents or a printer that processed 
  them.  Therefore, no change is planned to the specific regarding 
  your issue.
  
  If you feel that this resolution of your comment is not acceptable, please
  respond to this message with your comments.
  
  Jim Bigelow
  Editor
  
  [1] http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6773;user=guest

1.6 allow UTF-16 not just UTF-8

PROBLEM ID: 6774

STATE: Closed
RESOLUTION: Accept
USER POSITION: Agree

NOTES:

  Lexmark dessenting, all others accept

ORIGINAL MESSAGE:

  From: Henri Sivonen <hsivonen@iki.fi>

  From: Henri Sivonen <hsivonen@iki.fi>
  To: www-html-editor@w3.org
  Subject: allow UTF-16 not just UTF-8
  Date: Sun, 3 Aug 2003 22:01:47 +0300
  Message-Id: <EE667E7F-C5E4-11D7-B77B-003065B8CF0E@iki.fi>
  X-Archived-At: http://www.w3.org/mid/EE667E7F-C5E4-11D7-B77B-003065B8CF0E@iki.fi

  It is said that if a "charset" parameter is present for the 
  application/xhtml+xml MIME type, the only valid value is "utf-8". It 
  would make sense to allow "utf-16" as well. All XML processors are 
  required to support UTF-16 in addition to UTF-8, so allowing UTF-16 for 
  XHTML-Print doesn't cause any additional burden to implementations. 
  Also, the payload of Application/Vnd.pwg-multiplexed  chunks is defined 
  as octets, so UTF-16 strings can be delivered as  
  Application/Vnd.pwg-multiplexed  chunks without any further encoding.

  [extracted from issue 6548]
  -- 
  Henri Sivonen
  hsivonen@iki.fi
  http://www.iki.fi/hsivonen/

FOLLOWUP 1:

  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>

  From: don@lexmark.com [mailto:don@lexmark.com] 
  Sent: Tuesday, September 02, 2003 6:06 PM
  To: BIGELOW,JIM (HP-Boise,ex1)
  Cc: xp@pwg.org
  Subject: Re: XP> Relaxing XHTML-Print's restriction to UTF-8 to include
  UTF-16

  Jim:

  I would disagree.  I don't believe that all XHTML-Print enabled printers
  will necessarily bite the bullet and include a complete XML parser that
  requires support for UTF-16.  I don't believe we should force that to occur.
  Perhaps you should remind the group that XHTML-Print is target for LOW-END
  printers with this embedded.  No 3 gigahertz Pentium 4's with 512 MB of
  memory!!!

  *******************************************
  Don Wright                 don@lexmark.com

  Chair,  IEEE SA Standards Board
  Member, IEEE-ISTO Board of Directors
  f.wright@ieee.org / f.wright@computer.org

  Director, Alliances and Standards
  Lexmark International
  740 New Circle Rd C14/082-3
  Lexington, Ky 40550
  859-825-4808 (phone) 603-963-8352 (fax)
  *******************************************

FOLLOWUP 2:


  From: jim.bigelow@hp.com
  
  I tend to agree with Henri. 
    -- Jim Bigelow

FOLLOWUP 3:


  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>
  
  > From: elliott.bradshaw@zoran.com [mailto:elliott.bradshaw@zoran.com] 
  > Sent: Wednesday, September 03, 2003 7:07 AM
  > To: don@lexmark.com
  > Cc: BIGELOW,JIM (HP-Boise,ex1); owner-xp@pwg.org; xp@pwg.org
  > Subject: Re: XP> Relaxing XHTML-Print's restriction to UTF-8 
  > to include UTF-16
  >
  Or to put it another way, XHTML-Print describes a single way of doing
  something.  Wherease HTML and its derivatives frequently support multiple
  ways of getting the same effect.
  
  In the past, we have have resisted features that appear easy, unless they
  actually extend the capabilities of what can be done.
  
  Since I think a UTF-8 oriented client can get the same work done as a UTF-16
  client, we should not mandate the extension.
  
  IMHO.
  
    E.

FOLLOWUP 4:


  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>
  
  > From: Michael Sweet [mailto:mike@easysw.com]
  > Sent: Wednesday, September 03, 2003 7:26 AM
  > To: don@lexmark.com
  > Cc: BIGELOW,JIM (HP-Boise,ex1); xp@pwg.org
  > Subject: Re: XP> Relaxing XHTML-Print's restriction to UTF-8 
  > to include UTF-16
  
  I'm not so worried about memory usage; converting UTF-16 to UTF-8 on the
  input side is not expensive in terms of memory or processor.
  
  However, reliably detecting UTF-16 and managing the endianess of the words
  is a pain in the ass in the real world.  Assuming that all UTF-16 files
  start with FFFE or FEFF, the XML parser can handle the UTF-16 encoding
  without difficulty, however certain large convicted software monopolies
  regularly omit this important information making autodetection unreliable.
  
  Given the limited scope of XHTML-Print and the desire for maximum
  interoperability, I would recommend that we stick with UTF-8 as the only
  requirement so that applications that send XHTML-Print data have to use
  UTF-8 and manage whatever perversion of UTF-16 they use internally
  themselves...
  
  -- 
  ______________________________________________________________________
  Michael Sweet, Easy Software Products           mike at easysw dot com
  Printing Software for UNIX                       http://www.easysw.com

FOLLOWUP 5:


  From: don@lexmark.com
  
  
  I maintain my disagreement with this decision for all the reasons
  previously mentioned including:
  
  1) There are no characters which can be represented in UTF16 that connot be
  represented in UTF8
  2) Reliable detection of UTF16 has not been proven
  3) High "zoot" clients can much more easily convert any UTF16 to UTF8
  4) Many of the target printers will have no need to deal with generic XML
  and hence no reason to support UTF16
  
  
  
  
  
  
  
  Jim Bigelow <voyager-issues@mn.aptest.com> on 09/26/2003 03:48:41 PM
  
  To:    hsivonen@iki.fi
  cc:    don@lexmark.com, elliott.bradshaw@zoran.com
  Subject:    Re: allow UTF-16 not just UTF-8 (PR#6774)
  
  
  Thank you for your comment on the XHTML-Print Last Call
  Working Draft. It is recorded as issue 6774 [1] in the HTML
  Working Group's issue tracking system.
  
  The working group agrees that since XHTML-Print is a member
  of the family of XHTML 1.0 languages documents encodings cannot
  be restricted to UTF-8 but must also include UTF-16.  The
  specification will be modified to remove the sentence,
  'The only valid value for the "charset" parameter is "utf-8".'
  
  If you feel that this resolution of your comment is not acceptable, please
  respond to this message with your comments.
  
  Jim Bigelow
  Editor
  
   [1]
   http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6774;user=guest

FOLLOWUP 6:


  From: don@lexmark.com
  
  
  Works for me.
  
  **********************************************
   Don Wright                 don@lexmark.com
  
   Chair,  IEEE SA Standards Board
   Member, IEEE-ISTO Board of Directors
   f.wright@ieee.org / f.wright@computer.org
  
   Director, Alliances & Standards
   Lexmark International
   740 New Circle Rd
   Lexington, Ky 40550
   859-825-4808 (phone) 603-963-8352 (fax)
  **********************************************
  
  
  
  
  Jim Bigelow <voyager-issues@mn.aptest.com> on 09/29/2003 05:39:11 PM
  
  To:    don@lexmark.com
  cc:
  Subject:    Re: allow UTF-16 not just UTF-8 (PR#6774)
  
  
  Don,
  
  What do you think of the following compromise?
  1. say nothing about whether a printer supports UTF-8 or UTF-16
  2. require that conforming XHTML-Print documents be encoded in UTF-8 by
  requiring that conforming clients (Section 2.2) creating documents that are
  encoded in UF-8. This means adding the following to item 1 of Section 2.2:
  
  1. Clients SHALL produce a well-formed XHTML-Print document as defined in
  XHTML
  1.0 [XHTML1] and in Document Conformance. The document SHALL be encoded
  using
  UTF-8 [RFC2279].
  
  
   Jim Bigelow

FOLLOWUP 7:


  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>
  
  To the HTML WG:
  
  Hello,
  
  Please help me understand this facet of XHTML-Print as a member of the
  Family of Languages defined by the Modularization of XHTML 1.0 -- must an
  application that processes XHTML-Print documents be a conforming XML
  processor?  
  
  I'm sure that it must be able to process XHTML-Print documents as described
  by the XHTML-Print specification, but are there other constraints?  For
  example, an xml processor is supposed to be able to process documents in
  UTF-8 and UTF-16.  Why does an XHTML-Print processor have support UTF-16?
  What would be the reasons for not restricting the encoding to UTF-8?  
  
  The potential benefit of only requiring support for UTF-8, rather than both
  UTF-8 and UTF-16, is that a more low-cost (in terms of memory and processing
  power) printers could process utf-8 encoded XHTML-Print documents. Requiring
  support for both UTF-8 and UTF-16 increases the memory and processing
  requirements and thereby reduces the number of devices that could process
  XHTML-Print documents.
  
  One of the goals of XHTML-Print is to provide document format for printing
  from and to low-cost devices, so keeping requirements to a minimum increases
  the possibilities that low-cost printers will implement support for it.  
  
  Several representative of printer manufactures have expressed the opinion
  that support for UTF-8 and not for UTF-16 is preferred.  Can you help me
  understand the technical reasons why UTF-16 support should be required, so
  we can judge the trade-offs in implementation costs versus capabilities?
  
  Jim

FOLLOWUP 8:


  From: elliott.bradshaw@zoran.com
  
  
  Jim,
  
  Um, seems to me like a game of semantics.  Whether we make a statement
  about the language or a statement about how the client generates it,
  seems like it's the same thing.
  
  I think the conflict here is:
  
  1.  PWG wanted a simple way to send print jobs.  No need for multiple
  ways to accomplish the same thing.
  
  2.  But there seem to be W3C rules about how one derives languages
  from XHTML.
  
  
  I do think that #2 is contrary to the purpose of the original
  project. Just as we are able to say that XHTML-Print does not mandate
  certain properties which are too hard for a printer (e.g. the caveats
  on the position property) we ought to be able to exclude something
  that is not appropriate to the problem at hand.
  
  The only justification for this extension is "W3C says so."  In
  principle we shouldn't do it.  But, as a compromise I could live with
  it if I had to.
  
  
  --
  
  Elliott Bradshaw 
  Director, Software Engineering Zoran Imaging Division
  (formerly Oak Technology Imaging Group) 781 638-7534 0

FOLLOWUP 9:


  From: Mail Delivery Subsystem <MAILER-DAEMON@hades.mn.aptest.com>
  
  This is a MIME-encapsulated message
  
  --h91Hhtb18706.1065030235/hades.mn.aptest.com
  
  The original message was received at Wed, 1 Oct 2003 12:43:53 -0500
  from IDENT:i5LhU/0sXY+dwkWULvPvTYjef6dRQYOI@localhost [127.0.0.1]
  
     ----- The following addresses had permanent fatal errors -----
  <don@lexmark>
      (reason: 550 Host unknown)
  
     ----- Transcript of session follows -----
  550 5.1.2 <don@lexmark>... Host unknown (Name server: lexmark: host not found)
  
  --h91Hhtb18706.1065030235/hades.mn.aptest.com
  Content-Type: message/delivery-status
  
  Reporting-MTA: dns; hades.mn.aptest.com
  Received-From-MTA: DNS; localhost
  Arrival-Date: Wed, 1 Oct 2003 12:43:53 -0500
  
  Final-Recipient: RFC822; don@lexmark
  Action: failed
  Status: 5.1.2
  Remote-MTA: DNS; lexmark
  Diagnostic-Code: SMTP; 550 Host unknown
  Last-Attempt-Date: Wed, 1 Oct 2003 12:43:54 -0500
  
  --h91Hhtb18706.1065030235/hades.mn.aptest.com
  Content-Type: message/rfc822
  
  Return-Path: <voyager-issues@mn.aptest.com>
  Received: from localhost (IDENT:i5LhU/0sXY+dwkWULvPvTYjef6dRQYOI@localhost [127.0.0.1])
  	by hades.mn.aptest.com (8.11.6/8.11.6) with ESMTP id h91Hhrb18704;
  	Wed, 1 Oct 2003 12:43:53 -0500
  Date: Wed, 1 Oct 2003 12:43:53 -0500
  Message-Id: <200310011743.h91Hhrb18704@hades.mn.aptest.com>
  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  To: don@lexmark, elliott.bradshaw@zoran.com
  Subject: Re: allow UTF-16 not just UTF-8 (PR#6774)
  X-Loop: voyager-issues@mn.aptest.com
  
  Don and Elliott,
  
  The HTML working group discussed my question of why and XHTML-Print processor
  must be a conforming XML processor (in particular, why it must support both
  UTF-8 and UTF-16 encodings) on October 1, 2003.  
  
  The answer is that XHTML-Print must be a conforming XML processor and support
  both UTF-8 and UTF-16 encodings to preserve compatibility between xml-based
  applications.
  
  If XHTML-Print processors only supported UTF-8 then an xml-based application
  could not be reliably depended upon to emit an XHTML-Print document that the
  XHTML-print application could process.  For example, an xml-based Xforms
  application's output of an XHTML-Print document cannot be restricted by the
  XHTML-Print specification to UTF-8 since the application may not be able to
  control the encoding.
  
  Section 4.3.3 [1] and Appendix F [2] of the XML specification [3] give
  heuristics for determing a document's encoding when the charset parameter of the
  MIME type [4] is absent.
  
  An example UTF-16 decoder is available at [5] other encodings are at [6].
  
  Jim Bigelow
  
  [1] http://www.w3.org/TR/REC-xml#charencoding
  [2] http://www.w3.org/TR/REC-xml#sec-guessing
  [3] http://www.w3.org/TR/REC-xml
  [4] http://www.ietf.org/rfc/rfc3023.txt
  [5] http://interscript.sourceforge.net/interscript/doc/en_iscr_0282.html
  [6] http://interscript.sourceforge.net/interscript/doc/en_iscr_0275.html
  
  --h91Hhtb18706.1065030235/hades.mn.aptest.com--

FOLLOWUP 10:

  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>

  Here is Don Wright's objection to UTF-16 support. 

  Jim
  http://oz.boi.hp.com/~jhb/ 

  -----Original Message-----
  From: don@lexmark.com [mailto:don@lexmark.com] 
  Sent: Wednesday, October 08, 2003 9:42 AM
  To: BIGELOW,JIM (HP-Boise,ex1)
  Cc: elliott.bradshaw@zoran.com; www-html@w3.org
  Subject: Re: allow UTF-16 not just UTF-8 (PR#6774)

  Jim:

  So let me understand this....

  Because people have poorly designed and written XML applications running on
  3 GHz Pentium 4s with 512 megabytes of real memory that do not allow the
  control over whether UTF-8 or UTF-16 are emitted, we are expecting to burden
  $49 printers with code to be able to detect and interpret both.

  I maintain my objection and my no vote.

  **********************************************
   Don Wright                 don@lexmark.com

   Chair,  IEEE SA Standards Board
   Member, IEEE-ISTO Board of Directors
   f.wright@ieee.org / f.wright@computer.org

   Director, Alliances & Standards
   Lexmark International
   740 New Circle Rd
   Lexington, Ky 40550
   859-825-4808 (phone) 603-963-8352 (fax)
  **********************************************

  "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com> on 10/08/2003 10:24:45 AM

  To:    don@lexmark.com
  cc:    elliott.bradshaw@zoran.com, www-html@w3.org
  Subject:    Re: allow UTF-16 not just UTF-8 (PR#6774)

  From
  http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6774;user=g

  uest  - reply #3

  Date: Wed Oct  1 12:43:54 2003

  Don and Elliott,

  The HTML working group discussed my question of why and XHTML-Print
  processor must be a conforming XML processor (in particular, why it must
  support both UTF-8 and UTF-16 encodings) on October 1, 2003.

  The answer is that XHTML-Print must be a conforming XML processor and
  support both UTF-8 and UTF-16 encodings to preserve compatibility between
  xml-based applications.

  If XHTML-Print processors only supported UTF-8 then an xml-based application
  could not be reliably depended upon to emit an XHTML-Print document that the
  XHTML-print application could process.  For example, an xml-based Xforms
  application's output of an XHTML-Print document cannot be restricted by the
  XHTML-Print specification to UTF-8 since the application may not be able to
  control the encoding.

  Section 4.3.3 [1] and Appendix F [2] of the XML specification [3] give
  heuristics for determing a document's encoding when the charset parameter of
  the MIME type [4] is absent.

  An example UTF-16 decoder is available at [5] other encodings are at [6].

  Jim Bigelow

  [1] http://www.w3.org/TR/REC-xml#charencoding
  [2] http://www.w3.org/TR/REC-xml#sec-guessing
  [3] http://www.w3.org/TR/REC-xml
  [4] http://www.ietf.org/rfc/rfc3023.txt
  [5] http://interscript.sourceforge.net/interscript/doc/en_iscr_0282.html
  [6] http://interscript.sourceforge.net/interscript/doc/en_iscr_0275.html

  Jim
   http://oz.boi.hp.com/~jhb/

FOLLOWUP 11:

  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>

  -----Original Message-----
  From: elliott.bradshaw@zoran.com [mailto:elliott.bradshaw@zoran.com] 
  Sent: Thursday, October 09, 2003 2:14 PM
  To: don@lexmark.com
  Cc: BIGELOW,JIM (HP-Boise,ex1)
  Subject: Re: allow UTF-16 not just UTF-8 (PR#6774)

  Don,

  As you know I have been skeptical of feature creep all along.  But I think
  this one may be different...here's why.

  When we originally conceived XHTML-Print the idea was that the client code
  would be essentially a hand-coded print driver.  But this W3C discussion
  brings up the idea that people could use XML application development tools
  as well.  This could be in our interest if it gives people an easy way to
  write XHTML-Print aware applications.  (And it seems to be pretty
  fundamental to the way they defined XML.)

  It seems that such tools don't like to be constrained to only one of UTF-8
  vs. UTF-16...it would be "unnatural" to limit a developer in this way.  It
  sort of reminds me of 10baseT vs. 100baseT, in which it seems odd to support
  one but not the other.

  How much complexity would this add to the $49 printer?  Once we know whether
  or not we are in UTF-16, it would add very little (if nothing else do a
  brute force conversion from UTF-16 to UTF-8).  Detection of UTF-16 is also
  straightforward, as described in 4.3.3 of http://www.w3.org/TR/REC-xml,
  which says the special Byte Order Mark is required at the beginning of
  UTF-16.  (It also says very clearly that UTF-16 support is required.)

  So I think the cost is low, the benefit of XML-based application tools might
  be significant, and technical alignment with XML makes it worth doing.

    E.

  ----------------------------------------------------------------------------
  ----

  Elliott Bradshaw
  Director, Software Engineering
  Zoran Imaging Division (formerly Oak Technology Imaging Group) 781 638-7534

                      don@lexmark.co

                      m                    To:     "BIGELOW,JIM
  (HP-Boise,ex1)"                  
                                            <jim.bigelow@hp.com>

                      10/08/2003           cc:     elliott.bradshaw@zoran.com,
  www-html@w3.org   
                      12:41 PM             Subject:     Re: allow UTF-16 not
  just UTF-8          
                                            (PR#6774)

  Jim:

  So let me understand this....

  Because people have poorly designed and written XML applications running on
  3 GHz Pentium 4s with 512 megabytes of real memory that do not allow the
  control over whether UTF-8 or UTF-16 are emitted, we are expecting to burden
  $49 printers with code to be able to detect and interpret both.

  I maintain my objection and my no vote.

  **********************************************
   Don Wright                 don@lexmark.com

   Chair,  IEEE SA Standards Board
   Member, IEEE-ISTO Board of Directors
   f.wright@ieee.org / f.wright@computer.org

   Director, Alliances & Standards
   Lexmark International
   740 New Circle Rd
   Lexington, Ky 40550
   859-825-4808 (phone) 603-963-8352 (fax)
  **********************************************

  "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com> on 10/08/2003 10:24:45 AM

  To:    don@lexmark.com
  cc:    elliott.bradshaw@zoran.com, www-html@w3.org
  Subject:    Re: allow UTF-16 not just UTF-8 (PR#6774)

  From
  http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6774;user=g

  uest  - reply #3

  Date: Wed Oct  1 12:43:54 2003

  Don and Elliott,

  The HTML working group discussed my question of why and XHTML-Print
  processor must be a conforming XML processor (in particular, why it must
  support both UTF-8 and UTF-16 encodings) on October 1, 2003.

  The answer is that XHTML-Print must be a conforming XML processor and
  support both UTF-8 and UTF-16 encodings to preserve compatibility between
  xml-based applications.

  If XHTML-Print processors only supported UTF-8 then an xml-based application
  could not be reliably depended upon to emit an XHTML-Print document that the
  XHTML-print application could process.  For example, an xml-based Xforms
  application's output of an XHTML-Print document cannot be restricted by the
  XHTML-Print specification to UTF-8 since the application may not be able to
  control the encoding.

  Section 4.3.3 [1] and Appendix F [2] of the XML specification [3] give
  heuristics for determing a document's encoding when the charset parameter of
  the MIME type [4] is absent.

  An example UTF-16 decoder is available at [5] other encodings are at [6].

  Jim Bigelow

  [1] http://www.w3.org/TR/REC-xml#charencoding
  [2] http://www.w3.org/TR/REC-xml#sec-guessing
  [3] http://www.w3.org/TR/REC-xml
  [4] http://www.ietf.org/rfc/rfc3023.txt
  [5] http://interscript.sourceforge.net/interscript/doc/en_iscr_0282.html
  [6] http://interscript.sourceforge.net/interscript/doc/en_iscr_0275.html

  Jim
   http://oz.boi.hp.com/~jhb/

FOLLOWUP 12:


  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>
  
  Mike,
  
  I've neglected to update you on the discussions about UTF-8/UTF-16 support
  for XHTML-Print.  Please let us know you thoughts on the matter.
  
  You can see these discussion using the following link to the W3C's HTML
  Working Group issue database: 
  
  http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6774;user=g
  uest
  
  In summary:
  HTML WG: must support UTF-8 & UTF-16 for interoperability with all other xml
  and xml-derived applications and processors.
  
  Lexmark: UTF-16 support is too expensive to support in a low-cost printer,
  and too hard to reliably detect, ...
  
  Oak/Zoran: UTF-16 wouldn't be too expensive to implement and enables a new
  class of XHTML-Print producing devices
  
  HP: UTF-16 allows for more compact representation of Asian character
  documents and would not be too much to implement.
  
  Jim Bigelow, 
  Editor: XHTML-Print & CSS Print Profile
  W3C HTML and CSS Working Groups
  http://www.w3.org/TR/xhtml-print/
  http://www.w3.org/TR/css-print/
  Hewlett-Packard
  208-396-2068
  jim.bigelow@hp.com

FOLLOWUP 13:


  From: don@lexmark.com
  
  
  Steven, et al:
  
  The real problem is that the entire XML architecture was designed assuming
  high end boxes like the 3 GHz Pentium with 512 megabytes of memory.  We
  have already seen push back in other standards groups that consumer
  electronic devices and other smaller, lighter devices cannot afford all the
  luxuries demand by an obese XML architecture.  Unless the XML community
  accepts subsetting, we can't expect the broadest support for XML to happen
  at the low end until the price/performance ratios experience another order
  or two magnitude improvement.  As recently reported in several of the trade
  magazines focused on IT professionals, the deployment of XML and Web
  Services are have significant negative impacts on the IT infrastructure
  especially in the area of bandwidth utilization.  This is just another
  symptom of the same problem.
  
  I know I will lose this argument in the W3C but the realities of the
  XHTML-Print implementations will blow off UTF-16 as more fat with no
  benefit and simply not support it, "interoperable" or not.
  
  Sorry I'm not pure but practical.
  
  *******************************************
  Don Wright                 don@lexmark.com
  
  Chair,  IEEE SA Standards Board
  Member, IEEE-ISTO Board of Directors
  f.wright@ieee.org / f.wright@computer.org
  
  Director, Alliances and Standards
  Lexmark International
  740 New Circle Rd C14/082-3
  Lexington, Ky 40550
  859-825-4808 (phone) 603-963-8352 (fax)
  *******************************************
  
  
  
  
  "Steven Pemberton" <Steven.Pemberton@cwi.nl> on 10/15/2003 09:18:15 AM
  
  To:    "BIGELOW,JIM \(HP-Boise,ex1\)" <jim.bigelow@hp.com>,
         <w3c-html-wg@w3.org>, <don@lexmark.com>
  cc:    <voyager-issues@mn.aptest.com>, <elliott.bradshaw@zoran.com>,
         <www-html@w3.org>
  Subject:    Re: allow UTF-16 not just UTF-8 (PR#6774)
  
  
  > From: don@lexmark.com [mailto:don@lexmark.com]
  
  > So let me understand this....
  >
  > Because people have poorly designed and written XML applications running
  on
  > 3 GHz Pentium 4s with 512 megabytes of real memory that do not allow the
  > control over whether UTF-8 or UTF-16 are emitted, we are expecting to
  burden
  > $49 printers with code to be able to detect and interpret both.
  
  No Don. It is about interoperability and conforming to standards. XML
  allows
  documents to be encoded in either UTF8 or UTF 16: consumers must accept
  both, producers may produce either. An XHTML-Print printer will be just a
  consumer of an XML byte-stream at some IP address; we don't want to burden
  every program in the world that can produce XML with a switch that says
  "this output is going to a poor lowly XHTML Print processor that can't deal
  with UTF-16, so please produce UTF-8", especially since UTF 16 is the easy
  one to implement, and can only cost a few dozen bytes at best.
  
  If we changed this, XHTML Print would have to go back to last call, and you
  can bet your boots that the XML community would rise up against us, as it
  has in the past, and I can tell you we don't want to go there, and we would
  have a hundred people registering objections.
  
  Conforming to XML requirements comes with the territory of being XHTML. The
  XML community will not take lightly to us messing with their standards.
  
  Best wishes,
  
  Steven Pemberton

FOLLOWUP 14:


  From: "Steven Pemberton" <Steven.Pemberton@cwi.nl>
  
  > From: don@lexmark.com [mailto:don@lexmark.com]
  
  > So let me understand this....
  >
  > Because people have poorly designed and written XML applications running
  on
  > 3 GHz Pentium 4s with 512 megabytes of real memory that do not allow the
  > control over whether UTF-8 or UTF-16 are emitted, we are expecting to
  burden
  > $49 printers with code to be able to detect and interpret both.
  
  No Don. It is about interoperability and conforming to standards. XML allows
  documents to be encoded in either UTF8 or UTF 16: consumers must accept
  both, producers may produce either. An XHTML-Print printer will be just a
  consumer of an XML byte-stream at some IP address; we don't want to burden
  every program in the world that can produce XML with a switch that says
  "this output is going to a poor lowly XHTML Print processor that can't deal
  with UTF-16, so please produce UTF-8", especially since UTF 16 is the easy
  one to implement, and can only cost a few dozen bytes at best.
  
  If we changed this, XHTML Print would have to go back to last call, and you
  can bet your boots that the XML community would rise up against us, as it
  has in the past, and I can tell you we don't want to go there, and we would
  have a hundred people registering objections.
  
  Conforming to XML requirements comes with the territory of being XHTML. The
  XML community will not take lightly to us messing with their standards.
  
  Best wishes,
  
  Steven Pemberton

FOLLOWUP 15:

  From: Michael Sweet <mike@easysw.com>

  BIGELOW,JIM (HP-Boise,ex1) wrote:
  > Mike,
  > 
  > I've neglected to update you on the discussions about UTF-8/UTF-16
  > support for XHTML-Print.  Please let us know you thoughts on the
  > matter.

  My concerns have always been concerning the detection between UTF-8
  and UTF-16.  After looking through the archive and the current XML
  spec, it does look like the BOM is required at the beginning of any
  UTF-16 XML document, so any autodetection problems can safely be
  blamed on Microsoft or whatever vendor is producing a non-conforming
  document.

  I do like the idea of recommending (a SHOULD, not a MUST) that the
  XHTML-Print client use the UTF-8 encoding, and add a note that the
  typical XHTML-Print device has limited CPU/memory available and
  the use of UTF-8 will potentially provide faster printing, etc.

  -- 
  ______________________________________________________________________
  Michael Sweet, Easy Software Products           mike at easysw dot com
  Printing Software for UNIX                       http://www.easysw.com

FOLLOWUP 16:

  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>

  From: elliott.bradshaw@zoran.com [mailto:elliott.bradshaw@zoran.com] 
  Sent: Thursday, October 09, 2003 2:14 PM
  To: don@lexmark.com
  Cc: BIGELOW,JIM (HP-Boise,ex1)
  Subject: Re: allow UTF-16 not just UTF-8 (PR#6774)

  Don,

  As you know I have been skeptical of feature creep all along.  But I think
  this one may be different...here's why.

  When we originally conceived XHTML-Print the idea was that the client code
  would be essentially a hand-coded print driver.  But this W3C discussion
  brings up the idea that people could use XML application development tools
  as well.  This could be in our interest if it gives people an easy way to
  write XHTML-Print aware applications.  (And it seems to be pretty
  fundamental to the way they defined XML.)

  It seems that such tools don't like to be constrained to only one of UTF-8
  vs. UTF-16...it would be "unnatural" to limit a developer in this way.  It
  sort of reminds me of 10baseT vs. 100baseT, in which it seems odd to support
  one but not the other.

  How much complexity would this add to the $49 printer?  Once we know whether
  or not we are in UTF-16, it would add very little (if nothing else do a
  brute force conversion from UTF-16 to UTF-8).  Detection of UTF-16 is also
  straightforward, as described in 4.3.3 of http://www.w3.org/TR/REC-xml,
  which says the special Byte Order Mark is required at the beginning of
  UTF-16.  (It also says very clearly that UTF-16 support is required.)

  So I think the cost is low, the benefit of XML-based application tools might
  be significant, and technical alignment with XML makes it worth doing.

    E.

  ----------------------------------------------------------------------------
  ----

  Elliott Bradshaw
  Director, Software Engineering
  Zoran Imaging Division (formerly Oak Technology Imaging Group) 781 638-7534

FOLLOWUP 17:


  From: "Steven Pemberton" <steven.pemberton@cwi.nl>
  
  But support for UTF 16 adds a few dozen bytes of code, and no extra memory
  requirements. It is simpler than UTF 8! What's the problem?
  
  Steven
  
  ----- Original Message ----- 
  From: <don@lexmark.com>
  To: "Steven Pemberton" <Steven.Pemberton@cwi.nl>
  Cc: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>; <w3c-html-wg@w3.org>;
  <don@lexmark.com>; <voyager-issues@mn.aptest.com>;
  <elliott.bradshaw@zoran.com>; <www-html@w3.org>
  Sent: Thursday, October 16, 2003 12:20 AM
  Subject: Re: allow UTF-16 not just UTF-8 (PR#6774)
  
  
  >
  > Steven, et al:
  >
  > The real problem is that the entire XML architecture was designed assuming
  > high end boxes like the 3 GHz Pentium with 512 megabytes of memory.  We
  > have already seen push back in other standards groups that consumer
  > electronic devices and other smaller, lighter devices cannot afford all
  the
  > luxuries demand by an obese XML architecture.  Unless the XML community
  > accepts subsetting, we can't expect the broadest support for XML to happen
  > at the low end until the price/performance ratios experience another order
  > or two magnitude improvement.  As recently reported in several of the
  trade
  > magazines focused on IT professionals, the deployment of XML and Web
  > Services are have significant negative impacts on the IT infrastructure
  > especially in the area of bandwidth utilization.  This is just another
  > symptom of the same problem.
  >
  > I know I will lose this argument in the W3C but the realities of the
  > XHTML-Print implementations will blow off UTF-16 as more fat with no
  > benefit and simply not support it, "interoperable" or not.
  >
  > Sorry I'm not pure but practical.
  >
  > *******************************************
  > Don Wright                 don@lexmark.com
  >
  > Chair,  IEEE SA Standards Board
  > Member, IEEE-ISTO Board of Directors
  > f.wright@ieee.org / f.wright@computer.org
  >
  > Director, Alliances and Standards
  > Lexmark International
  > 740 New Circle Rd C14/082-3
  > Lexington, Ky 40550
  > 859-825-4808 (phone) 603-963-8352 (fax)
  > *******************************************
  >
  >
  >
  >
  > "Steven Pemberton" <Steven.Pemberton@cwi.nl> on 10/15/2003 09:18:15 AM
  >
  > To:    "BIGELOW,JIM \(HP-Boise,ex1\)" <jim.bigelow@hp.com>,
  >        <w3c-html-wg@w3.org>, <don@lexmark.com>
  > cc:    <voyager-issues@mn.aptest.com>, <elliott.bradshaw@zoran.com>,
  >        <www-html@w3.org>
  > Subject:    Re: allow UTF-16 not just UTF-8 (PR#6774)
  >
  >
  > > From: don@lexmark.com [mailto:don@lexmark.com]
  >
  > > So let me understand this....
  > >
  > > Because people have poorly designed and written XML applications running
  > on
  > > 3 GHz Pentium 4s with 512 megabytes of real memory that do not allow the
  > > control over whether UTF-8 or UTF-16 are emitted, we are expecting to
  > burden
  > > $49 printers with code to be able to detect and interpret both.
  >
  > No Don. It is about interoperability and conforming to standards. XML
  > allows
  > documents to be encoded in either UTF8 or UTF 16: consumers must accept
  > both, producers may produce either. An XHTML-Print printer will be just a
  > consumer of an XML byte-stream at some IP address; we don't want to burden
  > every program in the world that can produce XML with a switch that says
  > "this output is going to a poor lowly XHTML Print processor that can't
  deal
  > with UTF-16, so please produce UTF-8", especially since UTF 16 is the easy
  > one to implement, and can only cost a few dozen bytes at best.
  >
  > If we changed this, XHTML Print would have to go back to last call, and
  you
  > can bet your boots that the XML community would rise up against us, as it
  > has in the past, and I can tell you we don't want to go there, and we
  would
  > have a hundred people registering objections.
  >
  > Conforming to XML requirements comes with the territory of being XHTML.
  The
  > XML community will not take lightly to us messing with their standards.
  >
  > Best wishes,
  >
  > Steven Pemberton
  >
  >
  >
  >
  >
  >
  >

FOLLOWUP 18:


  From: don@lexmark.com
  
  
  
  One more thing, just one more thing.  Every option or alternative adds one
  more thing.
  
  I think I'll pass on that one more thin mint.
  
  *******************************************
  Don Wright                 don@lexmark.com
  
  Chair,  IEEE SA Standards Board
  Member, IEEE-ISTO Board of Directors
  f.wright@ieee.org / f.wright@computer.org
  
  Director, Alliances and Standards
  Lexmark International
  740 New Circle Rd C14/082-3
  Lexington, Ky 40550
  859-825-4808 (phone) 603-963-8352 (fax)
  *******************************************
  
  
  
  
  "Steven Pemberton" <steven.pemberton@cwi.nl> on 10/15/2003 07:26:24 PM
  
  To:    <don@lexmark.com>
  cc:    "BIGELOW,JIM \(HP-Boise,ex1\)" <jim.bigelow@hp.com>,
         <w3c-html-wg@w3.org>, <don@lexmark.com>,
         <voyager-issues@mn.aptest.com>, <elliott.bradshaw@zoran.com>,
         <www-html@w3.org>
  Subject:    Re: allow UTF-16 not just UTF-8 (PR#6774)
  
  
  But support for UTF 16 adds a few dozen bytes of code, and no extra memory
  requirements. It is simpler than UTF 8! What's the problem?
  
  Steven
  
  ----- Original Message -----
  From: <don@lexmark.com>
  To: "Steven Pemberton" <Steven.Pemberton@cwi.nl>
  Cc: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>;
  <w3c-html-wg@w3.org>;
  <don@lexmark.com>; <voyager-issues@mn.aptest.com>;
  <elliott.bradshaw@zoran.com>; <www-html@w3.org>
  Sent: Thursday, October 16, 2003 12:20 AM
  Subject: Re: allow UTF-16 not just UTF-8 (PR#6774)
  
  
  >
  > Steven, et al:
  >
  > The real problem is that the entire XML architecture was designed
  assuming
  > high end boxes like the 3 GHz Pentium with 512 megabytes of memory.  We
  > have already seen push back in other standards groups that consumer
  > electronic devices and other smaller, lighter devices cannot afford all
  the
  > luxuries demand by an obese XML architecture.  Unless the XML community
  > accepts subsetting, we can't expect the broadest support for XML to
  happen
  > at the low end until the price/performance ratios experience another
  order
  > or two magnitude improvement.  As recently reported in several of the
  trade
  > magazines focused on IT professionals, the deployment of XML and Web
  > Services are have significant negative impacts on the IT infrastructure
  > especially in the area of bandwidth utilization.  This is just another
  > symptom of the same problem.
  >
  > I know I will lose this argument in the W3C but the realities of the
  > XHTML-Print implementations will blow off UTF-16 as more fat with no
  > benefit and simply not support it, "interoperable" or not.
  >
  > Sorry I'm not pure but practical.
  >
  > *******************************************
  > Don Wright                 don@lexmark.com
  >
  > Chair,  IEEE SA Standards Board
  > Member, IEEE-ISTO Board of Directors
  > f.wright@ieee.org / f.wright@computer.org
  >
  > Director, Alliances and Standards
  > Lexmark International
  > 740 New Circle Rd C14/082-3
  > Lexington, Ky 40550
  > 859-825-4808 (phone) 603-963-8352 (fax)
  > *******************************************
  >
  >
  >
  >
  > "Steven Pemberton" <Steven.Pemberton@cwi.nl> on 10/15/2003 09:18:15 AM
  >
  > To:    "BIGELOW,JIM \(HP-Boise,ex1\)" <jim.bigelow@hp.com>,
  >        <w3c-html-wg@w3.org>, <don@lexmark.com>
  > cc:    <voyager-issues@mn.aptest.com>, <elliott.bradshaw@zoran.com>,
  >        <www-html@w3.org>
  > Subject:    Re: allow UTF-16 not just UTF-8 (PR#6774)
  >
  >
  > > From: don@lexmark.com [mailto:don@lexmark.com]
  >
  > > So let me understand this....
  > >
  > > Because people have poorly designed and written XML applications
  running
  > on
  > > 3 GHz Pentium 4s with 512 megabytes of real memory that do not allow
  the
  > > control over whether UTF-8 or UTF-16 are emitted, we are expecting to
  > burden
  > > $49 printers with code to be able to detect and interpret both.
  >
  > No Don. It is about interoperability and conforming to standards. XML
  > allows
  > documents to be encoded in either UTF8 or UTF 16: consumers must accept
  > both, producers may produce either. An XHTML-Print printer will be just a
  > consumer of an XML byte-stream at some IP address; we don't want to
  burden
  > every program in the world that can produce XML with a switch that says
  > "this output is going to a poor lowly XHTML Print processor that can't
  deal
  > with UTF-16, so please produce UTF-8", especially since UTF 16 is the
  easy
  > one to implement, and can only cost a few dozen bytes at best.
  >
  > If we changed this, XHTML Print would have to go back to last call, and
  you
  > can bet your boots that the XML community would rise up against us, as it
  > has in the past, and I can tell you we don't want to go there, and we
  would
  > have a hundred people registering objections.
  >
  > Conforming to XML requirements comes with the territory of being XHTML.
  The
  > XML community will not take lightly to us messing with their standards.
  >
  > Best wishes,
  >
  > Steven Pemberton
  >
  >
  >
  >
  >
  >
  >

FOLLOWUP 19:


  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>
  
  Don,
  
  Here is a new section in the Design Rationale portion of the spec:
  
  <h3 id="s.1.3.7">1.3.7 Character Model</h3>
  <p>
  The W3C architectural specification <cite>Character Model for the
  World Wide Web 1.0</cite> [<a href="#ref_charmod">CHARMOD</a>] gives
  the <em title="RECOMMENDED in RFC 2119 context"
  class="RFC2119">RECOMMENDED</em> representation of characters in
  XHTML-Print.
  Authors of XHTML-Print producing applications
  <em title="SHOULD in RFC 2119 context" class="RFC2119">SHOULD</em>
  be aware that lost cost printers might be limited in both
  processing power and memory and therefore,
  that fully-normalized ([<a href="#ref_charmod">CHARMOD</a>],
  <a href="http://www.w3.org/TR/charmod/#sec-FullyNormalized">4.2.3)
  utf-8 encoded documents could print more quickly than documents
  in other forms and encodings.
  </p>
  
  I hope that this section will help discourage UTF-16.
  
  Jim

FOLLOWUP 20:

  From: Henri Sivonen <hsivonen@iki.fi>

  On Thursday, Oct 16, 2003, at 01:20 Europe/Helsinki, don@lexmark.com  
  wrote:

  > The real problem is that the entire XML architecture was designed  
  > assuming
  > high end boxes like the 3 GHz Pentium with 512 megabytes of memory.

  Lesser devices can host expat. However, if a device can't host expat,  
  perhaps it would be better to use something other than XML to  
  communicate with the device.

  > We have already seen push back in other standards groups that consumer
  > electronic devices and other smaller, lighter devices cannot afford  
  > all the
  > luxuries demand by an obese XML architecture.  Unless the XML community
  > accepts subsetting, we can't expect the broadest support for XML to  
  > happen
  > at the low end until the price/performance ratios experience another  
  > order
  > or two magnitude improvement.

  If you subset XML, is support for the subset support for XML?

  What's the point of building a language on application-specific  
  almost-XML? A Language built on such almost-XML breaks expectations  
  (either in software or in the minds of people who need to deal with the  
  language). If you can't use tools that are based on the assumption that  
  the data they process is *exactly* XML and the programmers' knowledge  
  about XML isn't guaranteed to apply, wouldn't it be less confusing to  
  invent another grammar entirely and not call it XML?

  A well-defined extended subset of XML (for example: UTF-8 only,  
  normalization form C only, no doctype, no PIs, no CDATA sections, no  
  epilog, all HTML character entities predefined, namespace processing  
  mandatory) would be more useful that having specs layered on top of XML  
  1.0 trying to readjust what XML 1.0 is.

  XHTML-Print printers get data over HTTP which is over TCP. It would be  
  ludicrous to tweak the TCP header format in the XHTML-Print spec.

  > I know I will lose this argument in the W3C but the realities of the
  > XHTML-Print implementations will blow off UTF-16 as more fat with no
  > benefit and simply not support it, "interoperable" or not.

  Converting UTF-16 to UTF-8 really isn't a big deal. It's basically a  
  matter of shifting bits.

  Considering eliminating fat, I'd much rather eliminate character  
  entities[1] and references to the external DTD subset[2]. Character  
  entities are a burden in any case. They require either processing the  
  external DTD subset (bad for execution speed and memory requirements)  
  or implementing an extra feature which doesn't belong in an XML  
  processor (bad for conformance and yet redundant since there are  
  conforming ways of representing characters).

  [1]  
  http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML- 
  Print?id=6776;user=guest
  [2]  
  http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML- 
  Print?id=6773;user=guest

  -- 
  Henri Sivonen
  hsivonen@iki.fi
  http://www.iki.fi/hsivonen/

FOLLOWUP 21:


  From: don@lexmark.com
  
  
  Steven:
  
  I think your answer proves my point that the XML commmunity did not and
  does not consider the limitations of low cost, constrained embedded
  environments when developing XML.
  
  You make the assertion that no extra memory is required yet the reality is
  quite the opposite.
  
  Please tell me if I'm wrong, but my understanding of UTF-8 and UTF-16 is
  that:
  
  1) Every XHTML tag will require twice as many bytes when represented in
  UTF-16 versus UTF-8
  2) Every English XHTML-Print print job will be twice as big encoded with
  UTF-16 versus UTF-8
  3) Every "Latin 1" print job will be larger approaching 2X in size.
  
  When you double the data's size, buffers have to double to be able to hold
  and manipulate an equivalent amount of print stream content.  There is real
  cost and performance costs to be paid to deal with UTF-16 encoding
  especially when dealing with western character sets.  When a device is
  designed to deal with the far east "characters" there are other penalties
  to be paid in things like the size of the font load that mitigate the
  UTF-16 versus UTF-8 encoding issue.
  
  *******************************************
  Don Wright                 don@lexmark.com
  
  Chair,  IEEE SA Standards Board
  Member, IEEE-ISTO Board of Directors
  f.wright@ieee.org / f.wright@computer.org
  
  Director, Alliances and Standards
  Lexmark International
  740 New Circle Rd C14/082-3
  Lexington, Ky 40550
  859-825-4808 (phone) 603-963-8352 (fax)
  *******************************************
  
  
  
  
  
  "Steven Pemberton" <steven.pemberton@cwi.nl> on 10/15/2003 07:26:24 PM
  
  To:    <don@lexmark.com>
  cc:    "BIGELOW,JIM \(HP-Boise,ex1\)" <jim.bigelow@hp.com>,
         <w3c-html-wg@w3.org>, <don@lexmark.com>,
         <voyager-issues@mn.aptest.com>, <elliott.bradshaw@zoran.com>,
         <www-html@w3.org>
  Subject:    Re: allow UTF-16 not just UTF-8 (PR#6774)
  
  
  But support for UTF 16 adds a few dozen bytes of code, and no extra memory
  requirements. It is simpler than UTF 8! What's the problem?
  
  Steven
  
  ----- Original Message -----
  From: <don@lexmark.com>
  To: "Steven Pemberton" <Steven.Pemberton@cwi.nl>
  Cc: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>;
  <w3c-html-wg@w3.org>;
  <don@lexmark.com>; <voyager-issues@mn.aptest.com>;
  <elliott.bradshaw@zoran.com>; <www-html@w3.org>
  Sent: Thursday, October 16, 2003 12:20 AM
  Subject: Re: allow UTF-16 not just UTF-8 (PR#6774)
  
  
  >
  > Steven, et al:
  >
  > The real problem is that the entire XML architecture was designed
  assuming
  > high end boxes like the 3 GHz Pentium with 512 megabytes of memory.  We
  > have already seen push back in other standards groups that consumer
  > electronic devices and other smaller, lighter devices cannot afford all
  the
  > luxuries demand by an obese XML architecture.  Unless the XML community
  > accepts subsetting, we can't expect the broadest support for XML to
  happen
  > at the low end until the price/performance ratios experience another
  order
  > or two magnitude improvement.  As recently reported in several of the
  trade
  > magazines focused on IT professionals, the deployment of XML and Web
  > Services are have significant negative impacts on the IT infrastructure
  > especially in the area of bandwidth utilization.  This is just another
  > symptom of the same problem.
  >
  > I know I will lose this argument in the W3C but the realities of the
  > XHTML-Print implementations will blow off UTF-16 as more fat with no
  > benefit and simply not support it, "interoperable" or not.
  >
  > Sorry I'm not pure but practical.
  >
  > *******************************************
  > Don Wright                 don@lexmark.com
  >
  > Chair,  IEEE SA Standards Board
  > Member, IEEE-ISTO Board of Directors
  > f.wright@ieee.org / f.wright@computer.org
  >
  > Director, Alliances and Standards
  > Lexmark International
  > 740 New Circle Rd C14/082-3
  > Lexington, Ky 40550
  > 859-825-4808 (phone) 603-963-8352 (fax)
  > *******************************************
  >
  >
  >
  >
  > "Steven Pemberton" <Steven.Pemberton@cwi.nl> on 10/15/2003 09:18:15 AM
  >
  > To:    "BIGELOW,JIM \(HP-Boise,ex1\)" <jim.bigelow@hp.com>,
  >        <w3c-html-wg@w3.org>, <don@lexmark.com>
  > cc:    <voyager-issues@mn.aptest.com>, <elliott.bradshaw@zoran.com>,
  >        <www-html@w3.org>
  > Subject:    Re: allow UTF-16 not just UTF-8 (PR#6774)
  >
  >
  > > From: don@lexmark.com [mailto:don@lexmark.com]
  >
  > > So let me understand this....
  > >
  > > Because people have poorly designed and written XML applications
  running
  > on
  > > 3 GHz Pentium 4s with 512 megabytes of real memory that do not allow
  the
  > > control over whether UTF-8 or UTF-16 are emitted, we are expecting to
  > burden
  > > $49 printers with code to be able to detect and interpret both.
  >
  > No Don. It is about interoperability and conforming to standards. XML
  > allows
  > documents to be encoded in either UTF8 or UTF 16: consumers must accept
  > both, producers may produce either. An XHTML-Print printer will be just a
  > consumer of an XML byte-stream at some IP address; we don't want to
  burden
  > every program in the world that can produce XML with a switch that says
  > "this output is going to a poor lowly XHTML Print processor that can't
  deal
  > with UTF-16, so please produce UTF-8", especially since UTF 16 is the
  easy
  > one to implement, and can only cost a few dozen bytes at best.
  >
  > If we changed this, XHTML Print would have to go back to last call, and
  you
  > can bet your boots that the XML community would rise up against us, as it
  > has in the past, and I can tell you we don't want to go there, and we
  would
  > have a hundred people registering objections.
  >
  > Conforming to XML requirements comes with the territory of being XHTML.
  The
  > XML community will not take lightly to us messing with their standards.
  >
  > Best wishes,
  >
  > Steven Pemberton
  >
  >
  >
  >
  >
  >
  >

FOLLOWUP 22:


  From: "Steven Pemberton" <steven.pemberton@cwi.nl>
  
  Don,
  
  I've been wondering for a long time if that was the misunderstanding, but I
  was assured it wasn't.
  
  UTF 16 and UTF 8 are *external* representations. The internal amount of
  storage needed for them is identical, and completely up to you how you
  store.
  
  The only extra memory needed is the couple of dozen extra bytes of code to
  convert UTF 16 into whatever internal representation you use.
  
  Best wishes,
  
  Steven
  
  
  
  ----- Original Message ----- 
  From: <don@lexmark.com>
  To: "Steven Pemberton" <steven.pemberton@cwi.nl>
  Cc: <don@lexmark.com>; "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>;
  <w3c-html-wg@w3.org>; <voyager-issues@mn.aptest.com>;
  <elliott.bradshaw@zoran.com>; <www-html@w3.org>
  Sent: Thursday, October 16, 2003 2:51 PM
  Subject: Re: allow UTF-16 not just UTF-8 (PR#6774)
  
  
  >
  >
  > Steven:
  >
  > I think your answer proves my point that the XML commmunity did not and
  > does not consider the limitations of low cost, constrained embedded
  > environments when developing XML.
  >
  > You make the assertion that no extra memory is required yet the reality is
  > quite the opposite.
  >
  > Please tell me if I'm wrong, but my understanding of UTF-8 and UTF-16 is
  > that:
  >
  > 1) Every XHTML tag will require twice as many bytes when represented in
  > UTF-16 versus UTF-8
  > 2) Every English XHTML-Print print job will be twice as big encoded with
  > UTF-16 versus UTF-8
  > 3) Every "Latin 1" print job will be larger approaching 2X in size.
  >
  > When you double the data's size, buffers have to double to be able to hold
  > and manipulate an equivalent amount of print stream content.  There is
  real
  > cost and performance costs to be paid to deal with UTF-16 encoding
  > especially when dealing with western character sets.  When a device is
  > designed to deal with the far east "characters" there are other penalties
  > to be paid in things like the size of the font load that mitigate the
  > UTF-16 versus UTF-8 encoding issue.
  >
  > *******************************************
  > Don Wright                 don@lexmark.com
  >
  > Chair,  IEEE SA Standards Board
  > Member, IEEE-ISTO Board of Directors
  > f.wright@ieee.org / f.wright@computer.org
  >
  > Director, Alliances and Standards
  > Lexmark International
  > 740 New Circle Rd C14/082-3
  > Lexington, Ky 40550
  > 859-825-4808 (phone) 603-963-8352 (fax)
  > *******************************************
  >
  >
  >
  >
  >
  > "Steven Pemberton" <steven.pemberton@cwi.nl> on 10/15/2003 07:26:24 PM
  >
  > To:    <don@lexmark.com>
  > cc:    "BIGELOW,JIM \(HP-Boise,ex1\)" <jim.bigelow@hp.com>,
  >        <w3c-html-wg@w3.org>, <don@lexmark.com>,
  >        <voyager-issues@mn.aptest.com>, <elliott.bradshaw@zoran.com>,
  >        <www-html@w3.org>
  > Subject:    Re: allow UTF-16 not just UTF-8 (PR#6774)
  >
  >
  > But support for UTF 16 adds a few dozen bytes of code, and no extra memory
  > requirements. It is simpler than UTF 8! What's the problem?
  >
  > Steven
  >
  > ----- Original Message -----
  > From: <don@lexmark.com>
  > To: "Steven Pemberton" <Steven.Pemberton@cwi.nl>
  > Cc: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>;
  > <w3c-html-wg@w3.org>;
  > <don@lexmark.com>; <voyager-issues@mn.aptest.com>;
  > <elliott.bradshaw@zoran.com>; <www-html@w3.org>
  > Sent: Thursday, October 16, 2003 12:20 AM
  > Subject: Re: allow UTF-16 not just UTF-8 (PR#6774)
  >
  >
  > >
  > > Steven, et al:
  > >
  > > The real problem is that the entire XML architecture was designed
  > assuming
  > > high end boxes like the 3 GHz Pentium with 512 megabytes of memory.  We
  > > have already seen push back in other standards groups that consumer
  > > electronic devices and other smaller, lighter devices cannot afford all
  > the
  > > luxuries demand by an obese XML architecture.  Unless the XML community
  > > accepts subsetting, we can't expect the broadest support for XML to
  > happen
  > > at the low end until the price/performance ratios experience another
  > order
  > > or two magnitude improvement.  As recently reported in several of the
  > trade
  > > magazines focused on IT professionals, the deployment of XML and Web
  > > Services are have significant negative impacts on the IT infrastructure
  > > especially in the area of bandwidth utilization.  This is just another
  > > symptom of the same problem.
  > >
  > > I know I will lose this argument in the W3C but the realities of the
  > > XHTML-Print implementations will blow off UTF-16 as more fat with no
  > > benefit and simply not support it, "interoperable" or not.
  > >
  > > Sorry I'm not pure but practical.
  > >
  > > *******************************************
  > > Don Wright                 don@lexmark.com
  > >
  > > Chair,  IEEE SA Standards Board
  > > Member, IEEE-ISTO Board of Directors
  > > f.wright@ieee.org / f.wright@computer.org
  > >
  > > Director, Alliances and Standards
  > > Lexmark International
  > > 740 New Circle Rd C14/082-3
  > > Lexington, Ky 40550
  > > 859-825-4808 (phone) 603-963-8352 (fax)
  > > *******************************************
  > >
  > >
  > >
  > >
  > > "Steven Pemberton" <Steven.Pemberton@cwi.nl> on 10/15/2003 09:18:15 AM
  > >
  > > To:    "BIGELOW,JIM \(HP-Boise,ex1\)" <jim.bigelow@hp.com>,
  > >        <w3c-html-wg@w3.org>, <don@lexmark.com>
  > > cc:    <voyager-issues@mn.aptest.com>, <elliott.bradshaw@zoran.com>,
  > >        <www-html@w3.org>
  > > Subject:    Re: allow UTF-16 not just UTF-8 (PR#6774)
  > >
  > >
  > > > From: don@lexmark.com [mailto:don@lexmark.com]
  > >
  > > > So let me understand this....
  > > >
  > > > Because people have poorly designed and written XML applications
  > running
  > > on
  > > > 3 GHz Pentium 4s with 512 megabytes of real memory that do not allow
  > the
  > > > control over whether UTF-8 or UTF-16 are emitted, we are expecting to
  > > burden
  > > > $49 printers with code to be able to detect and interpret both.
  > >
  > > No Don. It is about interoperability and conforming to standards. XML
  > > allows
  > > documents to be encoded in either UTF8 or UTF 16: consumers must accept
  > > both, producers may produce either. An XHTML-Print printer will be just
  a
  > > consumer of an XML byte-stream at some IP address; we don't want to
  > burden
  > > every program in the world that can produce XML with a switch that says
  > > "this output is going to a poor lowly XHTML Print processor that can't
  > deal
  > > with UTF-16, so please produce UTF-8", especially since UTF 16 is the
  > easy
  > > one to implement, and can only cost a few dozen bytes at best.
  > >
  > > If we changed this, XHTML Print would have to go back to last call, and
  > you
  > > can bet your boots that the XML community would rise up against us, as
  it
  > > has in the past, and I can tell you we don't want to go there, and we
  > would
  > > have a hundred people registering objections.
  > >
  > > Conforming to XML requirements comes with the territory of being XHTML.
  > The
  > > XML community will not take lightly to us messing with their standards.
  > >
  > > Best wishes,
  > >
  > > Steven Pemberton
  > >
  > >
  > >
  > >
  > >
  > >
  > >
  >
  >
  >
  >
  >
  >
  >

FOLLOWUP 23:


  From: Rowland Shaw <Rowland.Shaw@crystaldecisions.com>
  
  ...and for every Asian language, each character can take up to three bytes
  (in UTF-8 vs. two in UTF-16)
  
  Taking a complete random Japanese character (Hiragana Letter Small A)
  U+3041, in UTF-8 as 0xE3 0x81 0x81 -- this assumes that you are willing to
  deal with characters as a MBCS, and that you aren't going to convert to UCS2
  internally.
  
  English has the biggest saving by saving as UTF-8 (so let it), but for most
  other languages, there is no benefit or worse, a 50% growth in sizes (vs.
  UTF-16).
  
  If UTF-16 is disallowed, it's no longer an XML application (which may be a
  road to go down) by definition on the minimum bar set for XML (back in the
  days of 486's and 8Mb machines). Thinking about it, my printer nowadays at
  home has more RAM in it than my PC when XML was being created...
  
  
  -----Original Message-----
  From: don@lexmark.com [mailto:don@lexmark.com] 
  Sent: 16 October 2003 14:00
  To: Steven Pemberton
  Cc: don@lexmark.com; BIGELOW,JIM (HP-Boise,ex1); w3c-html-wg@w3.org;
  voyager-issues@mn.aptest.com; elliott.bradshaw@zoran.com; www-html@w3.org
  Subject: Re: allow UTF-16 not just UTF-8 (PR#6774)
  
  
  
  Steven:
  
  I think your answer proves my point that the XML commmunity did not and
  does not consider the limitations of low cost, constrained embedded
  environments when developing XML.
  
  You make the assertion that no extra memory is required yet the reality is
  quite the opposite.
  
  Please tell me if I'm wrong, but my understanding of UTF-8 and UTF-16 is
  that:
  
  1) Every XHTML tag will require twice as many bytes when represented in
  UTF-16 versus UTF-8
  2) Every English XHTML-Print print job will be twice as big encoded with
  UTF-16 versus UTF-8
  3) Every "Latin 1" print job will be larger approaching 2X in size.
  
  When you double the data's size, buffers have to double to be able to hold
  and manipulate an equivalent amount of print stream content.  There is real
  cost and performance costs to be paid to deal with UTF-16 encoding
  especially when dealing with western character sets.  When a device is
  designed to deal with the far east "characters" there are other penalties
  to be paid in things like the size of the font load that mitigate the
  UTF-16 versus UTF-8 encoding issue.
  
  *******************************************
  Don Wright                 don@lexmark.com
  
  Chair,  IEEE SA Standards Board
  Member, IEEE-ISTO Board of Directors
  f.wright@ieee.org / f.wright@computer.org
  
  Director, Alliances and Standards
  Lexmark International
  740 New Circle Rd C14/082-3
  Lexington, Ky 40550
  859-825-4808 (phone) 603-963-8352 (fax)
  *******************************************
  
  
  
  
  
  "Steven Pemberton" <steven.pemberton@cwi.nl> on 10/15/2003 07:26:24 PM
  
  To:    <don@lexmark.com>
  cc:    "BIGELOW,JIM \(HP-Boise,ex1\)" <jim.bigelow@hp.com>,
         <w3c-html-wg@w3.org>, <don@lexmark.com>,
         <voyager-issues@mn.aptest.com>, <elliott.bradshaw@zoran.com>,
         <www-html@w3.org>
  Subject:    Re: allow UTF-16 not just UTF-8 (PR#6774)
  
  
  But support for UTF 16 adds a few dozen bytes of code, and no extra memory
  requirements. It is simpler than UTF 8! What's the problem?
  
  Steven
  
  ----- Original Message -----
  From: <don@lexmark.com>
  To: "Steven Pemberton" <Steven.Pemberton@cwi.nl>
  Cc: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>;
  <w3c-html-wg@w3.org>;
  <don@lexmark.com>; <voyager-issues@mn.aptest.com>;
  <elliott.bradshaw@zoran.com>; <www-html@w3.org>
  Sent: Thursday, October 16, 2003 12:20 AM
  Subject: Re: allow UTF-16 not just UTF-8 (PR#6774)
  
  
  >
  > Steven, et al:
  >
  > The real problem is that the entire XML architecture was designed
  assuming
  > high end boxes like the 3 GHz Pentium with 512 megabytes of memory.  We
  > have already seen push back in other standards groups that consumer
  > electronic devices and other smaller, lighter devices cannot afford all
  the
  > luxuries demand by an obese XML architecture.  Unless the XML community
  > accepts subsetting, we can't expect the broadest support for XML to
  happen
  > at the low end until the price/performance ratios experience another
  order
  > or two magnitude improvement.  As recently reported in several of the
  trade
  > magazines focused on IT professionals, the deployment of XML and Web
  > Services are have significant negative impacts on the IT infrastructure
  > especially in the area of bandwidth utilization.  This is just another
  > symptom of the same problem.
  >
  > I know I will lose this argument in the W3C but the realities of the
  > XHTML-Print implementations will blow off UTF-16 as more fat with no
  > benefit and simply not support it, "interoperable" or not.
  >
  > Sorry I'm not pure but practical.
  >
  > *******************************************
  > Don Wright                 don@lexmark.com
  >
  > Chair,  IEEE SA Standards Board
  > Member, IEEE-ISTO Board of Directors
  > f.wright@ieee.org / f.wright@computer.org
  >
  > Director, Alliances and Standards
  > Lexmark International
  > 740 New Circle Rd C14/082-3
  > Lexington, Ky 40550
  > 859-825-4808 (phone) 603-963-8352 (fax)
  > *******************************************
  >
  >
  >
  >
  > "Steven Pemberton" <Steven.Pemberton@cwi.nl> on 10/15/2003 09:18:15 AM
  >
  > To:    "BIGELOW,JIM \(HP-Boise,ex1\)" <jim.bigelow@hp.com>,
  >        <w3c-html-wg@w3.org>, <don@lexmark.com>
  > cc:    <voyager-issues@mn.aptest.com>, <elliott.bradshaw@zoran.com>,
  >        <www-html@w3.org>
  > Subject:    Re: allow UTF-16 not just UTF-8 (PR#6774)
  >
  >
  > > From: don@lexmark.com [mailto:don@lexmark.com]
  >
  > > So let me understand this....
  > >
  > > Because people have poorly designed and written XML applications
  running
  > on
  > > 3 GHz Pentium 4s with 512 megabytes of real memory that do not allow
  the
  > > control over whether UTF-8 or UTF-16 are emitted, we are expecting to
  > burden
  > > $49 printers with code to be able to detect and interpret both.
  >
  > No Don. It is about interoperability and conforming to standards. XML
  > allows
  > documents to be encoded in either UTF8 or UTF 16: consumers must accept
  > both, producers may produce either. An XHTML-Print printer will be just a
  > consumer of an XML byte-stream at some IP address; we don't want to
  burden
  > every program in the world that can produce XML with a switch that says
  > "this output is going to a poor lowly XHTML Print processor that can't
  deal
  > with UTF-16, so please produce UTF-8", especially since UTF 16 is the
  easy
  > one to implement, and can only cost a few dozen bytes at best.
  >
  > If we changed this, XHTML Print would have to go back to last call, and
  you
  > can bet your boots that the XML community would rise up against us, as it
  > has in the past, and I can tell you we don't want to go there, and we
  would
  > have a hundred people registering objections.
  >
  > Conforming to XML requirements comes with the territory of being XHTML.
  The
  > XML community will not take lightly to us messing with their standards.
  >
  > Best wishes,
  >
  > Steven Pemberton
  >
  >
  >
  >
  >
  >
  >

FOLLOWUP 24:


  From: elliott.bradshaw@zoran.com
  
  
  Don,
  
  I agree with the argument that a front end can convert from UTF-16 to UTF-8
  or whatever internal form is used, and have essentially no impact on memory
  needs.
  
  "A couple of dozen bytes" might be a little optimistic for this logic  :^)
  , but it's pretty straightforward:
    -look at first 16 bits to detect a UTF-16 mark
    -for each double byte emit the UTF-8 (or other) equivalent
  
  Of course a printer could choose to store Asian data differently than
  Latin, and save some space compared to native UTF-8.  This decision is
  orthogonal to the form of the input.  But this logic may not be worth it
  and is not needed for compliance.
  
    Frugally,
    Elliott
  
  
  --------------------------------------------------------------------------------
  
  Elliott Bradshaw
  Director, Software Engineering
  Zoran Imaging Division (formerly Oak Technology Imaging Group)
  781 638-7534
  
  
  
                                                                                                            
                      Rowland Shaw                                                                          
                      <Rowland.Shaw@crystaldeci       To:     "'don@lexmark.com'" <don@lexmark.com>, Steven 
                      sions.com>                       Pemberton <steven.pemberton@cwi.nl>                  
                                                      cc:     "BIGELOW,JIM (HP-Boise,ex1)"                  
                      10/16/2003 09:16 AM              <jim.bigelow@hp.com>, w3c-html-wg@w3.org,            
                                                       voyager-issues@mn.aptest.com,                        
                                                       elliott.bradshaw@zoran.com, www-html@w3.org          
                                                      Subject:     RE: allow UTF-16 not just UTF-8          
                                                       (PR#6774)                                            
                                                                                                            
  
  
  
  
  ...and for every Asian language, each character can take up to three bytes
  (in UTF-8 vs. two in UTF-16)
  
  Taking a complete random Japanese character (Hiragana Letter Small A)
  U+3041, in UTF-8 as 0xE3 0x81 0x81 -- this assumes that you are willing to
  deal with characters as a MBCS, and that you aren't going to convert to
  UCS2
  internally.
  
  English has the biggest saving by saving as UTF-8 (so let it), but for most
  other languages, there is no benefit or worse, a 50% growth in sizes (vs.
  UTF-16).
  
  If UTF-16 is disallowed, it's no longer an XML application (which may be a
  road to go down) by definition on the minimum bar set for XML (back in the
  days of 486's and 8Mb machines). Thinking about it, my printer nowadays at
  home has more RAM in it than my PC when XML was being created...
  
  
  -----Original Message-----
  From: don@lexmark.com [mailto:don@lexmark.com]
  Sent: 16 October 2003 14:00
  To: Steven Pemberton
  Cc: don@lexmark.com; BIGELOW,JIM (HP-Boise,ex1); w3c-html-wg@w3.org;
  voyager-issues@mn.aptest.com; elliott.bradshaw@zoran.com; www-html@w3.org
  Subject: Re: allow UTF-16 not just UTF-8 (PR#6774)
  
  
  
  Steven:
  
  I think your answer proves my point that the XML commmunity did not and
  does not consider the limitations of low cost, constrained embedded
  environments when developing XML.
  
  You make the assertion that no extra memory is required yet the reality is
  quite the opposite.
  
  Please tell me if I'm wrong, but my understanding of UTF-8 and UTF-16 is
  that:
  
  1) Every XHTML tag will require twice as many bytes when represented in
  UTF-16 versus UTF-8
  2) Every English XHTML-Print print job will be twice as big encoded with
  UTF-16 versus UTF-8
  3) Every "Latin 1" print job will be larger approaching 2X in size.
  
  When you double the data's size, buffers have to double to be able to hold
  and manipulate an equivalent amount of print stream content.  There is real
  cost and performance costs to be paid to deal with UTF-16 encoding
  especially when dealing with western character sets.  When a device is
  designed to deal with the far east "characters" there are other penalties
  to be paid in things like the size of the font load that mitigate the
  UTF-16 versus UTF-8 encoding issue.
  
  *******************************************
  Don Wright                 don@lexmark.com
  
  Chair,  IEEE SA Standards Board
  Member, IEEE-ISTO Board of Directors
  f.wright@ieee.org / f.wright@computer.org
  
  Director, Alliances and Standards
  Lexmark International
  740 New Circle Rd C14/082-3
  Lexington, Ky 40550
  859-825-4808 (phone) 603-963-8352 (fax)
  *******************************************
  
  
  
  
  
  "Steven Pemberton" <steven.pemberton@cwi.nl> on 10/15/2003 07:26:24 PM
  
  To:    <don@lexmark.com>
  cc:    "BIGELOW,JIM \(HP-Boise,ex1\)" <jim.bigelow@hp.com>,
         <w3c-html-wg@w3.org>, <don@lexmark.com>,
         <voyager-issues@mn.aptest.com>, <elliott.bradshaw@zoran.com>,
         <www-html@w3.org>
  Subject:    Re: allow UTF-16 not just UTF-8 (PR#6774)
  
  
  But support for UTF 16 adds a few dozen bytes of code, and no extra memory
  requirements. It is simpler than UTF 8! What's the problem?
  
  Steven
  
  ----- Original Message -----
  From: <don@lexmark.com>
  To: "Steven Pemberton" <Steven.Pemberton@cwi.nl>
  Cc: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>;
  <w3c-html-wg@w3.org>;
  <don@lexmark.com>; <voyager-issues@mn.aptest.com>;
  <elliott.bradshaw@zoran.com>; <www-html@w3.org>
  Sent: Thursday, October 16, 2003 12:20 AM
  Subject: Re: allow UTF-16 not just UTF-8 (PR#6774)
  
  
  >
  > Steven, et al:
  >
  > The real problem is that the entire XML architecture was designed
  assuming
  > high end boxes like the 3 GHz Pentium with 512 megabytes of memory.  We
  > have already seen push back in other standards groups that consumer
  > electronic devices and other smaller, lighter devices cannot afford all
  the
  > luxuries demand by an obese XML architecture.  Unless the XML community
  > accepts subsetting, we can't expect the broadest support for XML to
  happen
  > at the low end until the price/performance ratios experience another
  order
  > or two magnitude improvement.  As recently reported in several of the
  trade
  > magazines focused on IT professionals, the deployment of XML and Web
  > Services are have significant negative impacts on the IT infrastructure
  > especially in the area of bandwidth utilization.  This is just another
  > symptom of the same problem.
  >
  > I know I will lose this argument in the W3C but the realities of the
  > XHTML-Print implementations will blow off UTF-16 as more fat with no
  > benefit and simply not support it, "interoperable" or not.
  >
  > Sorry I'm not pure but practical.
  >
  > *******************************************
  > Don Wright                 don@lexmark.com
  >
  > Chair,  IEEE SA Standards Board
  > Member, IEEE-ISTO Board of Directors
  > f.wright@ieee.org / f.wright@computer.org
  >
  > Director, Alliances and Standards
  > Lexmark International
  > 740 New Circle Rd C14/082-3
  > Lexington, Ky 40550
  > 859-825-4808 (phone) 603-963-8352 (fax)
  > *******************************************
  >
  >
  >
  >
  > "Steven Pemberton" <Steven.Pemberton@cwi.nl> on 10/15/2003 09:18:15 AM
  >
  > To:    "BIGELOW,JIM \(HP-Boise,ex1\)" <jim.bigelow@hp.com>,
  >        <w3c-html-wg@w3.org>, <don@lexmark.com>
  > cc:    <voyager-issues@mn.aptest.com>, <elliott.bradshaw@zoran.com>,
  >        <www-html@w3.org>
  > Subject:    Re: allow UTF-16 not just UTF-8 (PR#6774)
  >
  >
  > > From: don@lexmark.com [mailto:don@lexmark.com]
  >
  > > So let me understand this....
  > >
  > > Because people have poorly designed and written XML applications
  running
  > on
  > > 3 GHz Pentium 4s with 512 megabytes of real memory that do not allow
  the
  > > control over whether UTF-8 or UTF-16 are emitted, we are expecting to
  > burden
  > > $49 printers with code to be able to detect and interpret both.
  >
  > No Don. It is about interoperability and conforming to standards. XML
  > allows
  > documents to be encoded in either UTF8 or UTF 16: consumers must accept
  > both, producers may produce either. An XHTML-Print printer will be just a
  > consumer of an XML byte-stream at some IP address; we don't want to
  burden
  > every program in the world that can produce XML with a switch that says
  > "this output is going to a poor lowly XHTML Print processor that can't
  deal
  > with UTF-16, so please produce UTF-8", especially since UTF 16 is the
  easy
  > one to implement, and can only cost a few dozen bytes at best.
  >
  > If we changed this, XHTML Print would have to go back to last call, and
  you
  > can bet your boots that the XML community would rise up against us, as it
  > has in the past, and I can tell you we don't want to go there, and we
  would
  > have a hundred people registering objections.
  >
  > Conforming to XML requirements comes with the territory of being XHTML.
  The
  > XML community will not take lightly to us messing with their standards.
  >
  > Best wishes,
  >
  > Steven Pemberton
  >
  >
  >
  >
  >
  >
  >

FOLLOWUP 25:


  From: don@lexmark.com
  
  
  Steven:
  
  Of course I knew this was jsut the external representation.
  
  I'm trying to reduce conversions and reduce the sizes of buffers, etc.
  necessary to do this work.  I have no doubt it can be done, I'm just trying
  to do things with smaller less powerful processors and with less available
  memory than what programmers normally expect to be available in today's
  environment.
  
  *******************************************
  Don Wright                 don@lexmark.com
  
  Chair,  IEEE SA Standards Board
  Member, IEEE-ISTO Board of Directors
  f.wright@ieee.org / f.wright@computer.org
  
  Director, Alliances and Standards
  Lexmark International
  740 New Circle Rd C14/082-3
  Lexington, Ky 40550
  859-825-4808 (phone) 603-963-8352 (fax)
  *******************************************
  
  
  
  
  
  "Steven Pemberton" <steven.pemberton@cwi.nl> on 10/16/2003 09:10:59 AM
  
  To:    <don@lexmark.com>
  cc:    <don@lexmark.com>, "BIGELOW,JIM \(HP-Boise,ex1\)"
         <jim.bigelow@hp.com>, <w3c-html-wg@w3.org>,
         <voyager-issues@mn.aptest.com>, <elliott.bradshaw@zoran.com>,
         <www-html@w3.org>
  Subject:    Re: allow UTF-16 not just UTF-8 (PR#6774)
  
  
  Don,
  
  I've been wondering for a long time if that was the misunderstanding, but I
  was assured it wasn't.
  
  UTF 16 and UTF 8 are *external* representations. The internal amount of
  storage needed for them is identical, and completely up to you how you
  store.
  
  The only extra memory needed is the couple of dozen extra bytes of code to
  convert UTF 16 into whatever internal representation you use.
  
  Best wishes,
  
  Steven
  
  
  
  ----- Original Message -----
  From: <don@lexmark.com>
  To: "Steven Pemberton" <steven.pemberton@cwi.nl>
  Cc: <don@lexmark.com>; "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>;
  <w3c-html-wg@w3.org>; <voyager-issues@mn.aptest.com>;
  <elliott.bradshaw@zoran.com>; <www-html@w3.org>
  Sent: Thursday, October 16, 2003 2:51 PM
  Subject: Re: allow UTF-16 not just UTF-8 (PR#6774)
  
  
  >
  >
  > Steven:
  >
  > I think your answer proves my point that the XML commmunity did not and
  > does not consider the limitations of low cost, constrained embedded
  > environments when developing XML.
  >
  > You make the assertion that no extra memory is required yet the reality
  is
  > quite the opposite.
  >
  > Please tell me if I'm wrong, but my understanding of UTF-8 and UTF-16 is
  > that:
  >
  > 1) Every XHTML tag will require twice as many bytes when represented in
  > UTF-16 versus UTF-8
  > 2) Every English XHTML-Print print job will be twice as big encoded with
  > UTF-16 versus UTF-8
  > 3) Every "Latin 1" print job will be larger approaching 2X in size.
  >
  > When you double the data's size, buffers have to double to be able to
  hold
  > and manipulate an equivalent amount of print stream content.  There is
  real
  > cost and performance costs to be paid to deal with UTF-16 encoding
  > especially when dealing with western character sets.  When a device is
  > designed to deal with the far east "characters" there are other penalties
  > to be paid in things like the size of the font load that mitigate the
  > UTF-16 versus UTF-8 encoding issue.
  >
  > *******************************************
  > Don Wright                 don@lexmark.com
  >
  > Chair,  IEEE SA Standards Board
  > Member, IEEE-ISTO Board of Directors
  > f.wright@ieee.org / f.wright@computer.org
  >
  > Director, Alliances and Standards
  > Lexmark International
  > 740 New Circle Rd C14/082-3
  > Lexington, Ky 40550
  > 859-825-4808 (phone) 603-963-8352 (fax)
  > *******************************************
  >
  >
  >
  >
  >
  > "Steven Pemberton" <steven.pemberton@cwi.nl> on 10/15/2003 07:26:24 PM
  >
  > To:    <don@lexmark.com>
  > cc:    "BIGELOW,JIM \(HP-Boise,ex1\)" <jim.bigelow@hp.com>,
  >        <w3c-html-wg@w3.org>, <don@lexmark.com>,
  >        <voyager-issues@mn.aptest.com>, <elliott.bradshaw@zoran.com>,
  >        <www-html@w3.org>
  > Subject:    Re: allow UTF-16 not just UTF-8 (PR#6774)
  >
  >
  > But support for UTF 16 adds a few dozen bytes of code, and no extra
  memory
  > requirements. It is simpler than UTF 8! What's the problem?
  >
  > Steven
  >
  > ----- Original Message -----
  > From: <don@lexmark.com>
  > To: "Steven Pemberton" <Steven.Pemberton@cwi.nl>
  > Cc: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>;
  > <w3c-html-wg@w3.org>;
  > <don@lexmark.com>; <voyager-issues@mn.aptest.com>;
  > <elliott.bradshaw@zoran.com>; <www-html@w3.org>
  > Sent: Thursday, October 16, 2003 12:20 AM
  > Subject: Re: allow UTF-16 not just UTF-8 (PR#6774)
  >
  >
  > >
  > > Steven, et al:
  > >
  > > The real problem is that the entire XML architecture was designed
  > assuming
  > > high end boxes like the 3 GHz Pentium with 512 megabytes of memory.  We
  > > have already seen push back in other standards groups that consumer
  > > electronic devices and other smaller, lighter devices cannot afford all
  > the
  > > luxuries demand by an obese XML architecture.  Unless the XML community
  > > accepts subsetting, we can't expect the broadest support for XML to
  > happen
  > > at the low end until the price/performance ratios experience another
  > order
  > > or two magnitude improvement.  As recently reported in several of the
  > trade
  > > magazines focused on IT professionals, the deployment of XML and Web
  > > Services are have significant negative impacts on the IT infrastructure
  > > especially in the area of bandwidth utilization.  This is just another
  > > symptom of the same problem.
  > >
  > > I know I will lose this argument in the W3C but the realities of the
  > > XHTML-Print implementations will blow off UTF-16 as more fat with no
  > > benefit and simply not support it, "interoperable" or not.
  > >
  > > Sorry I'm not pure but practical.
  > >
  > > *******************************************
  > > Don Wright                 don@lexmark.com
  > >
  > > Chair,  IEEE SA Standards Board
  > > Member, IEEE-ISTO Board of Directors
  > > f.wright@ieee.org / f.wright@computer.org
  > >
  > > Director, Alliances and Standards
  > > Lexmark International
  > > 740 New Circle Rd C14/082-3
  > > Lexington, Ky 40550
  > > 859-825-4808 (phone) 603-963-8352 (fax)
  > > *******************************************
  > >
  > >
  > >
  > >
  > > "Steven Pemberton" <Steven.Pemberton@cwi.nl> on 10/15/2003 09:18:15 AM
  > >
  > > To:    "BIGELOW,JIM \(HP-Boise,ex1\)" <jim.bigelow@hp.com>,
  > >        <w3c-html-wg@w3.org>, <don@lexmark.com>
  > > cc:    <voyager-issues@mn.aptest.com>, <elliott.bradshaw@zoran.com>,
  > >        <www-html@w3.org>
  > > Subject:    Re: allow UTF-16 not just UTF-8 (PR#6774)
  > >
  > >
  > > > From: don@lexmark.com [mailto:don@lexmark.com]
  > >
  > > > So let me understand this....
  > > >
  > > > Because people have poorly designed and written XML applications
  > running
  > > on
  > > > 3 GHz Pentium 4s with 512 megabytes of real memory that do not allow
  > the
  > > > control over whether UTF-8 or UTF-16 are emitted, we are expecting to
  > > burden
  > > > $49 printers with code to be able to detect and interpret both.
  > >
  > > No Don. It is about interoperability and conforming to standards. XML
  > > allows
  > > documents to be encoded in either UTF8 or UTF 16: consumers must accept
  > > both, producers may produce either. An XHTML-Print printer will be just
  a
  > > consumer of an XML byte-stream at some IP address; we don't want to
  > burden
  > > every program in the world that can produce XML with a switch that says
  > > "this output is going to a poor lowly XHTML Print processor that can't
  > deal
  > > with UTF-16, so please produce UTF-8", especially since UTF 16 is the
  > easy
  > > one to implement, and can only cost a few dozen bytes at best.
  > >
  > > If we changed this, XHTML Print would have to go back to last call, and
  > you
  > > can bet your boots that the XML community would rise up against us, as
  it
  > > has in the past, and I can tell you we don't want to go there, and we
  > would
  > > have a hundred people registering objections.
  > >
  > > Conforming to XML requirements comes with the territory of being XHTML.
  > The
  > > XML community will not take lightly to us messing with their standards.
  > >
  > > Best wishes,
  > >
  > > Steven Pemberton
  > >
  > >
  > >
  > >
  > >
  > >
  > >
  >
  >
  >
  >
  >
  >
  >

FOLLOWUP 26:

  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>

  Don and Steven,

  I want to expand on what you have said:
  Don wrote:
  > > 1) Every XHTML tag will require twice as many bytes when 
  > > represented in UTF-16 versus UTF-8
  > > 2) Every English XHTML-Print print job will be twice as 
  > > big encoded with UTF-16 versus UTF-8
  > > 3) Every "Latin 1" print job will be larger approaching 
  > > 2X in size.
  > >
  > > When you double the data's size, buffers have to double to 
  > > be able to hold and manipulate an equivalent amount of print 
  > > stream content.  

  This statement is only true for some print streams. See the discussion below
  in "The problem space".

  Steven wrote:
  > UTF 16 and UTF 8 are *external* representations. The internal 
  > amount of storage needed for them is identical, and 
  > completely up to you how you store.

  If a printer uses 16 bits internally to represent a character, then there
  shouldn't be a difference in buffering requirements between utf-8 and utf-16
  encoded files (see below for a more complete discussion).  However, if a
  printer uses 8 bits per character, then it has restricted itself to only
  handle a subset of possible documents, those with ASCII characters.  This is
  a product-specific decision akin to that of whether to make a device print
  in color or black & white or support landscape as well as portrait printing.
  Therefore, I suggest that the spec say that a printer should support utf-16,
  just as it now says it should support CSS, landscape printing, and color --
  within the limits of the device.  If a user buys a low-cost device that can
  only print ASCII characters in portrait orientation, without color, style
  sheets, or images, hopefully the price was inline with the printer's
  abilities and other, more expensive, more capable devices are available as
  needed.

  Jim

  The problem space
  ----------------------
  There is a document composition continuum from documents with only text,
  through mixed text and images, to documents that contain only images.  At
  the text-only end of the continuum, the effects on the document size of
  UTF-16 vs. UTF-8 is a doubling of document size. At the image-only end of
  the continuum, the effects on the document size of encoding in UTF-16 versus
  UTF-8 are over-shadowed by the image data. 

  The table below illustrates three points on the document composition
  continuum:
  1. Text-only: a document that prints as one page of ASCII text (times, 10pt,
  8in by 11in paper) [1].  Size, in bytes, is 6,282.

  2. Text & Image: a one page document with one 3in x 5in image (166.7K bytes)
  and the remainder text [2]. Size, in bytes, of document and image is
  171,531.

  3. Image-only: a one page document with eight 2in x 3.25in images (703.2K
  bytes) and no text. [3] Size, in bytes, of document and eight images is
  705,108.

  Size (bytes): utf-8: %doc : utf-16: %doc 
  Text-only:    6,282: 100  : 12,566: 100
  Text+Image:   4,776: 3.2  :  9,554: 5.4  (9,554 /(9,954+166,675)* 100)
  Image-only:   1,916: .27  :  3,834: .54 

  There is another point of variability: the characters in the text portions
  of the document. This is another continuum from ASCII only at one end to
  Japanese, Chinese, Korean, and Hindi at the other.  

  "Table 1: UTF types" of [4] gives the following average bytes per code point

           utf-8  utf-16
  English  1      2
  Latin-1  1.1    2
  Greek,
  Russian,
  Arabic,
  Hebrew   1.7    2
  Japanese,
  Chinese
  Korean
  Hindi    3      2

  As the language/script of the text portion of the document changes from
  English-only toward other scripts and languages, the size difference between
  utf-8 and utf-16 decreases.

  End-to-end solution
  -------------------
  If you look at the end-to-end solution, from the sending application to the
  printer, the stages can be thought of as:
  1. Sending Device: the data as represented in the sending device (a cell
  phone for example)
  2. Transmission: the data combined with markup and style information as and
  XHTML-Print data stream and then encoded in either UTF-8 or UTF-16
  3. Receiving Device: the printer -- breaking this into two parts gives:
  3.a The XHTML-Print data stream as received 
  3.b The data without markup and style information and before printing. How
  the data is stored is implementation dependent and how much memory is used
  depends on how a character is represented --  8 or 16 bits, and how much
  buffer of the document is buffered.  Each printer makes these choices,
  8bits/char restricted the documents processed to Latin1 characters.

  Stage   Size    utf-8   utf-16
  1. app   n       -         -
  2. xmit  n       n-3n*    2n   
  3a. Pr   n       n-3n     2n
  3b. Pr** n       n-2n     n-2n

  * n-3n shows the variable sizing depending on characters being encode:
  English only (n), CJK (3n)
  ** at Stage 3b, representing a character with 8bits restricts the characters
  that can be represented to ASCII or Latin 1, 16 bits can represent all
  characters.

  Internal representation

  If a printer uses 16 bits internally to represent a character, then there
  shouldn't be difference in buffering requirements between utf-8 and utf-16
  encoded files.  However, if a printer uses 8 bits, then it has restricted
  itself to only handle a subset of documents.  This is a product-specific
  decision akin to that of supporting color or not.  Therefore, I suggest that
  the spec say that a printer should support utf-16 just as it now say it
  should support CSS, landscape printing, and color -- within the limits of
  the device.  If a user buys a low-cost device that can only print ASCII
  characters in portrait orientation, without color, images or style,
  hopefully the price is inline with the printer's abilities and other, more
  expensive, more capable devices are available as needed.

  [1] http://www.pwg.org/xhtml-print/W3C-Version/georgeb.html
  [2] http://www.pwg.org/xhtml-print/W3C-Version/text+image.html
  [3] http://www.pwg.org/xhtml-print/W3C-Version/image-only.html

  [4] http://www-106.ibm.com/developerworks/library/utfencodingforms/

FOLLOWUP 27:

  From: Michael Sweet <mike@easysw.com>

  BIGELOW,JIM (HP-Boise,ex1) wrote:
  > ...
  > If a printer uses 16 bits internally to represent a character, then there
  > shouldn't be a difference in buffering requirements between utf-8 and utf-16
  > encoded files (see below for a more complete discussion).  However, if a
  > printer uses 8 bits per character, then it has restricted itself to only
  > handle a subset of possible documents, those with ASCII characters.  This is
   > ...

  I suggest there is another alternative - the implementation can
  simply convert UTF-16 to UTF-8 as the document is being read, so
  contrary to the previous comments there is no additional buffer
  memory overhead, merely a small amount of code to convert from
  UTF-16 to UTF-8.

  Whether the implementation chooses to limit support to "latin"
  text or not is another issue, but either way the *internal*
  representation can be controlled by the vendor separate from the
  external UTF-8/UTF-16/whatever representation.

  -- 
  ______________________________________________________________________
  Michael Sweet, Easy Software Products           mike at easysw dot com
  Printing Software for UNIX                       http://www.easysw.com

FOLLOWUP 28:


  From: "Steven Pemberton" <steven.pemberton@cwi.nl>
  
  UTF 8 and UTF 16 are just definitions of how you send a Unicode character
  stream in an interoperable way over the wire. The character set is the same,
  the characters are the same, it is just the encoding that is different.
  
  It is orthogonal to questions of how characters are stored internally. You
  can do what you like internally, it is completely up to you. It has no
  effect on the memory requirements of the receiving device, because you have
  to convert to your internal form anyway.
  
  Steven
  
  ----- Original Message ----- 
  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>
  To: "Steven Pemberton" <steven.pemberton@cwi.nl>; <don@lexmark.com>
  Cc: <w3c-html-wg@w3.org>; <voyager-issues@mn.aptest.com>;
  <elliott.bradshaw@zoran.com>; <www-html@w3.org>; <mike@easysw.com>
  Sent: Friday, October 17, 2003 3:15 AM
  Subject: RE: allow UTF-16 not just UTF-8 (PR#6774)
  
  
  > Don and Steven,
  >
  > I want to expand on what you have said:
  > Don wrote:
  > > > 1) Every XHTML tag will require twice as many bytes when
  > > > represented in UTF-16 versus UTF-8
  > > > 2) Every English XHTML-Print print job will be twice as
  > > > big encoded with UTF-16 versus UTF-8
  > > > 3) Every "Latin 1" print job will be larger approaching
  > > > 2X in size.
  > > >
  > > > When you double the data's size, buffers have to double to
  > > > be able to hold and manipulate an equivalent amount of print
  > > > stream content.
  >
  > This statement is only true for some print streams. See the discussion
  below
  > in "The problem space".
  >
  > Steven wrote:
  > > UTF 16 and UTF 8 are *external* representations. The internal
  > > amount of storage needed for them is identical, and
  > > completely up to you how you store.
  >
  > If a printer uses 16 bits internally to represent a character, then there
  > shouldn't be a difference in buffering requirements between utf-8 and
  utf-16
  > encoded files (see below for a more complete discussion).  However, if a
  > printer uses 8 bits per character, then it has restricted itself to only
  > handle a subset of possible documents, those with ASCII characters.  This
  is
  > a product-specific decision akin to that of whether to make a device print
  > in color or black & white or support landscape as well as portrait
  printing.
  > Therefore, I suggest that the spec say that a printer should support
  utf-16,
  > just as it now says it should support CSS, landscape printing, and
  color --
  > within the limits of the device.  If a user buys a low-cost device that
  can
  > only print ASCII characters in portrait orientation, without color, style
  > sheets, or images, hopefully the price was inline with the printer's
  > abilities and other, more expensive, more capable devices are available as
  > needed.
  >
  > Jim
  >
  >
  > The problem space
  > ----------------------
  > There is a document composition continuum from documents with only text,
  > through mixed text and images, to documents that contain only images.  At
  > the text-only end of the continuum, the effects on the document size of
  > UTF-16 vs. UTF-8 is a doubling of document size. At the image-only end of
  > the continuum, the effects on the document size of encoding in UTF-16
  versus
  > UTF-8 are over-shadowed by the image data.
  >
  > The table below illustrates three points on the document composition
  > continuum:
  > 1. Text-only: a document that prints as one page of ASCII text (times,
  10pt,
  > 8in by 11in paper) [1].  Size, in bytes, is 6,282.
  >
  > 2. Text & Image: a one page document with one 3in x 5in image (166.7K
  bytes)
  > and the remainder text [2]. Size, in bytes, of document and image is
  > 171,531.
  >
  > 3. Image-only: a one page document with eight 2in x 3.25in images (703.2K
  > bytes) and no text. [3] Size, in bytes, of document and eight images is
  > 705,108.
  >
  > Size (bytes): utf-8: %doc : utf-16: %doc
  > Text-only:    6,282: 100  : 12,566: 100
  > Text+Image:   4,776: 3.2  :  9,554: 5.4  (9,554 /(9,954+166,675)* 100)
  > Image-only:   1,916: .27  :  3,834: .54
  >
  > There is another point of variability: the characters in the text portions
  > of the document. This is another continuum from ASCII only at one end to
  > Japanese, Chinese, Korean, and Hindi at the other.
  >
  > "Table 1: UTF types" of [4] gives the following average bytes per code
  point
  >
  >          utf-8  utf-16
  > English  1      2
  > Latin-1  1.1    2
  > Greek,
  > Russian,
  > Arabic,
  > Hebrew   1.7    2
  > Japanese,
  > Chinese
  > Korean
  > Hindi    3      2
  >
  > As the language/script of the text portion of the document changes from
  > English-only toward other scripts and languages, the size difference
  between
  > utf-8 and utf-16 decreases.
  >
  >
  > End-to-end solution
  > -------------------
  > If you look at the end-to-end solution, from the sending application to
  the
  > printer, the stages can be thought of as:
  > 1. Sending Device: the data as represented in the sending device (a cell
  > phone for example)
  > 2. Transmission: the data combined with markup and style information as
  and
  > XHTML-Print data stream and then encoded in either UTF-8 or UTF-16
  > 3. Receiving Device: the printer -- breaking this into two parts gives:
  > 3.a The XHTML-Print data stream as received
  > 3.b The data without markup and style information and before printing. How
  > the data is stored is implementation dependent and how much memory is used
  > depends on how a character is represented --  8 or 16 bits, and how much
  > buffer of the document is buffered.  Each printer makes these choices,
  > 8bits/char restricted the documents processed to Latin1 characters.
  >
  >
  >
  > Stage   Size    utf-8   utf-16
  > 1. app   n       -         -
  > 2. xmit  n       n-3n*    2n
  > 3a. Pr   n       n-3n     2n
  > 3b. Pr** n       n-2n     n-2n
  >
  > * n-3n shows the variable sizing depending on characters being encode:
  > English only (n), CJK (3n)
  > ** at Stage 3b, representing a character with 8bits restricts the
  characters
  > that can be represented to ASCII or Latin 1, 16 bits can represent all
  > characters.
  >
  > Internal representation
  >
  > If a printer uses 16 bits internally to represent a character, then there
  > shouldn't be difference in buffering requirements between utf-8 and utf-16
  > encoded files.  However, if a printer uses 8 bits, then it has restricted
  > itself to only handle a subset of documents.  This is a product-specific
  > decision akin to that of supporting color or not.  Therefore, I suggest
  that
  > the spec say that a printer should support utf-16 just as it now say it
  > should support CSS, landscape printing, and color -- within the limits of
  > the device.  If a user buys a low-cost device that can only print ASCII
  > characters in portrait orientation, without color, images or style,
  > hopefully the price is inline with the printer's abilities and other, more
  > expensive, more capable devices are available as needed.
  >
  >
  >
  > [1] http://www.pwg.org/xhtml-print/W3C-Version/georgeb.html
  > [2] http://www.pwg.org/xhtml-print/W3C-Version/text+image.html
  > [3] http://www.pwg.org/xhtml-print/W3C-Version/image-only.html
  >
  > [4] http://www-106.ibm.com/developerworks/library/utfencodingforms/
  >
  >

FOLLOWUP 29:


  From: don@lexmark.com
  
  
  Steven:
  
  You perception of how this works in an embedded device especially in a
  printer that will use this in Bluetooth, UPNP and other environments is
  clearly tainted by your experience of this with the Web and PCs.
  
  0) Of course UTF-8 versus UTF-16 is orthogonal to the internal
  representation of the "printer" but not until it is in the "printer" and
  off the "network"
  
  1)  As defined to be used by Bluetooth and in other environments, the data
  is PUSHed to the device rather than being pulled.  You have less control
  over the amount of data being sent.
  
  2) The network buffers are in the same constrained memory space as the
  processor for XHTML-Print.  Chunks from the network have to be buffered by
  the network process until they can be dealt with by the TCP processes which
  buffers them until they can be dealt with by the XHTML-Print process.  All
  this is done in that same limited, constrained memory space.  If I'm going
  to maintain performance levels customers expect, I need to be able to
  buffer up in multiple buffers this data equivalent amounts of CONTENT which
  in English encoded UTF-16 is TWICE as many bytes as UTF-8.  It is
  unreasonable to expected the network or TCP process within the device to
  convert UTF-16 to the internal format; that happens when it actually hits
  the "printer."  So while it might not take any more memory in the "printer"
  because the content is converted to an internal format, before it reaches
  the "printer" but while it is in the embedded physical device called a
  printer, it does.
  
  Do you get it yet?  In the PC world, the user agent doesn't have to worry
  about all the underlying details necessary to have the content delivered
  from the network.  We don't have that luxury in the embedded space.  All
  that work is done by the same processor and with the same limited memory.
  How else do you think we can sell printers for $29??
  
  *******************************************
  Don Wright                 don@lexmark.com
  
  Chair,  IEEE SA Standards Board
  Member, IEEE-ISTO Board of Directors
  f.wright@ieee.org / f.wright@computer.org
  
  Director, Alliances and Standards
  Lexmark International
  740 New Circle Rd C14/082-3
  Lexington, Ky 40550
  859-825-4808 (phone) 603-963-8352 (fax)
  *******************************************
  
  
  
  
  "Steven Pemberton" <steven.pemberton@cwi.nl> on 10/17/2003 08:55:07 AM
  
  To:    "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>, <don@lexmark.com>
  cc:    <w3c-html-wg@w3.org>, <voyager-issues@mn.aptest.com>,
         <elliott.bradshaw@zoran.com>, <www-html@w3.org>, <mike@easysw.com>
  Subject:    Re: allow UTF-16 not just UTF-8 (PR#6774)
  
  
  UTF 8 and UTF 16 are just definitions of how you send a Unicode character
  stream in an interoperable way over the wire. The character set is the
  same,
  the characters are the same, it is just the encoding that is different.
  
  It is orthogonal to questions of how characters are stored internally. You
  can do what you like internally, it is completely up to you. It has no
  effect on the memory requirements of the receiving device, because you have
  to convert to your internal form anyway.
  
  Steven
  
  ----- Original Message -----
  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>
  To: "Steven Pemberton" <steven.pemberton@cwi.nl>; <don@lexmark.com>
  Cc: <w3c-html-wg@w3.org>; <voyager-issues@mn.aptest.com>;
  <elliott.bradshaw@zoran.com>; <www-html@w3.org>; <mike@easysw.com>
  Sent: Friday, October 17, 2003 3:15 AM
  Subject: RE: allow UTF-16 not just UTF-8 (PR#6774)
  
  
  > Don and Steven,
  >
  > I want to expand on what you have said:
  > Don wrote:
  > > > 1) Every XHTML tag will require twice as many bytes when
  > > > represented in UTF-16 versus UTF-8
  > > > 2) Every English XHTML-Print print job will be twice as
  > > > big encoded with UTF-16 versus UTF-8
  > > > 3) Every "Latin 1" print job will be larger approaching
  > > > 2X in size.
  > > >
  > > > When you double the data's size, buffers have to double to
  > > > be able to hold and manipulate an equivalent amount of print
  > > > stream content.
  >
  > This statement is only true for some print streams. See the discussion
  below
  > in "The problem space".
  >
  > Steven wrote:
  > > UTF 16 and UTF 8 are *external* representations. The internal
  > > amount of storage needed for them is identical, and
  > > completely up to you how you store.
  >
  > If a printer uses 16 bits internally to represent a character, then there
  > shouldn't be a difference in buffering requirements between utf-8 and
  utf-16
  > encoded files (see below for a more complete discussion).  However, if a
  > printer uses 8 bits per character, then it has restricted itself to only
  > handle a subset of possible documents, those with ASCII characters.  This
  is
  > a product-specific decision akin to that of whether to make a device
  print
  > in color or black & white or support landscape as well as portrait
  printing.
  > Therefore, I suggest that the spec say that a printer should support
  utf-16,
  > just as it now says it should support CSS, landscape printing, and
  color --
  > within the limits of the device.  If a user buys a low-cost device that
  can
  > only print ASCII characters in portrait orientation, without color, style
  > sheets, or images, hopefully the price was inline with the printer's
  > abilities and other, more expensive, more capable devices are available
  as
  > needed.
  >
  > Jim
  >
  >
  > The problem space
  > ----------------------
  > There is a document composition continuum from documents with only text,
  > through mixed text and images, to documents that contain only images.  At
  > the text-only end of the continuum, the effects on the document size of
  > UTF-16 vs. UTF-8 is a doubling of document size. At the image-only end of
  > the continuum, the effects on the document size of encoding in UTF-16
  versus
  > UTF-8 are over-shadowed by the image data.
  >
  > The table below illustrates three points on the document composition
  > continuum:
  > 1. Text-only: a document that prints as one page of ASCII text (times,
  10pt,
  > 8in by 11in paper) [1].  Size, in bytes, is 6,282.
  >
  > 2. Text & Image: a one page document with one 3in x 5in image (166.7K
  bytes)
  > and the remainder text [2]. Size, in bytes, of document and image is
  > 171,531.
  >
  > 3. Image-only: a one page document with eight 2in x 3.25in images (703.2K
  > bytes) and no text. [3] Size, in bytes, of document and eight images is
  > 705,108.
  >
  > Size (bytes): utf-8: %doc : utf-16: %doc
  > Text-only:    6,282: 100  : 12,566: 100
  > Text+Image:   4,776: 3.2  :  9,554: 5.4  (9,554 /(9,954+166,675)* 100)
  > Image-only:   1,916: .27  :  3,834: .54
  >
  > There is another point of variability: the characters in the text
  portions
  > of the document. This is another continuum from ASCII only at one end to
  > Japanese, Chinese, Korean, and Hindi at the other.
  >
  > "Table 1: UTF types" of [4] gives the following average bytes per code
  point
  >
  >          utf-8  utf-16
  > English  1      2
  > Latin-1  1.1    2
  > Greek,
  > Russian,
  > Arabic,
  > Hebrew   1.7    2
  > Japanese,
  > Chinese
  > Korean
  > Hindi    3      2
  >
  > As the language/script of the text portion of the document changes from
  > English-only toward other scripts and languages, the size difference
  between
  > utf-8 and utf-16 decreases.
  >
  >
  > End-to-end solution
  > -------------------
  > If you look at the end-to-end solution, from the sending application to
  the
  > printer, the stages can be thought of as:
  > 1. Sending Device: the data as represented in the sending device (a cell
  > phone for example)
  > 2. Transmission: the data combined with markup and style information as
  and
  > XHTML-Print data stream and then encoded in either UTF-8 or UTF-16
  > 3. Receiving Device: the printer -- breaking this into two parts gives:
  > 3.a The XHTML-Print data stream as received
  > 3.b The data without markup and style information and before printing.
  How
  > the data is stored is implementation dependent and how much memory is
  used
  > depends on how a character is represented --  8 or 16 bits, and how much
  > buffer of the document is buffered.  Each printer makes these choices,
  > 8bits/char restricted the documents processed to Latin1 characters.
  >
  >
  >
  > Stage   Size    utf-8   utf-16
  > 1. app   n       -         -
  > 2. xmit  n       n-3n*    2n
  > 3a. Pr   n       n-3n     2n
  > 3b. Pr** n       n-2n     n-2n
  >
  > * n-3n shows the variable sizing depending on characters being encode:
  > English only (n), CJK (3n)
  > ** at Stage 3b, representing a character with 8bits restricts the
  characters
  > that can be represented to ASCII or Latin 1, 16 bits can represent all
  > characters.
  >
  > Internal representation
  >
  > If a printer uses 16 bits internally to represent a character, then there
  > shouldn't be difference in buffering requirements between utf-8 and
  utf-16
  > encoded files.  However, if a printer uses 8 bits, then it has restricted
  > itself to only handle a subset of documents.  This is a product-specific
  > decision akin to that of supporting color or not.  Therefore, I suggest
  that
  > the spec say that a printer should support utf-16 just as it now say it
  > should support CSS, landscape printing, and color -- within the limits of
  > the device.  If a user buys a low-cost device that can only print ASCII
  > characters in portrait orientation, without color, images or style,
  > hopefully the price is inline with the printer's abilities and other,
  more
  > expensive, more capable devices are available as needed.
  >
  >
  >
  > [1] http://www.pwg.org/xhtml-print/W3C-Version/georgeb.html
  > [2] http://www.pwg.org/xhtml-print/W3C-Version/text+image.html
  > [3] http://www.pwg.org/xhtml-print/W3C-Version/image-only.html
  >
  > [4] http://www-106.ibm.com/developerworks/library/utfencodingforms/
  >
  >

FOLLOWUP 30:

  From: Michael Sweet <mike@easysw.com>

  don@lexmark.com wrote:
  > ...
  > 1)  As defined to be used by Bluetooth and in other environments, the
  > data is PUSHed to the device rather than being pulled.  You have less
  > control over the amount of data being sent.
  > ...

  The "push" model is also used for USB, parallel, and serial
  printing, and the current print devices seem to have no problem
  with flow control over these or network interfaces.  It might
  mean that customers will see slower printing with UTF-16 data,
  but between the spec and any documentation you provide to
  developers and customers, it shouldn't surprise anyone...

  -- 
  ______________________________________________________________________
  Michael Sweet, Easy Software Products                  mike@easysw.com
  Printing Software for UNIX                       http://www.easysw.com

FOLLOWUP 31:


  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>
  
  Message from Don Wright of Lexmark:
  Jim:
  
  I noticed after my last message:
  
  http://lists.w3.org/Archives/Member/w3c-html-wg/2003OctDec/0086.html
  
  Pemberton and others in the group ceased the e-mail thread.  Did I convince
  them or have they given up on me?
  
  **********************************************
   Don Wright                 don@lexmark.com
  
   Chair,  IEEE SA Standards Board
   Member, IEEE-ISTO Board of Directors
   f.wright@ieee.org / f.wright@computer.org
  
   Director, Alliances & Standards
   Lexmark International
   740 New Circle Rd
   Lexington, Ky 40550
   859-825-4808 (phone) 603-963-8352 (fax)
  **********************************************

FOLLOWUP 32:


  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>
  
  My reply to Don's emailed question:
  >Pemberton and others in the group ceased the e-mail thread.  Did 
  >I convince them or have they given up on me?
  
  Don,
  
  I think that the case for and against UTF-16 support in XHTML-Print has been
  made. 
  
  We discussed UTF-8/UTF-16 and the XHTML-Print spec in 10/22/03  HTML WG
  phone conference. The group has officially voted to ask the Director to make
  XHTML-Print W3C Working Draft 20 October 2003 [1] a Candidate
  Recommendation, noting your dissenting opinion on required UTF-16 support.
  
  Steven Pemberton feels that the director will agree to make the
  specification a Candidate Recommendation. 
  
  You may register a formal objection [2] concerning UTF-16 support in
  XHTML-Print, if you feel that your comments on this issue haven't
  sufficiently represented your position. Please continue to CC:
  voyager-issues@mn.aptest.com  on any further discussions, since this provide
  an archive.  
  
  The Disposition of Comments for XHTML-Print is at [3].
  
  Jim
  
  [1] http://www.w3.org/MarkUp/Group/2003/WD-xhtml-print-20031020/
  [2]
  http://www.w3.org/2003/06/Process-20030618/policies.html#WGArchiveMinorityVi
  ews
  [3]  http://www.w3.org/MarkUp/Group/2003/xhtml-print-cr-doc-20031017.html

REPLY 1:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Thank you for your comment on the XHTML-Print Last Call 
  Working Draft. It is recorded as issue 6774 [1] in the HTML 
  Working Group's issue tracking system. 
  
  The working group agrees that since XHTML-Print is a member
  of the family of XHTML 1.0 languages documents encodings cannot
  be restricted to UTF-8 but must also include UTF-16.  The
  specification will be modified to remove the sentence, 
  'The only valid value for the "charset" parameter is "utf-8".'
  
  If you feel that this resolution of your comment is not acceptable, please
  respond to this message with your comments.
  
  Jim Bigelow
  Editor
  
  [1] http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6774;user=guest

REPLY 2:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Don,
  
  What do you think of the following compromise?
  1. say nothing about whether a printer supports UTF-8 or UTF-16
  2. require that conforming XHTML-Print documents be encoded in UTF-8 by
  requiring that conforming clients (Section 2.2) creating documents that are
  encoded in UF-8. This means adding the following to item 1 of Section 2.2: 
  
  1. Clients SHALL produce a well-formed XHTML-Print document as defined in XHTML
  1.0 [XHTML1] and in Document Conformance. The document SHALL be encoded using
  UTF-8 [RFC2279].
  
  
  Jim Bigelow

REPLY 3:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Don and Elliott,
  
  The HTML working group discussed my question of why and XHTML-Print processor
  must be a conforming XML processor (in particular, why it must support both
  UTF-8 and UTF-16 encodings) on October 1, 2003.  
  
  The answer is that XHTML-Print must be a conforming XML processor and support
  both UTF-8 and UTF-16 encodings to preserve compatibility between xml-based
  applications.
  
  If XHTML-Print processors only supported UTF-8 then an xml-based application
  could not be reliably depended upon to emit an XHTML-Print document that the
  XHTML-print application could process.  For example, an xml-based Xforms
  application's output of an XHTML-Print document cannot be restricted by the
  XHTML-Print specification to UTF-8 since the application may not be able to
  control the encoding.
  
  Section 4.3.3 [1] and Appendix F [2] of the XML specification [3] give
  heuristics for determing a document's encoding when the charset parameter of the
  MIME type [4] is absent.
  
  An example UTF-16 decoder is available at [5] other encodings are at [6].
  
  Jim Bigelow
  
  [1] http://www.w3.org/TR/REC-xml#charencoding
  [2] http://www.w3.org/TR/REC-xml#sec-guessing
  [3] http://www.w3.org/TR/REC-xml
  [4] http://www.ietf.org/rfc/rfc3023.txt
  [5] http://interscript.sourceforge.net/interscript/doc/en_iscr_0282.html
  [6] http://interscript.sourceforge.net/interscript/doc/en_iscr_0275.html

1.7 why does object type override content type/HTTP level?

PROBLEM ID: 6775

STATE: Closed
RESOLUTION: Accept
USER POSITION: Agree

NOTES:

  Agreed changed wording to say resources

ORIGINAL MESSAGE:

  From: Henri Sivonen <hsivonen@iki.fi>

  From: Henri Sivonen <hsivonen@iki.fi>
  To: www-html-editor@w3.org
  Subject: why does object type override content type/HTTP level?
  Date: Sun, 3 Aug 2003 22:01:47 +0300
  Message-Id: <EE667E7F-C5E4-11D7-B77B-003065B8CF0E@iki.fi>
  X-Archived-At: http://www.w3.org/mid/EE667E7F-C5E4-11D7-B77B-003065B8CF0E@iki.fi

  3.10 Object Module
  "A printer MUST treat the object as a jpeg image when the value of the 
  object element's type attribute is 'text/jpeg'." Why is the type 
  attribute allowed to override the content type information delivered on 
  the Application/Vnd.pwg-multiplexed  or HTTP level? Previously the type 
  attribute has been considered advisory so that user agents may omit 
  requesting object they know they can't handle. (I assume "text/jpeg" is 
  a mistake and means "image/jpeg").

  [extracted from issue 6548]
  -- 
  Henri Sivonen
  hsivonen@iki.fi
  http://www.iki.fi/hsivonen/

REPLY 1:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Thank you for your comment on the XHTML-Print Last Call 
  Working Draft. It is recorded as issue 6775 [1] in the HTML 
  Working Group's issue tracking system. 
  
  The working group agrees with your comments by modifying the 
  text of section 3.10 to read, "A printer must support 
  resources of type 'image/jpeg'."
  
  If you feel that this resolution of your comment is not acceptable, please
  respond to this message with your comments.
  
  Jim Bigelow
  Editor
  
  [1] http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6775;user=guest

1.8 XHTML-Print: treating a missing media attribute as media="screen"

PROBLEM ID: 6870

STATE: Closed
RESOLUTION: Accept
USER POSITION: Agree

NOTES:

  changed to "all"

ORIGINAL MESSAGE:

  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>

  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>
  To: w3c-html-editor@w3.org
  Cc: xp@pwg.org
  Subject: XHTML-Print: treating a missing media attribute as media="screen"
  	 when printing not user's intent
  Date: Thu, 4 Sep 2003 14:10:55 -0400 
  Message-ID: <020A3CF87FB5AC47AA67966B33845755050DB594@xboi22.boise.itc.hp.com>

  Sections 3.13 and 3.15 of the W3C Last Call Working Draft of XHTML-Print [1]
  state, "The absence of the media attribute MUST be treat[ed] as if the media
  attribute had the value 'screen.'"  

  At the risk of be accused of mind reading, I think that most document
  authors do not write style sheets for printing but would like the styles to
  be applied when printing as well as browsing.  Therefore changing the value
  "screen" in the statement shown above to the value "all" would give more
  consistent results when browsing and printing.

  [1] http://www.w3.org/TR/2003/WD-xhtml-print-20030729/

  Jim Bigelow
  Hewlett-Packard Co.

REPLY 1:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Jonny wrote:
  I am starting to believe that this error isn't a bug (yes, the default 
  value *is* "all"), but a virus the way it keeps replicating. Anyone 
  willing to guess which spec it will infect next?
  
  -- 
  Jonny Axelsson,
  Web Standards,
  Opera Software

REPLY 2:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Don Wright wrote:
  
  
  Jim:
  
  "All" is consistant with XHTML2.  See
  http://www.w3.org/MarkUp/Group/2003/WD-xhtml2-20030810/abstraction.html#dt_MediaDesc
  
  **********************************************
   Don Wright                 don@lexmark.com

REPLY 3:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Thank you for your comment on the XHTML-Print Last Call 
  Working Draft. It is recorded as issue 6870 [1] in the HTML 
  Working Group's issue tracking system. 
  
  The working group has elected to implement you suggestions.
  
  
  Jim Bigelow
  Editor
  
  [1] http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6870;user=guest

1.9 support for character entities too expensive for low-cost printers

PROBLEM ID: 6776

STATE: Closed
RESOLUTION: Reject
USER POSITION: Agree

NOTES:

  No response to response to reply 2, assuming agreement.

ORIGINAL MESSAGE:

  From: Henri Sivonen <hsivonen@iki.fi>

  From: Henri Sivonen <hsivonen@iki.fi>
  To: www-html-editor@w3.org
  Subject: support for character entities too expensive for low-cost printers
  Date: Sun, 3 Aug 2003 22:01:47 +0300
  Message-Id: <EE667E7F-C5E4-11D7-B77B-003065B8CF0E@iki.fi>
  X-Archived-At: http://www.w3.org/mid/EE667E7F-C5E4-11D7-B77B-003065B8CF0E@iki.fi

  3.17 Character Entities
  The specification mentions that character entities are defined but 
  doesn't say whether printers should support them.

  I think requiring XHTML-Print implementations to support character 
  entities would be a very bad idea. Support for character entities is 
  the only feature of XHTML-Print that requires the printer to process 
  external entities. The burden of implementing a DTD catalog and parsing 
  the huge (relative to the size of the usual XHTML documents) DTD files 
  is significant compared to using a non-validating XML processor and not 
  processing enternal entities at all.

  Since XHTML-Print is intended to be used with low-cost printers and the 
  overwhelmingly most likely use case is that the documents are generated 
  by software as opposed to being written by hand by humans, I suggest 
  explicitly stating that printers should not be expected to support 
  character entities (or any other features of XML that depend on the 
  external entities to be processed, such as attribute defaulting).

  [extracted from issue 6548]
  -- 
  Henri Sivonen
  hsivonen@iki.fi
  http://www.iki.fi/hsivonen/

FOLLOWUP 1:

  From: Henri Sivonen <hsivonen@iki.fi>

  On Saturday, Sep 27, 2003, at 00:26 Europe/Helsinki, Jim Bigelow wrote:

  > The working group does not agree with you concerning
  > requiring support a set of predefined character entities.
  > The group feels that the set of required character
  > entities has a small memory foot print when implemented as
  > a data set. Furthermore, such a data set does not require
  > that a printer read the DTD.  Therefore, no change to the
  > specification is planned in this regard.

  The problem is that implementing such data set without reading the DTD 
  would mean that the parser would not be a XML processor as defined in 
  the XML spec. Using a modified parser would break one of XML's 
  benefits: the ability to use a ready-made off-the-shelf parser whose 
  functionality is well defined. Also, having such almost-XML processors 
  around could cause interoperability problems, since different parsers 
  would have different idea of what the pre-defined entities were and, 
  therefore, what entity references rendered a document not well-formed.

  -- 
  Henri Sivonen
  hsivonen@iki.fi
  http://www.iki.fi/hsivonen/

REPLY 1:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Thank you for your comment on the XHTML-Print Last Call 
  Working Draft. It is recorded as issue 6776 [1] in the HTML 
  Working Group's issue tracking system. 
  
  The working group does not agree with you concerning 
  requiring support a set of predefined character entities.
  The group feels that the set of required character
  entities has a small memory foot print when implemented as 
  a data set. Furthermore, such a data set does not require
  that a printer read the DTD.  Therefore, no change to the 
  specification is planned in this regard.
  
  If you feel that this resolution of your comment is not acceptable, please
  respond to this message with your comments.
  
  Jim Bigelow
  Editor
  
  [1] http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6776;user=guest

REPLY 2:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Henri Sivonen wrote:
  > The problem is that implementing such data set without reading the DTD 
  > would mean that the parser would not be a XML processor as defined in 
  > the XML spec. Using a modified parser would break one of XML's 
  > benefits: the ability to use a ready-made off-the-shelf parser whose 
  > functionality is well defined. 
  
  An XHTML-Print processor is only required to deal with XHTML-Print documents
  
  > Also, having such almost-XML processors 
  > around could cause interoperability problems, since different parsers 
  > would have different idea of what the pre-defined entities were and, 
  > therefore, what entity references rendered a document not well-formed.
  > 
  The pre-defined entities that an XHTML-Print processor must support is
  well-defined. These entities are specified in the XHTML-Print specification in
  [1]. No other entities are part of XHTML-Print and users do not have a means to
  create new entities. Therefore, a confroming printer need only implement means
  to recognize the set of pre-defined entities and replace them with required
  Unicode code points. It is then up to the implementation of a conforming printer
  on how best to process the pre-defined set of entities.  
  
  Some implementations have done this via a data table that is compiled into the
  code, thereby relieving the printer of the need to redundently access the same
  information from the DTD for each XHTML-Print document.
  
  However, the specification does not constrain how a confroming printer should
  provide support for the set of pre-defined entities.
  
  
  Jim Bigelow
  Editor
  
  [1] http://www.w3.org/TR/2003/WD-xhtml-print-20030729/#s_charentities

1.10 MIME type Application/Multiplexed not correct

PROBLEM ID: 6777

STATE: Closed
RESOLUTION: Accept
USER POSITION: Agree

NOTES:

  correct spec as indicated in issue

ORIGINAL MESSAGE:

  From: Henri Sivonen <hsivonen@iki.fi>
  
  From: Henri Sivonen <hsivonen@iki.fi>
  To: www-html-editor@w3.org
  Subject: MIME type Application/Multiplexed not correct
  Date: Sun, 3 Aug 2003 22:01:47 +0300
  Message-Id: <EE667E7F-C5E4-11D7-B77B-003065B8CF0E@iki.fi>
  X-Archived-At: http://www.w3.org/mid/EE667E7F-C5E4-11D7-B77B-003065B8CF0E@iki.fi
  
  
  B.2 MIME type Application/Multiplexed
  The heading and the following reference to RFC3391 should say 
  Application/Vnd.pwg-multiplexed instead of Application/Multiplexed.
      
  [extracted from issue 6548]
  -- 
  Henri Sivonen
  hsivonen@iki.fi
  http://www.iki.fi/hsivonen/

REPLY 1:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Thank you for your comment on the XHTML-Print Last Call 
  Working Draft. It is recorded as issue 6777 [1] in the HTML 
  Working Group's issue tracking system. 
  
  The working group made the change you suggested.
  
  Jim Bigelow
  Editor
  
  [1]http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6777;user=guest

1.11 XHTML-Print: Appendix B.2.1 uses "image header" without defining it.

PROBLEM ID: 6871

STATE: Closed
RESOLUTION: Accept
USER POSITION: Agree

NOTES:

  defined image header

ORIGINAL MESSAGE:

  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>

  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>
  To: www-html-editor@w3.org
  Cc: xp@pwg.org
  Subject: XHTML-Print: Appendix B.2.1 uses "image header" without defining  	it.
  Date: Thu, 4 Sep 2003 14:20:46 -0400 
  Message-ID: <020A3CF87FB5AC47AA67966B33845755050DB5AA@xboi22.boise.itc.hp.com>
  X-Archived-At: http://www.w3.org/mid/020A3CF87FB5AC47AA67966B33845755050DB5AA@xboi22.boise.itc.hp.com

  Appendix B.2.1 of the W3C Last Call Working Draft of XHTML-Print [1] uses
  the term "image's header" without defining it.  We at Hewlett-Packard
  suggest that the term be defined as the everything from the beginning of the
  image up to and including the "start of scan marker."

  Jim Bigelow
  Hewlett-Packard Co.

  [1] http://www.w3.org/TR/2003/WD-xhtml-print-20030729/

REPLY 1:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Don Wright wrote:
  
  Makes sense to me.
  
  **********************************************
   Don Wright                 don@lexmark.com

REPLY 2:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Thank you for your comment on the XHTML-Print Last Call 
  Working Draft. It is recorded as issue 6871 [1] in the HTML 
  Working Group's issue tracking system. 
  
  The working group has elected to implement you suggestions.
  
  
  Jim Bigelow
  Editor
  
  [1] http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6871;user=guest

1.12 Required support for script, noscript, and hidden

PROBLEM ID: 6778

STATE: Closed
RESOLUTION: Accept
USER POSITION: Agree

NOTES:

  same issue as 6772

ORIGINAL MESSAGE:

  From: ElliottBradshaw@oaktech.com [mailto:ElliottBradshaw@oaktech.com] 

  From: ElliottBradshaw@oaktech.com [mailto:ElliottBradshaw@oaktech.com] 
  Sent: Thursday, July 31, 2003 1:29 PM
  To: BIGELOW,JIM (HP-Boise,ex1)
  Cc: xp@pwg.org
  Subject: Required support for script, noscript, and hidden

  2.  Required support for script, noscript, and hidden.  I don't mind this
  change, exactly.  But (at the risk of re-opening a long debate) if the
  assumption is that an XHTML-Print client is generating data specifically in
  this language, then it should never generate these cases.  So mandating
  support seems redundant.  On the other hand, if the intent is to gracefully
  degrade when receiving data from other sources, then there are other issues
  (e.g. frames) that also come up.

  [extracted from issue 6536]

  ------------------------------------------
  Elliott Bradshaw
  Director, Software Engineering
  Oak Technology Imaging Group
  781 638-7534

REPLY 1:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  > From: ElliottBradshaw@oaktech.com [mailto:ElliottBradshaw@oaktech.com] 
  > Sent: Thursday, July 31, 2003 1:29 PM
  > To: BIGELOW,JIM (HP-Boise,ex1)
  > Cc: xp@pwg.org
  > Subject: Required support for script, noscript, and hidden
  > 
  > 2.  Required support for script, noscript, and hidden.  I don't mind this
  > change, exactly.  But (at the risk of re-opening a long debate) if the
  > assumption is that an XHTML-Print client is generating data specifically in
  > this language, then it should never generate these cases.  So mandating
  > support seems redundant.  On the other hand, if the intent is to gracefully
  > degrade when receiving data from other sources, then there are other issues
  > (e.g. frames) that also come up.
  > 
  Adding support for <noscript> allows a document author to use a single document
  and have the script execute when browsing and the content of the noscript
  element be displayed when printing.   The PWG version of XHTML-Print
  specifically said that the content of the script element should not be printed
  (Section 1.3.1) however it doesn't indicate how a printer was to recognize the
  script element treat it differently than all other unknown elements.  This
  change indicates how the printer should recognize and script, that the content
  should be discarded, and the alternate content in the noscript be printed.
  
  So, I think this change cleans up the intent already expressed in previous
  versions and does not open to larger issue of graceful degradation in the face
  of non-XHTML-Print documents.
  
  Jim.

REPLY 2:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Thank you for your comment on the XHTML-Print Last Call 
  Working Draft. It is recorded as issue 6778 [1] in the HTML 
  Working Group's issue tracking system. 
  
  The working group does not agree that support for the script
  implies support for document types other than XHTML-Print.  
  Therefore, no changes to the specificaton are planned regarding
  this issue.
  
  If you feel that this resolution of your comment is not acceptable, please
  respond to this message with your comments.
  
  Jim Bigelow
  Editor
  
  [1]http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6778;user=guest

1.13 treatment of attributes

PROBLEM ID: 6779

STATE: Closed
RESOLUTION: Accept
USER POSITION: Agree

NOTES:

  resolve as comment (albeit a nice one)

ORIGINAL MESSAGE:

  From: ElliottBradshaw@oaktech.com [mailto:ElliottBradshaw@oaktech.com] 
  
  From: ElliottBradshaw@oaktech.com [mailto:ElliottBradshaw@oaktech.com] 
  Sent: Thursday, July 31, 2003 1:29 PM
  To: BIGELOW,JIM (HP-Boise,ex1)
  Cc: xp@pwg.org
  Subject:  treatment of attributes
  
  3.  The new treatment for attributes is nice.
  
  
  [extracted from issue 6536]
  
  ------------------------------------------
  Elliott Bradshaw
  Director, Software Engineering
  Oak Technology Imaging Group
  781 638-7534

1.14 change of MIME type to application/xhtml+xml not compatible with UPnP

PROBLEM ID: 6780

STATE: Closed
RESOLUTION: Modify and Accept
USER POSITION: Agree

NOTES:

  Printers must support W3C and PWG MIME Type and DTD. PWG versions deprecated.

ORIGINAL MESSAGE:

  From: ElliottBradshaw@oaktech.com [mailto:ElliottBradshaw@oaktech.com] 

  From: ElliottBradshaw@oaktech.com [mailto:ElliottBradshaw@oaktech.com] 
  Sent: Thursday, July 31, 2003 1:29 PM
  To: BIGELOW,JIM (HP-Boise,ex1)
  Cc: xp@pwg.org
  Subject:  change of MIME type to application/xhtml+xml not compatible with UPnP

  4.  Section 2.1, last paragraph.  Changing the MIME type makes sense.  But I
  assume that "application/xhtml+xml" could refer to other kinds of data
  besides XHTML-Print.  In other words, the receiving side can't tell that
  this data is XHTML-Print.  Unless he looks at the DOCTYPE...right?

  I'm wondering if this change will be a problem for protocols such as UPnP
  that use the MIME type to distinguish "document format" (in the Semantic
  Model sense) when advertising capabilities.  For example,
  http://www.upnp.org/download/Service_print_v1_020808.pdf says

    "All UPnP printers MUST support at least the
  'application/vnd.pwg-xhtml-print' document format[XHTML-PRINT] ..."

  This would have to change to something new, in a way that specifically
  refers to XHTML-Print.

  [extracted from issue 6536]

  ------------------------------------------
  Elliott Bradshaw
  Director, Software Engineering
  Oak Technology Imaging Group
  781 638-7534

FOLLOWUP 1:


  From: elliott.bradshaw@zoran.com
  
  
  I am not sure that this resolution solves the problem.
  
  Protocols such as UPnP and Bluetooth need a unique MIME type to describe
  support for documents formatted as XHTML-Print.
  
  I agree tha the current type application/vnd.pwg-xhtml-print+xml should be
  migrated to something more official, which would require such protocols to
  make revisions that moves away from the deprecated name.  But they still
  need a unique way to identify XHTML-Print.
  
  Perhaps those groups have come up with another way to solve this, but to me
  a unique MIME type would be the right way to go.
  
  Can the W3C register a new MIME type for this purpose?
  
    Best regards,
    Elliott
  
  
  --------------------------------------------------------------------------------
  
  Elliott Bradshaw
  Director, Software Engineering
  Zoran Imaging Group (formerly Oak Technology Imaging Group)
  781 638-7534
  
  
  
                                                                                                       
                      Jim Bigelow                                                                      
                      <voyager-issues@mn.a       To:     ElliottBradshaw@oaktech.com                   
                      ptest.com>                 cc:                                                   
                                                 Subject:     Re: change of MIME type to               
                      09/26/2003 06:24 PM         application/xhtml+xml not compatible with UPnP       
                                                  (PR#6780)                                            
                                                                                                       
  
  
  
  
  Thank you for your comment on the XHTML-Print Last Call
  Working Draft. It is recorded as issue 6780 [1] in the HTML
  Working Group's issue tracking system.
  
  The working group decided that the MIME type
  "application/vnd.pwg-xhtml-print+xml" must be recognized as referring to a
  conforming XHTML-Print document, along with the MIME Type
  "application/xhtml+xml".  However, the
  "application/vnd.pwg-xhtml-print+xml"
  MIME type is deprecated in favor of the MIME Type "application/xhtml+xml.
  Future
  releases of this specification may remove the required support for the MIME
  type
  "application/vnd.pwg-xhtml-print+xml"
  
  If you feel that this resolution of your comment is not acceptable, please
  respond to this message with your comments.
  
  Jim Bigelow
  Editor
  
  [1]
  http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6780;user=guest

REPLY 1:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  > From: ElliottBradshaw@oaktech.com [mailto:ElliottBradshaw@oaktech.com] 
  > Sent: Thursday, July 31, 2003 1:29 PM
  > To: BIGELOW,JIM (HP-Boise,ex1)
  > Cc: xp@pwg.org
  > Subject:  change of MIME type to application/xhtml+xml not compatible with
  UPnP
  > 
  > 4.  Section 2.1, last paragraph.  Changing the MIME type makes sense.  But I
  > assume that "application/xhtml+xml" could refer to other kinds of data
  > besides XHTML-Print.  In other words, the receiving side can't tell that
  > this data is XHTML-Print.  Unless he looks at the DOCTYPE...right?
  > 
  > I'm wondering if this change will be a problem for protocols such as UPnP
  > that use the MIME type to distinguish "document format" (in the Semantic
  > Model sense) when advertising capabilities.  For example,
  > http://www.upnp.org/download/Service_print_v1_020808.pdf says
  > 
  >   "All UPnP printers MUST support at least the
  > 'application/vnd.pwg-xhtml-print' document format[XHTML-PRINT] ..."
  > 
  > This would have to change to something new, in a way that specifically
  > refers to XHTML-Print.
  > 
  Your point also holds for Bluetooth Basic Print Profile (v .95)
  (http://www.bluetooth.com/pdf/Basic_Printing_Profile_0_95a.pdf).  I think that
  XHTML-Print must continue to support the MIME type of
  'application/vnd.pwg-xhtml-print' and support for "application/xhtml+xml" should
  be optional.  I'll argue for this during the working group review.
  
  -- Jim

REPLY 2:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Thank you for your comment on the XHTML-Print Last Call 
  Working Draft. It is recorded as issue 6780 [1] in the HTML 
  Working Group's issue tracking system. 
  
  The working group decided that the MIME type
  "application/vnd.pwg-xhtml-print+xml" must be recognized as referring to a
  conforming XHTML-Print document, along with the MIME Type
  "application/xhtml+xml".  However, the "application/vnd.pwg-xhtml-print+xml"
  MIME type is deprecated in favor of the MIME Type "application/xhtml+xml. Future
  releases of this specification may remove the required support for the MIME type
  "application/vnd.pwg-xhtml-print+xml" 
  
  If you feel that this resolution of your comment is not acceptable, please
  respond to this message with your comments.
  
  Jim Bigelow
  Editor
  
  [1] http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6780;user=guest

1.15 Relaxing XHTML-Print's restriction to UTF-8 to include UTF-16

PROBLEM ID: 6815

STATE: Approved
RESOLUTION: Accept
USER POSITION: Agree

NOTES:

  duplicate of 6774

ORIGINAL MESSAGE:

  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>
  
  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>
  To: www-html-editor@w3.org
  Cc: xp@pwg.org
  Subject: Relaxing XHTML-Print's restriction to UTF-8 to include UTF-16
  Date: Tue, 2 Sep 2003 20:42:14 -0400 
  Message-ID: <020A3CF87FB5AC47AA67966B3384575504D1D0AD@xboi22.boise.itc.hp.com>
  X-Archived-At: http://www.w3.org/mid/020A3CF87FB5AC47AA67966B3384575504D1D0AD@xboi22.boise.itc.hp.com
  
  > From: Henri Sivonen [mailto:hsivonen@iki.fi] 
  ...
  > It is said that if a "charset" parameter is present for the 
  > application/xhtml+xml MIME type, the only valid value is "utf-8". It 
  > would make sense to allow "utf-16" as well. All XML processors are 
  > required to support UTF-16 in addition to UTF-8, so allowing 
  > UTF-16 for XHTML-Print doesn't cause any additional burden 
  > to implementations. Also, the payload of 
  > Application/Vnd.pwg-multiplexed  chunks is defined 
  > as octets, so UTF-16 strings can be delivered as  
  > Application/Vnd.pwg-multiplexed chunks without any further encoding.
  >
  
  I tend to agree with Henri when he says that support UTF-16 would not be
  much more expensive than UTF-8.  Does anyone on this list or the PWG's
  XHTML-Print list disagree?
  
  Jim

1.16 Change to wording of Section 2.3.1, "Images" section, fourth bullet confusing

PROBLEM ID: 6781

STATE: Closed
RESOLUTION: Accept
USER POSITION: Agree

NOTES:

  change spec to use wording in followup 1

ORIGINAL MESSAGE:

  From: ElliottBradshaw@oaktech.com [mailto:ElliottBradshaw@oaktech.com] 

  From: ElliottBradshaw@oaktech.com [mailto:ElliottBradshaw@oaktech.com] 
  Sent: Thursday, July 31, 2003 1:29 PM
  To: BIGELOW,JIM (HP-Boise,ex1)
  Cc: xp@pwg.org
  Subject: Change to wording of Section 2.3.1, "Images" section, fourth bullet confusing

  5.  Section 2.3.1, "Images" section, fourth bullet.  It used to say "Image
  data within the object element need not be supported." and now it says "A
  printer MAY choose to omit images referenced by a URI [RFC2396] containing a
  scheme name other than cid [RFC2392] and http [RFC2616] ."  I'm confused.

  [extracted from issue 6536]

  ------------------------------------------
  Elliott Bradshaw
  Director, Software Engineering
  Oak Technology Imaging Group
  781 638-7534

FOLLOWUP 1:

  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>

  From: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>
  To: www-html-editor@w3.org
  Cc: xp@pwg.org
  Subject: RE: XP> FW: Last call announcement for XHTML Print
  Date: Thu, 31 Jul 2003 13:53:00 -0700
  Message-ID: <020A3CF87FB5AC47AA67966B3384575503C7DBE0@xboi22.boise.itc.hp.com>
  X-Archived-At: http://www.w3.org/mid/020A3CF87FB5AC47AA67966B3384575503C7DBE0@xboi22.boise.itc.hp.com

  Elliott,

  You wrote:
  > 
  > I reviewed the public version and here are a few comments.
  >
  ... 
  > 
  > 
  > 5.  Section 2.3.1, "Images" section, fourth bullet.  It used 
  > to say "Image data within the object element need not be 
  > supported." and now it says "A printer MAY choose to omit 
  > images referenced by a URI [RFC2396] containing a scheme name 
  > other than cid [RFC2392] and http [RFC2616] ."  I'm confused.
  > 

  The rewording is an attempt to say, in the positive, what URI types must be
  supported and by implication that support for the data URI is not required.

  Perhaps it should actually say that in the positive :-).  For example,

  A printer must support images referenced by a URI [RFC2396] containing a 
  scheme name cid [RFC2392] and http [RFC2616], support for other scheme names
  is optional. However, support for a URI containing the data scheme name [REF
  NEEDED] is not required unless the printer chooses to implement the method
  for supporting in-line data given in Appendix B.3.

  Jim

FOLLOWUP 2:


  From: ElliottBradshaw@oaktech.com
  
  From: ElliottBradshaw@oaktech.com
  To: "BIGELOW,JIM (HP-Boise,ex1)" <jim.bigelow@hp.com>
  Cc: owner-xp@pwg.org, www-html-editor@w3.org, xp@pwg.org
  Subject: RE: XP> FW: Last call announcement for XHTML Print
  Date: Fri, 1 Aug 2003 09:28:23 -0400
  Message-ID: <OF13B3AA0D.ACFCD949-ON85256D75.0049B11B-85256D75.004A382E@ne.oaktech.com>
  X-Archived-At: http://www.w3.org/mid/OF13B3AA0D.ACFCD949-ON85256D75.0049B11B-85256D75.004A382E@ne.oaktech.com
  
  Jim,
  
  I see.  Actually the current draft now makes sense to me, but your revision
  is better.
  
    E.
  
  
  ------------------------------------------
  Elliott Bradshaw
  Director, Software Engineering
  Oak Technology Imaging Group
  781 638-7534

REPLY 1:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Thank you for your comment on the XHTML-Print Last Call 
  Working Draft. It is recorded as issue 6781 [1] in the HTML 
  Working Group's issue tracking system. 
  
  The working group decided to change the wording of section 2.3.1 to, 
  "A printer must support images referenced by a URI [RFC2396] containing a 
  scheme name cid [RFC2392] and http [RFC2616], support for other scheme names
  is optional."
  
  If you feel that this resolution of your comment is not acceptable, please
  respond to this message with your comments.
  
  Jim Bigelow
  Editor
  
  [1] http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6781;user=guest

1.17 RFC 2119 keyword in informative section

PROBLEM ID: 6783

STATE: Closed
RESOLUTION: Accept
USER POSITION: Agree

NOTES:

  Remove RFC 219 keyword annotations from informative section -- Jim

ORIGINAL MESSAGE:

  From: Susan Lesch [mailto:lesch@w3.org] 
  
  These are minor editorial comments for your XHTML-Print Last Call Working
  Draft [1]. Kudos to the editor and your group(s). It looks great.
  
  In 4.3 I am not sure the RFC 2119 key word MUST makes sense in an
  informative section (it might).
  
  [extracted from 6899]
  
  [1] http://www.w3.org/TR/2003/WD-xhtml-print-20030729/
  
  Best wishes for your project,
  -- 
  Susan Lesch           http://www.w3.org/People/Lesch/
  mailto:lesch@w3.org               tel:+1.858.483.4819
  World Wide Web Consortium (W3C)    http://www.w3.org/

FOLLOWUP 1:


  From: Mail Delivery Subsystem <MAILER-DAEMON@hades.mn.aptest.com>
  
  This is a MIME-encapsulated message
  
  --h8R03Rb28151.1064621007/hades.mn.aptest.com
  
  The original message was received at Fri, 26 Sep 2003 19:03:27 -0500
  from IDENT:iRSa5sMQNGkPhi4tk8I2cCBuLNNxhSgu@localhost [127.0.0.1]
  
     ----- The following addresses had permanent fatal errors -----
  <[mailto:lesch@w3.org]>
      (reason: 550 Host unknown)
  
     ----- Transcript of session follows -----
  550 5.1.2 <[mailto:lesch@w3.org]>... Host unknown (Name server: w3.org]: host not found)
  
  --h8R03Rb28151.1064621007/hades.mn.aptest.com
  Content-Type: message/delivery-status
  
  Reporting-MTA: dns; hades.mn.aptest.com
  Received-From-MTA: DNS; localhost
  Arrival-Date: Fri, 26 Sep 2003 19:03:27 -0500
  
  Final-Recipient: RFC822; [mailto:lesch@w3.org]
  Action: failed
  Status: 5.1.2
  Remote-MTA: DNS; w3.org]
  Diagnostic-Code: SMTP; 550 Host unknown
  Last-Attempt-Date: Fri, 26 Sep 2003 19:03:27 -0500
  
  --h8R03Rb28151.1064621007/hades.mn.aptest.com
  Content-Type: message/rfc822
  
  Return-Path: <voyager-issues@mn.aptest.com>
  Received: from localhost (IDENT:iRSa5sMQNGkPhi4tk8I2cCBuLNNxhSgu@localhost [127.0.0.1])
  	by hades.mn.aptest.com (8.11.6/8.11.6) with ESMTP id h8R03Qb28147
  	for <[mailto:lesch@w3.org]>; Fri, 26 Sep 2003 19:03:27 -0500
  Date: Fri, 26 Sep 2003 19:03:27 -0500
  Message-Id: <200309270003.h8R03Qb28147@hades.mn.aptest.com>
  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  To: lesch@w3.org]
  Subject: Re: RFC 2119 keyword in informative section (PR#6783)
  X-Loop: voyager-issues@mn.aptest.com
  
  Thank you for your comment on the XHTML-Print Last Call 
  Working Draft. It is recorded as issue 6783 [1] in the HTML 
  Working Group's issue tracking system. 
  
  The working group has elected to implement you suggestions.
  
  
  Jim Bigelow
  Editor
  
  [1] http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6783;user=guest
  
  --h8R03Rb28151.1064621007/hades.mn.aptest.com--

REPLY 1:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Thank you for your comment on the XHTML-Print Last Call 
  Working Draft. It is recorded as issue 6783 [1] in the HTML 
  Working Group's issue tracking system. 
  
  The working group has elected to implement you suggestions.
  
  
  Jim Bigelow
  Editor
  
  [1] http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6783;user=guest

1.18 Diagram 1 height & width not right

PROBLEM ID: 6784

STATE: Closed
RESOLUTION: Accept
USER POSITION: Agree

NOTES:

  Change height and width -- Jim

ORIGINAL MESSAGE:

  From: Susan Lesch [mailto:lesch@w3.org] 
  
  These are minor editorial comments for your XHTML-Print Last Call Working
  Draft [1]. Kudos to the editor and your group(s). It looks great.
  
  
  Diagram 1 is squished to height="303" width="450". The image is really
  height="404" width="600".
  
  [extracted from 6899]
  
  [1] http://www.w3.org/TR/2003/WD-xhtml-print-20030729/
  
  Best wishes for your project,
  -- 
  Susan Lesch           http://www.w3.org/People/Lesch/
  mailto:lesch@w3.org               tel:+1.858.483.4819
  World Wide Web Consortium (W3C)    http://www.w3.org/

FOLLOWUP 1:


  From: Mail Delivery Subsystem <MAILER-DAEMON@hades.mn.aptest.com>
  
  This is a MIME-encapsulated message
  
  --h8R05Yb28182.1064621134/hades.mn.aptest.com
  
  The original message was received at Fri, 26 Sep 2003 19:05:33 -0500
  from IDENT:Zc2NOPzouIqfs4RLF62cMWmR8FF39Hkw@localhost [127.0.0.1]
  
     ----- The following addresses had permanent fatal errors -----
  <[mailto:lesch@w3.org]>
      (reason: 550 Host unknown)
  
     ----- Transcript of session follows -----
  550 5.1.2 <[mailto:lesch@w3.org]>... Host unknown (Name server: w3.org]: host not found)
  
  --h8R05Yb28182.1064621134/hades.mn.aptest.com
  Content-Type: message/delivery-status
  
  Reporting-MTA: dns; hades.mn.aptest.com
  Received-From-MTA: DNS; localhost
  Arrival-Date: Fri, 26 Sep 2003 19:05:33 -0500
  
  Final-Recipient: RFC822; [mailto:lesch@w3.org]
  Action: failed
  Status: 5.1.2
  Remote-MTA: DNS; w3.org]
  Diagnostic-Code: SMTP; 550 Host unknown
  Last-Attempt-Date: Fri, 26 Sep 2003 19:05:33 -0500
  
  --h8R05Yb28182.1064621134/hades.mn.aptest.com
  Content-Type: message/rfc822
  
  Return-Path: <voyager-issues@mn.aptest.com>
  Received: from localhost (IDENT:Zc2NOPzouIqfs4RLF62cMWmR8FF39Hkw@localhost [127.0.0.1])
  	by hades.mn.aptest.com (8.11.6/8.11.6) with ESMTP id h8R05Xb28180
  	for <[mailto:lesch@w3.org]>; Fri, 26 Sep 2003 19:05:33 -0500
  Date: Fri, 26 Sep 2003 19:05:33 -0500
  Message-Id: <200309270005.h8R05Xb28180@hades.mn.aptest.com>
  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  To: lesch@w3.org]
  Subject: Re: Diagram 1 height & width not right (PR#6784)
  X-Loop: voyager-issues@mn.aptest.com
  
  Thank you for your comment on the XHTML-Print Last Call 
  Working Draft. It is recorded as issue 6784 [1] in the HTML 
  Working Group's issue tracking system. 
  
  The working group has elected to implement you suggestions.
  
  
  Jim Bigelow
  Editor
  
  [1] http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6784;user=guest
  
  --h8R05Yb28182.1064621134/hades.mn.aptest.com--

REPLY 1:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Thank you for your comment on the XHTML-Print Last Call 
  Working Draft. It is recorded as issue 6784 [1] in the HTML 
  Working Group's issue tracking system. 
  
  The working group has elected to implement you suggestions.
  
  
  Jim Bigelow
  Editor
  
  [1] http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6784;user=guest

1.19 Spell out abbreviations at first occurance

PROBLEM ID: 6785

STATE: Closed
RESOLUTION: Accept
USER POSITION: Agree

NOTES:

  Make changes as noted -- Jim

ORIGINAL MESSAGE:

  From: Susan Lesch [mailto:lesch@w3.org] 
  
  These are minor editorial comments for your XHTML-Print Last Call Working
  Draft [1]. Kudos to the editor and your group(s). It looks great.
  
  It would help to have these spelled out in their first occurrence: EXIF
  (Exchangeable Image File Format) JFIF (JPEG File Interchange Format) TIFF
  (Tag Image File Format) IFD (image file directory)
  
  [extracted from 6899]
  
  [1] http://www.w3.org/TR/2003/WD-xhtml-print-20030729/
  
  Best wishes for your project,
  -- 
  Susan Lesch           http://www.w3.org/People/Lesch/
  mailto:lesch@w3.org               tel:+1.858.483.4819
  World Wide Web Consortium (W3C)    http://www.w3.org/

FOLLOWUP 1:


  From: Mail Delivery Subsystem <MAILER-DAEMON@hades.mn.aptest.com>
  
  This is a MIME-encapsulated message
  
  --h8TICTb11035.1064859149/hades.mn.aptest.com
  
  The original message was received at Mon, 29 Sep 2003 13:12:29 -0500
  from IDENT:nwLuTDTVJCKK4JcFBo8cL2lTwc7ivWnu@localhost [127.0.0.1]
  
     ----- The following addresses had permanent fatal errors -----
  <[mailto:lesch@w3.org]>
      (reason: 550 Host unknown)
  
     ----- Transcript of session follows -----
  550 5.1.2 <[mailto:lesch@w3.org]>... Host unknown (Name server: w3.org]: host not found)
  
  --h8TICTb11035.1064859149/hades.mn.aptest.com
  Content-Type: message/delivery-status
  
  Reporting-MTA: dns; hades.mn.aptest.com
  Received-From-MTA: DNS; localhost
  Arrival-Date: Mon, 29 Sep 2003 13:12:29 -0500
  
  Final-Recipient: RFC822; [mailto:lesch@w3.org]
  Action: failed
  Status: 5.1.2
  Remote-MTA: DNS; w3.org]
  Diagnostic-Code: SMTP; 550 Host unknown
  Last-Attempt-Date: Mon, 29 Sep 2003 13:12:29 -0500
  
  --h8TICTb11035.1064859149/hades.mn.aptest.com
  Content-Type: message/rfc822
  
  Return-Path: <voyager-issues@mn.aptest.com>
  Received: from localhost (IDENT:nwLuTDTVJCKK4JcFBo8cL2lTwc7ivWnu@localhost [127.0.0.1])
  	by hades.mn.aptest.com (8.11.6/8.11.6) with ESMTP id h8TICTb11033
  	for <[mailto:lesch@w3.org]>; Mon, 29 Sep 2003 13:12:29 -0500
  Date: Mon, 29 Sep 2003 13:12:29 -0500
  Message-Id: <200309291812.h8TICTb11033@hades.mn.aptest.com>
  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  To: lesch@w3.org]
  Subject: Re: Spell out abbreviations at first occurance (PR#6785)
  X-Loop: voyager-issues@mn.aptest.com
  
  Thank you for your comment on the XHTML-Print Last Call 
  Working Draft. It is recorded as issue 6785 [1] in the HTML 
  Working Group's issue tracking system. 
  
  The working group has elected to implement you suggestions.
  
  
  Jim Bigelow
  Editor
  
  [1] http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6815;user=guest
  
  --h8TICTb11035.1064859149/hades.mn.aptest.com--

REPLY 1:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Thank you for your comment on the XHTML-Print Last Call 
  Working Draft. It is recorded as issue 6785 [1] in the HTML 
  Working Group's issue tracking system. 
  
  The working group has elected to implement you suggestions.
  
  
  Jim Bigelow
  Editor
  
  [1] http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6815;user=guest

1.20 markup elements and attributes globally

PROBLEM ID: 6786

STATE: Closed
RESOLUTION: Accept
USER POSITION: Agree

NOTES:

  Make changes as noted. -- Jim

ORIGINAL MESSAGE:

  From: Susan Lesch [mailto:lesch@w3.org] 
  
  These are minor editorial comments for your XHTML-Print Last Call Working
  Draft [1]. Kudos to the editor and your group(s). It looks great.
  
  
  It may make sense to mark up elements and attributes globally
  <code>thus</code>, as they are in 1.3.1 and some other places (that
  eliminates the need for quotes in the 4.1 heading).
  
  [extracted from 6899]
  
  [1] http://www.w3.org/TR/2003/WD-xhtml-print-20030729/
  
  Best wishes for your project,
  -- 
  Susan Lesch           http://www.w3.org/People/Lesch/
  mailto:lesch@w3.org               tel:+1.858.483.4819
  World Wide Web Consortium (W3C)    http://www.w3.org/

FOLLOWUP 1:


  From: Mail Delivery Subsystem <MAILER-DAEMON@hades.mn.aptest.com>
  
  This is a MIME-encapsulated message
  
  --h8TKR9b11244.1064867229/hades.mn.aptest.com
  
  The original message was received at Mon, 29 Sep 2003 15:27:09 -0500
  from IDENT:ILAoNWEh7kDvCjBr+yg3+PbhRj66PWGZ@localhost [127.0.0.1]
  
     ----- The following addresses had permanent fatal errors -----
  <[mailto:lesch@w3.org]>
      (reason: 550 Host unknown)
  
     ----- Transcript of session follows -----
  550 5.1.2 <[mailto:lesch@w3.org]>... Host unknown (Name server: w3.org]: host not found)
  
  --h8TKR9b11244.1064867229/hades.mn.aptest.com
  Content-Type: message/delivery-status
  
  Reporting-MTA: dns; hades.mn.aptest.com
  Received-From-MTA: DNS; localhost
  Arrival-Date: Mon, 29 Sep 2003 15:27:09 -0500
  
  Final-Recipient: RFC822; [mailto:lesch@w3.org]
  Action: failed
  Status: 5.1.2
  Remote-MTA: DNS; w3.org]
  Diagnostic-Code: SMTP; 550 Host unknown
  Last-Attempt-Date: Mon, 29 Sep 2003 15:27:09 -0500
  
  --h8TKR9b11244.1064867229/hades.mn.aptest.com
  Content-Type: message/rfc822
  
  Return-Path: <voyager-issues@mn.aptest.com>
  Received: from localhost (IDENT:ILAoNWEh7kDvCjBr+yg3+PbhRj66PWGZ@localhost [127.0.0.1])
  	by hades.mn.aptest.com (8.11.6/8.11.6) with ESMTP id h8TKR9b11242
  	for <[mailto:lesch@w3.org]>; Mon, 29 Sep 2003 15:27:09 -0500
  Date: Mon, 29 Sep 2003 15:27:09 -0500
  Message-Id: <200309292027.h8TKR9b11242@hades.mn.aptest.com>
  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  To: lesch@w3.org]
  Subject: Re: markup elements and attributes globally (PR#6786)
  X-Loop: voyager-issues@mn.aptest.com
  
  Thank you for your comment on the XHTML-Print Last Call 
  Working Draft. It is recorded as issue 6786 [1] in the HTML 
  Working Group's issue tracking system. 
  
  The working group has elected to implement you suggestions.
  
  
  Jim Bigelow
  Editor
  
  [1] http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6786;user=guest
  
  --h8TKR9b11244.1064867229/hades.mn.aptest.com--

REPLY 1:


  From: Jim Bigelow <voyager-issues@mn.aptest.com>
  
  Thank you for your comment on the XHTML-Print Last Call 
  Working Draft. It is recorded as issue 6786 [1] in the HTML 
  Working Group's issue tracking system. 
  
  The working group has elected to implement you suggestions.
  
  
  Jim Bigelow
  Editor
  
  [1] http://hades.mn.aptest.com/cgi-bin/voyager-issues/XHTML-Print?id=6786;user=guest

XHTML-Print Last Call Working Draft review Disposition of Comments

20 January 2004

Abstract

Status of this document

Table of Contents