Serialization Last Call issues
This document identifies the status of Last Call issues on XSLT 2.0 and XQuery 1.0 Serialization as of February 11, 2005.
The serialization draft has been defined jointly by the XML Query Working Group and the XSL Working Group (both part of the XML Activity).
The February 11, 2005 working draft includes a number of changes made in response to comments both received during the Last Call period that ended on Feb. 15, 2004. The working group is continuing to process these comments, and additional changes are expected.
Public comments on this document and its open issues are invited. Comments should be sent to the W3C mailing list public-qt-comments@w3.org. (archived at http://lists.w3.org/Archives/Public/public-qt-comments/) with “[Serial]” at the beginning of the subject field.
Most issues are classified as either “substantive”, meaning the editor believes technical changes to the document are required to address them, or “editorial”, meaning that the issue is one of specification clarity not technical correctness.
An issue transitions through several states. Issues tracking begins when an issue is “raised”. After discussion, the Working Group may have “decided” how to resolve the issue. This decision is “announced” and hopefully “acknowledged” by external commenters. For the most part, once an issue is decided, it is considered closed.
There are 125 issue(s).
17 raised (9 substantive), 0 proposed, 3 decided, 41 announced and 59 acknowledged.
Id | Title | Type | State |
+qt-2003Nov0050-01 | WD-xslt-xquery-serialization-20030502 omit-xml-declaration | substantive | acknowledged |
+qt-2003Nov0305-01 | [XQuery] SAG-XQ-005 Serializing Arbitrary Sequences | substantive | acknowledged |
+qt-2004Jan0019-04 | [XSLT2.0] Specify normalization form for serialization | substantive | acknowledged |
+qt-2004Jan0029-01 | [Serial] omit-xml-declaration | substantive | acknowledged |
+qt-2004Feb0049-01 | [Serialization] IBM-SE-001: Documentization | substantive | acknowledged |
+qt-2004Feb0053-01 | [Serialization] IBM-SE-004: XML Output Method | substantive | acknowledged |
+qt-2004Feb0055-01 | [Serialization] IBM-SE-006: Schema used in round-tripping | substantive | acknowledged |
+qt-2004Feb0057-01 | [Serialization] IBM-SE-008: Serializing namespace nodes | substantive | acknowledged |
+qt-2004Feb0058-01 | [Serialization] IBM-SE-009: Discarding of type annotations | substantive | acknowledged |
+qt-2004Feb0059-01 | [Serialization] IBM-SE-010: Namespace nodes after round-trip | substantive | acknowledged |
+qt-2004Feb0060-01 | [Serialization] IBM-SE-011: Character expansion | substantive | acknowledged |
+qt-2004Feb0061-01 | [Serialization] IBM-SE-012: Version parameter | substantive | acknowledged |
+qt-2004Feb0062-01 | [Serialization] IBM-SE-013: XML v1.1 vs. Namespaces v1.1 | substantive | acknowledged |
+qt-2004Feb0064-01 | [Serialization] IBM-SE-014: Serializing the "nilled" property | substantive | acknowledged |
+qt-2004Feb0146-01 | [Serial] canonicalization | substantive | announced |
+qt-2004Feb0188-01 | Serialization (sometimes) needs to include type information | substantive | announced |
+qt-2004Feb0261-01 | [Serialization] SCHEMA-A | substantive | acknowledged |
+qt-2004Feb0262-01 | [Serialization] SCHEMA-B | substantive | acknowledged |
+qt-2004Feb0263-01 | [Serialization] SCHEMA-C | substantive | acknowledged |
+qt-2004Feb0264-01 | [Serialization] SCHEMA-D | substantive | announced |
+qt-2004Feb0265-01 | [Serialization] SCHEMA-E | substantive | acknowledged |
+qt-2004Feb0266-01 | [Serialization] SCHEMA-F | substantive | acknowledged |
+qt-2004Feb0267-01 | [Serialization] SCHEMA-G | substantive | raised |
+qt-2004Feb0268-01 | [Serialization] SCHEMA-H | substantive | acknowledged |
+qt-2004Feb0269-01 | [Serialization] SCHEMA-I | substantive | acknowledged |
+qt-2004Feb0271-01 | [Serialization] SCHEMA-K | substantive | acknowledged |
+qt-2004Feb0272-01 | [Serialization] SCHEMA-L | substantive | acknowledged |
+qt-2004Feb0362-01 | [Serial] I18N WG last call comments [4] | substantive | objected |
+qt-2004Feb0362-02 | [Serial] I18N WG last call comments [5] | substantive | objected |
+qt-2004Feb0362-03 | [Serial] I18N WG last call comments [6] | substantive | announced |
+qt-2004Feb0362-04 | [Serial] I18N WG last call comments [7] | substantive | announced |
+qt-2004Feb0362-05 | [Serial] I18N WG last call comments [8] | substantive | announced |
+qt-2004Feb0362-06 | [Serial] I18N WG last call comments [9] | substantive | announced |
+qt-2004Feb0362-07 | [Serial] I18N WG last call comments [first comment 12] | substantive | announced |
+qt-2004Feb0362-08 | [Serial] I18N WG last call comments [11] | substantive | announced |
+qt-2004Feb0362-09 | [Serial] I18N WG last call comments [Second comment 12] | substantive | announced |
+qt-2004Feb0362-10 | [Serial] I18N WG last call comments [13] | substantive | announced |
+qt-2004Feb0362-11 | [Serial] I18N WG last call comments [14] | substantive | announced |
+qt-2004Feb0362-12 | [Serial] I18N WG last call comments [15] | substantive | announced |
+qt-2004Feb0362-13 | [Serial] I18N WG last call comments [16] | substantive | decided |
+qt-2004Feb0362-14 | [Serial] I18N WG last call comments [17] | substantive | acknowledged |
+qt-2004Feb0362-15 | [Serial] I18N WG last call comments [18] | substantive | acknowledged |
+qt-2004Feb0362-16 | [Serial] I18N WG last call comments [19] | substantive | announced |
+qt-2004Feb0362-17 | [Serial] I18N WG last call comments [20] | substantive | announced |
+qt-2004Feb0362-19 | [Serial] I18N WG last call comments [22] | substantive | acknowledged |
+qt-2004Feb0362-20 | [Serial] I18N WG last call comments [23] | substantive | announced |
+qt-2004Feb0362-21 | [Serial] I18N WG last call comments [24] | substantive | announced |
+qt-2004Feb0362-22 | [Serial] I18N WG last call comments [25] | substantive | announced |
+qt-2004Feb0362-24 | [Serial] I18N WG last call comments [31] | substantive | announced |
+qt-2004Feb0362-25 | [Serial] I18N WG last call comments [32] | substantive | announced |
+qt-2004Feb0918-01 | ORA-SE-341-B: serialization of XQuery DataModel instance is inadequate | substantive | acknowledged |
+qt-2004Feb0919-01 | ORA-SE-292-B: Processing of empty sequence is roundabout and confusing | substantive | raised |
+qt-2004Feb0921-01 | ORA-SE-300-B: Implementation-defined output methods need not normalize | substantive | acknowledged |
+qt-2004Feb0922-01 | ORA-SE-302-B: Phase 1, "Markup generation", is poorly specified | substantive | acknowledged |
+qt-2004Feb0923-01 | ORA-SE-304-Q: possible parameter for how to handle elements with no children | substantive | acknowledged |
+qt-2004Feb0924-01 | ORA-SE-308-C: What circumstances are meant by "in all other circumstances"? | substantive | acknowledged |
+qt-2004Feb0926-01 | ORA-SE-312-B: Missing exception for additional whitespace added by indent parameter | substantive | acknowledged |
+qt-2004Feb0927-01 | ORA-SE-315-Q: How can character expansion create new nodes? | substantive | acknowledged |
+qt-2004Feb0928-01 | ORA-SE-326-B: XML declaration is mandatory if the version is not 1.0 | substantive | acknowledged |
+qt-2004Feb0929-01 | ORA-SE-320-B: What does it mean to say two data models (sic) are the same? | substantive | decided |
+qt-2004Feb0930-01 | ORA-SE-301-B: Indent parameter should not apply to (potentially) mixed-mode elements | substantive | acknowledged |
+qt-2004Feb0932-01 | ORA-SE-309-B: Poorly worded constraints on the output | substantive | acknowledged |
+qt-2004Feb0936-01 | ORA-SE-317-B: document-uri property cannot be serialized | substantive | decided |
+qt-2004Feb0976-01 | [Serial] IBM-SE-100: Default parameter values should account for specifics for particular output methods | substantive | acknowledged |
+qt-2004Feb0977-01 | [Serial] IBM-SE-101: Default HTML version | substantive | acknowledged |
+qt-2004Feb0980-01 | [Serial] IBM-SE-103: Treatment of whitespace in XHTML attributes | substantive | acknowledged |
+qt-2004Feb0996-01 | FW: XSLT 2.0: XML Output Method: the omit-xml-declaration Parameter | substantive | announced |
+qt-2004Feb1040-01 | ORA-SE-305-E: Phase 2 should mention generation of character references | substantive | acknowledged |
+qt-2004Feb1042-01 | ORA-SE-298-E: Please clarify that all parameters are optional | substantive | acknowledged |
+qt-2004Feb1195-01 | [Serialization] MS-SER-LC1-001 | substantive | acknowledged |
+qt-2004Feb1197-01 | [Serialization] MS-SER-LC1-002 | substantive | acknowledged |
+qt-2004Feb1198-01 | [Serialization] MS-SER-LC1-005 | substantive | acknowledged |
+qt-2004Feb1204-01 | [Serialization] MS-SER-LC1-009 | substantive | acknowledged |
+qt-2004Feb1205-01 | [Serialization] MS-SER-LC1-012 | substantive | acknowledged |
+qt-2004May0006-01 | [Serial] additional last call comment about xml:lang | substantive | announced |
+qt-2004Sep0022-01 | [Serial] XHTML indentation | substantive | acknowledged |
+qt-2004Nov0025-01 | [Serial] XHTML Serialization | substantive | raised |
+qt-2004Nov0025-02 | [Serial] XHTML Serialization | substantive | raised |
+qt-2004Nov0025-03 | [Serial] XHTML Serialization | substantive | raised |
+qt-2004Nov0025-04 | [Serial] XHTML Serialization | substantive | raised |
+qt-2004Nov0025-07 | [Serial] XHTML Serialization | substantive | raised |
+qt-2004Nov0025-09 | [Serial] XHTML Serialization | substantive | raised |
+qt-2004Nov0074-01 | [Serial] > in processing instructions | substantive | raised |
+qt-2004Nov0075-01 | [Serial] 2 Sequence Normalization | substantive | raised |
+qt-2004Feb0050-01 | [Serialization] IBM-SE-002: Bugs in example | editorial | announced |
+qt-2004Feb0052-01 | [Serialization] IBM-SE-003: Undeclare-namespaces parameter | editorial | announced |
+qt-2004Feb0054-01 | [Serialization] IBM-SE-005: Definition of serialized output | editorial | announced |
+qt-2004Feb0056-01 | [Serialization] IBM-SE-007: Definition of round-tripping | editorial | announced |
+qt-2004Feb0270-01 | [Serialization] SCHEMA-J | editorial | announced |
+qt-2004Feb0273-01 | [Serialization] SCHEMA-M | editorial | announced |
+qt-2004Feb0274-01 | [Serialization] SCHEMA-N | editorial | raised |
+qt-2004Feb0275-01 | [Serialization] SCHEMA-O | editorial | announced |
+qt-2004Feb0276-01 | [Serialization] SCHEMA-P | editorial | announced |
+qt-2004Feb0278-01 | [Serialization] SCHEMA-Q | editorial | announced |
+qt-2004Feb0362-18 | [Serial] I18N WG last call comments [21] | editorial | announced |
+qt-2004Feb0362-23 | [Serial] I18N WG last call comments [26-30,33-34] | editorial | announced |
+qt-2004Feb0920-01 | ORA-SE-327-B: Surely namespace declaration is part of serializing XML version 1.0 | editorial | acknowledged |
+qt-2004Feb0931-01 | ORA-SE-306-C: Confusing definition of the "version" parameter | editorial | announced |
+qt-2004Feb0933-01 | ORA-SE-310-E: difficult sentence to parse | editorial | acknowledged |
+qt-2004Feb0934-01 | ORA-SE-303-B: undeclare-namespaces parameter is relevant to markup generation | editorial | acknowledged |
+qt-2004Feb0935-01 | ORA-SE-311-C: What is the "processor"? | editorial | announced |
+qt-2004Feb0937-01 | ORA-SE-314-B: Additional namespace nodes may be present if serialization does not undeclare namespaces | editorial | announced |
+qt-2004Feb0978-01 | [Serial] IBM-SE-102: Serialization editorial comments | editorial | announced |
+qt-2004Feb1037-01 | ORA-SE-293-E: Redundant phrase that can be deleted | editorial | acknowledged |
+qt-2004Feb1038-01 | ORA-SE-307-E: "An xml output method" is better than "the xml output method" | editorial | announced |
+qt-2004Feb1039-01 | ORA-SE-328-E: no mention of the standalone property | editorial | acknowledged |
+qt-2004Feb1041-01 | ORA-SE-296-E: Please define "serialization error" | editorial | announced |
+qt-2004Feb1043-01 | ORA-SE-299-E: misplaced comma | editorial | acknowledged |
+qt-2004Feb1044-01 | ORA-SE-297-E: Alphabetization problem | editorial | acknowledged |
+qt-2004Feb1045-01 | ORA-SE-295-E: The Note overflow the right margin when printed | editorial | acknowledged |
+qt-2004Feb1046-01 | ORA-SE-291-E: Term "empty string" is a poor choice of words | editorial | raised |
+qt-2004Feb1047-01 | ORA-SE-290-E: Title misuses the term "data models" | editorial | acknowledged |
+qt-2004Feb1196-01 | [Serialization] MS-SER-LC1-003 | editorial | acknowledged |
+qt-2004Feb1199-01 | [Serialization] MS-SER-LC1-004 | editorial | acknowledged |
+qt-2004Feb1200-01 | [Serialization] MS-SER-LC1-007 | editorial | announced |
+qt-2004Feb1201-01 | [Serialization] MS-SER-LC1-008 | editorial | announced |
+qt-2004Feb1202-01 | [Serialization] MS-SER-LC1-006 | editorial | raised |
+qt-2004Feb1203-01 | [Serialization] MS-SER-LC1-010 | editorial | acknowledged |
+qt-2004Feb1206-01 | [Serialization] MS-SER-LC1-011 | editorial | announced |
+qt-2004Nov0025-05 | [Serial] XHTML Serialization | editorial | raised |
+qt-2004Nov0025-06 | [Serial] XHTML Serialization | editorial | raised |
+qt-2004Nov0025-08 | [Serial] XHTML Serialization | editorial | raised |
+qt-2004Nov0037-01 | [Serial] Normalization and References | editorial | raised |
+qt-2004Nov0037-02 | [Serial] Normalization and References | editorial | raised |
+qt-2004Dec0001-01 | [Serial] serialization of xhtml + omit-xml-declaration | editorial | raised |
According to http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20030502/#N400318 The omit-xml-declaration parameter should be ignored if the standalone parameter is present, or if the encoding parameter specifies a value other than UTF-8 or UTF-16. There is one other case where it would be very useful to omit the declaration (or at least to use a value of utf-8) namely iso-646 (aka ASCII aka US-ASCII). It may be politically incorrect to say that ascii characters are still more interoperable than non-ascii characters, but in practice this is still the case. Especially in XML which specifies that a charset specified in the mime headers takes precedence it is hard to give (say) a utf8 file to someone to serve from their website without first finding out what http server they use, and how to make sure it won't serve the thing as latin 1 resulting in a non-well formed document. (See current discussion on W3C'S TAG list about this). One style of producing XML files that avoids these problems is to produce files that don't have an xml declaration (or have one that specifies utf-8) but to encode all non-ascii characters as numeric character references. Currently in an XSLT 1 usage in production I use <xsl:output encoding="US-ASCII"/> with saxon and post process with sed to remove the US-ASCII encoding declaration (which stops the file being parsed on several XML systems I have locally) I think that it would be very desirable if <xsl:output encoding="iso-646" omit-xml-declaration="yes"/> was defined to work, and produce files of the form described above. Failing that it would be good if it would be allowed by the specification if the system understood that encoding. David ________________________________________________________________________ This e-mail has been scanned for all viruses by Star Internet. The service is powered by MessageLabs. For more information on a proactive anti-virus service working around the clock, around the globe, visit: http://www.star.net.uk ________________________________________________________________________
Thank you for raising this issue. The XSL and XQuery working groups discussed the issue and decided not to require processors to support the US-ASCII encoding and its aliases. The working groups decided that the appropriate way of addressing your comment would be to replace the second paragraph of Section 4.5 of the last call working draft of XSLT 2.0 and XQuery 1.0 Serialization [1], which currently reads: << The omit-xml-declaration parameter must be ignored if the standalone parameter is present, or if the encoding parameter specifies a value other than UTF-8 or UTF-16. >> with the following: << The omit-xml-declaration parameter must be ignored if the standalone parameter is present, or if the encoding parameter specifies a value that is not UTF-8, UTF-16 or a subset of either of those encodings. An encoding S is a subset of another encoding E if the set of codepoints that can be encoded in S is a subset of those that can be encoded in B, and the encodings of those codepoints in S is the same as the encodings of those same codepoints in encoding E. >> That resolution seems to be in accord with the last sentence of your comment. Please let us know whether you consider this resolution acceptable.
Thanks, looks good to me. David
David, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization: << According to http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20030502/#N400318 The omit-xml-declaration parameter should be ignored if the standalone parameter is present, or if the encoding parameter specifies a value other than UTF-8 or UTF-16. There is one other case where it would be very useful to omit the declaration (or at least to use a value of utf-8) namely iso-646 (aka ASCII aka US-ASCII). It may be politically incorrect to say that ascii characters are still more interoperable than non-ascii characters, but in practice this is still the case. Especially in XML which specifies that a charset specified in the mime headers takes precedence it is hard to give (say) a utf8 file to someone to serve from their website without first finding out what http server they use, and how to make sure it won't serve the thing as latin 1 resulting in a non-well formed document. (See current discussion on W3C'S TAG list about this). One style of producing XML files that avoids these problems is to produce files that don't have an xml declaration (or have one that specifies utf-8) but to encode all non-ascii characters as numeric character references. Currently in an XSLT 1 usage in production I use <xsl:output encoding="US-ASCII"/> with saxon and post process with sed to remove the US-ASCII encoding declaration (which stops the file being parsed on several XML systems I have locally) I think that it would be very desirable if <xsl:output encoding="iso-646" omit-xml-declaration="yes"/> was defined to work, and produce files of the form described above. Failing that it would be good if it would be allowed by the specification if the system understood that encoding. >> The XSL and XML Query Working Groups discussed your comment, and initially responded in [2], indicating that Serialization would respect the setting of the omit-xml-declaration whenever the encoding was UTF-8, UTF-16 or some "subset" encoding of those two encodings. However, subsequent to making that decision, the working groups decided that the setting of the omit-xml-declaration parameter should be respected always, regardless of the setting of the encoding parameter. The 23 July working draft of Serialization [3] reflects that decision. Thank you once again for your comment. May I ask you to confirm that the revised response is acceptable to you? Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2003Nov/0050.html [2] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1235.html [3] http://www.w3.org/TR/2004/WD-xslt-xquery-serialization-20040723/ ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
> Thank you once again for your comment. May I ask you to confirm that > the revised response is acceptable to you? Yes this is perfectly acceptable, thank you. David ________________________________________________________________________ This e-mail has been scanned for all viruses by Star. The service is powered by MessageLabs. For more information on a proactive anti-virus service working around the clock, around the globe, visit: http://www.star.net.uk ________________________________________________________________________
SAG-XQ-005 Serializing Arbitrary Sequences We don't think that the mechanism for serializing an arbitrary sequence described in section 2 of the Serialization specification meets any known user requirement. We do think that there is a requirement for an interoperable serialization format for an arbitrary sequence, and that this should be defined. We think that the requirement is for a format that wraps each item in the sequence in an XML wrapper providing information about the type, the value, and in the case of nodes, the name of the node. For example, an attribute node might be serialized as <res:attribute name="my:att" type="my:shoesize" value="7.3"/> In the case of elements and documents, the tree rooted at that node would be serialized. The format would be extensible to allow implementation-defined attributes that represent the identity of nodes, allowing the information to be used for a subsequent update, or for creating hyperlinks. (Note, technically we are talking here about a representation of an arbitrary sequence in the form of a document. Serializing that document is entirely orthogonal). Michael Kay for Software AG
Walter, In [1] Michael Kay submitted the following comment from Software AG. Michael Kay wrote on 2003-11-27 06:56:34 AM: > SAG-XQ-005 Serializing Arbitrary Sequences > We don't think that the mechanism for serializing an arbitrary > sequence described in section 2 of the Serialization specification > meets any known user requirement. > We do think that there is a requirement for an interoperable > serialization format for an arbitrary sequence, and that this should > be defined. We think that the requirement is for a format that wraps > each item in the sequence in an XML wrapper providing information > about the type, the value, and in the case of nodes, the name of the > node. For example, an attribute node might be serialized as > <res:attribute name="my:att" type="my:shoesize" value="7.3"/> > In the case of elements and documents, the tree rooted at that node > would be serialized. The format would be extensible to allow > implementation-defined attributes that represent the identity of > nodes, allowing the information to be used for a subsequent update, > or for creating hyperlinks. > (Note, technically we are talking here about a representation of an > arbitrary sequence in the form of a document. Serializing that > document is entirely orthogonal). Thanks to Michael and Software AG for raising the comment. The XSL and XQuery working groups considered this comment and related comments. There was general agreement that there is some need for a mechanism for serializing arbitrary sequences that preserves most or all of the properties of the items in an arbitrary sequence that is being serialized. However, the working groups decided that precisely defining all of the requirements for such a mechanism at this stage would be difficult, and would likely lead to a solution that would not satisfy real user requirements. Therefore, the working groups decided to consider such a feature for a future revision of the recommendations, and close this comment without any changes to the specifications. May I ask you to confirm that this resolution is acceptable? Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2003Nov/0305.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Hello Henry, although I do not understand all the arguments, I can live with that decision; especially as we have a high interest in a finished XQuery recommendation as early as possible. Best regards Walter
SUGGESTION 4: 20 Serialization normalize-unicode? attribute of the xsl:output element Problem: Not clear which normalization forms to use. "NFC"? Why we should have "NFC" only, if we can support others as in fn:normalize-unicode? Solution: normalize-unicode? = string The attribute should follow the rules of the second argument($normalizationForm ) of the fn:normalize-unicode (http://www.w3.org/TR/xquery-operators/#func-normalize-unicode) Igor Hersht XSLT Development IBM Canada Ltd., 8200 Warden Avenue, Markham, Ontario L6G 1C7 Office D2-260, Phone (905)413-3240 ; FAX (905)413-4839
Igor, In [1], you submitted the following comment on the XSLT 2.0 and Serialization last call drafts. Igor Hersht wrote on 2004-01-11 05:01:13 PM: > SUGGESTION 4: > 20 Serialization normalize-unicode? attribute of the xsl:output element > Problem: > Not clear which normalization forms to use. "NFC"? > Why we should have "NFC" only, if we can support others as in > fn:normalize-unicode? > > Solution: > normalize-unicode? = string > The attribute should follow the rules of the second > argument($normalizationForm ) > of the fn:normalize-unicode > (http://www.w3.org/TR/xquery-operators/#func-normalize-unicode) Thank you for your comment. The XSL and XQuery Working Groups discussed your comment, and agreed that the serialization parameter for Unicode normalization should be aligned with the fn:normalize-unicode function and permit additional normalization forms to be specified. The working groups decided to make the following changes to the definition of the normalize-unicode serialization parameter: 1. Rename the parameter to "normalization-form". 2. The possible values of the parameter will be "NFC", "NFD", "NFKC", "NFKD", "fully-normalized", "none" or an implementation-defined normalization form. The default value is "none". We will also add a note advising of the interoperability problems that can arise by using anything other than NFC. 3. All of "NFC", "NFD", "NFKC", "NFKD", "fully-normalized", "none" and any implementation-defined value are permitted for the xml, xhtml and text output methods. The values "NFC", "fully-normalized" and "none" must be supported by an implementation for these output methods. 4. The normalization-form parameter is permitted to have the values "NFC", "NFD", "NFKC", "NFKD", "none" or an implementation-defined value if the output method is "html". The values "NFC" and "none" must be supported for the html output method. The value "fully-normalized" is not permitted if the output method is "html". 5. In the case of "fully-normalized", the normalization is the same as for NFC, but the processor must signal a serialization error if any of the "relevant constructs" of the result would begin with a combining character. The XSL Working Group will also make the corresponding changes to the xsl:output element in XSLT 2.0, replacing the normalize-unicode attribute with a normalization-form attribute. May I ask you to confirm that this response is acceptable? Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Jan/0019.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Hello, In [1], I asked Igor Hersht to confirm that he found the response to the issue labelled "SUGGESTION 4" in [2] acceptable. I'm responding on Igor's behalf to indicate that the response is acceptable to IBM. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Mar/0276.html [2] http://lists.w3.org/Archives/Public/public-qt-comments/2004Jan/0019.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Jonathan said in another thread: > If you do want to enter this as a last call comment, could you please start > a new thread that clearly says that? I made the following comment on the previous draft serialization document: http://lists.w3.org/Archives/Public/public-qt-comments/2003Nov/0050.html Please take that as a last call comment on this draft (as the coment has not been answered, and the situation is the same in this draft). David ________________________________________________________________________ This e-mail has been scanned for all viruses by Star Internet. The service is powered by MessageLabs. For more information on a proactive anti-virus service working around the clock, around the globe, visit: http://www.star.net.uk ________________________________________________________________________
David, In [1], you submitted the following comment against the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization: << Jonathan said in another thread: > If you do want to enter this as a last call comment, could you please start > a new thread that clearly says that? I made the following comment on the previous draft serialization document: http://lists.w3.org/Archives/Public/public-qt-comments/2003Nov/0050.html Please take that as a last call comment on this draft (as the coment has not been answered, and the situation is the same in this draft). >> The XSL and XQuery Working Groups actually logged your comment twice, in both [1] and [2]. We would like to close the issue raised in [1] as a duplicate of the issue raised in [2]. I trust this will be acceptable. Our apologies for any confusion. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Jan/0029.html [2] http://lists.w3.org/Archives/Public/public-qt-comments/2003Nov/0050.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
> We would like to close the issue raised in [1] as a > duplicate of the issue raised in [2]. I trust this will be acceptable. Yes, That's fine:-) David ________________________________________________________________________ This e-mail has been scanned for all viruses by Star Internet. The service is powered by MessageLabs. For more information on a proactive anti-virus service working around the clock, around the globe, visit: http://www.star.net.uk ________________________________________________________________________
Serialization Section 2, "Serializing Arbitrary Data Models": This comment recapitulates a discussion from a recent (21 January) meeting of the Query and XSLT working groups. It is suggested that the serialization document should define two separate and independent processes, possibly called "documentization" and "serialization". The documentization process should be defined to convert any Query Data Model (QDM) instance (which in general may contain zero, one, or many documents, or documents mixed with non-document fragments) into a QDM instance that contains exactly one document. This can be done by replacing each top-level item in the QDM instance by a descriptive "wrapper" element that labels it with its kind: attribute, atomic value, element, document, etc. A new synthetic document element is then inserted as parent of all the wrapper elements. This documentization process (unlike the one currently described in Section 2) should apply successfully to any QDM instance whatsoever. Thus (for example) if the QDM instance contains multiple documents, the boundaries between these documents is preserved. If documentization is invoked on a QDM instance that already contains a single document, that document is nevertheless wrapped in a descriptive element which is placed under a new synthetic parent document node (it is treated simply as a sequence of documents of length one). The serialization process then needs to be defined only for QDM instances that contain exactly one document. A serialization parameter can be defined to control whether documentization is applied before serialization (possibly documentization could be defined to occur by default if the first item in the sequence to be serialized is not a node). --Don Chamberlin
Don, In [1] you submitted the following comment on the serialization draft: Don Chamberlin wrote on 2004-02-02 06:37:20 PM: > Serialization Section 2, "Serializing Arbitrary Data Models": This > comment recapitulates a discussion from a recent (21 January) > meeting of the Query and XSLT working groups. It is suggested that > the serialization document should define two separate and > independent processes, possibly called "documentization" and "serialization". > > The documentization process should be defined to convert any Query > Data Model (QDM) instance (which in general may contain zero, one, > or many documents, or documents mixed with non-document fragments) > into a QDM instance that contains exactly one document. This can be > done by replacing each top-level item in the QDM instance by a > descriptive "wrapper" element that labels it with its kind: > attribute, atomic value, element, document, etc. A new synthetic > document element is then inserted as parent of all the wrapper > elements. This documentization process (unlike the one currently > described in Section 2) should apply successfully to any QDM > instance whatsoever. Thus (for example) if the QDM instance contains > multiple documents, the boundaries between these documents is > preserved. If documentization is invoked on a QDM instance that > already contains a single document, that document is nevertheless > wrapped in a descriptive element which is placed under a new > synthetic parent document node (it is treated simply as a sequence > of documents of length one). > > The serialization process then needs to be defined only for QDM > instances that contain exactly one document. A serialization > parameter can be defined to control whether documentization is > applied before serialization (possibly documentization could be > defined to occur by default if the first item in the sequence to be > serialized is not a node). Thank you for submitting this comment. The XSL and XQuery working groups considered your comment and related comments. There was general agreement that there is some need for a mechanism for serializing arbitrary sequences that preserves most or all of the properties of the items in an arbitrary sequence that is being serialized. However, the working groups decided that precisely defining all of the requirements for such a mechanism at this stage would be difficult, and would likely lead to a solution that would not satisfy real user requirements. Therefore, the working groups decided to consider such a feature for a future revision of the recommendations, and close this comment without any changes to the specifications. May I ask you to confirm that this resolution is acceptable? Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0049.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Henry, Thanks for your response. I understand why the working group has chosen not to provide this feature in XQuery Version 1, and I do not intend to pursue this issue further at this time. Regards, --Don Chamberlin
Serialization Section 4, "XML Output Method": The first paragraph states that serialization produces either an XML document entity or an external general parsed entity. No indication is given about how the serialization process chooses between these alternatives. The normalization rules in Section 2 always reduce the data model instance to exactly one document, so it is not clear how the second alternative is ever invoked. Also in this section, the second paragraph adds nothing that is not already said in the first paragraph. It should be deleted. --Don Chamberlin
The serialization process doesn't choose between these two alternatives (indeed, the set of well-formed XML document entities and the set of well-formed EGPEs have a large overlap). Rather, this sentence is stating a constraint. If the document node contains multiple elements or text nodes among its children then the result cannot be a well-formed document entity, therefore it must be a well-formed EGPE. If a standalone attribute is requested in the serialization parameters, then the result cannot be a well-formed EGPE, therefore it must be a well-formed document entity. If both conditions are true, there is a conflict, and I think there are rules later on about how this conflict should be resolved. Michael Kay
Hi, Don. In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization: Don Chamberlin wrote on 2004-02-02 07:14:53 PM: > Serialization Section 4, "XML Output Method": The first paragraph > states that serialization produces either an XML document entity or > an external general parsed entity. No indication is given about how > the serialization process chooses between these alternatives. The > normalization rules in Section 2 always reduce the data model > instance to exactly one document, so it is not clear how the second > alternative is ever invoked. > > Also in this section, the second paragraph adds nothing that is not > already said in the first paragraph. It should be deleted. Thank you for your comment. The XSL and XML Query Working Groups discussed your comment. It was noted that, although there is only ever one document node to process, the document node could have no element node children, more than one element node child or text node children. If any of those conditions holds, the serialization process produces an external general parsed entity; otherwise, it produces a document entity (which might also meet the syntactic criteria of an external general parsed entity). In order to clarify the first and third paragraphs of section 4, the working groups decided to make the following changes: - in the first sentence of the third paragraph, change "and the" to "then", to make it clear the conditions under which a document entity will be the result of the serialization process. - change the wording to make it clear that these rules describe requirements on the processor, rather than on the user. The processor will be required to produce a serialization error if it is unable to produce a well-formed entity of the appropriate kind, unless that is because of the action of the character expansion phase of serialization. The working groups further agreed that the second paragraph of section 4 adds no useful information, and decided to delete it. As you were present when this decision was made, I will assume the response is acceptable to you. Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0053.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Serialization Section 4, "XML Output Method": The paragraph before the bullet list says that "if a new tree were constructed by parsing the [serialized] XML document and converting it into a data model as described in [Data Model], then the new data model would be the same as the starting data model." But this conversion process involves validation, which is schema-dependent. We need to specify that the schema used in this round-trip process is an effective schema consisting of the in-scope schema definitions in the static context. --Don Chamberlin
After further discussion, it appears that it is not sufficient to use the in-scope-schema-definitions (ISSD) during round-tripping (serialization and re-parsing) of a data model instance. Round-tripping is used in validation, which in turn is used in every element constructor. It is necessary for round-tripping to preserve the type annotation of an element node, which may not be known in the ISSD. I think the schema used during round-tripping needs to be the union of the ISSD and the schema(s) from which the type annotations of the nodes were originally derived (called the "data model schema" in Section 2.2.5, Consistency Constraints). --Don Chamberlin
Don, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization: << Serialization Section 4, "XML Output Method": The paragraph before the bullet list says that "if a new tree were constructed by parsing the [serialized] XML document and converting it into a data model as described in [Data Model], then the new data model would be the same as the starting data model." But this conversion process involves validation, which is schema-dependent. We need to specify that the schema used in this round-trip process is an effective schema consisting of the in-scope schema definitions in the static context. >> Thank you for your comment. The XSL and XML Query Working Groups discussed your comment and decided that the "round-tripping" description in Serialization was not intended to be part of the definition of validation, but only to define the requirements on the form of a serialized instance of the data model. The issue will be closed without any change to Serialization. As you were present when this decision was made, I will assume the response is acceptable to you. Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0055.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Serialization Section 4, "XML Output Method": This section should specify the rules for serializing namespace nodes in the form of namespace declaration attributes. Does every namespace node attached to an element node result in an xmlns-attribute in that element's start-tag? Can the xmlns-attribute be omitted if it is present in the start-tag of a parent element? --Don Chamberlin
I don't think it's necessary to specify these rules in detail. Any output that satisfies the round-tripping constraints is acceptable. The philosophy of the serialization spec is to state the basic constraints that the output must specify, and beyond that, to be non-prescriptive. Michael Kay
Don, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization: > Serialization Section 4, "XML Output Method": This section should > specify the rules for serializing namespace nodes in the form of > namespace declaration attributes. Does every namespace node attached > to an element node result in an xmlns-attribute in that element's > start-tag? Can the xmlns-attribute be omitted if it is present in > the start-tag of a parent element? Thank you for your comment. The XSL and XML Query Working Groups discussed your comment, and concluded that any output that satisfies the round-tripping requirement for the XML output method can be used in serializing namespace nodes. The responses to your specific questions are "No, the serialized start-tag for an element node does not have to have an xmlns-attribute for every namespace node," and "Yes, if an element node has a namespace node, the xmlns-attribute can be omitted if the start-tag for an ancestor element declares the namespace, subject to the usual constraints imposed by namespace undeclaration or changes in binding." No change to the serialization specification is required. As you were present when this decision was made, I will assume the response is acceptable to you. Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0057.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Serialization Section 4, "XML Output Method": Bullet 5 says that type annotations are discarded during serialization. The note just below this bullet says that type annotations are optionally preserved during serialization. Which is true? If type annotations are optionally preserved, how is this option controlled? Is it implementation-defined, or controlled by a serialization parameter? It would be very helpful to have a note or example to illustrate how (and when) type annotations are serialized in the form of xsi:type attributes. Also, the note below Bullet 5 is very awkwardly phrased. If retained, this note should be condensed as follows: "In order to preserve type annotations, the serialization process could use mechanisms such as xsi:type and xsi:schemaLocation attributes." --Don Chamberlin
I agree the wording of the note could be improved. The intent is to say that type annotations are not retained through serialization, and if this causes a problem, the user can include xsi:type or xsi:schemaLocation attributes in the result tree, which will cause type annotations to be reconstituted when the serialized document is re-parsed. The serializer will never add these attributes itself. (Well, there's no ban on an implementor adding options to do this of course, but it's not part of serialization as specified). Michael Kay
Don, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization: XML Query wrote on 2004-02-02 07:15:26 PM:> > Serialization Section 4, "XML Output Method": Bullet 5 says that > type annotations are discarded during serialization. The note just > below this bullet says that type annotations are optionally > preserved during serialization. Which is true? If type annotations > are optionally preserved, how is this option controlled? Is it > implementation-defined, or controlled by a serialization parameter? > It would be very helpful to have a note or example to illustrate how > (and when) type annotations are serialized in the form of xsi:type attributes. > > Also, the note below Bullet 5 is very awkwardly phrased. If > retained, this note should be condensed as follows: "In order to > preserve type annotations, the serialization process could use > mechanisms such as xsi:type and xsi:schemaLocation attributes." Thank you for your comment. The XSL and XQuery Working Groups discussed your comment and decided that the note was intended to indicate that if the user would like type annotations to be preserved, the user should ensure the data model that the sequence that is input to the serialization process uses xsi:type and xsi:schemaLocation attributes. The note wasn't intended to grant a license to the serialization process to manufacture such attributes. In order to clarify this, the working groups decided to replace the note in the bullet 5 of section 4 with the following: << Note: In order to influence the type annotations in the data model that would result from processing a serialized XML document, the author of the XSLT stylesheet, XQuery expression or other process may wish to create the data model that is input to the serialization process so that it makes use of mechanisms provided by [XML Schema], such as xsi:type and xsi:schemaLocation attributes. The serialization process will not automatically create such attributes in the serialized document if those attributes were not part of the result tree that is to be serialized. >> As you were present when this decision was made, I will take it that the decision is acceptable to you. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0058.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Serialization Section 4, "XML Output Method": The statement in Bullet 6 is backward. Additional namespace nodes may be generated if the serialization process FAILS to undeclare namespaces. In addition, the namespace nodes may be different after round-tripping because the process of constructing an element node from an infoset may ignore namespaces that are not used in element or attribute names (see Data Model Section 6.2.4, Element Node Construction from Infoset".) --Don Chamberlin
Don, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization: << Serialization Section 4, "XML Output Method": The statement in Bullet 6 is backward. Additional namespace nodes may be generated if the serialization process FAILS to undeclare namespaces. In addition, the namespace nodes may be different after round-tripping because the process of constructing an element node from an infoset may ignore namespaces that are not used in element or attribute names (see Data Model Section 6.2.4, Element Node Construction from Infoset".) >> Thank you for your comment. The XSL and XML Query Working Groups discussed your comment, and decided to make the corrections that you had recommended, with a small refinement: that namespace nodes must not be ignored if the round-tripped data model instance is constructed from PSVI if the namespace prefix was used in a value of type xs:QName. As you were present when this decision was made, I will assume the response is acceptable to you. Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0059.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Serialization Section 4, "XML Output Method": Bullet 7 says that "Additional nodes may be present in the new tree" due to character expansion. Please explain how character expansion could result in new nodes and provide an example. Thanks, --Don Chamberlin
Don, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. > Serialization Section 4, "XML Output Method": Bullet 7 says that > "Additional nodes may be present in the new tree" due to character > expansion. Please explain how character expansion could result in > new nodes and provide an example. Thank you for your comment. The XSL and XQuery working groups discussed your comment, and decided to add a note to clarify the situation. I would like to add the following note to the final bullet of the bulleted list in section 4. << Note: The use-character-maps parameter can cause arbitrary characters to be inserted into the serialized XML document in an unescaped form, including characters that would be considered part of XML markup. Such characters could result in arbitrary new element nodes, attribute nodes, and so on, in the new tree that results from processing the serialized XML document. >> As you were present when this decision was made, I will take it that the decision is acceptable to you. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0060.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Serialization Section 4.1, "XML Output Method: the version Parameter": This section contains the words "If the processor does not support this version of XML ...". This seems to imply that support for XML versions is an optional feature. We should clearly specify the requirements in this area. Possibly support for XML 1.0 is required and support for XML 1.1 is an optional feature that should be included on our optional feature list? --Don Chamberlin
I think the view of the XSL working group was that we should leave it to the implementor to decide which versions of XML to support. It would be commercial suicide for a vendor not to support XML 1.0 in a 2004 product, but by 2010 the situation may look different, and we want our spec to be durable. Michael Kay
Don, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization: << Serialization Section 4.1, "XML Output Method: the version Parameter": This section contains the words "If the processor does not support this version of XML ...". This seems to imply that support for XML versions is an optional feature. We should clearly specify the requirements in this area. Possibly support for XML 1.0 is required and support for XML 1.1 is an optional feature that should be included on our optional feature list? >> Thank you for this comment. The XSL and XML Query Working Groups discussed your comment, and decided that the Serialization specification should be flexible in this regard and not place any requirements on the versions of XML or HTML that must be supported, although a particular host language might impose such requirements. The Serialization draft will be modified to state, for each of the xml and html output methods, that it is a serialization error if the value of the version parameter specifies a version of the XML or the HTML Recommendation that is not supported by the processor. The Serialization draft will not place any requirements on the processor on which versions of XML or HTML must be supported by a processor. The XQuery Working Group further decided that the XML Query language will require the processor to support the value 1.0 in the version parameter if the output method is xml. As you were present when this decision was made, I will assume the response is acceptable to you. Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0061.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Serialization Section 4.1, "XML Output Method: the version Parameter": This section should explain what it means to serialize a data model using XML Version 1.0 or Version 1.1. These versions are distinguished mainly by the characters they allow in names. Does the data model need to specify which XML version it is using? (Currently the data model does not provide any way to do this.) What happens if serialization is using XML Version 1.0 but it encounters a name that contains a character in the Version 1.1 character set? Also, this section should specify whether XML Version 1.1 interpreted to include Namespaces Version 1.1 as well. If not, should a separate version parameter be defined for this purpose? --Don Chamberlin
Hi, Don. In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization: << Serialization Section 4.1, "XML Output Method: the version Parameter": This section should explain what it means to serialize a data model using XML Version 1.0 or Version 1.1. These versions are distinguished mainly by the characters they allow in names. Does the data model need to specify which XML version it is using? (Currently the data model does not provide any way to do this.) What happens if serialization is using XML Version 1.0 but it encounters a name that contains a character in the Version 1.1 character set? Also, this section should specify whether XML Version 1.1 interpreted to include Namespaces Version 1.1 as well. If not, should a separate version parameter be defined for this purpose? >> Thank you for this comment. The XSL and XML Query Working Groups discussed your comment, and decided that all NCNames must conform to the version of Namespaces in XML specified by the version parameter; if NCNames do not conform to the appropriate version of the Namespaces recommendation, a serialization error results. Similarly, if the instance of the data model contains any characters that are not permitted by the particular version of XML specified by the version parameter, a serialization error results. For instance, if the version parameter has the value 1.0, and a text node contains one of the non-whitespace control characters in the range #x1 to #x1F, a serialization error results, because those characters were not permitted in XML 1.0 documents; if the version parameter has the value 1.1, and a comment node contains one of the control characters in the range #x7F to #x9f, other than NEL, a serialization error results, because those characters are also permitted to appear as character references in XML 1.1 documents. Finally, the description of the version parameter should indicate that it controls the version of both XML and Namespaces in XML to which the serialized result should conform. No independent parameter specifying the version of Namespaces in XML is required. I will modify the Serialization specification to reflect these decisions. As you were present when these decisions were made, I will assume the response is acceptable to you. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0062.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
The Serialization document says nothing about how the "nilled" property of an element node is serialized. Does this property always result in an xsi:nil attribute on the generated element? Does this process depend on anything (for example, the type of the element and/or whether it is "nillable")? --Don Chamberlin
Don, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization: > The Serialization document says nothing about how the "nilled" > property of an element node is serialized. Does this property always > result in an xsi:nil attribute on the generated element? Does this > process depend on anything (for example, the type of the element > and/or whether it is "nillable")? Thank you for your comment. The XSL and XQuery working groups discussed your comment and decided that it could happen that an element has the nilled property with the value true, but has no xsi:nil attribute. The working groups decided to add a note stating that, in such cases, the serialization process will not create an xsi:nil attribute for the element. As you were present when this decision was made, I will take it that the decision is acceptable to you. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0064.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
It seems useful for the XML output method to allow a canonical parameter which if true would cause the processor to emit canonical XML. This should not be required of processors, but should be allowed. (i.e. it's optional like indent). The trickiest part is that this could conflict with other properties like indent and omit-xml-declaration. The processor could either signal an error if there was an explicit conflict, or recover by simply outputting canonical XML. I prefer the latter solution. -- Elliotte Rusty Harold elharo@metalab.unc.edu Effective XML (Addison-Wesley, 2003) http://www.cafeconleche.org/books/effectivexml http://www.amazon.com/exec/obidos/ISBN%3D0321150406/ref%3Dnosim/cafeaulaitA
Elliotte, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization: > It seems useful for the XML output method to allow a canonical > parameter which if true would cause the processor to emit canonical > XML. This should not be required of processors, but should be > allowed. (i.e. it's optional like indent). > > The trickiest part is that this could conflict with other properties > like indent and omit-xml-declaration. The processor could either > signal an error if there was an explicit conflict, or recover by > simply outputting canonical XML. I prefer the latter solution. Thank you for your comment. The XSL and XML Query Working Groups discussed your comment. The working groups decided that it was too late in the process to add a new feature to serialization to support canonicalization, particularly in light of the fact that a solution to this problem is currently available: serialize using the xml output method, and post-process that serialized result with a processor that converts the XML documents to the appropriate type of canonical XML. In addition, the lack of type-awareness in existing definitions of canonicalization was of concern to the working groups. May I ask you to confirm that this response is acceptable to you? Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0146.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Consider the following schema fragment: <xs:element name="A"> <xs:complexType> <xs:sequence> <xs:element name="C" type="myns:Type1"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="B"> <xs:complexType> <xs:sequence> <xs:element name="C" type="myns:Type2"/> </xs:sequence> </xs:complexType> </xs:element> Now if we consider a document (or any other data source) containing both A and B elements, the following query <result> { for $x in doc("myDocument")//C return $x } </result> returns a result that cannot be strongly typed without losing type information by any valid schema, as the schema spec forbids elements with the same name and a different type in the same content model. It seems to me that the only way of retaining type information would be to annotate produced C elements with xsi:type. This could be a serialization parameter, similar to the cdata-section-elements. However, this would raise another issue, as anonymous type names would then be exposed, and would thus require to be handled in a consistent way by different XQuery and XML Schema processors. This issue is important, especially for tools that perform distributed XQuery processing, and that need to retain consistent type information when moving XML data from one processing node to another.
Hello, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization: << Consider the following schema fragment: <xs:element name="A"> <xs:complexType> <xs:sequence> <xs:element name="C" type="myns:Type1"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="B"> <xs:complexType> <xs:sequence> <xs:element name="C" type="myns:Type2"/> </xs:sequence> </xs:complexType> </xs:element> Now if we consider a document (or any other data source) containing both A and B elements, the following query <result> { for $x in doc("myDocument")//C return $x } </result> returns a result that cannot be strongly typed without losing type information by any valid schema, as the schema spec forbids elements with the same name and a different type in the same content model. It seems to me that the only way of retaining type information would be to annotate produced C elements with xsi:type. This could be a serialization parameter, similar to the cdata-section-elements. However, this would raise another issue, as anonymous type names would then be exposed, and would thus require to be handled in a consistent way by different XQuery and XML Schema processors. This issue is important, especially for tools that perform distributed XQuery processing, and that need to retain consistent type information when moving XML data from one processing node to another. >> Thank you for this comment. The XSL and XML Query Working Groups discussed your comment and several related comments. There was general agreement that there is some need for a mechanism that preserves most or all of the properties of the items in the sequence that is being serialized. However, the working groups decided that precisely defining all of the requirements for such a mechanism at this stage would be difficult, and would likely lead to a solution that would not satisfy real user requirements. Therefore, the working groups decided to consider such a feature for a future revision of the recommendations, and close this comment without any changes to the specifications. May I ask you to confirm that this response is acceptable to you? Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0188.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Technical [A] [Section 3: Serialization Parameters] This section outlines 15 serialization parameters. These parameters have informal descriptions - such as 'The value must be yes or no', etc. It is possible to describe all the parameters, except 'use-character-maps', using XML Schema data types. Suggested descriptions are, encoding: a new datatype derived from 'xs:string' cdata-section-element: a list of 'xs:QName' doctype-system: a new datatype derived from 'xs:string' doctype-public: a new datatype derived from 'xs:string' escape-uri-attributes: 'xs:boolean' include-content-type: 'xs:boolean' indent: 'xs:boolean' media-type: a new datatype derived from 'xs:string' normalize-unicode: 'xs:boolean' omit-xml-declaration: 'xs:boolean' standalone: 'xs:boolean' undeclare-namespaces: 'xs:boolean' version: a new datatype derived from 'xs:string' use-character-maps: NONE (see related issue, [N]) method: 'xs:QName' On behalf of the XML Schema WG. -- Mary Holstege@mathling.com
Seems a good idea in principle. The types as listed aren't quite right, for example standalone is xs:boolean? rather than xs:boolean, but this only adds to the argument for formalising them. Michael Kay (personal response)
Mary, In [1], you submitted the following comment on the Serialization last call draft on behalf of the XML Schema WG. Mary Holstege wrote on 2004-02-12 04:11:28 PM: > [A] [Section 3: Serialization Parameters] This section outlines 15 > serialization parameters. These parameters have informal descriptions - such > as 'The value must be yes or no', etc. It is possible to describe all the > parameters, except 'use-character-maps', using XML Schema data types. > Suggested descriptions are, > > encoding: a new datatype derived from 'xs:string' > cdata-section-element: a list of 'xs:QName' > doctype-system: a new datatype derived from 'xs:string' > doctype-public: a new datatype derived from 'xs:string' > escape-uri-attributes: 'xs:boolean' > include-content-type: 'xs:boolean' > indent: 'xs:boolean' > media-type: a new datatype derived from 'xs:string' > normalize-unicode: 'xs:boolean' > omit-xml-declaration: 'xs:boolean' > standalone: 'xs:boolean' > undeclare-namespaces: 'xs:boolean' > version: a new datatype derived from 'xs:string' > use-character-maps: NONE (see related issue, [N]) > method: 'xs:QName' Thanks to you and the Schema WG for this comment. The XSL and XQuery working groups considered the comment, and agreed that the definitions of the permissible sets of values need to be specified more clearly. However, the working groups did not feel it was necessary to describe the values with reference to the XML Schema data types, as the serialization parameters are not part of an API, but merely a formalism used between specifications. The working groups would like to replace the descriptions of the values of the parameters that appears in the bulleted list in Section 3, with a table. The following is my proposed replacement. << +----------------------+------------------------------------------------+ |PARAMETER NAME |PERMITTED VALUES FOR PARAMETER | +----------------------+------------------------------------------------+ |cdata-section-elements|A list of expanded-QNames, possibly empty. | +----------------------+------------------------------------------------+ |doctype-public |A string of Unicode characters. This parameter | | |is optional. | +----------------------+------------------------------------------------+ |doctype-system |A string of Unicode characters. This parameter | | |is optional. | +----------------------+------------------------------------------------+ |encoding |A string of Unicode characters in the range #x21| | |to #x7E (that is, printable ASCII characters); | | |the value should be a charset registered with | | |the Internet Assigned Numbers Authority [IANA], | | |[RFC2278] or begin with the characters x- or X-.| +----------------------+------------------------------------------------+ |escape-uri-attributes |One of the enumerated values yes or no | +----------------------+------------------------------------------------+ |include-content-type |One of the enumerated values yes or no | +----------------------+------------------------------------------------+ |indent |One of the enumerated values yes or no | +----------------------+------------------------------------------------+ |media-type |A string of Unicode characters specifying the | | |media type (MIME content type) [RFC2376]; the | | |charset parameter of the media type must not be | | |specified explicitly. | +----------------------+------------------------------------------------+ |method |An expanded-QName with a null namespace URI, and| | |the local part of the name equal to xml, xhtml, | | |html or text, or having a non-null namespace | | |URI. If the namespace URI is non-null, the | | |parameter specifies an implementation-defined | | |output method. | +----------------------+------------------------------------------------+ |normalize-unicode |One of the enumerated values yes or no | +----------------------+------------------------------------------------+ |omit-xml-declaration |One of the enumerated values yes or no | +----------------------+------------------------------------------------+ |standalone |One of the enumerated values yes, no or none | +----------------------+------------------------------------------------+ |undeclare-namespaces |One of the enumerated values yes or no | +----------------------+------------------------------------------------+ |use-character-maps |A list of pairs, possibly empty, with each pair | | |consisting of a single Unicode character and a | | |string of Unicode characters. | +----------------------+------------------------------------------------+ |version |A string of Unicode characters. | +----------------------+------------------------------------------------+ >> May I ask you to confirm that this response is acceptable to the XML Schema WG? Thanks, Henry ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
The Schema WG thanks you for your response. We find your response a distinct improvement and are content with it, although certain members of the WG continue to feel that the definitions would be cleaner if you did, in fact, go the final step to using concrete datatypes for the serialization parameter definitions. //Mary
Technical [B] [Section 2: Serializing Arbitrary Data Models] "cast as xs:string" is a key phrase in this section. To improve readability, there should be a pointer to what "cast as xs:string" means. We found 2 locations where how to "cast as xs:string" is indirectly described, [1] http://www.w3.org/TR/2003/WD-xpath-datamodel-20031112/#ElementNodeAccessors (see dm:string-value) [2] http://www.w3.org/TR/2003/WD-xpath-datamodel-20031112/#AttributeNodeAccessors (see dm:string-value) We found 1 location where how to "cast to string is directly described, [3] http://www.w3.org/TR/2003/WD-xpath-functions-20031112/#casting-to-string On behalf of the XML Schema WG. -- Mary Holstege@mathling.com
Mary, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema Working Group: > [B] [Section 2: Serializing Arbitrary Data Models] "cast as xs:string" is a > key phrase in this section. To improve readability, there should be a > pointer to what "cast as xs:string" means. > > We found 2 locations where how to "cast as xs:string" is indirectly > described, > > [1] > http://www.w3.org/TR/2003/WD-xpath-datamodel-20031112/#ElementNodeAccessors > (see dm:string-value) > [2] > http://www.w3.org/TR/2003/WD-xpath-datamodel-20031112/#AttributeNodeAccessors > (see dm:string-value) > > We found 1 location where how to "cast to string is directly described, > > [3] http://www.w3.org/TR/2003/WD-xpath-functions-20031112/#casting-to-string Thanks to you and to the working group for this comment. The XSL and XML Query Working Groups discussed the comment, and agreed that the description should indicate that Section 2 of Serialization should refer to a normative definition of casting to string. The working groups decided the normative definition should be that found in the Functions and Operators draft.[3] May I ask you to confirm that the working group finds the response acceptable? Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [4] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0262.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
The Schema WG thanks you for your response to our comment, and are happy with it. //Mary
Technical [C] [Section 2: Serializing Arbitrary Data Models] Saying the process fails for sequences containing xs:QName or xs:NOTATION nodes seems unhelpful. What happens if I have such a sequence? This appears to be a serialization error because processor is unable to cast an atomic value to string. Suggestion: replace 'process will fail' statement with 'serialization error'. On behalf of the XML Schema WG. -- Mary Holstege@mathling.com
Mary, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema Working Group: > [C] [Section 2: Serializing Arbitrary Data Models] Saying the process fails > for sequences containing xs:QName or xs:NOTATION nodes seems unhelpful. What > happens if I have such a sequence? This appears to be a serialization error > because processor is unable to cast an atomic value to string. Suggestion: > replace 'process will fail' statement with 'serialization error'. Thanks to you and the working group for this comment. The XSL and XML Query Working Groups discussed the comment, and agreed that the description should indicate that this is a serialization error. In fact, the second and sixth items in the numbered list in Section 2 already normatively indicate that fact. For clarity, the note in Section 2 will be changed to use the term "serialization error" as well. As that is the change the XML Schema Working Group recommended, I trust it will be acceptable. May I ask you to confirm that the working group finds the response acceptable? Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0263.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Henry Zongaro writes: > > Mary, > > In [1], you submitted the following comment on the Last Call Working > Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema > Working Group: > > > [C] [Section 2: Serializing Arbitrary Data Models] Saying the process > fails > > for sequences containing xs:QName or xs:NOTATION nodes seems unhelpful. > What > > happens if I have such a sequence? This appears to be a serialization > error > > because processor is unable to cast an atomic value to string. > Suggestion: > > replace 'process will fail' statement with 'serialization error'. > > Thanks to you and the working group for this comment. > > The XSL and XML Query Working Groups discussed the comment, and > agreed that the description should indicate that this is a serialization > error. In fact, the second and sixth items in the numbered list in > Section 2 already normatively indicate that fact. For clarity, the note > in Section 2 will be changed to use the term "serialization error" as > well. > > As that is the change the XML Schema Working Group recommended, I > trust it will be acceptable. May I ask you to confirm that the working > group finds the response acceptable? The Schema WG thanks you for this response. We find the clarification as a serialization error an improvement and accept that. We continue to be deeply troubled, however, by the fact that data models with xs:QNames fail to serialize and therefore validate correctly. We are heartened by our knowledge that the Query/XSL Working groups have continued to discuss that matter, and encourage them in that effort. //Mary
Technical [D] [Section 1: Introduction] "Ed. Note: This material has been moved out of the XSLT draft and into a separate document. The Working Groups also considered moving this material directly into the Data Model document, but elected to keep it separate for the moment, principally in order to advance the Data Model to Last Call. In the future, this material may be moved into the Data Model. The Working Groups solicit public opinion about which alternative is superior. " We prefer keeping this material in this separate document. This way, it makes serialization as independent as possible. On behalf of the XML Schema WG. -- Mary Holstege@mathling.com
Mary, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema working group: << [D] [Section 1: Introduction] "Ed. Note: This material has been moved out of the XSLT draft and into a separate document. The Working Groups also considered moving this material directly into the Data Model document, but elected to keep it separate for the moment, principally in order to advance the Data Model to Last Call. In the future, this material may be moved into the Data Model. The Working Groups solicit public opinion about which alternative is superior. " We prefer keeping this material in this separate document. This way, it makes serialization as independent as possible. >> Thanks to you and the schema working group for this comment. The XSL and XML Query Working Groups discussed the comment, and agreed that it would be best to specify the serialization process in a separate document. The editorial note will be deleted. I trust the XML Schema Working Group will find that response acceptable, as it is as they suggested. May I ask you to confirm that it is? Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0264.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Technical/Editorial [E] [Section 2: Serializing Arbitrary Data Models] This section talks about serialization of a "arbitrary" data model but in fact will fail for some data model instances. In addition, section 4 implicitly adds additional constraints on the data model (by putting constraints on the serialized form) without making it clear what those constraints are. Suggested changes are, - Change title of section 2 to Normalization. - Be clearer about actual goal (in particular: to serialize to well formed XML? to serialize to XML in such a way as to ensure 1:1 roundtrippability? other?). - List explicitly the conditions for successful serialization (and/or successful serialization to Well Formed XML, and/or successful serialization to XML which when schema-validated will produce the same data model instance). Section 4 (XML Output Method) does state: "The xml output method outputs the data model as an XML entity that must satisfy the rules for either a well-formed XML document entity or a well-formed XML external general parsed entity, or both, unless the processor is unable to satisfy those rules due to either serialization errors or the requirements of the character expansion phase of serialization, as described in 3 Serialization Parameters." It is not clear what happens when there are problems (when "the processor is unable to satisfy those rules"): failure? output of non-WF XML? On behalf of the XML Schema WG. -- Mary Holstege@mathling.com
Mary, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema Working Group: << [E] [Section 2: Serializing Arbitrary Data Models] This section talks about serialization of a "arbitrary" data model but in fact will fail for some data model instances. In addition, section 4 implicitly adds additional constraints on the data model (by putting constraints on the serialized form) without making it clear what those constraints are. Suggested changes are, - Change title of section 2 to Normalization. - Be clearer about actual goal (in particular: to serialize to well formed XML? to serialize to XML in such a way as to ensure 1:1 roundtrippability? other?). - List explicitly the conditions for successful serialization (and/or successful serialization to Well Formed XML, and/or successful serialization to XML which when schema-validated will produce the same data model instance). Section 4 (XML Output Method) does state: "The xml output method outputs the data model as an XML entity that must satisfy the rules for either a well-formed XML document entity or a well-formed XML external general parsed entity, or both, unless the processor is unable to satisfy those rules due to either serialization errors or the requirements of the character expansion phase of serialization, as described in 3 Serialization Parameters." It is not clear what happens when there are problems (when "the processor is unable to satisfy those rules"): failure? output of non-WF XML? >> Thanks to you and the working group for this comment. The XSL and XML Query Working Groups discussed your comment. Regarding the first point, the working groups agreed. The title will be changed to "Sequence Normalization" Regarding the second point, the goal is to serialize well-formed XML that reflects the content of the input sequence to the extent possible. We will add a statement to that effect. Regarding the third point, in answer to the question as to what happens when "those rules" can't be satisfied, we will clarify the cited text by changing it to the following: << The xml output method outputs the instance of the data model as an XML entity that must satisfy the rules for either a well-formed XML document entity or a well-formed XML external general parsed entity, or both. A serialization error results if the serializer is unable to satisfy those rules, except for contents modified by the character expansion phase of serialization, as described in 4 Phases of Serialization, which may result in the serial output being not well-formed rather than a serialization error. If a serialization error results, the processor must signal the error. >> However, describing the conditions under which serialization to XML will result in the same data model instance when schema-validated would not be practical; because of implementation-defined issues, etc., the list would be open ended. The working groups decided to make no change to Serialization in response to this part of the comment. May I ask you to confirm that this response is acceptable to the Schema Working Group? Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0265.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Dear XSL and XML Query Working Groups: The XML Schema WG has reviewed your response to this issue during the telcon on October 1, 2004 [1], and we are satisfied with your answer. Thanks for your time and consideration. Sincerely, David Ezell (on behalf of the XML Schema WG) [1] http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2004Oct/0000.html
Technical/Editorial [F] [Section 2: Serializing Arbitrary Data Models] The first paragraph of this section states: "An instance of the data model that is input to the serialization process is a sequence. The serialization process must first place that input sequence into a normalized form for serialization; it is the normalized sequence that is actually serialized. The normalized form for serialization is constructed by applying all of the following rules in order, with the initial sequence being input to the first step, and the sequence that results from any step being used as input to the subsequent step." We think wording in this section tends to imply a required implementation, which, given the destructive nature of the implementation described, leads to the conclusion that serialized data models cannot subsequently be used for anything else. We believe what is intended is the description of a mapping between data models and normalized data models, without attempting to constrain implementations. We request that the text in this section be recast in a more declarative fashion to make these intentions clear. On behalf of the XML Schema WG. -- Mary Holstege@mathling.com
Mary, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema Working Group: << [F] [Section 2: Serializing Arbitrary Data Models] The first paragraph of this section states: "An instance of the data model that is input to the serialization process is a sequence. The serialization process must first place that input sequence into a normalized form for serialization; it is the normalized sequence that is actually serialized. The normalized form for serialization is constructed by applying all of the following rules in order, with the initial sequence being input to the first step, and the sequence that results from any step being used as input to the subsequent step." We think wording in this section tends to imply a required implementation, which, given the destructive nature of the implementation described, leads to the conclusion that serialized data models cannot subsequently be used for anything else. We believe what is intended is the description of a mapping between data models and normalized data models, without attempting to constrain implementations. We request that the text in this section be recast in a more declarative fashion to make these intentions clear. >> Thanks to you and the working group for this comment. The XSL and XML Query Working Groups discussed your comment, and decided to make the following changes to Section 2 of Serialization. These changes are with respect to the July 23 draft of Serialization.[2] o Change the second sentence of the first paragraph to make it clear that the process is not destructive: << Prior to serializing a sequence using any of the output methods whose behavior is specified by this document (3 Serialization Parameters) the serialization process must first compute a normalized sequence for serialization; it is the normalized sequence that is actually serialized. >> o Reword the items in the numbered list so that it's clear that the result at each step is a new sequence: << 1. If the sequence that is input to serialization is empty, create a sequence S1 that consists of a zero-length string. Otherwise, copy each item in the sequence that is input to serialization to create the new sequence S1. 2. For each item in the sequence S1, if the item is atomic, obtain the lexical representation of the item by casting it to an xs:string and copy the string representation to the new sequence; otherwise, copy the item, which must be a node, to the new sequence. It is a serialization error if an atomic value cannot be cast to xs:string. The new sequence is S2. 3. For each subsequence of adjacent strings in S2, copy a single string to the new sequence equal to the values of the strings in the subsequence concatenated in order, each separated by a single space. Copy all other items to the new sequence. The new sequence is S3. 4. For each item in S3, if the item is a string, create a text node in the new sequence whose string value is equal to the string; otherwise, copy the item to the new sequence. The new sequence is S4. 5. For each item in S4, if the item is a document node, copy its children to the new sequence; otherwise, copy the item to the new sequence. The new sequence is S5. 6. It is a serialization error if an item in S5 is an attribute node or a namespace node. Otherwise, cconstruct a new sequence, S6, that consists of a single document node and copy all the items in the sequence, which are all nodes, as children of that document node. S6 is the normalized sequence. >> May I ask you to confirm that this response is acceptable to the Schema working group? Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0266.html [2] http://www.w3.org/TR/2004/WD-xslt-xquery-serialization-20040723/#serdm ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Dear XSL and XML Query Working Groups: The XML Schema WG has reviewed your response to this issue during the telcon on September 17, 2004 [1], and we are satisfied with your answer. Thanks for your time and consideration. Sincerely, David Ezell (on behalf of the XML Schema WG) [1] http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2004Sep/0090.html
Technical/Editorial [G] [Section 3: Serialization Parameters] Namespace binding generation ought to be explicitly called out either as its own phase or as part of markup generation. On behalf of the XML Schema WG. -- Mary Holstege@mathling.com
Technical/Editorial [H] [Section 4: XML Output Method] The exception to the round-trippability of the serialization is unclear: "Additional nodes may be present in the new tree, and the values of attribute nodes and text nodes in the new tree may be different from those in the original tree, due to the character expansion phase of serialization." What additional nodes may be present? How may they differ? As written this sentence is ambiguous and may be read as allowing _any_ additional nodes in the tree. On behalf of the XML Schema WG. -- Mary Holstege@mathling.com
Mary, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema Working Group. > [H] [Section 4: XML Output Method] The exception to the round-trippability > of the serialization is unclear: "Additional nodes may be present in the new > tree, and the values of attribute nodes and text nodes in the new tree may > be different from those in the original tree, due to the character expansion > phase of serialization." > > What additional nodes may be present? How may they differ? As written this > sentence is ambiguous and may be read as allowing _any_ additional nodes in > the tree. Thanks to Mary and the XML Schema Working Group for this comment. The XSL and XQuery Working Groups discussed the comment, and decided to add a note to clarify the situation. I would like to add the following note to the final bullet of the bulleted list in section 4. << Note: The use-character-maps parameter can cause arbitrary characters to be inserted into the serialized XML document in an unescaped form, including characters that would be considered part of XML markup. Such characters could result in arbitrary new element nodes, attribute nodes, and so on, in the new tree that results from processing the serialized XML document. >> May I ask you to confirm that this response is acceptable to the XML Schema Working Group? Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0268.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
> The XSL and XQuery Working Groups discussed the comment, and decided > to add a note to clarify the situation. I would like to add the following > note to the final bullet of the bulleted list in section 4. > > << > Note: The use-character-maps parameter can cause arbitrary characters to > be inserted into the serialized XML document in an unescaped form, > including characters that would be considered part of XML markup. Such > characters could result in arbitrary new element nodes, attribute nodes, > and so on, in the new tree that results from processing the serialized XML > document. > >> > > May I ask you to confirm that this response is acceptable to the XML > Schema Working Group? The Schema WG thanks you for your response, and finds it acceptable. -- Mary Holstege@mathling.com
Technical/Editorial [I] [Section 4.3: XML Output Method: the indent Parameter] licenses the addition of additional whitespace; this is not called out as permitted under the rules in section 4, however. On behalf of the XML Schema WG. -- Mary Holstege@mathling.com
Mary, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema Working Group. > [I] [Section 4.3: XML Output Method: the indent Parameter] licenses the > addition of additional whitespace; this is not called out as permitted under > the rules in section 4, however. Thanks to you and the XML Schema Working Group for this comment. The XSL and XQuery Working Groups discussed the comment, and agreed. The following item will be added to the bulleted list in section 4 to address this comment: << o Additional text nodes consisting of whitespace characters may be present in the new tree and some text nodes in the new tree may contain additional whitespace characters that were not present in the original tree if the indent parameter has the value yes, as described in 4.3 XML Output Method: the indent Parameter. >> May I ask you to confirm that this response is acceptable to the XML Schema Working Group? Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0269.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Henry Zongaro writes: > The XSL and XQuery Working Groups discussed the comment, and agreed. > The following item will be added to the bulleted list in section 4 to > address this comment: > > << > o Additional text nodes consisting of whitespace characters may be present > in the new tree and some text nodes in the new tree may contain additional > whitespace characters that were not present in the original tree if the > indent parameter has the value yes, as described in 4.3 XML Output Method: > the indent Parameter. > >> > > May I ask you to confirm that this response is acceptable to the XML > Schema Working Group? The Schema WG thanks you for this response and finds it acceptable. -- Mary Holstege@mathling.com
Technical [K] [General] In the absence of 'Conformance' Section, what should a processor do to claim conformance to this specification? On behalf of the XML Schema WG. -- Mary Holstege@mathling.com
Mary, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema Working Group: << [K] [General] In the absence of 'Conformance' Section, what should a processor do to claim conformance to this specification? >> Thanks to you and the Schema Working Group for this comment. The XSL and XML Query Working Groups discussed your comment, and decided to add the following Conformance section to the Serialization draft: << 10. Conformance Serialization is intended primarily as a component that can be used by other specifications. Therefore, this document relies on specifications that use it to specify conformance criteria for Serialization in their respective environments. Specifications that set conformance criteria for their use of Serialization must not change the semantic definitions of Serialization as given in this specification, except by subsetting and/or compatible extensions. >> May I ask you to confirm that this response is acceptable to the Schema Working Group? Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0271.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Dear XSL and XML Query Working Groups: The XML Schema WG has reviewed your response to this issue during the telcon on September 17, 2004 [1], and we are satisfied with your answer. Thanks for your time and consideration. Sincerely, David Ezell (on behalf of the XML Schema WG) [1] http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2004Sep/0090.html
Technical [L] [Section 4.1] Given that XML 1.1 is not [Should be "now". HZ] a recommendation, we believe that the serialization specification should give guidance to users and implementers about serializing as 1.0 or 1.1. We believe this section (4.1) is a good start, but needs more details about how serializers should deal with characters in the range x00 to x1F (ref http://www.w3.org/TR/2004/REC-xml11-20040204/#sec-xml11). See our related comment on the data model. On behalf of the XML Schema WG. -- Mary Holstege@mathling.com
Mary, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema working group: << Technical [L] [Section 4.1] Given that XML 1.1 is not [Should be "now". HZ] a recommendation, we believe that the serialization specification should give guidance to users and implementers about serializing as 1.0 or 1.1. We believe this section (4.1) is a good start, but needs more details about how serializers should deal with characters in the range x00 to x1F (ref http://www.w3.org/TR/2004/REC-xml11-20040204/#sec-xml11). See our related comment on the data model. >> Thanks to you and the XML Schema Working Group for this comment. The XSL and XML Query Working Groups discussed your comment. In response to another comment on the Last Call Working Draft, the working groups decided to add information on how NEL and LSEP characters must be handled, but neglected to add information on the control characters. The working groups decided to make the following changes based on the July 23 Working Draft of Serialization.[2] o In Section 5, the first paragraph following the bulleted list, change "certain whitespace characters" to "certain characters". (We've not been dealing with whitespace characters alone for some time.) o In Section 5, the first paragraph following the bulleted list, append the following sentence << In addition, the non-whitespace control characters #x1 through #x1F and #x7F through #x9F in text nodes and attribute nodes must be output as character references. >> o In Section 5, in the last note, remove the words "or CDATA sections". o In Section 5, in the last note, append the following paragraph << XML 1.0 permitted control characters in the range #x7F through #x9F to appear as literal characters in an XML document, but XML 1.1 requires such characters to be escaped as character references. An external general parsed entity with no text declaration or a text declaration that specifies a version pseudo-attribute with value "1.0" that is invoked by an XML 1.1 document entity must follow the rules of XML 1.1. Therefore, the non-whitespace control characters in the ranges #x1 through #x1F and #x7F through #x9F must always be escaped, regardless of the value of the version parameter. >> May I ask you to confirm that this response is acceptable to the Schema Working Group? Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0272.html [2] http://www.w3.org/TR/2004/WD-xslt-xquery-serialization-20040723/ ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Dear XSL and XML Query Working Groups: The XML Schema WG has reviewed your response to this issue during the telcon on September 17, 2004 [1], and we are satisfied with your answer. Thanks for your time and consideration. Sincerely, David Ezell (on behalf of the XML Schema WG) [1] http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2004Sep/0090.html
Dear XML Query WG and XSL WG, Below please find the I18N WGs comments on your last call document "XSLT 2.0 and XQuery 1.0 Serialization" (http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/). Please note the following: - Please address all replies to there comments to the I18N IG mailing list (w3c-i18n-ig@w3.org), not just to me. - Our comments are numbered in square brackets [nn]. We look forward to further discussion with you. [this mail is copied to the DOM WG to tell them what we are telling you about UTF-16 and endianness, which they should adopt for the Document Object Model (DOM) Level 3 Load and Save Specification] [4] This only defines serialization into bytes. In some contexts (e.g. Databases, in-program,...), serialization into a stream of characters is also important. The spec should specify how this is done. Regards, Martin.
Hello, In [1], Martin Duerst submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N Working Group. > [4] This only defines serialization into bytes. In some contexts > (e.g. Databases, in-program,...), serialization into a stream > of characters is also important. The spec should specify how > this is done. Thanks to Martin and the I18N Working Group for this comment. The XSL and XQuery Working Groups discussed the comment. The working groups noted that there is an analogy in parsing XML documents. XML 1.0 and XML 1.1 parsed entities are defined as sequences of character code points, each in some encoding. Though it is common practice to parse XML documents that have already been decoded into a sequence of characters, the XML 1.0 and XML 1.1 Recommendations do not describe the actions of an XML processor in those terms. Based on this analogy, the working groups decided that it was not appropriate for Serialization to specify normatively how to serialize into a stream of characters. The working groups did decide to add a note to Section 3 of Serialization indicating that a processor could provide an option that would permit the fourth phase of serialization (Encoding) to be skipped. May I ask the I18N Working Group to confirm that this response is acceptabe? Thanks, Henry [On behalf of the XSL and XQuery Working Groups.] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Henry Zongaro a écrit : > In [1], Martin Duerst submitted the following comment on the Last > Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of > the I18N Working Group. > > >>[4] This only defines serialization into bytes. In some contexts >> (e.g. Databases, in-program,...), serialization into a stream >> of characters is also important. The spec should specify how >> this is done. > > > Thanks to Martin and the I18N Working Group for this comment. > > The XSL and XQuery Working Groups discussed the comment. The working > groups noted that there is an analogy in parsing XML documents. XML 1.0 > and XML 1.1 parsed entities are defined as sequences of character code > points, each in some encoding. Though it is common practice to parse XML > documents that have already been decoded into a sequence of characters, > the XML 1.0 and XML 1.1 Recommendations do not describe the actions of an > XML processor in those terms. > > Based on this analogy, the working groups decided that it was not > appropriate for Serialization to specify normatively how to serialize into > a stream of characters. The working groups did decide to add a note to > Section 3 of Serialization indicating that a processor could provide an > option that would permit the fourth phase of serialization (Encoding) to > be skipped. We are not really satisfied with this resolution and would like to request further clarification. In particular, conformance when one is actually serializing to characters instead of bytes is not clear at all to us. Allowing this but not normatively is very strange, one is left to wonder what would be the conformance status of an implementation that *only* serializes to characters (because that's all that is required in a given context). > [1] > http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html Regards, -- François Yergeau
Hello, François. François Yergeau wrote on 2004-06-14 09:28:11 PM: >Henry Zongaro a écrit : >> In [1], Martin Duerst submitted the following comment on the Last >> Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of >> the I18N Working Group. >> >>>[4] This only defines serialization into bytes. In some contexts >>> (e.g. Databases, in-program,...), serialization into a stream >>> of characters is also important. The spec should specify how >>> this is done. >> The XSL and XQuery Working Groups discussed the comment. The working >> groups noted that there is an analogy in parsing XML documents. XML 1.0 >> and XML 1.1 parsed entities are defined as sequences of character code >> points, each in some encoding. Though it is common practice to parse XML >> documents that have already been decoded into a sequence of characters, >> the XML 1.0 and XML 1.1 Recommendations do not describe the actions of an >> XML processor in those terms. >> >> Based on this analogy, the working groups decided that it was not >> appropriate for Serialization to specify normatively how to serialize into >> a stream of characters. The working groups did decide to add a note to >> Section 3 of Serialization indicating that a processor could provide an >> option that would permit the fourth phase of serialization (Encoding) to >> be skipped. > >We are not really satisfied with this resolution and would like to >request further clarification. In particular, conformance when one is >actually serializing to characters instead of bytes is not clear at all >to us. Allowing this but not normatively is very strange, one is left >to wonder what would be the conformance status of an implementation that >*only* serializes to characters (because that's all that is required in >a given context). > >> [1] >> http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html Thank you for your response. The intent of the note was to indicate that an implementer might supply such a feature as an extension, because it is often required, but that such a feature is explicitly beyond the scope of the specification. An implementer might supply anything as an extension, and doesn't require permission to do so - we would just like to mention this one as a useful extension. Here is the text of the note that I'm proposing: << Note: Serialization is only defined in terms of encoding the result as a stream of bytes. However, a processor may provide an option that allows the encoding phase to be skipped, so that the result of serialization is a stream of Unicode characters. The effect of any such option is implementation-defined, and a processor is not required to support such an option. >> I don't believe there is a question of conformance here. Serialization to characters is explicitly a usage that is beyond the specification, and the behaviour of a processor that supplies such a feature is unspecified. Similarly, many XML parsers are able to parse characters in addition to parsing encoded characters, but the conformance of such parsers is not in question in spite of the fact that this feature is an extension that is not described by the XML 1.0 or 1.1 Recommendations. Does the I18N Working Group feel it would be better not to include such a note at all? Thanks, Henry ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Martin, François. In [1] Martin submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N working group: << [4] This only defines serialization into bytes. In some contexts (e.g. Databases, in-program,...), serialization into a stream of characters is also important. The spec should specify how this is done. >> In [2], I announced the following decision on behalf of the XSL and XML Query Working Groups: << The XSL and XQuery Working Groups discussed the comment. The working groups noted that there is an analogy in parsing XML documents. XML 1.0 and XML 1.1 parsed entities are defined as sequences of character code points, each in some encoding. Though it is common practice to parse XML documents that have already been decoded into a sequence of characters, the XML 1.0 and XML 1.1 Recommendations do not describe the actions of an XML processor in those terms. Based on this analogy, the working groups decided that it was not appropriate for Serialization to specify normatively how to serialize into a stream of characters. The working groups did decide to add a note to Section 3 of Serialization indicating that a processor could provide an option that would permit the fourth phase of serialization (Encoding) to be skipped. >> In [3], François raised the following objection on behalf of I18N: << We are not really satisfied with this resolution and would like to request further clarification. In particular, conformance when one is actually serializing to characters instead of bytes is not clear at all to us. Allowing this but not normatively is very strange, one is left to wonder what would be the conformance status of an implementation that *only* serializes to characters (because that's all that is required in a given context). >> The XSL and XML Query Working Groups discussed this comment again, and are unsure what change would resolve this issue. There does not appear to be any interoperability problem with not requiring implementations to support skipping the encoding phase. In addition, XSLT 1.0 did not require support for skipping the encoding phase of serialization, and such support has been raised as a requirement for XSLT 2.0. Would it be sufficient to remove the note in the Serialization specification that mentions that processors may implement an option that allows serialization to characters rather than serialization to bytes? Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html [2] http://lists.w3.org/Archives/Public/public-qt-comments/2004Apr/0065.html [3] http://lists.w3.org/Archives/Public/public-qt-comments/2004Jun/0109.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Hello. In [1], I wrote: << In addition, XSLT 1.0 did not require support for skipping the encoding phase of serialization, and such support has been raised as a requirement for XSLT 2.0. >> I omitted the word "not" from that sentence! It should read as follows: << In addition, XSLT 1.0 did not require support for skipping the encoding phase of serialization, and such support has NOT been raised as a requirement for XSLT 2.0. >> My apologies for any confusion. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Aug/0137.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Dear XML Query WG and XSL WG, Below please find the I18N WGs comments on your last call document "XSLT 2.0 and XQuery 1.0 Serialization" (http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/). Please note the following: - Please address all replies to there comments to the I18N IG mailing list (w3c-i18n-ig@w3.org), not just to me. - Our comments are numbered in square brackets [nn]. We look forward to further discussion with you. [this mail is copied to the DOM WG to tell them what we are telling you about UTF-16 and endianness, which they should adopt for the Document Object Model (DOM) Level 3 Load and Save Specification] [5] Section 2, point 3: "each separated by a single space": Inserting a space may not be the right thing, in particular for Chinese, Japanese, Thai,... which don't have spaces between words. This has to be checked very carefully. Regards, Martin.
This isn't trying to achieve linguistic separation, it is trying to achieve separation of tokens that meets the rules defined in XML Schema. XML Schema allows any sequence of whitespace characters between the items in a list, we mandate a single space character because that's the simplest whitespace sequence.
Hello, In [1], Martin Duerst submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N Working Group: > [5] Section 2, point 3: "each separated by a single space": > Inserting a space may not be the right thing, in particular for > Chinese, Japanese, Thai,... which don't have spaces between words. > This has to be checked very carefully. Thanks to Martin and the working group for this comment. The XSL and XML Query Working Groups discussed the comment, and decided that no change to the Serialization specification is required. The reason for separating each pair of string values by a single space is not to achieve any kind of linguistic separation of words, but to separate values in a way that would be consistent with the requirements for an XML Schema type derived by list, for instance. May I ask the I18N Working Group to confirm that this response is acceptable? Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html [2] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0401.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Hello Henry, Many thanks for your responses. The I18N WG (Core TF) has looked at your response below, and unfortunately, we have to say that we cannot accept it. At the least, we need some more information exchange to make sure we understand each other well. Below, you write that the convention of inserting a space isn't for linguistic separation, but for creating XML Schema lists. This may be the intention of the spec-writers, but who guarantees that this is how this will be used? In cases where it will be used in other ways, there would be serious problems when adapting a query or transformation to a different language (in particular Chinese, Japanese, Thai,...). So in particular, we need to know more about the following questions: - How/when/why would sequences of strings (or other atomic data types) typically be generated? - How would e.g. combinations of data values, strings,... be serialized other than though this mechanism? We think that in many cases, in particular for XML Query, this could be the mechanism of choice to write out texts mixed with e.g. stringified numbers. - What would the effort be to change a script relying on this mechanism so that it works for Chinese/Japanese,...? - How can the distinction between strings and text nodes be used to affect/create the right behavior, and how can we make sure that programmers use the solution that is easily adapted to all kinds of languages. Regards, Martin. At 11:55 04/04/28 -0400, Henry Zongaro wrote: >Hello, > > In [1], Martin Duerst submitted the following comment on the Last >Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of >the I18N Working Group: > > > [5] Section 2, point 3: "each separated by a single space": > > Inserting a space may not be the right thing, in particular for > > Chinese, Japanese, Thai,... which don't have spaces between words. > > This has to be checked very carefully. > > Thanks to Martin and the working group for this comment. > > The XSL and XML Query Working Groups discussed the comment, and >decided that no change to the Serialization specification is required. The >reason for separating each pair of string values by a single space is not >to achieve any kind of linguistic separation of words, but to separate >values in a way that would be consistent with the requirements for an XML >Schema type derived by list, for instance. > > May I ask the I18N Working Group to confirm that this response is >acceptable? > >Thanks, > >Henry [On behalf of the XSL and XML Query Working Groups] >[1] >http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html >[2] >http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0401.html >------------------------------------------------------------------ >Henry Zongaro Xalan development >IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 >mailto:zongaro@ca.ibm.com
> > Hello Henry, > > Many thanks for your responses. The I18N WG (Core TF) has looked > at your response below, and unfortunately, we have to say that we > cannot accept it. At the least, we need some more information > exchange to make sure we understand each other well. > > Below, you write that the convention of inserting a space isn't > for linguistic separation, but for creating XML Schema lists. > This may be the intention of the spec-writers, but who guarantees > that this is how this will be used? Sorry, Martin, but I think you have completely missed the point here. If an XML Schema declares the colors attribute as having type xs:NMTOKENS, and the typed value is the sequence ("red", "green", "blue"), then the correct lexical representation of this according to the rules in XML Schema is colors="red green blue". If you don't like that, you need to complain to the XML Schema WG. The places where XSLT/XQuery use space as a default separator are all associated with converting a typed value to the string value of a node, and are therefore closely associated with this XML Schema convention for representing lists. Of course we can't totally control how the facility is used, but we do provide a string-join function that allows any separator to be used in the lexical representation of a sequence, so we are not imposing any constraints on users. Michael Kay
Hello Michael, At 17:52 04/05/06 +0100, Michael Kay wrote: > > > > Hello Henry, > > > > Many thanks for your responses. The I18N WG (Core TF) has looked > > at your response below, and unfortunately, we have to say that we > > cannot accept it. At the least, we need some more information > > exchange to make sure we understand each other well. > > > > Below, you write that the convention of inserting a space isn't > > for linguistic separation, but for creating XML Schema lists. > > This may be the intention of the spec-writers, but who guarantees > > that this is how this will be used? > >Sorry, Martin, but I think you have completely missed the point here. I may, or I may not. Given the complexity of the XSLT/XQuery specs, and the fact that I'm dealing with a lot of other things (not to speak about the rest of the I18N WG), it might not necessarily come as a surprise. >If an >XML Schema declares the colors attribute as having type xs:NMTOKENS, and the >typed value is the sequence ("red", "green", "blue"), then the correct >lexical representation of this according to the rules in XML Schema is >colors="red green blue". If you don't like that, you need to complain to the >XML Schema WG. There is no problem with that, if indeed these values are typed as xs:NMTOKENS. But we strongly suspect that there is a problem if there are some values that are just simple strings. The fact that simple strings and text nodes are not treated in the same way, we suspect, will often lead to confusion. >The places where XSLT/XQuery use space as a default separator are all >associated with converting a typed value to the string value of a node, and >are therefore closely associated with this XML Schema convention for >representing lists. Of course we can't totally control how the facility is >used, but we do provide a string-join function that allows any separator to >be used in the lexical representation of a sequence, so we are not imposing >any constraints on users. Would it be possible for you to write the following three examples: - An example (such as above with "red", "green", "blue", but with the actual code) where these are e.g. NMTOKENS, and where the serialization with spaces makes sense. - An example with e.g. strings used as intermediate text in a formating- like operation (a la printf in C), where inserting spaces would happen, but would not be desired. - The previous example with the above 'string-join' function used to avoid the problems with spaces. Regards, Martin.
Hi, Martin. In [1], you wrote: Martin Duerst wrote on 2004-05-24 05:31:53 AM: > At 17:52 04/05/06 +0100, Michael Kay wrote: > >The places where XSLT/XQuery use space as a default separator are all > >associated with converting a typed value to the string value of a node, and > >are therefore closely associated with this XML Schema convention for > >representing lists. Of course we can't totally control how the facility is > >used, but we do provide a string-join function that allows any separator to > >be used in the lexical representation of a sequence, so we are not imposing > >any constraints on users. > > Would it be possible for you to write the following three examples: > > - An example (such as above with "red", "green", "blue", but with the > actual code) where these are e.g. NMTOKENS, and where the serialization > with spaces makes sense. Assume the following input document, where the type of the colors attribute is xs:NMTOKENS. <elem colors="red green blue"/> and the following stylesheet: <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"> <xsl:template match="/"> <xsl:sequence select="data(elem/@colors)"/> </xsl:template> </xsl:stylesheet> The result of serialization will be the following external general parsed entity. <?xml version="1.0" encoding="UTF-8"?>red green blue That entity might be subsequently referenced in the content of an element that has the simple type xs:NMTOKENS. If the PSVI that results is used to construct an instance of the XPath/XQuery Data Model, the typed valued of the element would be a sequence of three values of type xs:NMTOKEN; without the spaces, the typed value would be a sequence of a single value of type xs:NMTOKEN: "redgreenblue". Compare that with the result of the following stylesheet, where the rules for evaluating an attribute value template (section 5.5 of the last call draft of XSLT 2.0) state that each atomized value in the sequence that results from evaluating each XPath expression will be converted to a string, and separated by a space: <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"> <xsl:template match="/"> <elem colors="{data(elem/@colors)}"/> </xsl:template> </xsl:stylesheet> Result: <?xml version="1.0" encoding="UTF-8"?><elem colors="red green blue"/> Again, if that serialized entity is assessed against a schema in which the colors attribute has type xs:NMTOKENS, the typed value of the attribute will be a sequence of three values of type xs:NMTOKEN. Similarly, the result of the following stylesheet, where the rules for constructing complex content (section 5.6.1 of XSLT 2.0) describe how a text node is created from the sequence of atomic values that results from evaluating the xsl:sequence instruction: <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"> <xsl:template match="/"> <elem><xsl:sequence select="data(elem/@colors)"/></elem> </xsl:template> </xsl:stylesheet> Result: <elem>red green blue</elem> > - An example with e.g. strings used as intermediate text in a formating- > like operation (a la printf in C), where inserting spaces would happen, > but would not be desired. Is this the kind of example you're looking for? I've used an XPath expression to perform a simple date formatting operation, constructing the result as a sequence of strings. <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:hz="http://www.example.org" xmlns:xs="http://www.w3.org/2001/XMLSchema" version="2.0" exclude-result-prefixes="hz xs"> <xsl:function name="hz:format"> <xsl:param name="date" as="xs:date"/> <xsl:param name="format" as="xs:string"/> <xsl:sequence select=" for $c in (for $i in (1 to string-length($format)) return substring($format, $i, 1)) return if ($c = 'y') then get-year-from-date($date) else if ($c = 'd') then get-day-from-date($date) else if ($c = 'm') then get-month-from-date($date) else if ($c = 'M') then ('Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec') [get-month-from-date($date)] else $c"/> </xsl:function> <xsl:template match="/"> <doc> <v1> <xsl:sequence select="hz:format(xs:date('2004-12-21'), 'y-m-d')"/> </v1> <v2> <xsl:sequence select="hz:format(xs:date('2004-12-31'), 'M d, y')"/> </v2> </doc> </xsl:template> </xsl:stylesheet> This stylesheet will produce the following result, which is probably not what was intended. <doc><v1>2004 - 12 - 21</v1><v2>Dec 31 , 2004</v2></doc> > - The previous example with the above 'string-join' function used to > avoid the problems with spaces. If I change the definition of hz:format to add in a reference to string-join, specifying '' as the separator, <xsl:function name="hz:format-date"> <xsl:param name="date" as="xs:date"/> <xsl:param name="format" as="xs:string"/> <xsl:sequence select="string-join( for $c in (for $i in (1 to string-length($format)) return substring($format, $i, 1)) return ... , '')"/> </xsl:function> the result will be: <doc><v1>2004-12-21</v1><v2>Dec 31, 2004</v2></doc> Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0053.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Hello Henry, [this is a reply to http://lists.w3.org/Archives/Public/public-qt-comments/2004Jun/0038.html] We have looked at your code examples below in detail. The examples you are giving look reasonable, but we are concerned about is cases where text is not put together programmatically, but just concatenated, e.g. in an example such as <p>Document creation date: <xsl:sequence select="hz:format(xs:date('2004-12-21'), 'y-m-d')"/>.</p> Overall, I think that the convention of using a space between strings, inherited from SGML NMTOKENS and IDREFS, should not be the default in XQuery and XSLT to contatenate strings. Either there should be a function, e.g. called stringify-tokens, to handle cases such as "red green blue", which I guess would make the first of your examples <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"> <xsl:template match="/"> <elem colors="stringify-tokens({data(elem/@colors)})"/> </xsl:template> </xsl:stylesheet> or alternatively, making sure that the data model can distinguish, based on schema information, between tokens such as NMTOKENS and IDREFS and plain strings that don't need spaces when concatenated. Simply defining that all strings behave like tokens because some strings are tokens doesn't seem to make sense at all. Regards, Martin.
Martin Duerst wrote: > Overall, I think that the convention of using a space between > strings, inherited from SGML NMTOKENS and IDREFS, should not be the > default in XQuery and XSLT to contatenate strings. Hi Martin, For concatenating strings, which is what the concat() function does, we do not insert anything. I think Henry has shown [1] that our string manipulation library is pretty good at allowing other delimiters to be inserted if needed. Serializing a sequence of atomic values is not the same thing as "concatenating strings". The lexical representation of these atomic values is given by XML Schema, and the delimiters used are the delimiters used by XML Schema. The default for serializing a sequence of tokens defined by XML Schema pretty much has to be the format defined by XML Schema, or else XML processors won't be able to read serialized documents. So for serialization, I think your beef is with XML Schema. Linguistic tokens and delimiters are not the same as computerlanguage tokens and delimiters. In my opinion, the biggest problem occurs not when they differ, but when they are the same. That's why we have to invent conventions like camelCase or hyphenated-names to allow ourselves to create computer language tokens that consist of multiple linguistic tokens. XML Schema could have allowed users to create a sequence of string values that contain spaces, as in: <sequenceOfRoads>Gibson Road, Main Street</sequenceOfRoads> That would require XML Schema to allow an alternate delimiter to be specified. It doesn't. And it shouldn't - in XML, the best way to delimit individual items is to use markup: <roads> <road>Gibson Road</road> <road>Main Street</road> </roads> As a markup language, XML exists for the sole purpose of clearly identifying data. Let's use it! The alternative is to use microparsing. But that's not how XML works, and XQuery is based on XML. We support XML Schema, and that's what our serialization does by default. If you want a different serialization, you can use string manipulation to create whatever you want, but an XML Schema processor won't be able to recognize the tokens. Jonathan My opinion only. Not on behalf of anyone.
Hi Martin. The joint working groups have discussed your objection to the resolution to qt-2004Feb0362-02. The WG has decided to endorse Jonathan Robie's response [1] to your note. We will leave the issue's status as "objected". -scott [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Oct/0064.html
Dear XML Query WG and XSL WG, Below please find the I18N WGs comments on your last call document "XSLT 2.0 and XQuery 1.0 Serialization" (http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/). Please note the following: - Please address all replies to there comments to the I18N IG mailing list (w3c-i18n-ig@w3.org), not just to me. - Our comments are numbered in square brackets [nn]. We look forward to further discussion with you. [this mail is copied to the DOM WG to tell them what we are telling you about UTF-16 and endianness, which they should adopt for the Document Object Model (DOM) Level 3 Load and Save Specification] [6] Section 3, 'encoding': Given that this is already required for the XML output method, we think it's highly desirable to make the requirement for support for UTF-8 and UTF-16 general (including text). Regards, Martin.
I can't think of any reason not to make this change.
Martin, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N working group: << [6] Section 3, 'encoding': Given that this is already required for the XML output method, we think it's highly desirable to make the requirement for support for UTF-8 and UTF-16 general (including text). >> Thanks to you and the I18N working group for this comment. The XSL and XML Query Working Groups discussed the working group's comment, and decided to accept the I18N working group's suggestion. The serialization specification will be modified to require support for UTF-8 and UTF-16 encodings for all the output methods defined by the specification - namely, the xml, xhtml, html and text output methods. As this is the change the I18N working group proposed, I believe the response should be acceptable to the working group. May I ask you to confirm that it is? Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Dear XML Query WG and XSL WG, Below please find the I18N WGs comments on your last call document "XSLT 2.0 and XQuery 1.0 Serialization" (http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/). Please note the following: - Please address all replies to there comments to the I18N IG mailing list (w3c-i18n-ig@w3.org), not just to me. - Our comments are numbered in square brackets [nn]. We look forward to further discussion with you. [this mail is copied to the DOM WG to tell them what we are telling you about UTF-16 and endianness, which they should adopt for the Document Object Model (DOM) Level 3 Load and Save Specification] [7] Section 3, 'encoding': Here or for each individual output method, something should be said about the BOM. We think it should be the following: - XML/XHTML: UTF-16: required; UTF-8: may be used. - HTML/text: UTF-16: recommended; UTF-8: may be used. Regards, Martin.
I agree. In Saxon, I've added an extension attribute to control whether a BOM should be emitted, and I think it would be a good idea to make this a standard feature. The default should be yes for UTF-16, no for UTF-8.
Martin, In [1], you submitted the following comments on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N Working Group: << [7] Section 3, 'encoding': Here or for each individual output method, something should be said about the BOM. We think it should be the following: - XML/XHTML: UTF-16: required; UTF-8: may be used. - HTML/text: UTF-16: recommended; UTF-8: may be used. [8] Section 3, 'encoding': This should say that for UTF-16, endianness implementation-dependent (or implementation-defined) >> Thanks to you and the I18N Working Group for these comments. The XSL and XML Query Working Groups discussed the comments, and decided to add a byte-order-mark parameter to the Serialization specification to control whether a Byte Order Mark is written. The actual byte order used is implementation-dependent. If the concept of a Byte Order Mark does not make sense for the particular encoding selected, the byte-order-mark parameter is ignored. May I ask you to confirm that this response is acceptable to the working group? Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Dear XML Query WG and XSL WG, Below please find the I18N WGs comments on your last call document "XSLT 2.0 and XQuery 1.0 Serialization" (http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/). Please note the following: - Please address all replies to there comments to the I18N IG mailing list (w3c-i18n-ig@w3.org), not just to me. - Our comments are numbered in square brackets [nn]. We look forward to further discussion with you. [this mail is copied to the DOM WG to tell them what we are telling you about UTF-16 and endianness, which they should adopt for the Document Object Model (DOM) Level 3 Load and Save Specification] [8] Section 3, 'encoding': This should say that for UTF-16, endianness implementation-dependent (or implementation-defined) Regards, Martin.
Agreed.
Martin, In [1], you submitted the following comments on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N Working Group: << [7] Section 3, 'encoding': Here or for each individual output method, something should be said about the BOM. We think it should be the following: - XML/XHTML: UTF-16: required; UTF-8: may be used. - HTML/text: UTF-16: recommended; UTF-8: may be used. [8] Section 3, 'encoding': This should say that for UTF-16, endianness implementation-dependent (or implementation-defined) >> Thanks to you and the I18N Working Group for these comments. The XSL and XML Query Working Groups discussed the comments, and decided to add a byte-order-mark parameter to the Serialization specification to control whether a Byte Order Mark is written. The actual byte order used is implementation-dependent. If the concept of a Byte Order Mark does not make sense for the particular encoding selected, the byte-order-mark parameter is ignored. May I ask you to confirm that this response is acceptable to the working group? Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Dear XML Query WG and XSL WG, Below please find the I18N WGs comments on your last call document "XSLT 2.0 and XQuery 1.0 Serialization" (http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/). Please note the following: - Please address all replies to there comments to the I18N IG mailing list (w3c-i18n-ig@w3.org), not just to me. - Our comments are numbered in square brackets [nn]. We look forward to further discussion with you. [this mail is copied to the DOM WG to tell them what we are telling you about UTF-16 and endianness, which they should adopt for the Document Object Model (DOM) Level 3 Load and Save Specification] [9] Section 3, 'encoding': "If this parameter is not specified, and the output method does not specify any additional requirements, the encoding used is implementation defined." This should be more specific. In the absence of an 'encoding' parameter, information e.g. given to an implementation via an option, and specific information for a particular 'host language' (e.g. other than XQuery or XSLT), there should be a default of UTF-8. Regards, Martin.
Off-hand, I don't see any objection to this except that it might give some vendors a backwards compatibility problem.
Martin, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N Working Group: << [9] Section 3, 'encoding': "If this parameter is not specified, and the output method does not specify any additional requirements, the encoding used is implementation defined." This should be more specific. In the absence of an 'encoding' parameter, information e.g. given to an implementation via an option, and specific information for a particular 'host language' (e.g. other than XQuery or XSLT), there should be a default of UTF-8. >> Thanks to you and the working group for this comment. The XSL and XML Query Working Groups discussed your comment. In response to another last call issue, the encoding parameter is no longer optional. Therefore, any host specification is obliged to specify how the value of the parameter is determined. This is reflected in the 23 July Working Draft of Serialization.[2] There is no longer a need for any change to Serialization. May I ask you to confirm that this response is acceptable to the I18N Working Group? Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html [2] http://www.w3.org/TR/2004/WD-xslt-xquery-serialization-20040723/#serparam ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Dear XML Query WG and XSL WG, Below please find the I18N WGs comments on your last call document "XSLT 2.0 and XQuery 1.0 Serialization" (http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/). Please note the following: - Please address all replies to there comments to the I18N IG mailing list (w3c-i18n-ig@w3.org), not just to me. - Our comments are numbered in square brackets [nn]. We look forward to further discussion with you. [this mail is copied to the DOM WG to tell them what we are telling you about UTF-16 and endianness, which they should adopt for the Document Object Model (DOM) Level 3 Load and Save Specification] [12] [Should be "10". HZ] Section 3, 'escape-uri-attributes' (and other places in this spec): RFC 2396, section 2.4.1, only specifies how to escape a string of bytes in an URI, and cannot directly be applied to a string of (Unicode) characters. In accordance with the IRI draft and many other W3C specifications, this must be specified to use UTF-8 first and then use RFC 2396, section 2.4.1 (%-escaping). Regards, Martin.
Agreed.
Martin, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N working group: << [12] [Should be "10". HZ] Section 3, 'escape-uri-attributes' (and other places in this spec): RFC 2396, section 2.4.1, only specifies how to escape a string of bytes in an URI, and cannot directly be applied to a string of (Unicode) characters. In accordance with the IRI draft and many other W3C specifications, this must be specified to use UTF-8 first and then use RFC 2396, section 2.4.1 (%-escaping). >> Thanks to you and the working group for this comment. The XSL and XML Query Working Groups discussed the comment, and noted that Section 16.1 of XSLT 1.0 [2] relied upon Appendix B.2.1 of HTML 4.0 [3] for the normative definition of URI escaping. The working groups also noted that some specifications have duplicated the description of URI escaping, while still others have relied on diverse references for the normative definition of the URI escaping algorithm. In particular, the working groups noted that Section 3.2.17 of the PER of XML Schema: Datatypes 2nd ed. [4] refers to Section 5.4 of XML Linking Language [5] for the normative definition of URI escaping. The working groups decided to follow the lead of the XML Schema Working Group, and adopted the following changes: . In Section 6 of Serialization, sixth bullet, change << escape non-ASCII characters in URI attribute values using the method recommended in Section 2.4.1 of [RFC2396]. >> to << escape non-ASCII characters in URI attribute values using the method defined by Section 5.4 Locator Attribute of [XML Linking Language], except that relative URIs must not be absolutized. >> . In Section 7.2 of Serialization, third paragraph change << escape non-ASCII characters in URI attribute values using the method recommended in [RFC2396] (section 2.4.1). >> to << escape non-ASCII characters in URI attribute values using the method defined by Section 5.4 Locator Attribute of [XML Linking Language], except that relative URIs must not be absolutized. >> May I ask you to confirm that this response is acceptable to the working group? If not, we would ask the I18N working group to suggest the most appropriate normative reference for URI escaping that should be used by all new W3C Recommendations. Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html [2] http://www.w3.org/TR/xslt#section-HTML-Output-Method [3] http://www.w3.org/TR/REC-html40/appendix/notes.html#h-B.2.1 [4] http://www.w3.org/TR/2004/PER-xmlschema-2-20040318/#anyURI [5] http://www.w3.org/TR/2001/REC-xlink-20010627/#link-locators ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Hi Martin. This is another gentle reminder that we are waiting for a confirmation for the resolution of this issue so that we might close it.
Dear XML Query WG and XSL WG, Below please find the I18N WGs comments on your last call document "XSLT 2.0 and XQuery 1.0 Serialization" (http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/). Please note the following: - Please address all replies to there comments to the I18N IG mailing list (w3c-i18n-ig@w3.org), not just to me. - Our comments are numbered in square brackets [nn]. We look forward to further discussion with you. [this mail is copied to the DOM WG to tell them what we are telling you about UTF-16 and endianness, which they should adopt for the Document Object Model (DOM) Level 3 Load and Save Specification] [11] Section 3, 'include-content-type': Why is this parameter needed? It seems that it may be better to always include a <meta> element. Please remove the parameter or tell us when/why it's necessary to not have a <meta> element Regards, Martin.
This parameter has been requested by users a number of times, but the situations that justify it are difficult to describe concisely. The simplest case is where the user wants to output the meta element "by hand", to give greater control. The other cases I've seen are where the encoding isn't known until after subsequent stages in the processing pipeline.
Martin, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N Working Group: << [11] Section 3, 'include-content-type': Why is this parameter needed? It seems that it may be better to always include a <meta> element. Please remove the parameter or tell us when/why it's necessary to not have a <meta> element >> Thanks to you and the working group for this comment. The XSL and XML Query Working Groups discussed the comment, and noted that there are many situations in which users have found there to be a need for the include-content-type parameter. A user might not want the serialization process to produce a META element because some post-processing phase will be responsible for creating that element or because the sequence that is input to serialization already contains such a META element that the user would like the serialization process to preserve. Users sometimes find it necessary to do this in order to work around bugs in web server software. The working group decided that no change to the Serialization draft was necessary. May I ask you to confirm that this response is acceptable to the I18N Working Group? Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Hello Henry, The I18N WG has looked at the response below. We looked at two documents: http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/ http://www.w3.org/TR/2004/WD-xslt-xquery-serialization-20040723/ The LC version says 'value is implementation defined'. The new WD doesn't even say that (or we haven't found it). Your email below is written as if it was 'yes' by default. We think that (default 'yes') would be the right thing to do. If the spec indeed uses 'yes' as the default, please send us a pointer to the place where it does. If not, we would not be satisfied with this resolution. Regards, Martin.
Dear XML Query WG and XSL WG, Below please find the I18N WGs comments on your last call document "XSLT 2.0 and XQuery 1.0 Serialization" (http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/). Please note the following: - Please address all replies to there comments to the I18N IG mailing list (w3c-i18n-ig@w3.org), not just to me. - Our comments are numbered in square brackets [nn]. We look forward to further discussion with you. [this mail is copied to the DOM WG to tell them what we are telling you about UTF-16 and endianness, which they should adopt for the Document Object Model (DOM) Level 3 Load and Save Specification] [12] The description of 'media-type' is confusing. Does it change something in the output, or only in the way the output is labelled? Does it affect the <meta>, if output? Can it affect other things, e.g. a Content-Type header in HTTP? This should be clarified. Regards, Martin.
You're not the only one who's confused. It's often used by transformation servlets to set the HTTP headers, but as far as the serializer itself is concerned, it's documentary.
Martin, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N Working Group: << [12] The description of 'media-type' is confusing. Does it change something in the output, or only in the way the output is labelled? Does it affect the <meta>, if output? Can it affect other things, e.g. a Content-Type header in HTTP? This should be clarified. >> Thanks to you and the working group for this comment. The XSL and XML Query Working Groups discussed the comment. Yes, the setting of the media-type parameter may affect the sequence of octets that is the result of serialization using the html and xhtml output methods, as is clearly indicated in [2] and [3]. In addition, the serializer may use the parameter to influence things that are outside of the scope of this specification, such as an HTTP header. To make this clear, the following will be added to the description of the media-type parameter in the table in Section 3 of Serialization: << If the destination of the serialized output is annotated with a media type, this parameter may be used to provide such an annotation. For example, it may be used to set the media type in an HTTP header. >> Finally, the reference to RFC 2396 that currently appears in the description of the media-type parameter is not appropriate; that RFC defines XML media types. The reference will be changed to RFC 2046[4] "Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types." May I ask you to confirm that this response is acceptable to the I18N Working Group? Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html [2] http://www.w3.org/TR/2004/WD-xslt-xquery-serialization-20040723/#N10F34 [3] http://www.w3.org/TR/2004/WD-xslt-xquery-serialization-20040723/#xhtml-output [4] ftp://ftp.rfc-editor.org/in-notes/rfc2046.txt ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Dear XML Query WG and XSL WG, Below please find the I18N WGs comments on your last call document "XSLT 2.0 and XQuery 1.0 Serialization" (http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/). Please note the following: - Please address all replies to there comments to the I18N IG mailing list (w3c-i18n-ig@w3.org), not just to me. - Our comments are numbered in square brackets [nn]. We look forward to further discussion with you. [this mail is copied to the DOM WG to tell them what we are telling you about UTF-16 and endianness, which they should adopt for the Document Object Model (DOM) Level 3 Load and Save Specification] [13] Section 3, 'normalize-unicode': Using Normalization Form C is the right thing, but XML 1.1, in accordance with the Character Model, defines some additional start conditions in some cases. How are these guaranteed (e.g. by adding an initial space if necessary)? If there is no such guarantee, there should at least be a warning, but a guarantee is highly preferable. Regards, Martin.
Martin, In [1], you submitted the following comment on the Serialization last call draft. Martin Duerst wrote on 2004-02-15 12:37:30 PM: > [13] Section 3, 'normalize-unicode': Using Normalization Form C is > the right thing, but XML 1.1, in accordance with the Character > Model, defines some additional start conditions in some cases. > How are these guaranteed (e.g. by adding an initial space if > necessary)? If there is no such guarantee, there should at least > be a warning, but a guarantee is highly preferable. Our thanks to you and the I18N WG for submitting this comment. The XSL and XQuery Working Groups discussed the comment and related comments, and decided to make the following changes to the normalize-unicode parameter: 1. Rename the parameter to "normalization-form". 2. The possible values of the parameter will be "NFC", "NFD", "NFKC", "NFKD", "fully-normalized", "none" or an implementation-defined normalization form. The default value is "none". We will also add a note advising of the interoperability problems that can arise by using anything other than NFC. 3. All of "NFC", "NFD", "NFKC", "NFKD", "fully-normalized", "none" and any implementation-defined value are permitted for the xml, xhtml and text output methods. The values "NFC", "fully-normalized" and "none" must be supported by an implementation for these output methods. 4. The normalization-form parameter is permitted to have the values "NFC", "NFD", "NFKC", "NFKD", "none" or an implementation-defined value if the output method is "html". The values "NFC" and "none" must be supported for the html output method. The value "fully-normalized" is not permitted if the output method is "html". 5. In the case of "fully-normalized", the normalization is the same as for NFC, but the processor must signal a serialization error if any of the "relevant constructs" of the result would begin with a combining character. We believe that item 5 on this list addresses the particular concern raised in the comment, that guarantees should be provided that the start conditions of the Character Model are not violated. May I ask you to confirm that this is an acceptable response to the I18N WG's comment? Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Dear XML Query WG and XSL WG, Below please find the I18N WGs comments on your last call document "XSLT 2.0 and XQuery 1.0 Serialization" (http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/). Please note the following: - Please address all replies to there comments to the I18N IG mailing list (w3c-i18n-ig@w3.org), not just to me. - Our comments are numbered in square brackets [nn]. We look forward to further discussion with you. [this mail is copied to the DOM WG to tell them what we are telling you about UTF-16 and endianness, which they should adopt for the Document Object Model (DOM) Level 3 Load and Save Specification] [14] Section 3, four phases of serialization: Character expansion comes before Encoding, but encoding depends on character expansion (using numeric character references for characters that don't exist in a certain encoding). This has to be sorted out very carefully and explained in detail, ideally with examples. There's also an interaction between mapping and normalization. If there's a mapping combining grave->̀, normalization must be aware that ̀ is not an ASCII string! Regards, Martin.
You are probably right that we need to analyze and explain the interactions between the different options better than we do at the moment.
Martin, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N Working Group: << [14] Section 3, four phases of serialization: Character expansion comes before Encoding, but encoding depends on character expansion (using numeric character references for characters that don't exist in a certain encoding). This has to be sorted out very carefully and explained in detail, ideally with examples. There's also an interaction between mapping and normalization. If there's a mapping combining grave->̀, normalization must be aware that ̀ is not an ASCII string! >> Thanks to you and the I18N Working Group for this comment. The XSL and XML Query Working Groups discussed the comment, and decided, because of the interactions between Unicode normalization and creation of character references, to fold together character expansion and Unicode normalization. In addition, the working groups decided to add creation of character references to the character expansion phase, because it had not been explicitly mentioned as part of that phase. Specifically, the working groups decided to replace the second and third bullets of Section 4 of Serialization with the following text: << 2. Character expansion is concerned with the representation of characters appearing in text and attribute nodes in the instance of the data model. The substitution processes that may apply are listed below, in priority order: a character that is handled by one process in this list will be unaffected by processes appearing later in the list, except that a character affected by Unicode normalization may be affected by creation of CDATA sections or by character escaping o URI escaping (in the case of URI-valued attributes in the HTML and XHTML output methods), as determined by the escape-uri-attributes parameter o Character mapping, as determined by the use-character-maps parameter. Text nodes that are children of elements specified by the cdata-section-elements parameter are not affected by this step. o Unicode Normalization, if requested by the normalization-form parameter. Unicode normalization is applied to the character stream that results after all markup generation and character expansion has taken place. For the definitions of the various normalization forms, see [Character Model for the World Wide Web 1.0] The meanings associated with the possible values of the normalization-form parameter are as follows: o NFC specifies the serialized result should be in Unicode Normalization Form C. o NFD specifies the serialized result should be in Unicode Normalization Form D. o NFKC specifies the serialized result should be in Unicode Normalization Form KC. o NFKD specifies the serialized result should be in Unicode Normalization Form KD. o fully-normalized specifies the serialized result should be in fully normalized form. o none specifies that no Unicode normalization should be applied. o An implementation-defined value has an implementation- defined effect. o Creation of CDATA sections, as determined by the cdata-section-elements parameter. Note that this is also affected by the encoding parameter, in that characters not present in the selected encoding cannot be represented in a CDATA section. o Escaping according to XML or HTML rules of special characters and of characters that cannot be represented in the selected encoding. For example replacing < by <. >> The Unicode Normalization phase becomes the third step of character expansion. Character mapping becomes the second step, with the clarification that it does not affect elements to which cdata-section-elements applies. This was done to make it clear that any characters affected by character mapping are not affected by Unicode Normalization. The lead-in to the bulleted list will be modified so that CDATA section creation and escaping still apply to characters affected by Unicode Normalization - this is a consequence of trying to fold the two together. Finally, the last bullet will be modified to make it clear that not only special characters, but characters that can't be represented in the selected encoding are affected by that final step. May I ask you to confirm that this response is acceptable to the I18N Working Group? Thanks, Henry ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Dear XML Query WG and XSL WG, Below please find the I18N WGs comments on your last call document "XSLT 2.0 and XQuery 1.0 Serialization" (http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/). Please note the following: - Please address all replies to there comments to the I18N IG mailing list (w3c-i18n-ig@w3.org), not just to me. - Our comments are numbered in square brackets [nn]. We look forward to further discussion with you. [this mail is copied to the DOM WG to tell them what we are telling you about UTF-16 and endianness, which they should adopt for the Document Object Model (DOM) Level 3 Load and Save Specification] [15] Section 4, "To anticipate the proposed changes to end-of-line handling in XML 1.1, implementations may also output the characters x85 and x2028 as character references. This will not affect the way they are interpreted by an XML 1.0 parser.": XML 1.1 is now a REC, so this is no longer anticipated. See http://www.w3.org/TR/2004/REC-xml11-20040204/#sec-line-ends Regards, Martin.
Yes. Now that XML+NS 1.1 is at Rec status, I think the WGs need to take a fresh top-level look at our policy towards them; serialization is just one aspect of this.
Hello, In [1], Martin Duerst submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization: > [15] Section 4, "To anticipate the proposed changes to end-of-line > handling in XML 1.1, implementations may also output the characters > x85 and x2028 as character references. This will not affect the way > they are interpreted by an XML 1.0 parser.": XML 1.1 is now a REC, > so this is no longer anticipated. See > http://www.w3.org/TR/2004/REC-xml11-20040204/#sec-line-ends Thanks to Martin and the working group for this comment. The XSL and XML Query Working Groups discussed the comment, and agreed that the Serialization specification should be amended so that it no longer refers to XML 1.1 as if it were not yet a recommendation. Furthermore, the working groups decided that the handling of x85 and x2028 should be such that they can be successfully processed by either an XML 1.0 or an XML 1.1 processor without being normalized to a line-feed character, even if the value of the version parameter is 1.0. Following are the changes required to implement that change: Replace the paragraph after the bulleted list in Section 4 with the following: << A consequence of this rule is that certain whitespace characters must be output as character references, to ensure that they survive the round trip through serialization and parsing. Specifically, CR, NEL and LINE SEPARATOR characters in text nodes must be output respectively as 
, …, and 
, or their equivalents; while CR, NL, TAB, NEL and LINE SEPARATOR characters in attribute nodes must be output respectively as 
, 
, 	, …, and 
, or their equivalents >> And replace the note following the bulleted list with the following note: << Note: XML 1.0 did not permit processors to normalize NEL or LINE SEPARATOR characters to a LINE FEED character. However, if a document entity that specifies version 1.1 invokes an external general parsed entity with no TextDecl or a TextDecl that specifies a version of 1.0, the external parsed entity is processed according to the rules of XML 1.1. For this reason, NEL and LINE SEPARATOR characters in text and attribute nodes must always be escaped using character references or CDATA sections, regardless of the value of the version parameter. >> May I ask the working group to confirm that this response is acceptable to it? Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Dear XML Query WG and XSL WG, Below please find the I18N WGs comments on your last call document "XSLT 2.0 and XQuery 1.0 Serialization" (http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/). Please note the following: - Please address all replies to there comments to the I18N IG mailing list (w3c-i18n-ig@w3.org), not just to me. - Our comments are numbered in square brackets [nn]. We look forward to further discussion with you. [this mail is copied to the DOM WG to tell them what we are telling you about UTF-16 and endianness, which they should adopt for the Document Object Model (DOM) Level 3 Load and Save Specification] [16] Section 4.2 (XML output method, encoding): "If no encoding parameter is specified, then the processor must use either UTF-8 or UTF-16.": It may be desirable to further narrow this to UTF-8 for higher predictability. On the other hand, this should not say "If no encoding parameter is specified", but "If no encoding is specified (either with an encoding parameter or externally)" to allow e.g. specification of encoding with an option. Regards, Martin.
On the first point: yes, perhaps. On the second, the serializer is driven by a set of parameters. I think that by the time the serializer is invoked, the parameter values have been fully computed, regardless where they came from, so the serialization spec does not need to discuss different ways of supplying the parameters.
Michael Kay a écrit : > On the second, the serializer is driven by a set of parameters. I think > that by the time the serializer is invoked, the parameter values have > been fully computed, regardless where they came from, so the > serialization spec does not need to discuss different ways of supplying > the parameters. If the parameters are the only way to influence serialization behaviour, then this should be clarified. Section 3 now starts "There are a number of parameters that influence...", which doesn't seem to claim to exhaustiveness.
Martin, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N Working Group: << [16] Section 4.2 (XML output method, encoding): "If no encoding parameter is specified, then the processor must use either UTF-8 or UTF-16.": It may be desirable to further narrow this to UTF-8 for higher predictability. On the other hand, this should not say "If no encoding parameter is specified", but "If no encoding is specified (either with an encoding parameter or externally)" to allow e.g. specification of encoding with an option. >> Thanks to you and the working group for this comment. The XSL and XML Query Working Groups discussed your comment. Regarding the first point: in response to other Last Call comments, Serialization no longer specifies default values for parameters. This is reflected in the 23 July Working Draft of Serialization.[2] XSLT and XQuery now specify how the value of the encoding parameter is determined in all circumstances, so no change to the Serialization specification is required in response to that part of the comment. Regarding the second point: again, all the serialization parameters are fully determined by whatever mechanisms are provided by the host specification. Beyond that, serialization has implementation-dependent and implementation-defined aspects, so it should be clear that not all of a serializer's behaviour is governed by the settings of the parameters. The working groups feel no change to the Serialization specification is required in this regard. May I ask you to confirm that this response is acceptable to the I18N Working Group? Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html [2] http://www.w3.org/TR/2004/WD-xslt-xquery-serialization-20040723/#serparam ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Hello Henry, The I18N WG (Core TF) has looked at your reply. We are not satisfied with your answer. On a procedural point, we would like to point out that moving the defaults elsewhere makes them very difficult to check, and risks that agreements between WGs are forgotten. In particular, it is not sufficient to just close a comment on the specification it is made; it should be transferred to the other specification(s) where it now applies. On the actual issue, our main concern is to make sure that defaults are actually specified appropriately. In XQuery, the default seems to be 'implementation-defined': http://www.w3.org/TR/2004/WD-xquery-20040723/#id-xq-serialization-parameters We are not at all convinced that this will lead to the necessary degree of interoperability. For XSLT, there is no new public WD. A pointer to or explanation of the current solution for this issue for XSLT would be appreciated. Without having a look at it, we cannot assess whether we are satisfied with the resolution to our comment. We would also like to mention that while there may be specific considerations for each specification, using the same defaults where possible will make things easier for users, and will lead to better overall interoperability. Regards, Martin.
Martin Duerst wrote: > In XQuery, the default seems to be 'implementation-defined': > > http://www.w3.org/TR/2004/WD-xquery-20040723/#id-xq-serialization-parameters > > > We are not at all convinced that this will lead to the necessary > degree of interoperability. He was referring to this earlier comment: >> [16] Section 4.2 (XML output method, encoding): "If no encoding >> parameter is specified, then the processor must use either UTF-8 or >> UTF-16.": It may be desirable to further narrow this to UTF-8 for >> higher predictability. On the other hand, this should not say "If >> no encoding parameter is specified", but "If no encoding is >> specified (either with an encoding parameter or externally)" to >> allow e.g. specification of encoding with an option. Hi Martin, According to the XML Spec: > All XML processors MUST accept the UTF-8 and UTF-16 encodings of > Unicode 3.1 [Unicode3]; the mechanisms for signaling which of the two > is in use, or for bringing other encodings into play, are discussed > later, in 4.3.3 Character Encoding in Entities. Serialization produces XML for XML processors. Since all XML processors are required to accept the encodings that XQuery serialization is allowed to produce, the distinction between the two encodings should not make a difference unless an XML processor fails to implement the XML specification. Are you suggesting that XML processors should not be required to accept both encodings? It's true that supporting both encodings complicates implementations, especially when the various normalizations are taken into account. But in XML, I think that's a done deal, and I think that we incurred this complication largely at the urging of the I18N community. Jonathan Not on behalf of anybody.
Dear XML Query WG and XSL WG, Below please find the I18N WGs comments on your last call document "XSLT 2.0 and XQuery 1.0 Serialization" (http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/). Please note the following: - Please address all replies to there comments to the I18N IG mailing list (w3c-i18n-ig@w3.org), not just to me. - Our comments are numbered in square brackets [nn]. We look forward to further discussion with you. [this mail is copied to the DOM WG to tell them what we are telling you about UTF-16 and endianness, which they should adopt for the Document Object Model (DOM) Level 3 Load and Save Specification] [17] Section 4.2 (XML output method, encoding): "When outputting a newline character in the data model, the implementation is free to represent it using any character sequence that will be normalized to a newline character by an XML parser,...": This should probably says that for interoperability, it is better to avoid x85 and x2028. Regards, Martin.
I don't see a specific need to say that: if you're generating XML 1.0 then you need to avoid these characters and if you're generating XML 1.1 then you don't. This seems to be covered by the statement as written.
Hello, In [1], Martin Duerst submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N Working Group. > [17] Section 4.2 (XML output method, encoding): "When outputting a newline > character in the data model, the implementation is free to represent > it using any character sequence that will be normalized to a newline > character by an XML parser,...": This should probably says that > for interoperability, it is better to avoid x85 and x2028. In [2], Michael Kay responded: > I don't see a specific need to say that: if you're generating XML 1.0 > then you need to avoid these characters and if you're generating XML 1.1 > then you don't. This seems to be covered by the statement as written. Thanks to Martin and the I18N Working Group for this comment. The XSL and XQuery Working Groups discussed the comment, and agreed with Michael Kay that the statement regarding the representation of newline characters in the serialized document was correct as written, and that no change is required. May I ask the I18N Working Group to confirm that this response is acceptable? Thanks, Henry [On behalf of the XSL and XQuery Working Groups.] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html [2] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0401.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Hello Henry, Sorry for the delay in replying to your mails. The I18N WG (Core TF) has looked at your response, and we are glad to tell you that it is acceptable for us. Regards, Martin. At 10:59 04/04/13 -0400, Henry Zongaro wrote: >Hello, > > In [1], Martin Duerst submitted the following comment on the Last >Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of >the I18N Working Group. > > > [17] Section 4.2 (XML output method, encoding): "When outputting a >newline > > character in the data model, the implementation is free to represent > > it using any character sequence that will be normalized to a newline > > character by an XML parser,...": This should probably says that > > for interoperability, it is better to avoid x85 and x2028. > > In [2], Michael Kay responded: > > > I don't see a specific need to say that: if you're generating XML 1.0 > > then you need to avoid these characters and if you're generating XML 1.1 > > then you don't. This seems to be covered by the statement as written. > > Thanks to Martin and the I18N Working Group for this comment. > > The XSL and XQuery Working Groups discussed the comment, and agreed >with Michael Kay that the statement regarding the representation of >newline characters in the serialized document was correct as written, and >that no change is required. > > May I ask the I18N Working Group to confirm that this response is >acceptable? > >Thanks, > >Henry [On behalf of the XSL and XQuery Working Groups.] >[1] >http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html >[2] >http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0401.html >------------------------------------------------------------------ >Henry Zongaro Xalan development >IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 >mailto:zongaro@ca.ibm.com
Dear XML Query WG and XSL WG, Below please find the I18N WGs comments on your last call document "XSLT 2.0 and XQuery 1.0 Serialization" (http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/). Please note the following: - Please address all replies to there comments to the I18N IG mailing list (w3c-i18n-ig@w3.org), not just to me. - Our comments are numbered in square brackets [nn]. We look forward to further discussion with you. [this mail is copied to the DOM WG to tell them what we are telling you about UTF-16 and endianness, which they should adopt for the Document Object Model (DOM) Level 3 Load and Save Specification] [18] Section 4.5 (XML output method, omit-xml-declaration): "The omit-xml-declaration parameter must be ignored if the standalone parameter is present, or if the encoding parameter specifies a value other than UTF-8 or UTF-16.": This disallows producing XML other than UTF-8 or UTF-16 without an xml declaration even though this is legal e.g. if served over HTTP with a corresponding charset parameter. We are not sure this is intended, and we are not sure this is a good thing. On the other hand, omit-xml-declaration must also be ignored if version is not 1.0. Regards, Martin.
This rule overriding omit-xml-declaration has proved controversial with some users, usually because they want to output fragments of XML that they can concatenate into a single file. We should review it. On the other hand, users do complain if the serializer produces output that an XML parser then rejects.
Hello, In [1], Martin Duerst submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N Working Group. > [18] Section 4.5 (XML output method, omit-xml-declaration): "The > omit-xml-declaration parameter must be ignored if the standalone > parameter is present, or if the encoding parameter specifies a > value other than UTF-8 or UTF-16.": This disallows producing > XML other than UTF-8 or UTF-16 without an xml declaration even > though this is legal e.g. if served over HTTP with a corresponding > charset parameter. We are not sure this is intended, and we > are not sure this is a good thing. On the other hand, > omit-xml-declaration must also be ignored if version is not 1.0. Thanks to Martin and the I18N Working Group for this comment. The XSL and XQuery Working groups discussed this comment. Regarding the second point, although XML 1.1 requires a document entity to have an XML declaration, it does not require an external general parsed entity to have a text declaration. The setting of the omit-xml-declaration parameter could still be meaningful, even if the version parameter has a value other than 1.0. Regarding the first point, as originally written, XML 1.0 required an XML declaration or a text declaration if the encoding of the document or external general parsed entity was anything other than UTF-8 or UTF-16. XSLT 1.0 enforced that requirement in its serialization mechanism. The draft of Serialization inherited that behaviour from XSLT 1.0. However, an erratum to XML 1.0 removed that requirement. In response to both points, the working groups decided that the Serialization specification should permit an XML declaration or text declaration to be omitted in precisely those circumstances in which it can be omitted according to XML 1.0 and XML 1.1. In particular, the working groups decided that if the serialized result could be considered to be the text declaration of an external general parsed entity, the omit-xml-declaration parameter could have the value yes or the value no, and the parameter's setting would take effect. They further decided that if the serialized result could only be considered to be a document entity because o the standalone parameter had the value yes or no; or o the version parameter had a value other than 1.0 and the doctype-system parameter was supplied the omit-xml-declaration parameter must have the value no. Otherwise, a serialization error results. A host language would, of course, have the option of ensuring such conflicts never arise through whatever language-specific mechanism it uses to specify serialization parameters. May I ask the working group to confirm that that response is acceptable? Thanks, Henry [On behalf of the XSL and XQuery Working Groups.] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Hello Henry, The I18N WG (Core TF) has looked at your response. We can confirm that we are okay with your solution under the assumption that the default is still the same (i.e. omit-xml-declaration='no', i.e. it is the default to omit an XML declaration). Regards, Martin. At 11:17 04/04/13 -0400, Henry Zongaro wrote: >Hello, > > In [1], Martin Duerst submitted the following comment on the Last >Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of >the I18N Working Group. > > > [18] Section 4.5 (XML output method, omit-xml-declaration): "The > > omit-xml-declaration parameter must be ignored if the standalone > > parameter is present, or if the encoding parameter specifies a > > value other than UTF-8 or UTF-16.": This disallows producing > > XML other than UTF-8 or UTF-16 without an xml declaration even > > though this is legal e.g. if served over HTTP with a corresponding > > charset parameter. We are not sure this is intended, and we > > are not sure this is a good thing. On the other hand, > > omit-xml-declaration must also be ignored if version is not 1.0. > > Thanks to Martin and the I18N Working Group for this comment. > > The XSL and XQuery Working groups discussed this comment. > > Regarding the second point, although XML 1.1 requires a document >entity to have an XML declaration, it does not require an external general >parsed entity to have a text declaration. The setting of the >omit-xml-declaration parameter could still be meaningful, even if the >version parameter has a value other than 1.0. > > Regarding the first point, as originally written, XML 1.0 required an >XML declaration or a text declaration if the encoding of the document or >external general parsed entity was anything other than UTF-8 or UTF-16. >XSLT 1.0 enforced that requirement in its serialization mechanism. The >draft of Serialization inherited that behaviour from XSLT 1.0. However, >an erratum to XML 1.0 removed that requirement. > > In response to both points, the working groups decided that the >Serialization specification should permit an XML declaration or text >declaration to be omitted in precisely those circumstances in which it can >be omitted according to XML 1.0 and XML 1.1. > > In particular, the working groups decided that if the serialized >result could be considered to be the text declaration of an external >general parsed entity, the omit-xml-declaration parameter could have the >value yes or the value no, and the parameter's setting would take effect. >They further decided that if the serialized result could only be >considered to be a document entity because > > o the standalone parameter had the value yes or no; or > o the version parameter had a value other than 1.0 and the > doctype-system parameter was supplied > >the omit-xml-declaration parameter must have the value no. Otherwise, a >serialization error results. A host language would, of course, have the >option of ensuring such conflicts never arise through whatever >language-specific mechanism it uses to specify serialization parameters. > > May I ask the working group to confirm that that response is >acceptable? > >Thanks, > >Henry [On behalf of the XSL and XQuery Working Groups.] >[1] >http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html >------------------------------------------------------------------ >Henry Zongaro Xalan development >IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 >mailto:zongaro@ca.ibm.com
Hello, Martin, Regarding the response to the I18N WG's comment number [18], you wrote: Martin Duerst <duerst@w3.org> wrote on 2004-05-05 04:12:39 AM: > The I18N WG (Core TF) has looked at your response. > We can confirm that we are okay with your solution under the > assumption that the default is still the same (i.e. > omit-xml-declaration='no', i.e. it is the default to omit an > XML declaration). In response to another last call comment, default settings for parameters to serialization will be determined by the process that sets those parameters. The particular default settings specified by XSLT 2.0 and XQuery 1.0 have not changed, however. In particular, the XSLT 2.0 specifies a default value of no for the value of the omit-xml-declaration parameter, while XQuery 1.0 specifies a default value of yes. Thanks, Henry ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Hello Henry, We have looked at the response below, and are fine with it, i.e. we think that it satisfactorily addresses our original comment. Regards, Martin. At 08:50 04/05/05 -0400, Henry Zongaro wrote: >Hello, Martin, > > Regarding the response to the I18N WG's comment number [18], you >wrote: > >Martin Duerst <duerst@w3.org> wrote on 2004-05-05 04:12:39 AM: > > The I18N WG (Core TF) has looked at your response. > > We can confirm that we are okay with your solution under the > > assumption that the default is still the same (i.e. > > omit-xml-declaration='no', i.e. it is the default to omit an > > XML declaration). > > In response to another last call comment, default settings for >parameters to serialization will be determined by the process that sets >those parameters. The particular default settings specified by XSLT 2.0 >and XQuery 1.0 have not changed, however. In particular, the XSLT 2.0 >specifies a default value of no for the value of the omit-xml-declaration >parameter, while XQuery 1.0 specifies a default value of yes. > >Thanks, > >Henry >------------------------------------------------------------------ >Henry Zongaro Xalan development >IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 >mailto:zongaro@ca.ibm.com
Dear XML Query WG and XSL WG, Below please find the I18N WGs comments on your last call document "XSLT 2.0 and XQuery 1.0 Serialization" (http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/). Please note the following: - Please address all replies to there comments to the I18N IG mailing list (w3c-i18n-ig@w3.org), not just to me. - Our comments are numbered in square brackets [nn]. We look forward to further discussion with you. [this mail is copied to the DOM WG to tell them what we are telling you about UTF-16 and endianness, which they should adopt for the Document Object Model (DOM) Level 3 Load and Save Specification] [19] 6.4 HTML Output Method: Writing Character Data: "When outputting a sequence of whitespace characters in the data model, within an element where whitespace is treated normally, (but not in elements such as pre and textarea) the html output method may represent it using any character sequence that will be treated as whitespace by an HTML user agent.": @@@ We need to check whether this (which allows replacement of whitespace including linebreaks by whitespace not including linebreaks and vice-versa) is okay for Chinese, Japanese, Thai,... (languages without spaces between words). This has to be checked extremely carefully. Regards, Martin.
I think it's better if we don't try to define the detailed rules here, but just state the constraint: you can replace one whitespace sequence by another if user agents treat them as equivalent. If we try to be more precise than this, we will get it wrong.
The current text does not say that, it says that one sequence of white can be replaced by another if HTML user agents consider the latter as whitespace (presumably in the XML sense). But HTML user agents need to distinguish line breaks from other whitespace, for the reasons hinted to by Martin. See list item 9 in http://www.w3.org/TR/2001/REC-xhtml-modularization-20010410/conformance.html#s_conform_user_agent for the gory details.
Thanks, distinction noted.
Hi, Martin. In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N working group: Martin Duerst wrote on 2004-02-15 12:37:30 PM: > [19] 6.4 HTML Output Method: Writing Character Data: "When outputting > a sequence of whitespace characters in the data model, within an > element where whitespace is treated normally, (but not in elements > such as pre and textarea) the html output method may represent it > using any character sequence that will be treated as whitespace > by an HTML user agent.": @@@ We need to check whether this (which > allows replacement of whitespace including linebreaks by whitespace > not including linebreaks and vice-versa) is okay for Chinese, > Japanese, Thai,... (languages without spaces between words). > This has to be checked extremely carefully. In [2], François Yergeau added the following information, in response to a note from Michael Kay on the topic: > > I think it's better if we don't try to define the detailed rules here, > > but just state the constraint: you can replace one whitespace sequence > > by another if user agents treat them as equivalent. > > The current text does not say that, it says that one sequence of white > can be replaced by another if HTML user agents consider the latter as > whitespace (presumably in the XML sense). But HTML user agents need to > distinguish line breaks from other whitespace, for the reasons hinted to > by Martin. See list item 9 in > http://www.w3.org/TR/2001/REC-xhtml-modularization-20010410/conformance.html#s_conform_user_agent > for the gory details. Thanks to you and the I18N working group for this comment. The XSL and XML Query Working Groups discussed the comment. The working groups were unable to find any statement in HTML 4.01 that different whitespace characters can be treated differently, ignoring such elements as pre and textarea. The reference that François provided was from the XHTML Modularization Recommendation, although the original comment was on the html output method. In discussing the comment, some members of the WGs thought that XHTML Modularization probably better reflected the requirements placed on HTML user agents in order to support languages such as those you mentioned. The WGs decided to add a normative requirement in the description of the html output method stating that whitespace characters can be replaced only with any other sequence of whitespace characters that has the same effect in a user agent. The WGs also decided to add a non-normative reference pointing to bullet 9 of section 3.5 of XHTML Modularization, to provide further information on the issues involved. May I ask you to confirm that this response is acceptable to the I18N Working Group? Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html [2] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1025.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Dear XML Query WG and XSL WG, Below please find the I18N WGs comments on your last call document "XSLT 2.0 and XQuery 1.0 Serialization" (http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/). Please note the following: - Please address all replies to there comments to the I18N IG mailing list (w3c-i18n-ig@w3.org), not just to me. - Our comments are numbered in square brackets [nn]. We look forward to further discussion with you. [this mail is copied to the DOM WG to tell them what we are telling you about UTF-16 and endianness, which they should adopt for the Document Object Model (DOM) Level 3 Load and Save Specification] [20] 6.4 HTML Output Method: Writing Character Data: "Certain characters, specifically the control characters #x7F-#x9F, are legal in XML but not in HTML. ... The processor may signal the error, but is not required to do so.": Please change this to require the processor to produce an error. Regards, Martin.
I worry that we will get many complaints from users who are misusing these codepoints if we do this. Their code will stop working, and it may be quite difficult for them to fix it. (Though it's a good use case for character maps...)
Hello, In [1], Martin Duerst submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N Working Group. > [20] 6.4 HTML Output Method: Writing Character Data: "Certain characters, > specifically the control characters #x7F-#x9F, are legal in XML but > not in HTML. ... The processor may signal the error, but is not > required to do so.": Please change this to require the processor > to produce an error. In [2], Michael Kay responded: > I worry that we will get many complaints from users who are misusing > these codepoints if we do this. Their code will stop working, and it may > be quite difficult for them to fix it. (Though it's a good use case for > character maps...) Thanks to Martin and the I18N Working Group for this comment. The XSL and XQuery Working Groups discussed the comment, and decided to endorse Michael Kay's response without any change to the Serialization specification. May I ask the I18N Working Group to confirm that this response is acceptable? Thanks, Henry [On behalf of the XSL and XQuery Working Groups.] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html [2] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0401.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Hello Henry, Many thanks for your replies to our comments. The I18N WG (Core TF) has looked at your reply below. We are sorry, but we have to clearly disagree. We think that producing junk is never a good idea. See below for further discussion. At 11:17 04/04/13 -0400, Henry Zongaro wrote: >Hello, > > In [1], Martin Duerst submitted the following comment on the Last >Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of >the I18N Working Group. > > > [20] 6.4 HTML Output Method: Writing Character Data: "Certain >characters, > > specifically the control characters #x7F-#x9F, are legal in XML but > > not in HTML. ... The processor may signal the error, but is not > > required to do so.": Please change this to require the processor > > to produce an error. > > In [2], Michael Kay responded: > > > I worry that we will get many complaints from users who are misusing > > these codepoints if we do this. How are they misusing these code points? The case we know is that bytes in the rage 0x80-0x9F are used e.g. in iso-8859-1 but with the intent of giving them the windows-1252 semantics. If somebody is reading in windows-1252 documents, then it's simple to just declare them that way. Also, if somebody wants windows-1252 as output, they can just say so using XSLT. Neither reading windows-1252 nor writing out windows-1252 is in any way a misuse of XML, HTML, or XSLT. HTML allows using the *bytes* 0x80-0x9F if in the encoding used, they are encoding *characters* that are allowed by HTML. If it is some other misuse that you are speaking about, please inform us about the details. > > Their code will stop working, In some way just a detail, but: There is currently no XSLT 2.0 code that will stop working. XSTL 1.0 doesn't have the XHTML output method. With kind regards, Martin. > > and it may > > be quite difficult for them to fix it. (Though it's a good use case for > > character maps...) > > Thanks to Martin and the I18N Working Group for this comment. > > The XSL and XQuery Working Groups discussed the comment, and decided >to endorse Michael Kay's response without any change to the Serialization >specification. > > May I ask the I18N Working Group to confirm that this response is >acceptable? > >Thanks, > >Henry [On behalf of the XSL and XQuery Working Groups.] >[1] >http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html >[2] >http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0401.html >------------------------------------------------------------------ >Henry Zongaro Xalan development >IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 >mailto:zongaro@ca.ibm.com
> > > I worry that we will get many complaints from users who > are misusing > > > these codepoints if we do this. > > How are they misusing these code points? The case we know is that > bytes in the rage 0x80-0x9F are used e.g. in iso-8859-1 but with > the intent of giving them the windows-1252 semantics. This was the case I had in mind. People create documents in cp1252 and declare them as iso-8859-1. And it all works, because the errors cancel each other out. If we oblige processors to detect this situation we will be asking users to pay for the extra processing cost, and in return the application that worked before will stop working. Will they thank us? Because if they won't, we shouldn't do it. > > In some way just a detail, but: There is currently no XSLT 2.0 > code that will stop working. XSTL 1.0 doesn't have the XHTML > output method. I may have lost the thread, but I thought we were discussing the HTML output method? > > [20] 6.4 HTML Output Method: Writing Character Data: "Certain >characters, Michael Kay
Hello Michael, The I18N WG (Core TF) has discussed your mail, and has asked me to reply. I'm sorry for the delay. At 17:52 04/05/06 +0100, Michael Kay wrote: > > > > I worry that we will get many complaints from users who > > are misusing > > > > these codepoints if we do this. > > > > How are they misusing these code points? The case we know is that > > bytes in the rage 0x80-0x9F are used e.g. in iso-8859-1 but with > > the intent of giving them the windows-1252 semantics. > >This was the case I had in mind. People create documents in cp1252 and >declare them as iso-8859-1. And it all works, because the errors cancel each >other out. If we oblige processors to detect this situation we will be >asking users to pay for the extra processing cost, and in return the >application that worked before will stop working. Will they thank us? >Because if they won't, we shouldn't do it. Some users will be very thankful, others won't. The users that will be thankful will be those that care about data integrity and interoperability worldwide and in the long term. They will be able to fix a problem in their data that they otherwise might not have found. As a result, they will not only produce correct, valid output, but will also make sure that their input data will work well in other circumstances, such as searching, sorting, and any kind of other processing. Not the least, with the introduction of XML 1.1, there are also such issues as the confusion betwen NEL and the three-dot elipsis. There was a time when the mentality on the Web was 'everything goes', which lead to the slippery slope of bugwards compatibility. We have learned, with great pain, that this is a dead end, and we don't want to go there anymore. XML is the clearest example of how this can be done better. And I sincerely hope that XSLT will not be tempted to go down the bugwards compatibility slope. The C1 area is forbidden in HTML exactly because it is a very easy and cheap way to help people check and (if necessary) clean up their data. RFC 2070 (http://www.ietf.org/rfc/rfc2070.txt) was written almost 10 years ago. That C1 is allowed in XML is, according to James Clark, an oversight. XML 1.1 has corrected it. > > In some way just a detail, but: There is currently no XSLT 2.0 > > code that will stop working. XSTL 1.0 doesn't have the XHTML > > output method. > >I may have lost the thread, but I thought we were discussing the HTML output >method? Okay, sorry. There is still no XSLT 2.0 code that will stop working, even for the HTML output method. And because the XHTML output method is supposed to work according to the compatibility guidelines, it of course also should forbid producing C1 character output. Regards, Martin. > > > [20] 6.4 HTML Output Method: Writing Character Data: "Certain > >characters, > >Michael Kay
Thanks. There's no easy right answer on this one. It's similar to the question of whether products should accept "c:\a\b.xml" in places where a URI is required. Some products allow it. I've resisted, and report it as an error. When users find that it works on one product and doesn't work on mine, it's me they complain to. I tell them they are wrong and they should read the specs, but I can afford to do that because they aren't (at present) paying customers. I would be happy with the stricter rule if we had imposed it from the start. I'm not happy with the idea that version 2 should be stricter than version 1. That's in good measure because, for the time being, people's first exposure to XSLT 2.0 is through my product, and when they get compatibility or usability problems, they report it to me as "a Saxon bug". In addition, the XSLT spec has always been pragmatic about the reality of HTML interoperability. If the spec wasn't pragmatic in this way, then I think XSLT implementors would have to be pragmatic, and the weaknesses of HTML conformance would spill over into weaknesses in XSLT conformance. There are many ways that we allow XSLT stylesheets to generate non-conformant HTML, and I don't see that this one is particularly different from the others. Most areas where we have tried to be strict about what we generate (for example, in URI escaping) have led to practical problems for users. Michael Kay > -----Original Message----- > From: Martin Duerst [mailto:duerst@w3.org] > Sent: 21 May 2004 08:09 > To: Michael Kay; 'Henry Zongaro'; w3c-i18n-ig@w3.org > Cc: public-qt-comments@w3.org > Subject: RE: [Serial] I18N WG last call comments > > Hello Michael, > > The I18N WG (Core TF) has discussed your mail, and has asked > me to reply. I'm sorry for the delay. > > At 17:52 04/05/06 +0100, Michael Kay wrote: > > > > > > I worry that we will get many complaints from users who > > > are misusing > > > > > these codepoints if we do this. > > > > > > How are they misusing these code points? The case we know is that > > > bytes in the rage 0x80-0x9F are used e.g. in iso-8859-1 but with > > > the intent of giving them the windows-1252 semantics. > > > >This was the case I had in mind. People create documents in > cp1252 and > >declare them as iso-8859-1. And it all works, because the > errors cancel each > >other out. If we oblige processors to detect this situation > we will be > >asking users to pay for the extra processing cost, and in return the > >application that worked before will stop working. Will they thank us? > >Because if they won't, we shouldn't do it. > > Some users will be very thankful, others won't. The users that will > be thankful will be those that care about data integrity and > interoperability > worldwide and in the long term. They will be able to fix a problem > in their data that they otherwise might not have found. As a result, > they will not only produce correct, valid output, but will also > make sure that their input data will work well in other circumstances, > such as searching, sorting, and any kind of other processing. Not the > least, with the introduction of XML 1.1, there are also such issues > as the confusion betwen NEL and the three-dot elipsis. > > There was a time when the mentality on the Web was 'everything goes', > which lead to the slippery slope of bugwards compatibility. We have > learned, with great pain, that this is a dead end, and we don't want > to go there anymore. XML is the clearest example of how this can be > done better. And I sincerely hope that XSLT will not be tempted to > go down the bugwards compatibility slope. > > The C1 area is forbidden in HTML exactly because it is a very easy > and cheap way to help people check and (if necessary) clean up their > data. RFC 2070 (http://www.ietf.org/rfc/rfc2070.txt) was written > almost 10 years ago. That C1 is allowed in XML is, according to > James Clark, an oversight. XML 1.1 has corrected it. > > > > > In some way just a detail, but: There is currently no XSLT 2.0 > > > code that will stop working. XSTL 1.0 doesn't have the XHTML > > > output method. > > > >I may have lost the thread, but I thought we were discussing > the HTML output > >method? > > Okay, sorry. There is still no XSLT 2.0 code that will stop working, > even for the HTML output method. And because the XHTML output > method is supposed to work according to the compatibility guidelines, > it of course also should forbid producing C1 character output. > > Regards, Martin. > > > > > > [20] 6.4 HTML Output Method: Writing Character Data: "Certain > > >characters, > > > >Michael Kay > >
FWIW here's what we do: we tout our products as being standards-based and therefore more interoperable. When a customer complains about something that is, in fact, following the standard, but not doing what they want, we provide a custom solution (for $$$). We also take a look at the standard to make sure that it makes sense, and if it doesn't, and we have the bandwidth, we try to improve the standard. So the question is, will the majority be happy or unhappy with a particular decision on the standard? I am not trying to answer that question, I'm only saying that customers complaining about the standard will always be there. The issue is if lots of customers complain about the same thing, then it's a telling sign that the standard isn't serving the purpose. Andrea (from the cheap seats) Michael Kay wrote: > Thanks. There's no easy right answer on this one. It's similar to the > question of whether products should accept "c:\a\b.xml" in places where a > URI is required. Some products allow it. I've resisted, and report it as an > error. When users find that it works on one product and doesn't work on > mine, it's me they complain to. I tell them they are wrong and they should > read the specs, but I can afford to do that because they aren't (at present) > paying customers. > > I would be happy with the stricter rule if we had imposed it from the start. > I'm not happy with the idea that version 2 should be stricter than version > 1. That's in good measure because, for the time being, people's first > exposure to XSLT 2.0 is through my product, and when they get compatibility > or usability problems, they report it to me as "a Saxon bug". > > In addition, the XSLT spec has always been pragmatic about the reality of > HTML interoperability. If the spec wasn't pragmatic in this way, then I > think XSLT implementors would have to be pragmatic, and the weaknesses of > HTML conformance would spill over into weaknesses in XSLT conformance. There > are many ways that we allow XSLT stylesheets to generate non-conformant > HTML, and I don't see that this one is particularly different from the > others. Most areas where we have tried to be strict about what we generate > (for example, in URI escaping) have led to practical problems for users. > > Michael Kay > > > >>-----Original Message----- >>From: Martin Duerst [mailto:duerst@w3.org] >>Sent: 21 May 2004 08:09 >>To: Michael Kay; 'Henry Zongaro'; w3c-i18n-ig@w3.org >>Cc: public-qt-comments@w3.org >>Subject: RE: [Serial] I18N WG last call comments >> >>Hello Michael, >> >>The I18N WG (Core TF) has discussed your mail, and has asked >>me to reply. I'm sorry for the delay. >> >>At 17:52 04/05/06 +0100, Michael Kay wrote: >> >> >>>>>>I worry that we will get many complaints from users who >>>> >>>>are misusing >>>> >>>>>>these codepoints if we do this. >>>> >>>>How are they misusing these code points? The case we know is that >>>>bytes in the rage 0x80-0x9F are used e.g. in iso-8859-1 but with >>>>the intent of giving them the windows-1252 semantics. >>> >>>This was the case I had in mind. People create documents in >> >>cp1252 and >> >>>declare them as iso-8859-1. And it all works, because the >> >>errors cancel each >> >>>other out. If we oblige processors to detect this situation >> >>we will be >> >>>asking users to pay for the extra processing cost, and in return the >>>application that worked before will stop working. Will they thank us? >>>Because if they won't, we shouldn't do it. >> >>Some users will be very thankful, others won't. The users that will >>be thankful will be those that care about data integrity and >>interoperability >>worldwide and in the long term. They will be able to fix a problem >>in their data that they otherwise might not have found. As a result, >>they will not only produce correct, valid output, but will also >>make sure that their input data will work well in other circumstances, >>such as searching, sorting, and any kind of other processing. Not the >>least, with the introduction of XML 1.1, there are also such issues >>as the confusion betwen NEL and the three-dot elipsis. >> >>There was a time when the mentality on the Web was 'everything goes', >>which lead to the slippery slope of bugwards compatibility. We have >>learned, with great pain, that this is a dead end, and we don't want >>to go there anymore. XML is the clearest example of how this can be >>done better. And I sincerely hope that XSLT will not be tempted to >>go down the bugwards compatibility slope. >> >>The C1 area is forbidden in HTML exactly because it is a very easy >>and cheap way to help people check and (if necessary) clean up their >>data. RFC 2070 (http://www.ietf.org/rfc/rfc2070.txt) was written >>almost 10 years ago. That C1 is allowed in XML is, according to >>James Clark, an oversight. XML 1.1 has corrected it. >> >> >> >>>>In some way just a detail, but: There is currently no XSLT 2.0 >>>>code that will stop working. XSTL 1.0 doesn't have the XHTML >>>>output method. >>> >>>I may have lost the thread, but I thought we were discussing >> >>the HTML output >> >>>method? >> >>Okay, sorry. There is still no XSLT 2.0 code that will stop working, >>even for the HTML output method. And because the XHTML output >>method is supposed to work according to the compatibility guidelines, >>it of course also should forbid producing C1 character output. >> >>Regards, Martin. >> >> >> >>>>>[20] 6.4 HTML Output Method: Writing Character Data: "Certain >>>> >>>>characters, >>> >>>Michael Kay >> >> > -- I have always wished that my computer would be as easy to use as my telephone. My wish has come true. I no longer know how to use my telephone. -Bjarne Stroustrup, designer of C++ programming language (1950- )
Hello Michael, At 11:04 04/05/21 +0100, Michael Kay wrote: >Thanks. There's no easy right answer on this one. It's similar to the >question of whether products should accept "c:\a\b.xml" in places where a >URI is required. Some products allow it. I've resisted, and report it as an >error. When users find that it works on one product and doesn't work on >mine, it's me they complain to. I tell them they are wrong and they should >read the specs, but I can afford to do that because they aren't (at present) >paying customers. I think rather than waiting for the customers to complain, the best solution may be to produce an instructing error message. In this case, such a message is quite easy to produce. In this way, you can tell them, without having to write emails individually. For example, an error message could read as follows: line x, character y: Illegal C1 codepoint in HTML output. (Hint: This is most probably due to a problem in the input, in particular for example due to input declared to be in the iso-8859-1 character encoding that is actually in the windows-1252 character encoding.) >I would be happy with the stricter rule if we had imposed it from the start. >I'm not happy with the idea that version 2 should be stricter than version >1. That's in good measure because, for the time being, people's first >exposure to XSLT 2.0 is through my product, and when they get compatibility >or usability problems, they report it to me as "a Saxon bug". I have separately complained about the number of XSLT 1.0/2.0 compatibility issues, so I'm definitely not unsympathetic to this point. But looking at the various patterns of compatibility issues, this one is really harmless: The XSLT fails with a very clear error message and a very clear fix. Although I haven't done an in-depth analysis (I have suggested such a thing), I strongly suspect that many other incompatibilities are of a much more dangerous nature: The XSLT still works, but the output is a little different in some cases, which may be detected sooner or later, or too late. >In addition, the XSLT spec has always been pragmatic about the reality of >HTML interoperability. Well, if it were only HTML interoperability, that may be another issue. But the fact is that we know that the XML input is garbage. And I don't think XSLT should be lenitent with garbage XML input in cases where it is easy to detect that it's garbage. >If the spec wasn't pragmatic in this way, then I >think XSLT implementors would have to be pragmatic, and the weaknesses of >HTML conformance would spill over into weaknesses in XSLT conformance. I'm sure there will be a test suite for XSLT 2.0. Adding the right tests would probably go a long way, and wouldn't be very difficult. (I'd be happy to produce some.) >There >are many ways that we allow XSLT stylesheets to generate non-conformant >HTML, and I don't see that this one is particularly different from the >others. Could you point to a list of these, or list (some of) them here? Regards, Martin. >Most areas where we have tried to be strict about what we generate >(for example, in URI escaping) have led to practical problems for users. > >Michael Kay > > > > -----Original Message----- > > From: Martin Duerst [mailto:duerst@w3.org] > > Sent: 21 May 2004 08:09 > > To: Michael Kay; 'Henry Zongaro'; w3c-i18n-ig@w3.org > > Cc: public-qt-comments@w3.org > > Subject: RE: [Serial] I18N WG last call comments > > > > Hello Michael, > > > > The I18N WG (Core TF) has discussed your mail, and has asked > > me to reply. I'm sorry for the delay. > > > > At 17:52 04/05/06 +0100, Michael Kay wrote: > > > > > > > > I worry that we will get many complaints from users who > > > > are misusing > > > > > > these codepoints if we do this. > > > > > > > > How are they misusing these code points? The case we know is that > > > > bytes in the rage 0x80-0x9F are used e.g. in iso-8859-1 but with > > > > the intent of giving them the windows-1252 semantics. > > > > > >This was the case I had in mind. People create documents in > > cp1252 and > > >declare them as iso-8859-1. And it all works, because the > > errors cancel each > > >other out. If we oblige processors to detect this situation > > we will be > > >asking users to pay for the extra processing cost, and in return the > > >application that worked before will stop working. Will they thank us? > > >Because if they won't, we shouldn't do it. > > > > Some users will be very thankful, others won't. The users that will > > be thankful will be those that care about data integrity and > > interoperability > > worldwide and in the long term. They will be able to fix a problem > > in their data that they otherwise might not have found. As a result, > > they will not only produce correct, valid output, but will also > > make sure that their input data will work well in other circumstances, > > such as searching, sorting, and any kind of other processing. Not the > > least, with the introduction of XML 1.1, there are also such issues > > as the confusion betwen NEL and the three-dot elipsis. > > > > There was a time when the mentality on the Web was 'everything goes', > > which lead to the slippery slope of bugwards compatibility. We have > > learned, with great pain, that this is a dead end, and we don't want > > to go there anymore. XML is the clearest example of how this can be > > done better. And I sincerely hope that XSLT will not be tempted to > > go down the bugwards compatibility slope. > > > > The C1 area is forbidden in HTML exactly because it is a very easy > > and cheap way to help people check and (if necessary) clean up their > > data. RFC 2070 (http://www.ietf.org/rfc/rfc2070.txt) was written > > almost 10 years ago. That C1 is allowed in XML is, according to > > James Clark, an oversight. XML 1.1 has corrected it. > > > > > > > > In some way just a detail, but: There is currently no XSLT 2.0 > > > > code that will stop working. XSTL 1.0 doesn't have the XHTML > > > > output method. > > > > > >I may have lost the thread, but I thought we were discussing > > the HTML output > > >method? > > > > Okay, sorry. There is still no XSLT 2.0 code that will stop working, > > even for the HTML output method. And because the XHTML output > > method is supposed to work according to the compatibility guidelines, > > it of course also should forbid producing C1 character output. > > > > Regards, Martin. > > > > > > > > > [20] 6.4 HTML Output Method: Writing Character Data: "Certain > > > >characters, > > > > > >Michael Kay > > > >
> > >There > >are many ways that we allow XSLT stylesheets to generate > non-conformant > >HTML, and I don't see that this one is particularly > different from the > >others. > > Could you point to a list of these, or list (some of) them here? > For example: * you can produce elements and attributes that aren't defined in HTML * you can nest elements in ways that aren't allowed in HTML * you can give attributes values that aren't allowed in HTML * you can use any system ID and public ID that you like in the doctype declaration * you can use disable-output-escaping (or now character maps) to produce any kind of garbage that takes your fancy * you can suppress the escaping of URIs in URI-valued attributes * you can suppress the generation of the META element defining the character encoding or generate your own that contains a value unrelated to the true character encoding All these features are occasionally useful either to exploit non-standard features in browsers, or to generate output designed for processing by software other than HTML browsers. Michael Kay
Hello Henry, [I have copied the I18N IG as well as a list on your side to reduce the change that this gets lost. I suggest to do that with all messages related to last-call discussions.] At 16:44 04/08/30 -0400, Henry Zongaro wrote: >Hello, Martin. > > In [1], you submitted the following comment on the Last Call Working >Draft of Serialization on behalf of the I18N Working Group: > ><< >[20] 6.4 HTML Output Method: Writing Character Data: "Certain characters, > specifically the control characters #x7F-#x9F, are legal in XML but > not in HTML. ... The processor may signal the error, but is not > required to do so.": Please change this to require the processor > to produce an error. > >> > > The XSL and XML Query Working Groups made the decision recorded at >[2], but the I18N Working Group raised an objection [3] to that decision. > > A subsequent e-mail exchange [4-9] ensued between yourself, Michael >Kay and Andrea Vine. The final message in the thread came from Michael >Kay. > > As the XSL and XQuery Working Groups have not heard whether the >additional discussion satisfactorily clarified the issue for the I18N >Working Group, we will assume that the issue has been resolved to their >satisfaction. If that is not the case, please advise us of any additional >points requiring clarification. The I18N WG (Core TF) has had a look at this issue (again!). We have decided that we need to object to your current resolution. In particular, we want to point out that while Michael Kay has listed other cases where it is possible to create non-valid HTML with the HTML serialization method (see http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0054.html), all these issues are higher-level than the issue at hand. They are all related to document structure and/or are not used by default. Trying to address issues related to document structure would mean that serialization would have to deal with HTML versioning info and configuration options for such versioning, which would considerably complicate the specification. This is not at all the case for disallowing code points in the C1 range, which is independent of HTML versioning and at a much more basic level. Also, as Micheal mentioned, character maps can be used to circumvent any kinds of output restrictions. It is much better to make the production of clean, correct output the default (in particular when this can be easily achieved), and have some mechanism for circumvention, than to tolerate crappy output from the start. The misused of codepoints in the C1 range has been a long-standing problem, and we greatly hope that XQuery and XSLT can help to solve it rather than contribute to production of more garbage. Regards, Martin. P.S.: For the record, I would also like to point out that the I18N WG has officially disagreed on this issue at http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0004.html. The following discussion has brought up some more details, but there is no indication that the I18N WG would have changed its opinion. I think that it is clearly inappropriate in such cases to say, as you do above "we will assume that the issue has been resolved to their satisfaction". [I think doing such a thing is appropriate when you have fully (or maybe partially) addressed our comment.] >Thanks, > >Henry >[1] >http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html >[2] >http://lists.w3.org/Archives/Public/public-qt-comments/2004Apr/0068.html >[3] >http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0004.html >[4] >http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0012.html >[5] >http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0040.html >[6] >http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0041.html >[7] >http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0049.html >[8] >http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0052.html >[9] >http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0054.html >------------------------------------------------------------------ >Henry Zongaro Xalan development >IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 >mailto:zongaro@ca.ibm.com
In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N Working Group. >> [20] 6.4 HTML Output Method: Writing Character Data: "Certain characters, specifically the control characters #x7F-#x9F, are legal in XML but not in HTML. ... The processor may signal the error, but is not required to do so.": Please change this to require the processor to produce an error. << The XSL and XQuery working groups have gave an initial response [2] to this issue, the i18n WG disagreed with this response [3], and a lengthy email discussion followed. After further discussion, the working group have decided to accept your comment, and resolve it by requiring the processor to signal the error. Please let us know if the resolution to this issue is acceptable. Joanne Tong [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html [2] http://lists.w3.org/Archives/Public/public-qt-comments/2004Apr/0068.html [3] http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0004.html
Dear XML Query WG and XSL WG, Below please find the I18N WGs comments on your last call document "XSLT 2.0 and XQuery 1.0 Serialization" (http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/). Please note the following: - Please address all replies to there comments to the I18N IG mailing list (w3c-i18n-ig@w3.org), not just to me. - Our comments are numbered in square brackets [nn]. We look forward to further discussion with you. [this mail is copied to the DOM WG to tell them what we are telling you about UTF-16 and endianness, which they should adopt for the Document Object Model (DOM) Level 3 Load and Save Specification] General last call comments, not i18n-related: [22] There shouldbe some warning about denormalization when using charmaps Regards, Martin.
I agree.
Martin, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N working group: << [22] There should be some warning about denormalization when using charmaps >> Thanks to you and the I18N working group for this comment. The XSL and XML Query Working Groups discussed the working group's comment, and decided to add a note indicating that the use of character maps may result in a serialized document that is not normalized. May I ask you to confirm that this response is acceptable to the working group? Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Hello Henry, We just decided at the I18N WG Core TF teleconf that we are happy with this resolution. Regards, Martin. At 13:09 04/07/13 -0400, Henry Zongaro wrote: >Martin, > > In [1], you submitted the following comment on the Last Call Working >Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N >working group: > ><< >[22] There should be some warning about denormalization when using > charmaps > >> > > Thanks to you and the I18N working group for this comment. > > The XSL and XML Query Working Groups discussed the working group's >comment, and decided to add a note indicating that the use of character >maps may result in a serialized document that is not normalized. > > May I ask you to confirm that this response is acceptable to the >working group? > >Thanks, > >Henry [On behalf of the XSL and XML Query Working Groups] >[1] >http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html >------------------------------------------------------------------ >Henry Zongaro Xalan development >IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 >mailto:zongaro@ca.ibm.com
Dear XML Query WG and XSL WG, Below please find the I18N WGs comments on your last call document "XSLT 2.0 and XQuery 1.0 Serialization" (http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/). Please note the following: - Please address all replies to there comments to the I18N IG mailing list (w3c-i18n-ig@w3.org), not just to me. - Our comments are numbered in square brackets [nn]. We look forward to further discussion with you. [this mail is copied to the DOM WG to tell them what we are telling you about UTF-16 and endianness, which they should adopt for the Document Object Model (DOM) Level 3 Load and Save Specification] General last call comments, not i18n-related: [23] Section 4: "The base URIs of nodes in the two trees may be different." Does this mean that base URIs are not serialized? This should be checked or at least explained. Regards, Martin.
Yes, the base URI typically is supplied at the time a tree is built by a parser, it is not normally explicit in the content of the tree.
Hello, In [1], Martin Duerst submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N Working Group. >[23] Section 4: "The base URIs of nodes in the two trees may be different." > Does this mean that base URIs are not serialized? This should be > checked or at least explained. Thanks to Martin and the I18N Working Group for this comment. In [2] Michael Kay responded as follows: >Yes, the base URI typically is supplied at the time a tree is built by a >parser, it is not normally explicit in the content of the tree. The XSL and XQuery Working Groups discussed this comment, and concurred with Michael Kay's response. The working groups did not feel any clarification of the specification was required. May I ask the I18N Working Group to confirm that this response is acceptable? Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html [2] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0401.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Dear XML Query WG and XSL WG, Below please find the I18N WGs comments on your last call document "XSLT 2.0 and XQuery 1.0 Serialization" (http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/). Please note the following: - Please address all replies to there comments to the I18N IG mailing list (w3c-i18n-ig@w3.org), not just to me. - Our comments are numbered in square brackets [nn]. We look forward to further discussion with you. [this mail is copied to the DOM WG to tell them what we are telling you about UTF-16 and endianness, which they should adopt for the Document Object Model (DOM) Level 3 Load and Save Specification] General last call comments, not i18n-related: [24] Cases of creation of non-wellformed XML where the processor is not required to signal an error: It would be good to have an option to request well-formedness checking even if Character Maps are used. Regards, Martin.
Hello, In [1], Martin Duerst submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N Working Group. > [24] Cases of creation of non-wellformed XML where the processor is not > required to signal an error: It would be good to have an option to > request well-formedness checking even if Character Maps are used. Thanks to Martin and the I18N Working Group for this comment. The XSL and XQuery Working Groups discussed the comment, and concluded that, although such a mechanism might be useful, an XML parser would be capable of performing the same well-formedness checking. On those grounds, the working groups decided it was not necessary to duplicate that functionality in Serialization. May I ask the working group to confirm that this response is acceptable? Thanks, Henry [On behalf of the XSL and XQuery Working Groups.] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Henry Zongaro a écrit : > In [1], Martin Duerst submitted the following comment on the Last > Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of > the I18N Working Group. > > >>[24] Cases of creation of non-wellformed XML where the processor is not >> required to signal an error: It would be good to have an option to >> request well-formedness checking even if Character Maps are used. > > > Thanks to Martin and the I18N Working Group for this comment. > > The XSL and XQuery Working Groups discussed the comment, and > concluded that, although such a mechanism might be useful, an XML parser > would be capable of performing the same well-formedness checking. On > those grounds, the working groups decided it was not necessary to > duplicate that functionality in Serialization. We are not satisfied with this resolution. We feel that 1) well-formedness is very important ; 2) using a parser to check it is just a possible implementation strategy and 3) that this strategy may not even be available when serializing to other than a local file, e.g. to a network socket. > [1] > http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html Regards, -- François Yergeau
Martin, François. In [1] Martin submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N working group: << [24] Cases of creation of non-wellformed XML where the processor is not required to signal an error: It would be good to have an option to request well-formedness checking even if Character Maps are used. >> In [2], I announced the following decision on behalf of the XSL and XML Query Working Groups: << The XSL and XQuery Working Groups discussed the comment, and concluded that, although such a mechanism might be useful, an XML parser would be capable of performing the same well-formedness checking. On those grounds, the working groups decided it was not necessary to duplicate that functionality in Serialization. >> In [3], François raised the following objection on behalf of I18N: << We are not satisfied with this resolution. We feel that 1) well-formedness is very important ; 2) using a parser to check it is just a possible implementation strategy and 3) that this strategy may not even be available when serializing to other than a local file, e.g. to a network socket. >> The XSL and XML Query Working Groups discussed this issue further, and concluded that requiring a serialization component to be capable of detecting XML that was not well-formed in the presence of character maps would be too much of an implementation burden on a serializer. The working groups also noted that there is no interoperability problem with this resolution, and that an implementation could always add an implementation-specific option that would perform the sort of checking that has I18N suggested. Finally, the XSL and XQuery Working Groups noted that the last paragraph of Section 5 of the most recent draft of Serialization [4] indicates that only character maps and the use of user-written extension functions might result in the creation of XML that is not well-formed. In fact it is only character maps that might result in XML that is not well-formed without being detected. The working groups will correct that misstatement. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html [2] http://lists.w3.org/Archives/Public/public-qt-comments/2004Apr/0059.html [3] http://lists.w3.org/Archives/Public/public-qt-comments/2004Jun/0108.html [4] http://www.w3.org/TR/2004/WD-xslt-xquery-serialization-20040723/#xml-output ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Dear XML Query WG and XSL WG, Below please find the I18N WGs comments on your last call document "XSLT 2.0 and XQuery 1.0 Serialization" (http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/). Please note the following: - Please address all replies to there comments to the I18N IG mailing list (w3c-i18n-ig@w3.org), not just to me. - Our comments are numbered in square brackets [nn]. We look forward to further discussion with you. [this mail is copied to the DOM WG to tell them what we are telling you about UTF-16 and endianness, which they should adopt for the Document Object Model (DOM) Level 3 Load and Save Specification] General last call comments, not i18n-related: [25] 7, Text Output Method: "The media-type parameter is applicable for the text output method.": What does that mean? How is it applied? Regards, Martin.
It means, go and read the general (method-independent) description of this parameter (which in this case, is not very enlightening...)
Martin, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N Working Group: << [25] 7, Text Output Method: "The media-type parameter is applicable for the text output method.": What does that mean? How is it applied? >> Thanks to you and the working group for this comment. The XSL and XML Query Working Groups discussed the comment. In [2], I indicated that, in response to the second comment number 12 in [1], the working groups will clarify the description of the media-type parameter that appears in the table in section 3 of Serialization.[3] In response to the comment at hand, the working groups will add a reference to that description to the text you've quoted above, and to similar text that appears in Sections 5.9 and 7.8 of Serialization. The working groups believe that will clarify what it means for the parameter to be applicable. Similar descriptions of the use-character-maps parameter in sections 5.9, 7.8 and 8 of Serialization will be clarified through the addition of references to section 9 of Serialization, which describes character maps in detail. May I ask you to confirm that this response is acceptable to the I18N Working Group? Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html [2] http://lists.w3.org/Archives/Public/public-qt-comments/2004Sep/0079.html [3] http://www.w3.org/TR/2004/WD-xslt-xquery-serialization-20040723/#serparam ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Dear XML Query WG and XSL WG, Below please find the I18N WGs comments on your last call document "XSLT 2.0 and XQuery 1.0 Serialization" (http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/). Please note the following: - Please address all replies to there comments to the I18N IG mailing list (w3c-i18n-ig@w3.org), not just to me. - Our comments are numbered in square brackets [nn]. We look forward to further discussion with you. [this mail is copied to the DOM WG to tell them what we are telling you about UTF-16 and endianness, which they should adopt for the Document Object Model (DOM) Level 3 Load and Save Specification] Editorial: [31] Section 5 and Section 6: "If the data model includes a head element that has a meta element child, the processor should replace any content attribute of the meta element, or add such an attribute, with the value as described above, rather than output a new meta element." This is written as if there would be only one <meta> element. Replacement should only take place if the <meta> element has a http-equiv attribute with value 'Content-Type'. Regards, Martin.
In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N Working Group. >> [31] Section 5 and Section 6: "If the data model includes a head element that has a meta element child, the processor should replace any content attribute of the meta element, or add such an attribute, with the value as described above, rather than output a new meta element." This is written as if there would be only one <meta> element. Replacement should only take place if the <meta> element has a http-equiv attribute with value 'Content-Type'. << The XSL and XQuery working groups have accepted your comment. It is related to an issue about having to explore the entire contents of the HEAD element before it decides whether to to add a new META element as the first child. See [2]. We intend to resolve this by replacing the offending text with: "If a META element has been added to the HEAD element as described above, then any existing META element child of the HEAD element having an http-equiv attribute with the value "Content-Type" MUST be discarded." Please let us know if the resolution to this issue is acceptable. Joanne Tong [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html [2] http://lists.w3.org/Archives/Member/w3c-xsl-wg/2004Aug/0088.html
Dear XML Query WG and XSL WG, Below please find the I18N WGs comments on your last call document "XSLT 2.0 and XQuery 1.0 Serialization" (http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/). Please note the following: - Please address all replies to there comments to the I18N IG mailing list (w3c-i18n-ig@w3.org), not just to me. - Our comments are numbered in square brackets [nn]. We look forward to further discussion with you. [this mail is copied to the DOM WG to tell them what we are telling you about UTF-16 and endianness, which they should adopt for the Document Object Model (DOM) Level 3 Load and Save Specification] Editorial: [32] Section 5 and Section 6: Note starting: "This escaping is deliberately confined to non-ASCII characters,": There are certain ASCII characters that are not allowed in URIs. They should be escaped. Regards, Martin.
> [32] Section 5 and Section 6: Note starting: "This escaping > is deliberately > confined to non-ASCII characters,": There are certain > ASCII characters > that are not allowed in URIs. They should be escaped. The decision here is very deliberate, as the text says. Note that appendix B.2.1 of the HTML 4.0 specification also refers to %HH escaping only in connection with non-ASCII characters. Although characters such as spaces are not allowed in URIs, if you escape them in URIs that are interpreted client-side, such as javascript: URIs, the URI stops working in most browsers. Also, you can't escape an id attribute that acts as the target of a link, because % is not valid in an ID attribute. In practice (whatever the spec says) if you escape the URI fragment identifier of a same-page URI reference but don't escape the corresponding ID attribute, the browser doesn't match them up. In fact, the evidence appears to be that browsers don't unescape URIs at all, they leave this to be done at the server. Escaping non-ASCII characters, as we currently specify, appears to work for fragment identifiers referring to a different page, but not for same-page references. It's a mess, which is one reason why we now provide the option to switch off automatic escaping of URIs and allow the user to do it themselves using the escape-uri() function. Regards, Michael Kay
In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N Working Group. >> [32] Section 5 and Section 6: Note starting: "This escaping is deliberately confined to non-ASCII characters,": There are certain ASCII characters that are not allowed in URIs. They should be escaped. << The XSL and XQuery working groups have discussed your comment, and decline to act on it. We endorse Mike Kay's response [2]: The decision here is very deliberate, as the text says. Note that appendix B.2.1 of the HTML 4.0 specification also refers to %HH escaping only in connection with non-ASCII characters. Although characters such as spaces are not allowed in URIs, if you escape them in URIs that are interpreted client-side, such as javascript: URIs, the URI stops working in most browsers. Also, you can't escape an id attribute that acts as the target of a link, because % is not valid in an ID attribute. In practice (whatever the spec says) if you escape the URI fragment identifier of a same-page URI reference but don't escape the corresponding ID attribute, the browser doesn't match them up. In fact, the evidence appears to be that browsers don't unescape URIs at all, they leave this to be done at the server. Escaping non-ASCII characters, as we currently specify, appears to work for fragment identifiers referring to a different page, but not for same-page references. It's a mess, which is one reason why we now provide the option to switch off automatic escaping of URIs and allow the user to do it themselves using the escape-uri() function. Please let us know if the resolution to this issue is acceptable. Joanne Tong [1] http://www.w3.org/XML/Group/xsl-query-specs/last-call-comments/xquery-serialization/issues.xml#qt-2004Feb0362-25 [2] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0401.html
SECTION 2: Serialization of the Data Model is important because we want to be able to share Data Model instances between applications over a network in a standard way. Today the spec does not support this (e.g. document boundaries are blurred, scalar values are converted to text nodes). We should use XML itself to solve the problem. The idea is the same as using XML to represent XML schema and using XML to represent XQuery (XQueryX). First, we can define an XML schema that describes the XQuery data model , then serialize the XQuery data model instance based on this XML schema. The key is to define a comprehensive XML schema to describe an XQuery data model instance. For example, consider an XQuery data model instance that consists of 2 items, an xs:integer of value 1, followed by an xs:string of value 'abc'. It can be serialized as: <xqdm:seq xmlns:xqdm="http://www.w3.org/2004/xqdm" > <xqdm:item> <xqdm:type>xs_integer<xqdm:type> <xqdm:value>1<xqdm:value> </xqdm:item> <xqdm:item> <xqdm:type>xs_string<xqdm:type> <xqdm:value>abc<xqdm:value> </xqdm:item> </xqdm:seq> With this kind of serialization, the Data Model can be serialized in exactly one way. - Steve B.
Steve, In [1] you submitted the following comment on the serialization draft: Steve Buxton wrote on 2004-02-17 06:31:41 AM: > SECTION 2: > > Serialization of the Data Model is important because we want to be > able to share Data Model instances between applications over a > network in a standard way. Today the spec does not support this (e. > g. document boundaries are blurred, scalar values are converted to > text nodes). > > We should use XML itself to solve the problem. The idea > is the same as using XML to represent XML schema and using XML to > represent XQuery (XQueryX). > > First, we can define an XML schema that describes the > XQuery data model , then serialize the XQuery data model instance > based on this XML schema. > The key is to define a comprehensive XML schema to describe an XQuery > data model instance. > > For example, consider an XQuery data model instance that consists of > 2 items, an > xs:integer of value 1, followed > by an xs:string of value 'abc'. It can be serialized as: > > <xqdm:seq xmlns:xqdm="http://www.w3.org/2004/xqdm" > > <xqdm:item> > <xqdm:type>xs_integer<xqdm:type> > <xqdm:value>1<xqdm:value> > </xqdm:item> > <xqdm:item> > <xqdm:type>xs_string<xqdm:type> > <xqdm:value>abc<xqdm:value> > </xqdm:item> > </xqdm:seq> > > With this kind of serialization, the Data Model can be serialized in > exactly one way. Thank you for submitting this comment. The XSL and XQuery working groups considered your comment and related comments. There was general agreement that there is some need for a mechanism for serializing arbitrary sequences that preserves most or all of the properties of the items in an arbitrary sequence that is being serialized. However, the working groups decided that precisely defining all of the requirements for such a mechanism at this stage would be difficult, and would likely lead to a solution that would not satisfy real user requirements. Therefore, the working groups decided to consider such a feature for a future revision of the recommendations, and close this comment without any changes to the specifications. May I ask you to confirm that this resolution is acceptable? Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0918.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
SECTION 2: Serializing arbitrary data models Step 1 says to replace an empty sequence by a zero-length string, which is presumably an atomic value of type xs:string of length 0. Step 2 casts this xs:string to xs:string, a no-op. Step 3 would add a space if there were more than one atomic value in the sequence, but there is not, so this step is a no-op. Step 4 says to convert this atomic value into a text node with the same string value. But a text node may not have a zero-length string as its content; this is a bug in the specification. Perhaps step 4 should be interpreted as simply deleting any zero-length xs:string values from the sequence. In that case, if we started with an empty sequence, we are back to an empty sequence, and one wonders why step 1 is there. Or perhaps step 4 is intended to raise a serialization error. If that is the intention, please say so. - Steve B.
SECTION 3: Serialization parameters It says "The method identifies the overall method... If the QName is in a namespace, then it identifies an implementation-defined output method; the behavior in this case is not specified by this document." However, you have specified that normalization (section 2) occurs prior to invoking the method. This implies that the implementation-defined method has no control over the normalization. It would be desirable to reverse this, so that normalization occurs inside the method rather than prior to the method. In that case, normalization would be part of the standard-defined methods but implementation-defined methods might have other algorithms for dealing with values permitted by the data model that do not correspond to well-formed XML. To accomplish this, simply make the current section 2 into the first phase, prior to "Markup generation". - Steve B.
Steve, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization: << SECTION 3: Serialization parameters It says "The method identifies the overall method... If the QName is in a namespace, then it identifies an implementation-defined output method; the behavior in this case is not specified by this document." However, you have specified that normalization (section 2) occurs prior to invoking the method. This implies that the implementation-defined method has no control over the normalization. It would be desirable to reverse this, so that normalization occurs inside the method rather than prior to the method. In that case, normalization would be part of the standard-defined methods but implementation-defined methods might have other algorithms for dealing with values permitted by the data model that do not correspond to well-formed XML. To accomplish this, simply make the current section 2 into the first phase, prior to "Markup generation". >> Thank you for this comment. The XSL and XML Query Working Groups discussed your comment, and agreed that implementation-defined output methods should be granted control of whether the normalization of arbitrary sequences that is specified by section 2 occurs. As a representative of Oracle was present when this decision was made, I will assume the response is acceptable to you. Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0921.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
SECTION 3: Serialization parameters It is not clear what the scope of phase 1, "Markup generation", is. It says that this phase "produces the representation of start and end tags for elements, and other constructs such as ...". The use of "such as..." is non-specific. It would be better to provide a complete list of what is included. Even the phrase "representation of start and end tags for elements" is not very clear -- does this include the attributes and namespace declarations within the start tag? - Steve B.
Steve, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization: Thank you for this comment. The XSL and XML Query Working Groups discussed your comment, and decided to clarify the description of the markup generation phase by replacing the first bullet of Section 4 of Serialization with the following: << 1. Markup generation produces the character representation of those parts of the serialized result that describe the structure of the normalized instance of the data model. In the cases of the xml, html and xhtml output methods, this phase produces the character representations of the following: o the document type declaration; o start tags and end tags (except for the attribute values, whose representation is produced by the character expansion phase); o processing instructions; and o comments. In the case of the xml and xhtml output methods, this phase also produces the following: o the XML or text declaration; and o empty element tags (except for the attribute values); In the case of the text output method, this phase has no effect. >> In addition, the working groups decided to add a statement to the effect that the phases of serialization apply to the output methods defined by the Serialization specification, and that it is implementation-defined whether any apply for an implementation-defined output method. As a representative of Oracle was present when this decision was made, I will assume the response is acceptable to you. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0922.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
SECTION 3: Serialization parameters Perhaps there should be a parameter to indicate whether to output elements with no children as start-tag plus end-tag or as empty-element tags. - Steve B.
Steve, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization: << SECTION 3: Serialization parameters Perhaps there should be a parameter to indicate whether to output elements with no children as start-tag plus end-tag or as empty-element tags. >> Thank you for this comment. The XSL and XML Query Working Groups discussed your comment, and decided that such a parameter was not necessary, as it would not affect the Infoset of the serialized result. In addition, it was noted that such a parameter might conflict with the requirements of the xhtml and html output methods. As you were present when this decision was made, I will assume the response is acceptable to you. Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0923.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
SECTION 4: XML output method Second para: "In all other circumstances, the serialized form must comply with the requirements described for the xml output method." It is not clear what "all other circumstances" is constrasting itself with. One naturally looks back to the first paragraph to see what conditions it lays out. The only condition in that paragraph appears to be "unless the processor is unable to satisfy those rules...". Thus the logical structure appears to be: test if the processor is able to satisfy such-and-such rules if yes, the beginning of the first paragraph appies if no, the second paragraph applies. But this seems unlikely to be your intent. - Steve B.
Steve, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization: << SECTION 4: XML output method Second para: "In all other circumstances, the serialized form must comply with the requirements described for the xml output method." It is not clear what "all other circumstances" is constrasting itself with. One naturally looks back to the first paragraph to see what conditions it lays out. The only condition in that paragraph appears to be "unless the processor is unable to satisfy those rules...". Thus the logical structure appears to be: test if the processor is able to satisfy such-and-such rules if yes, the beginning of the first paragraph appies if no, the second paragraph applies. But this seems unlikely to be your intent. >> Thank you for this comment. The XSL and XML Query Working Groups discussed your comment. In response to other comments in Section 4, the working groups decided to reword the first and third paragraphs of that section in order to make it clear that a serialization error results if the serialized result is not a well-formed document entity or external general parsed entity. In addition, the second paragraph of section 4 was deleted. The working groups believe that these changes address your comment as well. I believe that a representative of Oracle was present when this decision was made, so I will assume the response is acceptable to you. Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0924.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
SECTION 4: XML output method Setting the indent parameter to yes may introduce additional whitespace in the output. Reparsing the output value may retain this additional whitespace, for example, if it is added to an element of mixed content. This exception is not listed. (You have an exception for the character expansion phase, but the indent parameter is processed by the Markup generation phase, so the exception for character expansion does not cover the action of the indent parameter.) - Steve B.
Steve, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. > SECTION 4: XML output method > > Setting the indent parameter to yes may introduce additional > whitespace in the output. Reparsing the output value may > retain this additional whitespace, for example, if it is added > to an element of mixed content. This exception is not listed. > (You have an exception for the character expansion phase, but > the indent parameter is processed by the Markup generation > phase, so the exception for character expansion does not > cover the action of the indent parameter.) Thank you for your comment. The XSL and XQuery working groups discussed your comment, and agreed with your analysis. The following item will be added to the bulleted list in section 4 to address this comment: << o Additional text nodes consisting of whitespace characters may be present in the new tree and some text nodes in the new tree may contain additional whitespace characters that were not present in the original tree if the indent parameter has the value yes, as described in 4.3 XML Output Method: the indent Parameter. >> As you were present when this decision was made, I will take it that the decision is acceptable to you. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0926.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
SECTION 4: XML output method Final bullet says "Additional nodes may be present in the new tree .. due to the character expansion phase of serialization." Could you please give an example of how character expansion can cause new nodes? I don't see how any of the four kinds of character expansion (URI escaping, CDATA sections, character mapping, special character references) can cause a new node. While these character expansions might change the physical presentation of a text node, I don't see how they can cause one text node to become two text nodes, for example. - Steve B.
Steve, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. > SECTION 4: XML output method > > Final bullet says "Additional nodes may be present in the new tree > .. due to the character expansion phase of serialization." > Could you please give an example of how character expansion can > cause new nodes? I don't see how any of the four kinds of > character expansion (URI escaping, CDATA sections, character > mapping, special character references) can cause a new node. > While these character expansions might change the physical > presentation of a text node, I don't see how they can cause one > text node to become two text nodes, for example. Thank you for your comment. The XSL and XQuery working groups discussed your comment, and decided to add a note to clarify the situation. I would like to add the following note to the final bullet of the bulleted list in section 4. << Note: The use-character-maps parameter can cause arbitrary characters to be inserted into the serialized XML document in an unescaped form, including characters that would be considered part of XML markup. Such characters could result in arbitrary new element nodes, attribute nodes, and so on, in the new tree that results from processing the serialized XML document. >> As you were present when this decision was made, I will take it that the decision is acceptable to you. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0927.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
SECTION 4.5: XML output method: the omit-xml-declaration parameter The last sentence says "The omit-xml-declaration parameter must be ignored if the standlone parameter is present, or if the encoding parameter specifies a value other than UTF-8 or UTF-16." That is, if standalone is specified, then an XML declaration is mandatory in the output. Isn't an XML declaration also mandatory if the version is not 1.0? That should probably be added to the list in this sentence. - Steve B.
Steve, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. > SECTION 4.5: XML output method: the omit-xml-declaration parameter > > The last sentence says "The omit-xml-declaration parameter must > be ignored if the standlone parameter is present, or if the > encoding parameter specifies a value other than UTF-8 or UTF-16." > That is, if standalone is specified, then an XML declaration is > mandatory in the output. Isn't an XML declaration also mandatory > if the version is not 1.0? That should probably be added to > the list in this sentence. Thank you for your comment. The XSL and XQuery working groups discussed your comment, and concluded that, although XML 1.1 requires a document entity to have an XML declaration, it does not require an external general parsed entity to have a text declaration. However, prompted by your comment, the working groups decided to formulate their requirements for the omit-xml-declaration parameter to fit with the requirements of XML 1.0 and XML 1.1. The specification will require the setting of the omit-xml-declaration parameter to be obeyed always, and to require conflicts between the settings of that parameter and other parameters to be considered a serialization error. A host language would, of course, have the option of ensuring such conflicts never arise through whatever language-specific mechanism it uses to specify serialization parameters. In particular, the working groups decided that if the serialized result could be considered to be the text declaration of an external general parsed entity, the omit-xml-declaration parameter could have the value yes or the value no, and the parameter's setting would take effect. They further decided that if the serialized result could only be considered to be a document entity because o the standalone parameter had the value yes or no; or o the version parameter had a value other than 1.0 and the doctype-system parameter was supplied the omit-xml-declaration parameter must have the value no. Otherwise, a serialization error results. As you were present when this decision was made, I will take it that the decision is acceptable to you. Thanks, Henry [On behalf of the XSL and XQuery Working Groups.] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0928.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
SECTION 4: XML output method Fifth paragraph says "In addition, the output must be such that if a new tree was constructed by parsing the XML document and converting it into a data model as specified in [Data Model], then the new data model would be the same as the starting data model, with the following possible exceptions:...". The word "same" is not defined for data models (sic; you mean "sequence", "item", "node" or "document node"). One cannot apply the word "same" literally to properties that are sequences of nodes, such as parent, children, attributes, and namespaces, since it is impossible to construct nodes with the same node identity as the original value. You may wish to look at how SQL/XML:2003 handled this issue (see Subclause 10.3 "Determination of identical values"). - Steve B.
SECTION 3: serialization parameters It says "indent specifies whether the processor may add additional whitespace when outputting the data model...". It is not clear to what extent this interacts with the properties of nodes. For example, if an element's type permits mixed content, then adding whitespace to that element's content potentially damages that element's semantics. If an element has not been validated, then it is possible that that element is intended to have mixed content and the processor just doesn't know it, so again, the conservative thing to do is to prohibit adding whitespace. If an element has been validated and is known to have only elements in its content model, then it would be permissible to add whitespace to that element's content on output as a pretty-printing option. The conclusion is that this parameter should only govern the output of such elements. - Steve B.
Steve, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization: << SECTION 3: serialization parameters It says "indent specifies whether the processor may add additional whitespace when outputting the data model...". It is not clear to what extent this interacts with the properties of nodes. For example, if an element's type permits mixed content, then adding whitespace to that element's content potentially damages that element's semantics. If an element has not been validated, then it is possible that that element is intended to have mixed content and the processor just doesn't know it, so again, the conservative thing to do is to prohibit adding whitespace. If an element has been validated and is known to have only elements in its content model, then it would be permissible to add whitespace to that element's content on output as a pretty-printing option. The conclusion is that this parameter should only govern the output of such elements. >> Thank you for this comment. The XSL and XML Query Working Groups discussed your comment. Recently, the working groups made a decision in principle to define serialization in terms of the mapping to Infoset defined by Data Model. As such, the concept of the content model of an element will no longer available when the indent parameter takes effect. In order to give some guidance to processors regarding the effect of indent on elements with mixed content, the working groups decided to add a statement that whitespace should not be added where it might be significant. I would like to add the following item to the bulleted list in Section 5.3 of the 23 July draft of Serialization: << o Whitespace characters should not be added in places where the characters would be significant - for example, in the content of an element whose content model is known to be mixed. >> As a representative of Oracle was present when this decision was made, I will assume the response is acceptable to you. Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0930.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
SECTION 4: XML output method The constraints expressed in the third paragraph, as currently worded, seem incomplete. The paragraph says "If the document node of the data model has a single element node child and no text node children, and the serialized output is a well-formed XML document entity, the serialized output must conform to the XML Namespaces Recommendation [XML Names]. If the data model does not take this form, and the serialized output is a well-formed XML external general parsed entity, then the serialized output must be an entity which, when referenced within a trivial XML document wrapper like this <!DOCTYPE doc [ <!ENTITY e SYSTEM "entity-URI"> ]> <doc>&e;</doc> where entity-URI is a URI for the entity, produces a document which must itself be a well-formed XML document conforming to the XML Namespaces Recommendation [XML Names]." This language seems to leave open the following possibilities: 1. The document node has a single element node child and no text node child, but the serialized output is not well-formed XML. 2. The document node does not have a single element node child, or has a text node child, but the serialized output is not a well-formed XML external general parsed entity. I think the solution is to reword the paragraph as follows: If the document node of the input value has a single element node child and no text node children, then the serialized output shall be a well-formed XML document entity that conforms to the XML Namespaces Recommendation [XML Names]. Otherwise, the serialized output shall be a well-formed XML external general parsed entity, which, when referenced within a trivial XML document wrapper like this <!DOCTYPE doc [ <!ENTITY e SYSTEM "entity-URI"> ]> <doc>&e;</doc> where entity-URI is a URI for the entity, produces a document which must itself be a well-formed XML document conforming to the XML Namespaces Recommendation [XML Names]. - Steve B.
Hi, Steve. In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization: Steve Buxton wrote on 2004-02-17 06:44:15 AM: > SECTION 4: XML output method > > The constraints expressed in the third paragraph, as currently > worded, seem incomplete. The paragraph says > "If the document node of the data model has a single element node > child and no text node children, and the serialized output is a > well-formed XML document entity, the serialized output must conform > to the XML Namespaces Recommendation [XML Names]. If the data model > does not take this form, and the serialized output is a well-formed > XML external general parsed entity, then the serialized output must > be an entity which, when referenced within a trivial XML document > wrapper like this > > <!DOCTYPE doc [ > <!ENTITY e SYSTEM "entity-URI"> > ]> > <doc>&e;</doc> > > where entity-URI is a URI for the entity, produces a document which > must itself be a well-formed XML document conforming to the XML > Namespaces Recommendation [XML Names]." > > This language seems to leave open the following possibilities: > 1. The document node has a single element node child and no text > node child, but the serialized output is not well-formed XML. > 2. The document node does not have a single element node child, > or has a text node child, but the serialized output is not a > well-formed XML external general parsed entity. > > I think the solution is to reword the paragraph as follows: > > If the document node of the input value has a single element node > child and no text node children, then the serialized output shall > be a well-formed XML document entity that conforms to the XML > Namespaces Recommendation [XML Names]. Otherwise, the serialized > output shall be a well-formed XML external general parsed entity, > which, when referenced within a trivial XML document wrapper like > this > > <!DOCTYPE doc [ > <!ENTITY e SYSTEM "entity-URI"> > ]> > <doc>&e;</doc> > > where entity-URI is a URI for the entity, produces a document > which must itself be a well-formed XML document conforming to the > XML Namespaces Recommendation [XML Names]. Thank you for your comment. The XSL and XML Query Working Groups discussed your comment. The working groups agreed that the first paragraph of section 4 was intended to place a requirement on the serialization process that it must produce a well-formed entity (a document entity or external general parsed entity, as appropriate), unless it is unable to do so because of the effect of the character expansion phase of serialization. Otherwise, a serialization error results. In response to your comment and a related comment on the first three paragraphs of section 4, the working groups decided to make clear the intent of the first and third paragraphs of section 4 by making the following changes: - in the first sentence of the third paragraph, change "and the" to "then", to make it clear the conditions under which a document entity will be the result of the serialization process. - change the wording to make it clear that these rules describe requirements on the processor, rather than on the user. The processor will be required to produce a serialization error if it is unable to produce a well-formed entity of the appropriate kind, unless that is because of the action of the character expansion phase of serialization. As this seems to be in agreement with your proposed rewording, and a representative of Oracle was present when this decision was made, I will assume the response is acceptable. Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0932.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
SECTION 4: XML output method The third bullet says "The base URIs in the two trees may be different." Document nodes also have a property called document-uri; probably the document-uri is not recoverable by reparsing a serialization either. - Steve B.
[My apologies that these comments are coming in after the end of the Last Call comment period.] Section 6 This section states that the default value for the version method is 4.0, while section 3 states that the default is implementation defined. The two statements need to be reconciled. The same comment probably applies to other parameters. Thanks, Henry [Speaking on behalf of reviewers from IBM.] ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
> > Section 6 > > This section states that the default value for the version > method is 4.0, > while section 3 states that the default is implementation > defined. The > two statements need to be reconciled. The same comment > probably applies > to other parameters. > Actually, I think that the Serialization spec should not define default values for any parameters. This should be up to the client application to specify. Michael Kay
Hello, In [1], I submitted the following comment on the last call draft of Serialization on behalf of IBM: Henry Zongaro/Toronto/IBM wrote on 2004-02-17 08:45:45 PM: > Section 6 > > This section states that the default value for the version method is > 4.0, while section 3 states that the default is implementation > defined. The two statements need to be reconciled. The same > comment probably applies to other parameters. In response [2], Michael Key proposed: Michael Kay wrote on 2004-02-18 03:31:48 AM: > Actually, I think that the Serialization spec should not define default > values for any parameters. This should be up to the client application > to specify. The XSL and XQuery working groups considered this comment, and decided to accept Michael Kay's suggestion. This note announces the decision and signals my acceptance of the response. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0976.html [2] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0988.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
[My apologies that these comments are coming in after the end of the Last Call comment period.] Section 6 The default version of HTML should probably be 4.01. That's the default specified by XSLT. Thanks, Henry [Speaking on behalf of reviewers from IBM.] ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Hello, In [1], I submitted the following comment on the last call draft of Serialization on behalf of IBM: Henry Zongaro/Toronto/IBM wrote on 2004-02-17 08:46:35 PM: > Section 6 > > The default version of HTML should probably be 4.01. That's the > default specified by XSLT. The XSL and XQuery working groups considered this comment, and decided to have client specifications of serialization specify all parameter value settings. No defaults will be specified by the serialization specification. This note announces the decision and signals my acceptance of the response. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0977.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
[My apologies that these comments are coming in after the end of the Last Call comment period.] Section 5 The third bullet of this section states, "The serializer should avoid outputting line breaks and multiple whitespace characters within attribute values." It's not clear what a processor should do in such cases. This should state that these characters should be replaced by a single space character. Thanks, Henry [Speaking on behalf of reviewers from IBM.] ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Hello, In [1], I submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of IBM: Henry Zongaro/Toronto/IBM wrote on 2004-02-17 08:53:23 PM: > Section 5 > > The third bullet of this section states, "The serializer should > avoid outputting line breaks and multiple whitespace characters > within attribute values." It's not clear what a processor should do > in such cases. This should state that these characters should be > replaced by a single space character. The XSL and XML Query Working Groups discussed the comment, and decided to remove this rule about whitespace in attributes for the xhtml output method. All the other rules for describing the formatting requirements of the xhtml output method are strictly under the control of the processor, but in this case, the user has control of the content of the data model instance that is to be serialized, so the serialization process should just leave that to the user's control. Instead, the rule will be replaced with a non-normative reference to the compatibility appendix of XHTML as guidance to the user. This note announces and acknowledges that decision. Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0980.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
4.5 XML Output Method: the omit-xml-declaration Parameter http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/#N105F3 This says: "The omit-xml-declaration parameter must be ignored if the standalone parameter is present, or if the encoding parameter specifies a value other than UTF-8 or UTF-16." I would like to control the output of the omit-xml-declaration parameter, where the encoding parameter specifies a value other than UTF-8 or UTF-16. I often don't use Unicode. I would like the option to output with non-standard encoding as XHTML. The XHTML standard (http://www.w3.org/TR/xhtml1/) specifies that "an XML declaration is not required in all XML documents"; it is often desirable to omit it, given that it is known that there are unexpected results with some user agents. Thanks Deborah BBCi at http://www.bbc.co.uk/ This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated. If you have received it in error, please delete it from your system. Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately. Please note that the BBC monitors e-mails sent or received. Further communication will signify your consent to this.
Deborah, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. > 4.5 XML Output Method: the omit-xml-declaration Parameter > http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/#N105F3 > > This says: "The omit-xml-declaration parameter must be ignored if the > standalone parameter is present, or if the encoding parameter specifies > a value other than UTF-8 or UTF-16." > > I would like to control the output of the omit-xml-declaration > parameter, where the encoding parameter specifies a value other than > UTF-8 or UTF-16. I often don't use Unicode. I would like the option > to output with non-standard encoding as XHTML. The XHTML standard ( > http://www.w3.org/TR/xhtml1/) specifies that "an XML declaration is > not required in all XML documents"; it is often desirable to omit > it, given that it is known that there are unexpected results with > some user agents. Thank you for your comment. The XSL and XQuery Working groups discussed your comment. As originally written, XML 1.0 required an XML declaration or a text declaration if the encoding of the document or external general parsed entity was anything other than UTF-8 or UTF-16. XSLT 1.0 enforced that requirement in its serialization mechanism. The draft of Serialization inherited that behaviour from XSLT 1.0. However, an erratum to XML 1.0 removed that requirement. In response to your comment, the working groups decided to require the XML declaration or text declaration to be omitted, regardless of the setting of the encoding parameter. Serialization will permit an XML declaration or text declaration to be omitted in precisely those circumstances in which it can be omitted according to XML 1.0 and XML 1.1. This would affect both the xml and xhtml output methods. As that is the change you requested, I believe that decision will be acceptable to you. May I ask you to confirm that it is? Thanks, Henry [On behalf of the XSL and XQuery Working Groups.] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0996.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
SECTION 3: Serialization parameters Phase 2, "Character markup", fourth bullet, mentions escaping of special characters such as <. You could also mention here the creation of character references for characters that are not representable in the encoding. - Steve B.
Steve, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization: << SECTION 3: Serialization parameters Phase 2, "Character markup", fourth bullet, mentions escaping of special characters such as <. You could also mention here the creation of character references for characters that are not representable in the encoding. >> Thank you for this comment. The XSL and XML Query Working Groups discussed your comment, and decided, because of the interactions between Unicode normalization and creation of character references, to fold together character expansion and Unicode normalization, and at the same time, add creation of character references to the character expansion phase. Specifically, the working groups decided to replace the second and third bullets of Section 4 of Serialization with the following text: << 2. Character expansion is concerned with the representation of characters appearing in text and attribute nodes in the instance of the data model. The substitution processes that may apply are listed below, in priority order: a character that is handled by one process in this list will be unaffected by processes appearing later in the list, except that a character affected by Unicode normalization may be affected by creation of CDATA sections and by character escaping o URI escaping (in the case of URI-valued attributes in the HTML and XHTML output methods), as determined by the escape-uri-attributes parameter o Character mapping, as determined by the use-character-maps parameter. Text nodes that are children of elements specified by the cdata-section-elements parameter are not affected by this step. o Unicode Normalization, if requested by the normalization-form parameter. Unicode normalization is applied to the character stream that results after all markup generation and character expansion has taken place. For the definitions of the various normalization forms, see [Character Model for the World Wide Web 1.0] The meanings associated with the possible values of the normalization-form parameter are as follows: o NFC specifies the serialized result should be in Unicode Normalization Form C. o NFD specifies the serialized result should be in Unicode Normalization Form D. o NFKC specifies the serialized result should be in Unicode Normalization Form KC. o NFKD specifies the serialized result should be in Unicode Normalization Form KD. o fully-normalized specifies the serialized result should be in fully normalized form. o none specifies that no Unicode normalization should be applied. o An implementation-defined value has an implementation- defined effect. o Creation of CDATA sections, as determined by the cdata-section-elements parameter. Note that this is also affected by the encoding parameter, in that characters not present in the selected encoding cannot be represented in a CDATA section. o Escaping according to XML or HTML rules of special characters and of characters that cannot be represented in the selected encoding. For example replacing < by <. >> The Unicode Normalization phase becomes the third step of character expansion. Character mapping becomes the second step, with the clarification that it does not affect elements to which cdata-section-elements applies. This was done to make it clear that any characters affected by character mapping are not affected by Unicode Normalization. The lead-in to the bulleted list will be modified so that CDATA section creation and escaping still apply to characters affected by Unicode Normalization - this is a consequence of trying to fold the two together. Finally, the last bullet will be modified to make it clear that not only special characters, but characters that can't be represented in the selected encoding are affected by that final step. As a representative of Oracle was present when this decision was made, I will assume the response is acceptable to you. Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1040.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
SECTION 3: Serialization parameters It says "standalone specifies whether the processor is to emit a standlone document declaration and the value of the declaration; the value must be yes or no." The first sentence implies that standalone is a parameter with three possible values: don't emit a standalone declaration; do emit and its value is yes; do emit and its value is no. Having only two values for the parameter is not adequate for its task. In section 4.5 "XML output method: the omit-xml-declaration parameter", the last paragraph includes the phrase "...if the standalone parameter is present...". This indicates that your model is that parameters are optional. With that model you can indeed get by with two values for the parameter, because the third state would be indicated by the absense of the parameter. If this is your intent, you should preface the list of parameters by saying that they are optional. Also, most of the parameter descriptions include a sentence of the form "If this parameter is not specified..." but a few do not. It would be good to supply this sentence for all parameters. - Steve B.
Steve, In [1], you submitted the following comment on the Serialization last call draft: Steve Buxton wrote on 2004-02-18 05:22:15 PM: > SECTION 3: Serialization parameters > > It says "standalone specifies whether the processor is to emit > a standlone document declaration and the value of the declaration; > the value must be yes or no." The first sentence implies that > standalone is a parameter with three possible values: > don't emit a standalone declaration; do emit and its value is yes; > do emit and its value is no. Having only two values for the > parameter is not adequate for its task. > > In section 4.5 "XML output method: the omit-xml-declaration > parameter", the last paragraph includes the phrase > "...if the standalone parameter is present...". This indicates > that your model is that parameters are optional. With that > model you can indeed get by with two values for the parameter, > because the third state would be indicated by the absense of > the parameter. If this is your intent, you should preface the > list of parameters by saying that they are optional. Also, > most of the parameter descriptions include a sentence of the > form "If this parameter is not specified..." but a few do not. > It would be good to supply this sentence for all parameters. Thank you for submitting your comment. The XSL and XQuery Working groups discussed your comment and several related comments. In most cases, the serialization draft treated a parameter whose value was not specified by the client specification as if it had been specified with a particular default value that was defined by either the serialization draft or by the implementation. Such parameters, though optional from the point of view of the client specification, always had some value. In a few instances - as with the standalone parameter - the absence of a parameter was treated as if it was a distinct setting for the parameter. The working groups decided to place the onus on the client specifications (XSLT and XQuery for now) to specify default values for parameters, if appropriate, rather than defining any in the Serialization specification. With this change, only the doctype-public and doctype-system parameters could be absent. The following table, which will replace the descriptions of the parameter values that currently appear in Section 3, should clarify this. Corresponding changes to the uses of the parameter values in subsequent sections will similarly be made. << +----------------------+------------------------------------------------+ |PARAMETER NAME |PERMITTED VALUES FOR PARAMETER | +----------------------+------------------------------------------------+ |cdata-section-elements|A list of expanded-QNames, possibly empty. | +----------------------+------------------------------------------------+ |doctype-public |A string of Unicode characters. This parameter | | |is optional. | +----------------------+------------------------------------------------+ |doctype-system |A string of Unicode characters. This parameter | | |is optional. | +----------------------+------------------------------------------------+ |encoding |A string of Unicode characters in the range #x21| | |to #x7E (that is, printable ASCII characters); | | |the value should be a charset registered with | | |the Internet Assigned Numbers Authority [IANA], | | |[RFC2278] or begin with the characters x- or X-.| +----------------------+------------------------------------------------+ |escape-uri-attributes |One of the enumerated values yes or no | +----------------------+------------------------------------------------+ |include-content-type |One of the enumerated values yes or no | +----------------------+------------------------------------------------+ |indent |One of the enumerated values yes or no | +----------------------+------------------------------------------------+ |media-type |A string of Unicode characters specifying the | | |media type (MIME content type) [RFC2376]; the | | |charset parameter of the media type must not be | | |specified explicitly. | +----------------------+------------------------------------------------+ |method |An expanded-QName with a null namespace URI, and| | |the local part of the name equal to xml, xhtml, | | |html or text, or having a non-null namespace | | |URI. If the namespace URI is non-null, the | | |parameter specifies an implementation-defined | | |output method. | +----------------------+------------------------------------------------+ |normalize-unicode |One of the enumerated values yes or no | +----------------------+------------------------------------------------+ |omit-xml-declaration |One of the enumerated values yes or no | +----------------------+------------------------------------------------+ |standalone |One of the enumerated values yes, no or none | +----------------------+------------------------------------------------+ |undeclare-namespaces |One of the enumerated values yes or no | +----------------------+------------------------------------------------+ |use-character-maps |A list of pairs, possibly empty, with each pair | | |consisting of a single Unicode character and a | | |string of Unicode characters. | +----------------------+------------------------------------------------+ |version |A string of Unicode characters. | +----------------------+------------------------------------------------+ >> May I ask you to confirm that this response to your comment is acceptable? Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1042.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Section 1 Introduction Editorial We think it may be better, if the serialization document is kept separate, so other serialization formats can be added without impacting the general datamodel document.
Michael, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization: << Section 1 Introduction Editorial We think it may be better, if the serialization document is kept separate, so other serialization formats can be added without impacting the general datamodel document. >> Thank you for this comment. The XSL and XML Query Working Groups discussed your comment, and agreed that it would be best to specify the serialization process in a separate document. The editorial note will be deleted. I am unsure whether any representative of Microsoft, apart from the chair of the XML Query WG, was present when this decision was made. May I ask you to confirm that this response is acceptable to you? Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1195.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Section 2 Editorial/Technical Please rewrite "Replace any string in the sequence with a text node whose string value is equal to the string." as "Replace any string with length greater than 0 in the sequence with a text node whose string value is equal to the string. Remove any zero-length string from the sequence."
Michael, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization: << Section 2 Editorial/Technical Please rewrite "Replace any string in the sequence with a text node whose string value is equal to the string." as "Replace any string with length greater than 0 in the sequence with a text node whose string value is equal to the string. Remove any zero-length string from the sequence." >> Thank you for this comment. The XSL and XML Query Working Groups discussed your comment, and noted that the following text was added to Section 7.7.1 of the July 23 draft the Data Model specification [2]: << When a Document or Element Node is constructed, Text Nodes that would be adjacent are combined into a single Text Node. If the resulting Text Node is empty, it is never placed among the children of its parent, it is simply discarded. >> The working groups decided that this text in Data Model makes the change that you recommended no longer necessary. As you were present when this decision was made, I will assume the response is acceptable to you. Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1197.html [2] http://www.w3.org/TR/2004/WD-xpath-datamodel-20040723/#TextNodeOverview ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Section 4 Editorial/Technical Please rewrite "and this may result in type annotations that are either more or less precise than those in the original result tree." as "and this may result in type annotations that are different from those in the original result tree."
Michael, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization: << Section 4 Editorial/Technical Please rewrite "and this may result in type annotations that are either more or less precise than those in the original result tree." as "and this may result in type annotations that are different from those in the original result tree." >> Thanks you for this comment. The XSL and XML Query Working Groups discussed your comment, and agreed to make the change that you suggested. As you were present when this decision was made, I will assume the response is acceptable to you. Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1198.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
General Editorial Section 2 introduces the term of a normalized sequence. Use this instead of "data model" in the rules that operate on the normalized sequence.
Michael, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization: << Section 2 introduces the term of a normalized sequence. Use this instead of "data model" in the rules that operate on the normalized sequence. >> Thank you for this comment. The XSL and XML Query Working Groups discussed your comment, and decided to change "instance of the data model" throughout Section 2 to "sequence", keeping the distinction between the sequence that is input to serialization and the normalized sequence clear throughout. As you were present when this decision was made, I will assume the response is acceptable to you. Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1204.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
General Technical We see a use for an XML-based markup vocabulary or XQuery expression-based serialization to describe a full data model instance that provides for all items using a fixed schema that exposes the data model's node structure and property information. However, we believe that such a modus should be done in a future version and not delay the current publication cycle.
Michael, In [1], you submitted the following comment on the serialization last call: > General > Technical > > We see a use for an XML-based markup vocabulary or XQuery > expression-based serialization to describe a full data model instance > that provides for all items using a fixed schema that exposes the data > model's node structure and property information. However, we believe > that such a modus should be done in a future version and not delay the > current publication cycle. Thank you for submitting this comment. The XSL and XQuery working groups considered your comment and related comments. There was general agreement that there is some need for a mechanism for serializing arbitrary sequences that preserves most or all of the properties of the items in an arbitrary sequence that is being serialized. The working groups decided that precisely defining all of the requirements for such a mechanism at this stage would be difficult, and would likely lead to a solution that would not satisfy real user requirements. Therefore, the working groups decided to consider such a feature for a future revision of the recommendations, and close this comment without any changes to the specifications. As this seems to be the outcome you proposed, I trust this resolution is acceptable to you. May I ask you to confirm? Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1205.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Confirmed. Thanks Michael
Dear XML Query WG and XSL WG, This is a last call comment on your Serialization document. We are sorry that this last call comment is late, but it is very important. We have earlier sent comments on the Data Model and on XSLT where we have urged better support for the inheritance of xml:lang (or for inherited attributes in general). Without any such support, it is extremely tedious to write a transformation or query that adequately copies xml:lang from the input to the output. In internal discussion, Liam Quin suggested that it might be more appropriate to submit our comment against Serialization. The reason for this is that XPath/XSLT offers reasonable support for extracting xml:lang information from source documents into the data model. However, when serialized, this leads in most cases to a completely unnecessary and undesirable multiplication of xml:lang attributes on virtually every element. Adding some support for reducing unnecessary xml:lang attributes from the output on serialization would be highly desirable. As I think we have written previously, better support for xml:lang (and maybe inherited attributes in general) was something that was left over as 'future work' from XSLT 1.0. Your current work is the best chance to fix this problem. Regards, Martin.
> We have earlier sent comments on the Data Model and on XSLT > where we have urged better support for the inheritance of > xml:lang (or for inherited attributes in general). Without > any such support, it is extremely tedious to write a > transformation or query that adequately copies xml:lang > from the input to the output. I have been monitoring questions and answers on the xsl-list (at one time there were 100 a day) for five years now, and I have not once seen a complaint about this from a user. It might be difficult in theory, but I don't think it is a problem in practice. Michael Kay
Hello Michael, At 17:52 04/05/06 +0100, Michael Kay wrote: > > We have earlier sent comments on the Data Model and on XSLT > > where we have urged better support for the inheritance of > > xml:lang (or for inherited attributes in general). Without > > any such support, it is extremely tedious to write a > > transformation or query that adequately copies xml:lang > > from the input to the output. > >I have been monitoring questions and answers on the xsl-list (at one time >there were 100 a day) for five years now, and I have not once seen a >complaint about this from a user. It might be difficult in theory, but I >don't think it is a problem in practice. If you don't think it's a problem in practice, what about taking the xml-to-xhtml XSLT associated with the xmlspec DTD, and change it so that multilingual input is output with the correct xml:lang attributes (but without having xml:lang on every element if not necessary). Regards, Martin.
> If you don't think it's a problem in practice, what about taking > the xml-to-xhtml XSLT associated with the xmlspec DTD, and change > it so that multilingual input is output with the correct xml:lang > attributes (but without having xml:lang on every element if not > necessary). > There are lots of things in XSLT that aren't easy, such as processing CALS table models. But on the scale of problems this one is by no means difficult: for example it can be done by running a three-phase transformation in which a pre-processing phase adds redundant xml:lang attributes: <xsl:template match="*"> <xsl:copy> <xsl:copy-of select="@* | ancestor-or-self::*/@xml:lang[last()]"/> <xsl:apply-templates/> </xsl:copy> </xsl:template> and a post-processing phase removes redundant xml:lang attributes: <xsl:template match="*"> <xsl:copy> <xsl:copy-of select="@* except @xml:lang[. = ancestor::*/@xml:lang[last()]]"/> <xsl:apply-templates/> </xsl:copy> </xsl:template> When I said it doesn't seem to be a problem in practice, I meant that I don't see evidence of lots of users trying to do this and complaining that it's difficult. I see a lot more complaints about the difficulty of handling CALS table models. For a problem that doesn't arise often in practice, a solution in 12 lines of code seems good enough. Inherited attributes are a bit of an oddity. Formally, there is no such thing as an inherited attribute, it's only a design convention, and there are lots of different variations on it - xml:space, xml:base, and xml:lang work quite differently from each other. Without a formalisation of inherited attributes in the data model and in the XML Schema type system, it's not easy to come up with language constructs that would be generic enough to be useful, and an ad-hoc solution for one particular attribute would be really bad design, especially in the absence of any evidence of a pressing user problem. Michael Kay
Hello Michael, Sorry for the delay of my answer. At 10:17 04/05/07 +0100, Michael Kay wrote: > > If you don't think it's a problem in practice, what about taking > > the xml-to-xhtml XSLT associated with the xmlspec DTD, and change > > it so that multilingual input is output with the correct xml:lang > > attributes (but without having xml:lang on every element if not > > necessary). > >There are lots of things in XSLT that aren't easy, such as processing CALS >table models. But on the scale of problems this one is by no means >difficult: for example it can be done by running a three-phase >transformation in which a pre-processing phase adds redundant xml:lang >attributes: > ><xsl:template match="*"> > <xsl:copy> > <xsl:copy-of select="@* | ancestor-or-self::*/@xml:lang[last()]"/> > <xsl:apply-templates/> > </xsl:copy> ></xsl:template> It turns out that this first pass can be integrated into the main pass. >and a post-processing phase removes redundant xml:lang attributes: > ><xsl:template match="*"> > <xsl:copy> > <xsl:copy-of select="@* except @xml:lang[. = >ancestor::*/@xml:lang[last()]]"/> > <xsl:apply-templates/> > </xsl:copy> ></xsl:template> > >When I said it doesn't seem to be a problem in practice, I meant that I >don't see evidence of lots of users trying to do this and complaining that >it's difficult. I see a lot more complaints about the difficulty of handling >CALS table models. For a problem that doesn't arise often in practice, a >solution in 12 lines of code seems good enough. Well, if it were only this code. But the overhead of going from a one-pass solution to a two-pass solution is quite heavy in many respects. It is an important barrier. >Inherited attributes are a bit of an oddity. Formally, there is no such >thing as an inherited attribute, it's only a design convention, and there >are lots of different variations on it - xml:space, xml:base, and xml:lang >work quite differently from each other. Without a formalisation of inherited >attributes in the data model and in the XML Schema type system, it's not >easy to come up with language constructs that would be generic enough to be >useful, and an ad-hoc solution for one particular attribute would be really >bad design, I agree. But I'm sure there are ways to do this that are not ad-hoc for a single attribute, that can adapt to the different inherited attributes, and that don't necessarily need to be in the data model. I seem to remember that James Clark at one point said that having a feature to recursively invoke XSLT (in this case on its output) would easily solve this problem. Regards, Martin. >especially in the absence of any evidence of a pressing user >problem. > >Michael Kay
> I seem to remember that James Clark at one point said that having > a feature to recursively invoke XSLT (in this case on its output) > would easily solve this problem. You can now indeed invoke one XSLT template to process the output of another. This is the multi-pass solution that I showed you. Michael Kay
At 10:07 04/05/25 +0100, Michael Kay wrote: > > I seem to remember that James Clark at one point said that having > > a feature to recursively invoke XSLT (in this case on its output) > > would easily solve this problem. > >You can now indeed invoke one XSLT template to process the output of >another. This is the multi-pass solution that I showed you. Hello Michael, Are you saying that this can indeed be done with a single invocation of an XSLT implementation, with a single stylesheet? Your use of "pre-processing phase" and so on in your previous mail wasn't totally clear on this, at least not for me. If this is true, it would be very nice, and I would assume that our WG would then be very happy with the result. For our reference, can you please either point to the section in the spec where this multi-pass thing is described, or can you resend the code in your earlier mail with some framework code added that shows how to define the various passes? Regards, Martin.
> At 10:07 04/05/25 +0100, Michael Kay wrote: > > > > I seem to remember that James Clark at one point said that having > > > a feature to recursively invoke XSLT (in this case on its output) > > > would easily solve this problem. > > > >You can now indeed invoke one XSLT template to process the output of > >another. This is the multi-pass solution that I showed you. > > Hello Michael, > > Are you saying that this can indeed be done with a single invocation > of an XSLT implementation, with a single stylesheet? Your use of > "pre-processing phase" and so on in your previous mail wasn't > totally clear on this, at least not for me. Yes, it can all be done within a single transformation in a single stylesheet. > > If this is true, it would be very nice, and I would assume that our > WG would then be very happy with the result. For our reference, can > you please either point to the section in the spec where this > multi-pass thing is described, or can you resend the code in > your earlier mail with some framework code added that shows how > to define the various passes? There's a simple example showing how temporary trees can be used to support multi-phase transformations in section 9.4 of the spec: http://www.w3.org/TR/xslt20/#temporary-trees I'm afraid I'm too busy today to do a worked example for you. Michael Kay
At 08:47 04/05/26 +0100, Michael Kay wrote: > > Hello Michael, > > > > Are you saying that this can indeed be done with a single invocation > > of an XSLT implementation, with a single stylesheet? Your use of > > "pre-processing phase" and so on in your previous mail wasn't > > totally clear on this, at least not for me. > >Yes, it can all be done within a single transformation in a single >stylesheet. This sounds great! > > If this is true, it would be very nice, and I would assume that our > > WG would then be very happy with the result. For our reference, can > > you please either point to the section in the spec where this > > multi-pass thing is described, or can you resend the code in > > your earlier mail with some framework code added that shows how > > to define the various passes? > >There's a simple example showing how temporary trees can be used to support >multi-phase transformations in section 9.4 of the spec: > >http://www.w3.org/TR/xslt20/#temporary-trees > >I'm afraid I'm too busy today to do a worked example for you. Okay, I took this example, and the code fragments that you sent earlier, and put something together below. I'd appreciate if you could check it. I'm not sure I got everything right, in particular all the modes. How to create a stlyesheet that cleanly copies xml:lang: [assuming for simplicity that all xml:lang information is comming from the source, not from the stylesheet, and that only whole elements are transferred, not independent textual pieces] [I'm using a tree-pass solution; this could be done in many cases as a two-pass solution] - Start with your stylesheet. - Make sure that on all elements, xml:lang is copied. - Assumes that the main mode for the original stylesheet is the default mode. <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="*" mode="expandXmlLang"> <xsl:copy> <xsl:copy-of select="@* | ancestor-or-self::*/@xml:lang[last()]"/> <xsl:apply-templates mode="expandXmlLang"/> </xsl:copy> </xsl:template> <xsl:template match="*" mode="cleanXmlLang"> <xsl:copy> <xsl:copy-of select="@* except @xml:lang[. = ancestor::*/@xml:lang[last()]]"/> <xsl:apply-templates mode="cleanXmlLang"/> </xsl:copy> </xsl:template> <!-- rest of your stylesheet here or somewhere --> <xsl:variable name="xmlLangExpanded"> <xsl:apply-templates select="/" mode="expandXmlLang"/> </xsl:variable> <xsl:variable name="processedMain"> <xsl:apply-templates select="$xmlLangExpanded" mode="#default"/> </xsl:variable> <xsl:template match="/"> <xsl:apply-templates select"$processedMain" mode="cleanXmlLang"/> <xsl:template> </xsl:stylesheet> Regards, Martin.
Yes, this code looks correct. Michael Kay > -----Original Message----- > From: public-qt-comments-request@w3.org > [mailto:public-qt-comments-request@w3.org] On Behalf Of Martin Duerst > Sent: 27 May 2004 07:26 > To: Michael Kay; public-qt-comments@w3.org > Cc: w3c-i18n-ig@w3.org; 'Liam Quin' > Subject: RE: [Serial] additional last call comment about xml:lang > > > At 08:47 04/05/26 +0100, Michael Kay wrote: > > > > Hello Michael, > > > > > > Are you saying that this can indeed be done with a single > invocation > > > of an XSLT implementation, with a single stylesheet? Your use of > > > "pre-processing phase" and so on in your previous mail wasn't > > > totally clear on this, at least not for me. > > > >Yes, it can all be done within a single transformation in a single > >stylesheet. > > This sounds great! > > > > If this is true, it would be very nice, and I would > assume that our > > > WG would then be very happy with the result. For our > reference, can > > > you please either point to the section in the spec where this > > > multi-pass thing is described, or can you resend the code in > > > your earlier mail with some framework code added that shows how > > > to define the various passes? > > > >There's a simple example showing how temporary trees can be > used to support > >multi-phase transformations in section 9.4 of the spec: > > > >http://www.w3.org/TR/xslt20/#temporary-trees > > > >I'm afraid I'm too busy today to do a worked example for you. > > Okay, I took this example, and the code fragments that you sent > earlier, and put something together below. I'd appreciate if you > could check it. I'm not sure I got everything right, in particular > all the modes. > > How to create a stlyesheet that cleanly copies xml:lang: > [assuming for simplicity that all xml:lang information > is comming from the source, not from the stylesheet, and > that only whole elements are transferred, not independent > textual pieces] > [I'm using a tree-pass solution; this could be done in > many cases as a two-pass solution] > > - Start with your stylesheet. > - Make sure that on all elements, xml:lang is copied. > - Assumes that the main mode for the original stylesheet > is the default mode. > > > <xsl:stylesheet > version="2.0" > xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> > > <xsl:template match="*" mode="expandXmlLang"> > <xsl:copy> > <xsl:copy-of select="@* | > ancestor-or-self::*/@xml:lang[last()]"/> > <xsl:apply-templates mode="expandXmlLang"/> > </xsl:copy> > </xsl:template> > > <xsl:template match="*" mode="cleanXmlLang"> > <xsl:copy> > <xsl:copy-of select="@* except @xml:lang[. = > ancestor::*/@xml:lang[last()]]"/> > <xsl:apply-templates mode="cleanXmlLang"/> > </xsl:copy> > </xsl:template> > > <!-- rest of your stylesheet here or somewhere --> > > <xsl:variable name="xmlLangExpanded"> > <xsl:apply-templates select="/" mode="expandXmlLang"/> > </xsl:variable> > > <xsl:variable name="processedMain"> > <xsl:apply-templates select="$xmlLangExpanded" mode="#default"/> > </xsl:variable> > > <xsl:template match="/"> > <xsl:apply-templates select"$processedMain" mode="cleanXmlLang"/> > <xsl:template> > </xsl:stylesheet> > > > Regards, Martin. >
Hello Michael, I'm sorry for the delay of this message. I would like to thank you for your help. The I18N WG is satisfied with the solution to the problem of carrying xml:lang information from a source document to an output document using XSLT. However, we think that it will be difficult for the reader to understand this, and we therefore request that the code below, or something similar, be added to the specification as an example. With kind regards, Martin. At 11:56 04/05/27 +0100, Michael Kay wrote: >Yes, this code looks correct. > >Michael Kay > > > -----Original Message----- > > From: public-qt-comments-request@w3.org > > [mailto:public-qt-comments-request@w3.org] On Behalf Of Martin Duerst > > Sent: 27 May 2004 07:26 > > To: Michael Kay; public-qt-comments@w3.org > > Cc: w3c-i18n-ig@w3.org; 'Liam Quin' > > Subject: RE: [Serial] additional last call comment about xml:lang > > > > > > At 08:47 04/05/26 +0100, Michael Kay wrote: > > > > > > Hello Michael, > > > > > > > > Are you saying that this can indeed be done with a single > > invocation > > > > of an XSLT implementation, with a single stylesheet? Your use of > > > > "pre-processing phase" and so on in your previous mail wasn't > > > > totally clear on this, at least not for me. > > > > > >Yes, it can all be done within a single transformation in a single > > >stylesheet. > > > > This sounds great! > > > > > > If this is true, it would be very nice, and I would > > assume that our > > > > WG would then be very happy with the result. For our > > reference, can > > > > you please either point to the section in the spec where this > > > > multi-pass thing is described, or can you resend the code in > > > > your earlier mail with some framework code added that shows how > > > > to define the various passes? > > > > > >There's a simple example showing how temporary trees can be > > used to support > > >multi-phase transformations in section 9.4 of the spec: > > > > > >http://www.w3.org/TR/xslt20/#temporary-trees > > > > > >I'm afraid I'm too busy today to do a worked example for you. > > > > Okay, I took this example, and the code fragments that you sent > > earlier, and put something together below. I'd appreciate if you > > could check it. I'm not sure I got everything right, in particular > > all the modes. > > > > How to create a stlyesheet that cleanly copies xml:lang: > > [assuming for simplicity that all xml:lang information > > is comming from the source, not from the stylesheet, and > > that only whole elements are transferred, not independent > > textual pieces] > > [I'm using a tree-pass solution; this could be done in > > many cases as a two-pass solution] > > > > - Start with your stylesheet. > > - Make sure that on all elements, xml:lang is copied. > > - Assumes that the main mode for the original stylesheet > > is the default mode. > > > > > > <xsl:stylesheet > > version="2.0" > > xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> > > > > <xsl:template match="*" mode="expandXmlLang"> > > <xsl:copy> > > <xsl:copy-of select="@* | > > ancestor-or-self::*/@xml:lang[last()]"/> > > <xsl:apply-templates mode="expandXmlLang"/> > > </xsl:copy> > > </xsl:template> > > > > <xsl:template match="*" mode="cleanXmlLang"> > > <xsl:copy> > > <xsl:copy-of select="@* except @xml:lang[. = > > ancestor::*/@xml:lang[last()]]"/> > > <xsl:apply-templates mode="cleanXmlLang"/> > > </xsl:copy> > > </xsl:template> > > > > <!-- rest of your stylesheet here or somewhere --> > > > > <xsl:variable name="xmlLangExpanded"> > > <xsl:apply-templates select="/" mode="expandXmlLang"/> > > </xsl:variable> > > > > <xsl:variable name="processedMain"> > > <xsl:apply-templates select="$xmlLangExpanded" mode="#default"/> > > </xsl:variable> > > > > <xsl:template match="/"> > > <xsl:apply-templates select"$processedMain" mode="cleanXmlLang"/> > > <xsl:template> > > </xsl:stylesheet> > > > > > > Regards, Martin. > >
Martin, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N Working Group: << This is a last call comment on your Serialization document. We are sorry that this last call comment is late, but it is very important. We have earlier sent comments on the Data Model and on XSLT where we have urged better support for the inheritance of xml:lang (or for inherited attributes in general). Without any such support, it is extremely tedious to write a transformation or query that adequately copies xml:lang from the input to the output. In internal discussion, Liam Quin suggested that it might be more appropriate to submit our comment against Serialization. The reason for this is that XPath/XSLT offers reasonable support for extracting xml:lang information from source documents into the data model. However, when serialized, this leads in most cases to a completely unnecessary and undesirable multiplication of xml:lang attributes on virtually every element. Adding some support for reducing unnecessary xml:lang attributes from the output on serialization would be highly desirable. As I think we have written previously, better support for xml:lang (and maybe inherited attributes in general) was something that was left over as 'future work' from XSLT 1.0. Your current work is the best chance to fix this problem. >> There was much subsequent discussion of the topic between Michael Kay and yourself.[2-9] In [10], you indicated that the I18N Working group was satisfied with the mechanisms that are available for filtering redundant xml:lang attributes. The XSL and XQuery Working Groups discussed the issue and decided that, in light of the discussion, no change to Serialization is required. The XSL WG will consider adding an example to the XSLT 2.0 specification. May I ask you to confirm that this response is acceptable to the I18N Working Group? Thanks, Henry [On behalf of the XSL and XML Query Working Groups] [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0006.html [2] http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0010.html [3] http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0013.html [4] http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0014.html [5] http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0055.html [6] http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0056.html [7] http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0067.html [8] http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0068.html [9] http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0074.html [10] http://lists.w3.org/Archives/Public/public-qt-comments/2004Jul/0052.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
The public draft says: "The serialization of the instance of the data model follows the same rules as for the xml output method, with the exceptions noted below." Indentation is not mentioned, which implies that the xml, not the html, indentation rules should be followed. So I did. But then output within a <pre> tag is wrecked. So I've changed my code to follow the html rules for indentation. Surely this is what is intended? -- Colin Paul Adams Preston Lancashire
Colin, Colin Paul Adams wrote on 09/03/2004 08:10:24 AM: > The public draft says: > > "The serialization of the instance of the data model follows the same > rules as for the xml output method, with the exceptions noted below." > > Indentation is not mentioned, which implies that the xml, not the > html, indentation rules should be followed. > > So I did. > > But then output within a <pre> tag is wrecked. > > So I've changed my code to follow the html rules for indentation. > Surely this is what is intended? Speaking for myself, I believe you are correct - that would be the only thing that would make sense in the context of the xhtml output method. Thanks, Henry ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Colin, In [1], you submitted the following comment on the 23 July Working Draft of Serialization: << The public draft says: "The serialization of the instance of the data model follows the same rules as for the xml output method, with the exceptions noted below." Indentation is not mentioned, which implies that the xml, not the html, indentation rules should be followed. So I did. But then output within a <pre> tag is wrecked. So I've changed my code to follow the html rules for indentation. Surely this is what is intended? >> Thank you for this comment. The XSL Working Group discussed the comment and intend to resolve this by adding the following item to the bulleted list in Section 6 of Serialization "XHTML Output Method", modelled upon the corresponding description for the HTML output method. << o If the indent parameter has the value yes, the serializer may add or remove whitespace as it outputs the result tree, so long as it does not change the way that a conforming HTML user agent would render the output. Note: This rule can be satisfied by observing the following constraints: o Whitespace must only be added before or after an element, or adjacent to an existing whitespace character. o Whitespace must not be added or removed adjacent to an inline element. The inline elements are those elements in the XHTML namespace in the %inline category of any of the XHTML 1.0 DTD's, in the %inline.class category of the XHTML 1.1 DTD, and elements in the XHTML namespace with local names ins and del if they are used as inline elements (i.e., if they do not contain element children). o Whitespace must not be added or removed inside a formatted element, the formatted elements being those in the XHTML namespace with local names pre, script, style, and textarea. The HTML definition of whitespace is different from the XML definition: see section 9.1 of the HTML 4.01 specification. >> May I ask you to confirm that this response is acceptable to you? Thanks, Joanne Tong [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Sep/0022.html
>>>>> "Joanne" == Joanne Tong <joannet@ca.ibm.com> writes: Joanne> May I ask you to confirm that this response is Joanne> acceptable to you? Yes it is. Thank you. -- Colin Paul Adams Preston Lancashire
Dear XSL Working Group, Dear XML Query Working Group, Comment on section 6 of the Serialization spec: [...] Equally, it is entirely under the control of the person or process that creates the instance of the data model whether the output conforms to XHTML Strict, XHTML Transitional, XHTML Frameset, or XHTML Basic. [...] Please change the enumeration of document types to something more general, there is no "XHTML Strict" document type (it would be "XHTML 1.0 Strict") and some document types such as XHTML 1.1 are missing.
Dear XSL Working Group, Dear XML Query Working Group, Comment on section 6 of the Serialization spec: [...] Given an empty instance of an XHTML element whose content model is not EMPTY (for example, an empty title or paragraph) the serializer MUST NOT use the minimized form. [...] It is not clear to me how it is determined whether an element has such a content model, please specify clearly how this is determined and whether implementations should/may/etc. apply these rules to elements for which the algorithm to determine the content model defines no result, e.g. the algorithm is unlikely to define the rules for the "wbr" element which is a proprietary element with a content model of EMPTY.
Dear XSL Working Group, Dear XML Query Working Group, Comment on section 6 of the Serialization spec: [...] The serializer SHOULD output namespace declarations in a way that is consistent with the requirements of the XHTML DTD if this is possible. The DTD requires the declaration xmlns="http://www.w3.org/1999/xhtml" to appear on the html element, and only on the html element. [...] This is only true for XHTML 1.0 document types, XHTML 1.1 for example allows xmlns="http://www.w3.org/1999/xhtml" on all elements.
Dear XSL Working Group, Dear XML Query Working Group, Comment on section 6 of the Serialization spec: "Note: Where the process used to construct the input instance of the data model does not provide complete control over the prefix used for an element name in the instance of the data model or control of whether the element is in the default namespace (for instance, the XSLT namespace fixup process), implementors are encouraged to provide means or endeavor to preserve the obvious intent of a user to place the html element in the default namespace, wherever possible. For example, implementors of XSLT processors are encouraged to place the html element that results from a literal result element like the following in the default namespace:" It is not clear to me whether the note following the item above is a clarification of the requirement or an additional suggestion. If it is an additional suggestion, please change the document so that implementations SHOULD implement what the note describes, if it is a clarification, please make this more obvious in the document.
Dear XSL Working Group, Dear XML Query Working Group, Comment on section 6 of the Serialization spec: [...] If the instance of the data model includes a head element that has a meta element child, the serializer SHOULD replace any content attribute of the meta element, or add such an attribute, with the value as described above, rather than output a new meta element. [...] Please change this text to limit the behavior to meta elements with http-equiv="Content-Type" (where the value of the attribute is case- insensitive).
Dear XSL Working Group, Dear XML Query Working Group, Comment on section 6 of the Serialization spec: Please add a note that this serialization is insufficient to meet all the requirements to deliver XHTML documents to legacy user agents, for example, if the instance of the data model includes <p xml:lang="en">...</p> the serializer does not serialize it to <p xml:lang="en" lang="en">...</p> which would however be necessary to make this information available to user agents that only look at the lang attribute.
Dear XSL Working Group, Dear XML Query Working Group, Section 7.1.4 of the latest XSLT 2.0 and XQuery 1.0 Serialization draft notes that "The html output method MUST terminate processing instructions with > rather than ?>" but it does not seem to require that serializers singal a serialization error if the processing instruction data contains a ">" which is legal in XML but not possible in HTML. Please add a serialization error for this case. regards.
Section 2 defines the normalisation step three ways (in English, in XSLT and in XQuery) unfortunately I don't think they are equivalent. I think the intention is to get the effect of the xslt/xquery code but I don't think the prose does this. The prose could be corrected but defining things in "equivalent" ways, even if two of them are in a non-normative note is dangerous, and it might be better _just_ to give an unambiguous definition (in XSLT and/or Xquery) and drop the prose description. The problem with the existing text definition is I believe with concatenation of text nodes. <xsl:result-document> <xsl:copy-of select="$seq"/> </xsl:result-document> would merge any adjacent text nodes into a single text node with the concatenated string and drop any empty text nodes (as they would acquire a parent after copying. adjacent text nodes may arise either because they were in adjacent positions in the original sequence, or may "become adjacent" as a result of taking the children of document nodes, or converting atomic strings to text nodes (steps 4 and 5). So if you want to keep the existing text I think you need to make S6 into S7 and add a new S6 that merges adjacent text nodes and removes text nodes with value the empty string. David
Serialization Section 2, "Serializing Arbitrary Data Models": The note at the end of this section contains code that is too wide to print. Also, the XQuery expression in this node will not parse because it has no parentheses around the conditional part of the if-expression. This note should be edited as necessary because of the "documentization" process described above. The resulting code should be checked for validity and broken into lines of suitable length. --Don Chamberlin
Don, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. << Serialization Section 2, "Serializing Arbitrary Data Models": The note at the end of this section contains code that is too wide to print. Also, the XQuery expression in this node will not parse because it has no parentheses around the conditional part of the if-expression. This note should be edited as necessary because of the "documentization" process described above. The resulting code should be checked for validity and broken into lines of suitable length. >> Thank you for your comment, which I have handled editorially. I have applied the editorial changes that you suggested. I would appreciate if you could check the next draft of the specification when it becomes available, and verify that I've correctly applied the changes. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0050.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Serialization Section 3, "Serialization Parameters": The "undeclare-namespaces" parameter needs a better explanation. I am guessing that it means the following: If no namespace node attached to an element E defines the namespace prefix P, but P is defined by a namespace node attached to the parent of element E, then the serialization of element E must contain a namespace declaration attribute that binds the prefix P to an empty URI. --Don Chamberlin
Don, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. << Serialization Section 3, "Serialization Parameters": The "undeclare-namespaces" parameter needs a better explanation. I am guessing that it means the following: If no namespace node attached to an element E defines the namespace prefix P, but P is defined by a namespace node attached to the parent of element E, then the serialization of element E must contain a namespace declaration attribute that binds the prefix P to an empty URI. >> Thank you for your comment, which I am handling editorially. The section titled, "Serialization Parameters" no longer includes a description of the purpose of the "undeclare-namespaces" parameter. Instead, I have attempted to improve the description of the parameter that appears in the section titled, "XML Output Method: the undeclare-namespaces Parameter". I would appreciate if you could check the next draft of the specification when it becomes available, and verify that the description that appears there is satisfactory.. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0052.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Serialization Section 4, "XML Output Method": The third paragraph contains a circular definition of the serialized output that depends on the serialized output. Presumably it intends to say something like this: "If the document node of the data model has a single element node and no text node children, then the serialized output is a well-formed XML document entity that conforms to the XML Namespaces Recomendation. Otherwise, the serialized output is a well-formed XML external general parsed entity which, when referenced within a trivial XML document wrapper like this ..." (This will clean up the circular logic. But this whole paragraph should be edited based on the "documentization" process described is separate comment IBM-SE-001.) --Don Chamberlin
Don, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. << Serialization Section 4, "XML Output Method": The third paragraph contains a circular definition of the serialized output that depends on the serialized output. Presumably it intends to say something like this: "If the document node of the data model has a single element node and no text node children, then the serialized output is a well-formed XML document entity that conforms to the XML Namespaces Recomendation. Otherwise, the serialized output is a well-formed XML external general parsed entity which, when referenced within a trivial XML document wrapper like this ..." (This will clean up the circular logic. But this whole paragraph should be edited based on the "documentization" process described is separate comment IBM-SE-001.) >> Thank you for your comment, which I am handling editorially. I believe that the wording of this section was revised in response to another comment on the last call draft, and that revision has fixed the circular definition you cite. I would appreciate if you could check the next draft of the specification when it becomes available, and verify that it correctly resolves your issue. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0054.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Serialization Section 4, "XML Output Method": The paragraph before the bullet list says that the round-tripped data model must be "the same as the starting data model". This should be clarified to mean the data model after the "normalization" (or "documentization") process described in Section 2. --Don Chamberlin
Don, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. << Serialization Section 4, "XML Output Method": The paragraph before the bullet list says that the round-tripped data model must be "the same as the starting data model". This should be clarified to mean the data model after the "normalization" (or "documentization") process described in Section 2. >> Thank you for your comment, which I am handling editorially. I have applied the editorial change that you suggested. I would appreciate if you could check the next draft of the specification when it becomes available to verify that I've correctly applied the change. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0056.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Editorial [J] [Section 6.5 HTML Output Method] Encoding states that [style-1] "then unless the include-content-type parameter is present and has the value "no"" But, for all other parameters, the style is [style-2] "If the xxx parameter has the value yes". We have a preference for [style-2]. For consistency, we request you to recast the sentence in Section 6.5 using [style-2]. On behalf of the XML Schema WG. -- Mary Holstege@mathling.com
Yes. There are a few phrases like this that predate the separation of the serialization spec from XSLT, and that make assumptions about the default values of parameters: the theory is that defaults should be defined in the XSLT specification, not here, but this has not always been carried through. Michael Kay (speaking personally)
Mary, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema Working Group. << Editorial [J] [Section 6.5 HTML Output Method] Encoding states that [style-1] "then unless the include-content-type parameter is present and has the value "no"" But, for all other parameters, the style is [style-2] "If the xxx parameter has the value yes". We have a preference for [style-2]. For consistency, we request you to recast the sentence in Section 6.5 using [style-2]. >> Thanks to you and the XML Schema Working Group for this comment, which I am handling editorially. In response to another comment, the XSL and XML Query Working Groups decided to make all serialization parameters mandatory, except for doctype-public and doctype-system, which are still optional. There are no longer instances of the undesirable style you cited. I would appreciate if you could check the next public draft of the specification when it becomes available to verify that the styles used to refer to the values of the two serialization parameters that are still optional and to those that are non-optional are acceptable to the XML Schema Working Group. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0270.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Editorial [M] [Section 1: Introduction] The key words defined in RFC 2119 are in uppercase and should be here too and where used as a term of art. e.g. s/may/MAY s/must/MUST etc. or a note should be added clarifying that the lowercase forms are always used as terms of art. On behalf of the XML Schema WG. -- Mary Holstege@mathling.com
Mary, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema Working Group: << Editorial [M] [Section 1: Introduction] The key words defined in RFC 2119 are in uppercase and should be here too and where used as a term of art. e.g. s/may/MAY s/must/MUST etc. or a note should be added clarifying that the lowercase forms are always used as terms of art. >> Thanks to you and the working group for this comment, which I have handled editorially. I have applied the editorial change that you suggested. I would appreciate if you could check the next public draft of Serialization when it becomes available, and verify that I've addressed the comment to the Schema WG's satisfaction. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0273.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Editorial [N] [Section 2: Serializing Arbitrary Data Models] This section uses terms from 'XQuery 1.0 and XPath 2.0 Data Model' and 'XQuery 1.0 and XPath 2.0 Functions and Operators' - such as 'sequence', 'atomic value', 'text node', 'document node', 'casting to xs:string', etc. For readability, these terms should be cross linked to the extent that it is possible. On behalf of the XML Schema WG. -- Mary Holstege@mathling.com
Editorial [O] [Section 3: Serialization Parameters] 'use-character-maps' is a misnomer. It is not a boolean parameter as it sounds. It is a list of {character, string} pairs. Similar to cdata-section-elements, it should be 'character-maps'. On behalf of the XML Schema WG. -- Mary Holstege@mathling.com
Mary, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema Working Group. << Editorial [O] [Section 3: Serialization Parameters] 'use-character-maps' is a misnomer. It is not a boolean parameter as it sounds. It is a list of {character, string} pairs. Similar to cdata-section-elements, it should be 'character-maps'. >> Thanks to you and the XML Schema Working Group for this comment, which I am handling editorially. The name of this parameter derives from the attribute of the same name that is defined for xsl:output and xsl:result-document by XSLT 2.0. Admittedly, the name can be interpreted as implying that it has a boolean value, but it is intended to mean "these are the character maps to use." The name actually has an antecedent in XSLT 1.0: use-attribute-sets. As I'd like to keep the names of the serialization parameters the same as the corresponding attributes defined by XSLT, I'm inclined not to make this editorial change. If this response is not acceptable to the XML Schema Working Group, I would invite you reopen the issue. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0275.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Editorial [P] [Section 3: Serialization Parameters] Though the title is 'Serialization Parameters', this section also outlines the four phases of serialization. Request to break this section into two: 'Serialization Parameters' and 'Serialization'. On behalf of the XML Schema WG. -- Mary Holstege@mathling.com
Mary, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema Working Group: Mary Holstege wrote on 2004-02-12 04:18:57 PM: > Editorial > > [P] [Section 3: Serialization Parameters] Though the title is 'Serialization > Parameters', this section also outlines the four phases of serialization. > Request to break this section into two: 'Serialization Parameters' and > 'Serialization'. I have applied the editorial change that you suggested, splitting the section into two. I named the second section "Phases of Serialization". Thanks to you and the Schema WG for this comment. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0276.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Editorial [Q] [Section 4.5: XML Output Method: the omit-xml-declaration Parameter] This section jams two parameters, omit-xml-declaration and standalone. Suggestion: split them into 2 sections. On behalf of the XML Schema WG. -- Mary Holstege@mathling.com
Mary, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema Working Group: Mary Holstege wrote on 2004-02-12 04:19:19 PM: > Editorial > > [Q] [Section 4.5: XML Output Method: the omit-xml-declaration Parameter] > This section jams two parameters, omit-xml-declaration and standalone. > Suggestion: split them into 2 sections. Thanks to you and the Schema Working Group for this comment. I felt that the effect of the standalone parameter is too tightly coupled with the effect of the omit-xml-declaration parameter to split this section into two. Rather than make the proposed change, I decided to rename the section "XML Output Method: the omit-xml-declaration and standalone Parameters". I hope that change will be acceptable to the working group. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0278.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Dear XML Query WG and XSL WG, Below please find the I18N WGs comments on your last call document "XSLT 2.0 and XQuery 1.0 Serialization" (http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/). Please note the following: - Please address all replies to there comments to the I18N IG mailing list (w3c-i18n-ig@w3.org), not just to me. - Our comments are numbered in square brackets [nn]. We look forward to further discussion with you. [this mail is copied to the DOM WG to tell them what we are telling you about UTF-16 and endianness, which they should adopt for the Document Object Model (DOM) Level 3 Load and Save Specification] [21] Section 8: There should be a reference to XSLT to show examples of use of character maps. Regards, Martin.
Martin, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N Working Group. << [21] Section 8: There should be a reference to XSLT to show examples of use of character maps. >> Thanks to you and the I18N Working Group for this comment, which I'm handling editorially. I have applied the editorial change that you suggested. I would appreciate if you could check the next public draft of the specification when it becomes available, and verify that I've correctly applied the change. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Dear XML Query WG and XSL WG, Below please find the I18N WGs comments on your last call document "XSLT 2.0 and XQuery 1.0 Serialization" (http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/). Please note the following: - Please address all replies to there comments to the I18N IG mailing list (w3c-i18n-ig@w3.org), not just to me. - Our comments are numbered in square brackets [nn]. We look forward to further discussion with you. [this mail is copied to the DOM WG to tell them what we are telling you about UTF-16 and endianness, which they should adopt for the Document Object Model (DOM) Level 3 Load and Save Specification] Editorial: [26] Normalization: This term is used for different things: - Character normalization (Charmod, NFC) - Normalization as described in section 2 of this document. - Normalization as described in the formal semantics document. These should be very clearly distinguished and labeled. [27] Section 3, 'media-type', says "... the charset parameter of the media type must not be specified explicitly". This should be changed to "... the charset parameter of the media type must not be specified explicitly here." to make clear that this is just a statement about this parameter, not in general. [28] Section 3, "omit-xml-declaration specifies whether the serialization process is to output an XML declaration. The value must be yes or no If this parameter is not specified, the value is implementation defined." The wording should be improved to make clear which is yes and which is no. (and please add a period after 'no'). [29] Section 4: "Additional nodes may be present in the new tree, and the values of attribute nodes and text nodes in the new tree may be different from those in the original tree, due to the character expansion phase of serialization.": this should clearly state that this applies only to URI escaping and character mapping, and that CDATA sections and escaping of special characters cannot create differences. [30] 4.8: "If the output method is xml and the value of the version parameter is 1.0, namespace >UN<declaration is not performed, and the undeclare-namespace parameter is ignored." [33] Section 7, freestanding paragraph "The default encoding for the text output method is implementation-defined.": this is a repetition from the previous paragraph and should be removed. [34] RFC 2376 is obsoleted by RFC 3023. Regards, Martin.
Martin, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N Working Group. << [26] Normalization: This term is used for different things: - Character normalization (Charmod, NFC) - Normalization as described in section 2 of this document. - Normalization as described in the formal semantics document. These should be very clearly distinguished and labeled. [27] Section 3, 'media-type', says "... the charset parameter of the media type must not be specified explicitly". This should be changed to "... the charset parameter of the media type must not be specified explicitly here." to make clear that this is just a statement about this parameter, not in general. [28] Section 3, "omit-xml-declaration specifies whether the serialization process is to output an XML declaration. The value must be yes or no If this parameter is not specified, the value is implementation defined." The wording should be improved to make clear which is yes and which is no. (and please add a period after 'no'). [29] Section 4: "Additional nodes may be present in the new tree, and the values of attribute nodes and text nodes in the new tree may be different from those in the original tree, due to the character expansion phase of serialization.": this should clearly state that this applies only to URI escaping and character mapping, and that CDATA sections and escaping of special characters cannot create differences. [30] 4.8: "If the output method is xml and the value of the version parameter is 1.0, namespace >UN<declaration is not performed, and the undeclare-namespace parameter is ignored." [Made [31] and [32] into separate, substantive issues. HZ] [33] Section 7, freestanding paragraph "The default encoding for the text output method is implementation-defined.": this is a repetition from the previous paragraph and should be removed. [34] RFC 2376 is obsoleted by RFC 3023. >> Thanks to you and the I18N Working Group for these comments, which I am handling editorially. I have applied the following changes to the serialization draft in response to these comments: [26] I've tried to clarify the use of the first two types of normalization by referring to them as sequence normalization and Unicode normalization throughout the document. [27] I made the suggested correction. [28] Most descriptions of parameters have been removed from the "Serialization Parameters" section, including the description of omit-xml-declaration. The description of this parameter that appears in the section on the XML output method should be clear. [29] I have clarified this by referring explicitly to URI escaping, character mapping and Unicode normalization. [30] It is now considered to be a serialization error if undeclare-namespaces has the value yes and the output method is xml, so this sentence no longer appears in the draft. [33] The encoding parameter is no longer optional, so the two pieces of redundant information have been removed. [34] Added a new normative reference to RFC 3023. I would appreciate if you could check the next public draft of the specification when it becomes available, and verify that I've correctly applied all the changes, and that they resolve these issues to the satisfaction of the I18N Working Group. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
SECTION 4.7 : XML output method: the undeclare-namespaces parameter Last sentence: "If the output method is xml and the value of the version parameter is 1.0, namespace declaration is not performed...". This statement looks incorrect. Surely the output of an XML 1.0 method must declare namespaces, otherwise the result will not conform to Namespaces 1.0 Recommendation. I think "declaration" is a typo here; I think you mean "undeclaration". - Steve B.
Steve, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. Stephen Buxton wrote on 2004-02-17 06:32:42 AM: > SECTION 4.7 : XML output method: the undeclare-namespaces parameter > > Last sentence: "If the output method is xml and the value of the > version parameter is 1.0, namespace declaration is not performed...". > This statement looks incorrect. Surely the output of an XML 1.0 > method must declare namespaces, otherwise the result will not > conform to Namespaces 1.0 Recommendation. I think "declaration" > is a typo here; I think you mean "undeclaration". Thank you for pointing out this typographical error. In response to another issue, this conflict in the settings of parameters has become a serialization error; the subsequent rewording removed the typo. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0920.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
SECTION 3: Serialization parameters The parameter "version" is said to be "the version of the output method". Viewing an output method as a deliverable product, or component of a deliverable product, the version of the output method is not subject to change from invocation to invocation, and hence is not a parameter in the usual meaning of the term, being rather a descriptive identifier about the output method, just the same as the name of the vendor selling the output method is not a parameter, it is a descriptive identifier. Perhaps what is meant is the version of XML to be generated. This is corroborated by section 4.1 "XML output method: the version parameter". - Steve B.
Steve, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. << SECTION 3: Serialization parameters The parameter "version" is said to be "the version of the output method". Viewing an output method as a deliverable product, or component of a deliverable product, the version of the output method is not subject to change from invocation to invocation, and hence is not a parameter in the usual meaning of the term, being rather a descriptive identifier about the output method, just the same as the name of the vendor selling the output method is not a parameter, it is a descriptive identifier. Perhaps what is meant is the version of XML to be generated. This is corroborated by section 4.1 "XML output method: the version parameter". >> Thank you for this comment, which I am handling editorially. In response to another last call comment, the description of the version parameter no longer appears in Section 3 "Serialization Parameters". Instead, descriptions of this parameter appear in the definitions of the XML and HTML output methods. I would appreciate if you could check the next draft of the specification when it becomes available, and verify that those definitions of the version parameter are acceptable to you. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0931.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
SECTION 4: XML output method Sixth open circle bullet: "Additional namespace nodes may be present in the new tree if the serialization process undeclared namespaces...". On first reading, this sentence seemed ungrammatical. The problem was that it felt like "the serialization" was the subject, and "process" was the verb, which should either be "processes" or "processed", and then "undeclared namespaces" would be the direct object. Rewording it as "Additional namespace nodes may be present in the new tree if the serialization process undeclared one or more namespaces..." might help. - Steve B.
Steve, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. Stephen Buxton wrote on 2004-02-17 06:44:21 AM: > SECTION 4: XML output method > > Sixth open circle bullet: "Additional namespace nodes may be > present in the new tree if the serialization process undeclared > namespaces...". On first reading, this sentence seemed > ungrammatical. The problem was that it felt like "the > serialization" was the subject, and "process" was the verb, > which should either be "processes" or "processed", and then > "undeclared namespaces" would be the direct object. > Rewording it as "Additional namespace nodes may be > present in the new tree if the serialization process undeclared > one or more namespaces..." might help. Thank you for pointing out the potential difficulty in parsing this sentence. I have reworded the clause as follows: "Additional namespace nodes may be present in the new tree if the serialization process did not undeclare one or more namespaces. . . ." Note that "undeclared" has become "did not undeclare" in response to a substantive comment. I hope you will find that the new version is not so difficult to parse. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0933.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
SECTION 3: Serialization parameters Assuming that phase 1, "Markup generation", is responsible for creating namespace declarations, the parameter undeclare-namespaces is relevant to this phase and should be listed here. - Steve B.
Steve, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. Stephen Buxton wrote on 2004-02-17 06:44:00 AM: > SECTION 3: Serialization parameters > > Assuming that phase 1, "Markup generation", is responsible for > creating namespace declarations, the parameter undeclare-namespaces > is relevant to this phase and should be listed here. Yes, creation of namespace declarations falls under the purview of the markup generation phase. I have added the undeclare-namespaces parameter to the list of parameters that influence that phase of serialization. Thank you for pointing out the omission. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0934.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
SECTION 4: XML output method Last sentence concludes with "...if nodes in the model contain characters that are invalid in XML (introduced, perhaps, by calling a user-written extension function: this is an error but the processor is not required to signal it)." It is not clear what is meant by "processor". Does this refer to the XQuery engine that invoked the user-written extension and thereby obtained a corrupt value, or does it refer to the XML output method? Probably the former, since there does not appear to be any provision for calling user-written functions from the output method. This could be fixed by changing "processor" to "XQuery processor". - Steve B.
Steve, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. << SECTION 4: XML output method Last sentence concludes with "...if nodes in the model contain characters that are invalid in XML (introduced, perhaps, by calling a user-written extension function: this is an error but the processor is not required to signal it)." It is not clear what is meant by "processor". Does this refer to the XQuery engine that invoked the user-written extension and thereby obtained a corrupt value, or does it refer to the XML output method? Probably the former, since there does not appear to be any provision for calling user-written functions from the output method. This could be fixed by changing "processor" to "XQuery processor". >> Thank you for your comment, which I am handling editorially. In response to your comment, and at the suggestion of Michael Kay, I've introduced a new term: serializer. That term is now used throughout the specification. The only two remaining instances of "processor" are qualified and refer to XML processors and XSLT processors. I would appreciate if you could check the next draft of the specification when it becomes available, and verify that this new term and its definition satisfactorily address your comment. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0935.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
SECTION 4: XML output method Sixth bullet, "Additional namespace nodes may be present in the new tree if the serialization process undeclared namespaces." This seems to be a misstatement of what you intend. Given a document node D with an element node E1 with a child E2 with fewer inscope namespaces than its parent E1, then there are four scenarios to consider, forming a two-by-two matrix: The output method may undeclare namespaces, or it may not; and the parse of the output may be an XML 1.0 parser or an XML 1.1 parser. The analysis of the four cases is: undeclare, reparse with XML 1.0: this will generate an error during the reparse, since undeclaring is not a feature of XML 1.0. undeclare, reparse with XML 1.1: this will restore the original value. no undeclare, reparse with XML 1.0: no error during the reparse step (at least for namespace undeclarations), so the resulting document node will have more namespace nodes in the regenerated E2 than it should have. no undeclare, reparse with XML 1.1: same analysis as preceeding case. Thus the correct statement is that additional namespaces nodes may be present in the new tree if the serialization process did not undeclare namespaces. That is replace "undeclared" with "did not undeclare". - Steve B.
Steve, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. << SECTION 4: XML output method Sixth bullet, "Additional namespace nodes may be present in the new tree if the serialization process undeclared namespaces." This seems to be a misstatement of what you intend. Given a document node D with an element node E1 with a child E2 with fewer inscope namespaces than its parent E1, then there are four scenarios to consider, forming a two-by-two matrix: The output method may undeclare namespaces, or it may not; and the parse of the output may be an XML 1.0 parser or an XML 1.1 parser. The analysis of the four cases is: undeclare, reparse with XML 1.0: this will generate an error during the reparse, since undeclaring is not a feature of XML 1.0. undeclare, reparse with XML 1.1: this will restore the original value. no undeclare, reparse with XML 1.0: no error during the reparse step (at least for namespace undeclarations), so the resulting document node will have more namespace nodes in the regenerated E2 than it should have. no undeclare, reparse with XML 1.1: same analysis as preceeding case. Thus the correct statement is that additional namespaces nodes may be present in the new tree if the serialization process did not undeclare namespaces. That is replace "undeclared" with "did not undeclare". >> Thank you for your comment, which I am handling editorially. I have applied the correction you pointed out. I would appreciate if you could check the next draft of the specification when it becomes available, and verify that I've correctly applied the change. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0937.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
[My apologies that these comments are coming in after the end of the Last Call comment period.] Hello, Following are comments on Serialization that we believe to be editorial in nature. ------------------------------------------------------------------ Section 2 The last sentence states that xs:NOTATION cannot be converted to xs:string. That's no longer true. ------------------------------------------------------------------ Section 3 In the second bullet (cdata-section-elements), the value should be a list of expanded QNames rather than names. ------------------------------------------------------------------ Section 3 In the second bullet (cdata-section-elements), the clause "no elements will be treated specially" appears. The meaning of "treated specially" is not clear - the statement should be made more clear. ------------------------------------------------------------------ Section 3 In the penultimate bullet (use-character-maps), the word "provides" is vague. This should use the word "specifies", as do other bullets. ------------------------------------------------------------------ Section 3 In the penultimate bullet (use-character-maps), the name of the parameter is not entirely accurate. In fact, there is just one mapping, though it may map many characters to strings. ------------------------------------------------------------------ Section 4.4 The last bullet refers to the xml:space attribute. A reference to the definition of that attribute would be appropriate. ------------------------------------------------------------------ Section 4.4 In the last bullet, the style used to describe the value of the xml:space attribute isn't appropriate. Change 'xml:space="preserve" attribute' to "xml:space attribute with the value 'preserve'". ------------------------------------------------------------------ Section 4.5 The last sentence describes circumstances in which the omit-xml-declaration parameter should be ignored. Rather than saying it's ignored, it might be easier to understand if this indicated it's treated as if the value was "no". ------------------------------------------------------------------ Section 5 In the note in the fifth bullet, "in in" should be "in". ------------------------------------------------------------------ Section 5 Suggest replacing the note in the fifth bullet with the following: << NOTE: Where the process used to construct the input data model does not provide complete control over the prefix (or lack thereof) used for an element name in the data model, implementors are encouraged to produce namepace syntax appropriate to the kind of document being serialized (when possible). For example, when serializing a document as XHTML it is preferable to bind "http://www.w3.org/1999/xhtml" as the default namespace (no prefix), like so: <html xmlns="http://www.w3.org/1999/xhtml"> ... </html> for best compatability with pre-XHTML applications. >> ------------------------------------------------------------------ Section 5 The sixth bullet states, "The content type should be set to the value given for the media-type parameter; the default value for XHTML is text/html. The value application/xhtml+xml, registered in [RFC3236], may also be used." It is not clear whether this means that a processor has two choices of media-type to use as the default, or "may also be used" refers to what the client of the serialization process may specify as the value of the parameter. That needs to be clearly specified. If the latter, it also needs to be clearly specified whether those are the only two values permitted. ------------------------------------------------------------------ Appendix A Add references to XML 1.1 and Namespaces in XML 1.1. ------------------------------------------------------------------ Thanks, Henry [Speaking on behalf of reviewers from IBM.] ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Hello. In [1], I submitted various editorial comments on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of reviewers at IBM. I have applied the suggested editorial changes, except those in the description of the XHTML output method. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0978.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
SECTION 2: Serializing arbitrary data models Step 2 says "If the data model instance contains any atomic values, or sequences that contain atomic values, ...". But the input to the normalization process is a single sequence, and sequences do not nest, so what does it mean to say that a sequence contains a sequence? Perhaps you mean the second sequence to be a subsequence of the input sequence. But there is no need for this case, since if a subsequence contains an atomic value, the input sequence also contains an atomic value. The phrase "or sequences that contain atomic values" can be deleted. - Steve B.
Steve, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. Stephen Buxton wrote on 2004-02-18 05:21:15 PM: > SECTION 2: Serializing arbitrary data models > > Step 2 says "If the data model instance contains any atomic > values, or sequences that contain atomic values, ...". > But the input to the normalization process is a single sequence, > and sequences do not nest, so what does it mean to say that > a sequence contains a sequence? Perhaps you mean the second > sequence to be a subsequence of the input sequence. But there > is no need for this case, since if a subsequence contains an > atomic value, the input sequence also contains an atomic value. > The phrase "or sequences that contain atomic values" can be > deleted. I am not entirely certain why that phrase appeared here, but I agree that it is entirely redundant. I have applied the editorial change that you suggested. Thank you for your comment. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1037.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
SECTION 4: XML output method The first four words of this section are "The xml output method". The specification is non-constructive and non-deterministic. This is a good thing, but it means that there is more than one acceptable algorithm for an xml output method. Consequently calling it "the xml output method" is misleading. It would be better to say explicitly that the specification is non-constructive and non-deterministic, and talk about the requirements on "an xml output method" rather than "the xml output method". Similarly, the title might be "XML output methods" rather than the singular. - Steve B.
Steve, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. << SECTION 4: XML output method The first four words of this section are "The xml output method". The specification is non-constructive and non-deterministic. This is a good thing, but it means that there is more than one acceptable algorithm for an xml output method. Consequently calling it "the xml output method" is misleading. It would be better to say explicitly that the specification is non-constructive and non-deterministic, and talk about the requirements on "an xml output method" rather than "the xml output method". Similarly, the title might be "XML output methods" rather than the singular. >> Thank you for your comment, which I am handling editorially. You make a good point. I have added a paragraph to the end of the section entitled "Serialization Parameters" indicating that some unspecified details of the output methods are implementation-dependent, in an attempt to make it clear that the specifications of the output methods are non-constructive. However, I've decided not to accept your suggestion to refer to "an xml output method" rather than "the xml output method." I'm inclined to regard this as a single method that is parameterized by both serialization parameters and implementation-defined and -dependent behaviours. I think using the indefinite article would be confusing for readers. I would appreciate it if you could check the next draft of serialization to verify whether this resolution is acceptable to you. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1038.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
SECTION 4.8: XML output method: other parameters There is no section for the standalone parameter, and it is not mentioned here either. In section 3 "Serialization parameters" two paragraphs after the list of parameters, last sentence, it says "If the semantics of a parameter are not described for an output method, then it is not applicable to that output method." This would seem to imply that the standalone parameter is not applicable to the XML output method. But that can't be; see section 4.5 "XML output method: the omit-xml-declaration parameter, which describes interactions between the omit-xml-declaration parameter and the standalone parameter. Arguably these mentions are sufficient, but it would be better if either the standalone property appeared in the title of some section, or was listed in this section as one of the "other parameters". - Steve B.
Steve, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. Stephen Buxton wrote on 2004-02-18 05:21:28 PM: > SECTION 4.8: XML output method: other parameters > > There is no section for the standalone parameter, and it is not > mentioned here either. In section 3 "Serialization parameters" > two paragraphs after the list of parameters, last sentence, it > says "If the semantics of a parameter are not described for an > output method, then it is not applicable to that output method." > This would seem to imply that the standalone parameter is not > applicable to the XML output method. But that can't be; see > section 4.5 "XML output method: the omit-xml-declaration parameter, > which describes interactions between the omit-xml-declaration > parameter and the standalone parameter. Arguably these mentions > are sufficient, but it would be better if either the standalone > property appeared in the title of some section, or was listed > in this section as one of the "other parameters". In response to your comment, I applied an editorial change to rename the section that was previously entitled "XML output method: the omit-xml-declaration Parameter" to "XML Output Method: the omit-xml-declaration and standalone Parameters". I believe that should make it clear that the standalone parameter is applicable to the xml output method. Thank you for submitting your comment. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1039.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
SECTION 2: serializing arbitrary data models The term "serialization error" is used in various places but never formally defined and therefore it is not clear what this term encompasses. Step 2 says "It is a serialization error if the value cannot be cast to xs:string." and step 6 says "It is a serialization error if an item in the sequence is an attribute node or a namespace node." So "serialization error" includes at least these two conditions, but it is not clear on reading this section whether there might be others defined later in the specification. The paragraph after the six steps says "If the normalization process results in a serialization error, the processor must signal the error." So it must signal the two just described; are there any others? - Steve B.
Steve, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. << SECTION 2: serializing arbitrary data models The term "serialization error" is used in various places but never formally defined and therefore it is not clear what this term encompasses. Step 2 says "It is a serialization error if the value cannot be cast to xs:string." and step 6 says "It is a serialization error if an item in the sequence is an attribute node or a namespace node." So "serialization error" includes at least these two conditions, but it is not clear on reading this section whether there might be others defined later in the specification. The paragraph after the six steps says "If the normalization process results in a serialization error, the processor must signal the error." So it must signal the two just described; are there any others? >> Thank you for your comment, which I am handling editorially. I have added a terminology section that includes a definition of "serialization error", and what it means to signal a serialization error. I would appreciate if you could check the next draft of the specification when it becomes available to verify that you find the definition to be acceptable. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1041.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
SECTION 3: Serialization parameters It says "undeclare-namespaces specifies whether namespaces, are..." The comma should be removed. - Steve B.
Steve, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. Stephen Buxton wrote on 2004-02-18 05:22:21 PM: > SECTION 3: Serialization parameters > > It says "undeclare-namespaces specifies whether namespaces, are..." > The comma should be removed. Thank you for submitting your comment. In response to another comment, the description of the meaning of a parameter appears only in the section on the xml output method. I removed the text with the typographical error you pointed out. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1043.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
SECTION 3: serialization parameters The list of parameters appears to be alphabetized, with the exception of the first parameter, encoding. Perhaps this one should be placed in alphabetic order. Perhaps there should be a prefatory note that the parameters are listed in alphabetic order. - Steve B.
Steve, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. Stephen Buxton wrote on 2004-02-18 05:22:29 PM: > SECTION 3: serialization parameters > > The list of parameters appears to be alphabetized, with the > exception of the first parameter, encoding. Perhaps this one > should be placed in alphabetic order. Perhaps there should be > a prefatory note that the parameters are listed in alphabetic > order. I have applied the first editorial change you suggested, and placed all of the parameters in alphabetical order. I did not feel it was necessary to include a prefatory note. Thank you for submitting your comment. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1044.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
SECTION 2: Serializing arbitrary data models The Note towards the end of this section overflows the right margin when this specification is printed. - Steve B.
Steve, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. Stephen Buxton wrote on 2004-02-18 05:22:38 PM: > SECTION 2: Serializing arbitrary data models > > The Note towards the end of this section overflows the right > margin when this specification is printed. I believe you were referring specifically to the XQuery expression that appears in the example in that note. I have adjusted the formatting of the example, so that it should fit within a page of reasonable width. Thank you for pointing out the problem. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1045.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
SECTION 2: Serializing arbitrary data models The introductory paragraph says that the result of each step should be another sequence. Step 1 says "Replace an empty sequence with a zero-length string." The term "empty string" is a poor choice of terminology. a "string" might be either a text node or an atomic value of type xs:string. Since text nodes are not allowed to have length zero, you must mean an xs:string value of length 0. You could save your readers the trouble of making these deductions (by possibly missing the fact that a text node can not be empty) by simply saying "Replace the empty sequence by an atomic value of type xs:string and length 0". - Steve B.
SECTION 2: Serializing arbitrary data models The use of the term "data models" in the title is incorrect. A data model is an abstract specification of what values are permissible within a system. You are not talking about serializing arbitrary abstract specifications. What you mean here is "serializing arbitrary sequences". This abuse of the term "data model" is persistent throughout the entire specification. It would be a good idea to scan the entire specification for "data model" and use the proper terminology for those so-called "data models" that are really values. - Steve B.
Steve, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. Stephen Buxton wrote on 2004-02-18 05:22:56 PM: > SECTION 2: Serializing arbitrary data models > > The use of the term "data models" in the title is incorrect. > A data model is an abstract specification of what values are > permissible within a system. You are not talking about > serializing arbitrary abstract specifications. > What you mean here is "serializing arbitrary sequences". > This abuse of the term "data model" is persistent throughout > the entire specification. It would be a good idea to scan > the entire specification for "data model" and use the proper > terminology for those so-called "data models" that are > really values. Thank you for pointing out this error. The editors of the various XQuery and XSLT specifications decided that the appropriate term to use when speaking of a value is "instance of the data model", and that the term "data model" should be used only when speaking of that specification. I have applied this editorial change throughout the Serialization specification. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1047.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Section 2 Editorial Please rewrite "whether namespaces, are to be undeclared " as "whether namespaces are to be undeclared ".
Michael, In [2], I sent the following response to one of your comments: Henry Zongaro/Toronto/IBM wrote on 2004-06-13 02:05:48 PM: > In [1], you submitted the following comment on the Last Call > Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. > > Michael Rys wrote on 2004-02-26 04:23:18 PM: > > Section 2 > > Editorial > > > > Please rewrite "whether namespaces, are to be undeclared " as "whether > > namespaces are to be undeclared ". > I have corrected the typographical error that you pointed out. > Thank you for submitting this comment. In fact, in response to another comment, the description of the meaning of the undeclare-namespaces parameter appears only in the section on the xml output method. I removed the text with the typographical error you pointed out. My apologies for any confusion. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1196.html [2] http://lists.w3.org/Archives/Public/public-qt-comments/2004Jun/0062.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Section 2 Editorial "Serialization can be regarded as involving four phases of processing, carried out sequentially as follows:" should add normalization step that is mentioned earlier or make it clear that normalization has already occurred.
Michael, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. Michael Rys wrote on 2004-02-26 04:23:21 PM: > Section 2 > Editorial > > "Serialization can be regarded as involving four phases of processing, > carried out sequentially as follows:" should add normalization step that > is mentioned earlier or make it clear that normalization has already > occurred. I have applied an editorial change to indicate that the phases of serialization that this describes follow the normalization step. Thank you for your comment. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1199.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Section 4 Editorial Please insert a subsection title before current 4.1 to improve structure of section.
Michael, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. << Section 4 Editorial Please insert a subsection title before current 4.1 to improve structure of section. >> Thank you for your comment, which I am handling editorially. I have applied the editorial change that you suggested to both the section entitled "XML Output Method" and the section entitled "HTML Output Method". I would appreciate if you could check the next draft of the specification when it becomes available, and verify that I've applied the change to your satisfaction. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1200.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Section 4.2 Editorial Add some more concrete examples for last paragraph. E.g, U+0007 in XML 1.0 or U+0000 in XML 1.1.
Michael, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. << Section 4.2 Editorial Add some more concrete examples for last paragraph. E.g, U+0007 in XML 1.0 or U+0000 in XML 1.1. >> Thank you for your comment, which I am handling editorially. The section you refer to is entitled "XML Output Method: the encoding Parameter". I decided instead to add an example describing the effect of control characters in the section on the version parameter, and I added an example describing the effect of using characters that cannot be represented in a particular encoding to this section on the encoding parameter. I would appreciate if you could check the next draft of the specification when it becomes available, and verify that this is an acceptable response to your comment. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1201.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Section 4 Editorial Remove note about using XSD type annotation mechanisms. If this is added, it should be added as a separate output method (see comment MS-SER-LC1-012).
Section 4.7 Editorial Please reword "represented most accurately " as "represented accurately ".
Michael, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. Michael Rys wrote on 2004-02-26 04:23:48 PM: > Section 4.7 > Editorial > > Please reword "represented most accurately " as "represented accurately > ". I have applied the editorial change that you recommended. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1203.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Section 4.7 Editorial Please add the xml namespace node to the example, since that node is always in-scope. Also consider to represent the data model nodes without using an XML serialized form.
Michael, In [1], you submitted the following comment on the Last Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization. << Section 4.7 Editorial Please add the xml namespace node to the example, since that node is always in-scope. Also consider to represent the data model nodes without using an XML serialized form. >> Thank you for this comment, which I am handling editorially. I have applied the editorial changes that you suggested. I would appreciate if you could check the next draft of the specification when it becomes available, and verify that I've correctly applied the change. Thanks, Henry [1] http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1206.html ------------------------------------------------------------------ Henry Zongaro Xalan development IBM SWS Toronto Lab T/L 969-6044; Phone +1 905 413-6044 mailto:zongaro@ca.ibm.com
Dear XSL Working Group, Dear XML Query Working Group, Comment on section 6 of the Serialization spec: [...] If the instance of the data model includes a head element in the XHTML namespace, and the include-content-type parameter has the value yes, the xhtml output method MUST add a meta element immediately after the start-tag of the head element specifying the character encoding actually used. [...] It is not clear what "immediately after" means here, I would expect that the meta element is the first child of the head element, but in the example it is the second node (preceded by a whitespace text node). The example is non-conforming as it lacks the trailing space before />, please change the example to conform to the specification.
Dear XSL Working Group, Dear XML Query Working Group, Comment on section 6 of the Serialization spec: [...] The content type SHOULD be set to the value given for the media-type parameter; the default value for XHTML is text/html. The value application/xhtml+xml, registered in [RFC3236], MAY also be used. [...] It is not clear to me what you mean here, there are only two behaviors that make sense to me, either * the value is always text/html * the value is always the value given for media-type defaulting to text/html It is never acceptable to use application/xhtml+xml unless explicitly requested (among other things, some user agents fail to recognize the charset parameter if the type is not text/html), please change the text to state that the value must be set to the value given for the media-type or text/html if no media-type is specified.
Dear XSL Working Group, Dear XML Query Working Group, Comment on section 6 of the Serialization spec: Please add a note that this process removes possible parameters in the attribute value, i.e. that <meta http-equiv="Content-Type" content="text/html;version='3.0'" /> in the data model instance would be replaced by e.g. <meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
Dear XSL Working Group, Dear XML Query Working Group, In Section 4 item 2 "Character expansion" it is not clear in which order these are to be processed, the prose states the list is "in priority order" but the list is unordered. If the list is meant to be in an order please use an ordered list in the markup. Specifically, please clarify whether URI escaping is affected by the normalization-form parameter and further please add a note in how far escape-uri-attributes is consistent with the latest IRI Internet Draft. I further note that the reference for the normalization forms is http://www.w3.org/TR/2002/WD-charmod-20020430/ which is quite outdated and the latest version does no longer cover this subject.
I also note that the reference [Unicode Normalization] appears in the References list but is not actually used in the document. It would seem you want to reference UAX 15 rather than charmod for the nor- mative definition of the normalization forms. Please review the references section for other outdated and/or unused references and change the section to comply with the recommendations for references sections outlined in <http://www.w3.org/2001/06/manual/>. http://esw.w3.org/topic/SiteTools lists several tools that might help here, for example <http://www.w3.org/2004/07/references-checker-ui>.
hello. working with saxon, i discovered that the omit-xml-declaration value is ignored if i use any other xml encoding than UTF-8. from the xml point of view, this makes sense. however, when using xhtml documents as fragments that are assembled with a non xml-aware server-side mechanism such as php, this is a problem because unless i use UTF-8, i end up with xml declarations in the middle of my assembled web page, which makes it invalid. http://www.w3.org/TR/xslt-xquery-serialization/#xhtml-output claims that the omit-xml-declaration should be observed, but does so only in a note, which leaves me wondering whether saxon is incorrect or just interpreting the spec differently. i would suggest to clarify this issue, and i would also suggest to require that the xml declaration is not output, regardless of the encoding.
> > http://www.w3.org/TR/xslt-xquery-serialization/#N10E63 probably is the > relevant part. as i understand it, it only specifies when a > declaration > MUST be output, but not when it MUST NOT be output. or am i reading it > wrong or lacking some context? I think you're right, the specification forgets to say that an XML declaration must not be output if the omit-xml-declaration parameter has the value "yes". Sometimes one forgets to say the obvious. Michael Kay