This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
The wording of the serialization parameter byte-order-mark allows (and indeed requires) an implementation to omit the byte order mark when byte-order-mark="no" is coded. This is invalid behaviour when encoding="UTF-16" and method="xml" or method="xhtml", as XML requires a byte order mark for UTF-16. Accordingly, I think that sections 5.1.11 and 6.1.11 should state some specific rule for when encoding="UTF-16" (either to ignore the byte-order-mark parameter, or to make it a serlialization error).
If the environment into which you serialize the data does guarantee the UTF-16 encoding (such as specific string types in some programming languages and databases), there is no reason to require the BOM. It actually could make further processing more complex. Also note that the XML parsers allow for the encoding to be provided through external means.
While these two points are true, they do not negate the requirement in section 4.3.3 of Extensible Markup Language (XML) 1.0 (Third Edition) and Extensible Markup Language (XML) 1.1. The word MUST is used.
(In reply to comment #2) > While these two points are true, they do not negate the requirement in > section 4.3.3 of Extensible Markup Language (XML) 1.0 (Third Edition) > and Extensible Markup Language (XML) 1.1. The word MUST is used. This is true of complete documents, but XSLT also has a requirement to support fragments, for example multiple top level elements. These fragments are often combined by some post process and having the bom there may well be inconvenient. this is essentially the same issue as the xml declaration. An XML declaration is similarly mandatory for an xml document if the encoding is not utf8/16 and xslt1 mandated that it be added. This turned out to be too restrictive and xslt2 allows people to request that it be omitted, even though this may make the result not well formed, on the assumption that they know what they are doing...
But what you are describing is an external parsed general entity, and the requirement for a BOM with encoding of UTF-16 is still mandatory for such entities. If a BOM is incovenient for a particular application, then you can serialize in UTF-16BE or UTF-16LE, where the BOM is not compulsory (in fact it is forbidden by the Unicode standard).
The XSL and XQuery working group discussed this comment on Feb 1, 2006 and decided that no technical change is required. This decision was taken in order to support the use cases where different resulting XML fragments need to be concatenated and the BOM would add additional complexity to that process. A non-normative note will be added to the specification to explain why the combination of UTF-16 and no BOM is allowed. This note would point out that the serialization specification can output XML fragments that may not be well- formed external general parsed entities. I am marking this bug as CLOSED. Please reopen this bug within one week if you feel this resolution is unacceptable. Thank you for raising the comment. Joanne