XML 1.0 Fifth Edition Specification Errata


This document records all known errors in the Fifth Edition of the Extensible Markup Language (XML) 1.0 Specification ; for updates see the latest version.

The errata are numbered, classified as Substantive or Editorial, and listed in reverse chronological order of their date of publication in each category. Changes to the text of the spec are indicated thus: deleted text, new text, modified text . Substantive corrections are proposed by the XML Core Working Group, which has consensus that they are appropriate; they are not to be considered normative until approved by a Call for Review of Proposed Corrections or a Call for Review of an Edited Recommendation.

Please email error reports to xml-editor@w3.org.

Substantive errata

Errata as of 2013-01-09


Section 2.3 Common Syntactic Constructs

Delete the following paragraph:

Names beginning with the string "xml", or with any string which would match (('X'|'x') ('M'|'m') ('L'|'l')), are reserved for standardization in this or future versions of this specification.

Section 2.6 Processing Instructions

Make the following change:

PIs are not part of the document's character data, but must be passed through to the application. The PI begins with a target (PITarget) used to identify the application to which the instruction is directed. The target names " XML ", " xml ", and so on are reserved for standardization in this or future versions of this specification.Target names that begin "xml-" are reserved for standardization in this or future specifications from the XML Core Working Group or its successors. The target name "xml" is forbidden as it introduces the XML declaration. The XML Notation mechanism may be used for formal declaration of PI targets. Parameter entity references must not be recognized within processing instructions.

Section 3 Logical Structures

Make the following changes:

This specification does not constrain the application semantics, use, or (beyond syntax) names of the element types and attributes, except that names beginning with a match to (('X'|'x')('M'|'m')('L'|'l')) "xml:" are reserved for standardization in this or future versions of this specification specifications from the XML Core Working Group or its successors.

Editorial errata

Errata as of 2009-09-16


Section 2.2 Characters

Add a Note at the very end of the section as follows:


[Unicode] (conformance clause C06) says that canonically equivalent sequences of characters ought to be treated as identical. However, XML parsed entities (including document entities) that are canonically equivalent according to Unicode but which use distinct code point (character) sequences are considered distinct by XML processors. Therefore, all XML parsed entities SHOULD be created in a "fully normalized" form per [CharMod-Norm]. Otherwise the user might unknowingly create canonically equivalent but unequal sequences that appear identical to the user but which are treated as distinct by XML processors.

A document can still be well-formed, even if it is not in a normalized form. XML processors MAY verify that the document being processed is in a fully-normalized form and report to the application whether it is or not.

Section A.2 Other References

Add a reference to CharMod-Norm:

W3C Working Draft. Character Model for the World Wide Web 1.0: Normalization. François Yergeau, Martin J. Dürst, Richard Ishida, Addison Phillips, Misha Wolf, Tex Texin. (See http://www.w3.org/TR/charmod-norm/.)
The ill effects of (the lack of) Unicode normalization are noteworthy, but advice in a note is the best that can be done as part of an erratum, i.e. without changing the spec normatively.

Last updated $Date: 2013-02-13 15:08:50 $ by $Author: NormanWalsh $