The question of whether to represent parsed entities was the subject of much discussion at the last-call stage. We decided that they were not required by most of the Infoset's client specs, with the DOM being the clear exception.
This decision was indeed counter to design principle 2.2 of the requirements document in that editors may require this information, but it had become apparent that this requirement was beyond what the Infoset could reasonably provide. Editors may require information such as attribute order, whitespace in tags and variety of quotes used that we had already considered to be too low-level for the Infoset, so we did not consider that their need for entity boundaries was compelling.
We do not agree that this decision violates requirements 3.3 and 3.4.
The case of unparsed entities is different; they are part of the logical content of the document and the XML specification requires that they are reported.
The Infoset does allow for processors that do not expand external parsed entities, by means of the Unexpanded Entity Reference information item. Though there is no separate representation of the entity, all the relevant information is provided.
xmlns="" attributes are reflected in the [namespace attributes] property. They do not appear in the [in-scope namespaces] property because they do not correspond to a namespace that is in scope, but rather one which is no longer in scope.
The interpretation of xmlns="" as associating the empty prefix with a "null namespace" would be contrary to the Namespaces rec, which explicitly states that "unprefixed elements in the scope of the declaration are not considered to be in any namespace" (section 5.2).
Accepted except for two points.
"Attribute defaults are not required reading for non-validating processors" - this is not correct. All processors must default attributes if they have read the relevant declaration.
"The 20th item under Appendix D [...] is in direct conflict with the tenth reporting requirement under Appendix B" - this is a misunderstanding. Item 10 in appendix B says that an XML processor must supply as the attribute's actual value the default value for attributes whose declaration provides a default value but that are unspecified on the start tag in question. Item 20 in appendix D says that information about an attribute's declared default value (irrespective of whether that attribute is defaulted in this case) is not included in the Infoset.
We agree that this is a serious problem, but we cannot fully address it in the Infoset spec.
The spec defines the Infoset resulting from parsing an XML 1.0 document. To go beyond this and circumscribe what Infosets can result from other operations is outside the scope of the spec, and more importantly not a problem we know how to solve in general.
There are two well-known cases where inconsistency can arise in the (unextended) Infoset. The case of [in-scope namespaces] and [namespace attributes] seems to be well-understood. XSLT requires that namespace attributes be added on output, corresponding to the namespace nodes. In Infoset terms this means that [in-scope namespaces] takes precedence over [namespace attributes]. Similarly we expect that xml:base attributes will be added or changed to match the [base URI] property (though some have suggested that xml:base attributes resulting from schema processing might affect [base URI]).
We will add text to the spec indicating that [in-scope namespaces] and [base URI] should be taken as definitive in preference to [namespace attributes] and xml:base attributes respectively.
We removed comments from the Document Type Declaration Info Item because they are useless out of context. We retained PIs because they are needed for (eg) stylesheet processing.
We did not feel it was worth having a separate info item just to contain these PIs.
Though we agree that it might have been better to use a different name instead of "Document Type Declaration information item", we do not consider it important enough to change at this stage.
Unexpanded entity reference items are also used in the case where the processor has not read a declaration for the entity, either because it has not read the whole DTD or because there is no declaration. We consider that it is useful for the Infoset to support these cases since both are allowed by XML 1.0 specification (the absence of a declaration for an entity may be only a validity error).
Note that the Infoset makes these cases distinguishable - the [system identifier] property of the unexpanded entity item will have a value, be unknown, or have no value depending on whether the processor has read a declaration, not read one (but not read the whole DTD), or not read one despite reading the whole DTD.
The [all declarations processed] property indicates whether the processor has read the whole DTD, and a value of false should be interpreted as meaning that the processor may not have been able to generate the correct Infoset of the document, both in terms of entities and attribute defaulting and normalization.
Handling attribute values containing (legal) references to externally declared internal entities is a problem for processors in general, not just the Infoset. Erratum E10 to the second edition make it an error to pass such documents to processors that do not read the whole DTD.