26 January 2001

1. Document Object Model Core

Editors
Arnaud Le Hors, IBM

Table of contents

1.1. DOM Level 3 Core

(ED: Although the following defines a set of new interfaces that extend the DOM Level 2 interfaces the current plan is not to actually add any new interfaces but instead to expand the existing ones.)
Issue Level-3-Core-1:
Instead of extending the core interfaces should we define some kind of DOMUtility interface which could be optional?
Type Definition DOMKey

A DOMKey is a unique key generated by the DOM implementation to uniquely identify DOM nodes.


IDL Definition
typedef Object DOMKey;

Interface Entity3

This interface extends the Entity interface with additional attributes to provide information on the text declaration of external parsed entities.


IDL Definition
interface Entity3 : Entity {
           attribute DOMString        actualEncoding;
           attribute DOMString        encoding;
           attribute DOMString        version;
};

Attributes
actualEncoding of type DOMString
An attribute specifying the actual encoding of this entity, when it is an external parsed entity. This is null otherwise.
encoding of type DOMString
An attribute specifying, as part of the text declaration, the encoding of this entity, when it is an external parsed entity. This is null otherwise.
version of type DOMString
An attribute specifying, as part of the text declaration, the version number of this entity, when it is an external parsed entity. This is null otherwise.
Interface Document3

This interface extends the Document interface with additional attributes and methods.


IDL Definition
interface Document3 : Document {
           attribute DOMString        actualEncoding;
           attribute DOMString        encoding;
           attribute boolean          standalone;
           attribute boolean          strictErrorChecking;
           attribute DOMString        version;
  Node               adoptNode(in Node source)
                                        raises(DOMException);
};

Attributes
actualEncoding of type DOMString
An attribute specifying the actual encoding of this document. This is null otherwise.
encoding of type DOMString
An attribute specifying, as part of the XML declaration, the encoding of this document. This is null when unspecified.
standalone of type boolean
An attribute specifying, as part of the XML declaration, whether this document is standalone.
strictErrorChecking of type boolean
An attribute specifying whether errors checking is enforced or not. When set to false, the implementation is free to not test every possible error case normally defined on DOM operations, and not raise any DOMException. In case of error, the behavior is undefined. This attribute is true by defaults.
version of type DOMString
An attribute specifying, as part of the XML declaration, the version number of this document. This is null when unspecified.
Methods
adoptNode
Changes the ownerDocument of a node, its children, as well as the attached attribute nodes if there are any. If the node has a parent it is first removed from its parent child list. This effectively allows moving a subtree from one document to another. The following list describes the specifics for each type of node.
ATTRIBUTE_NODE
The ownerElement attribute is set to null and the specified flag is set to true on the adopted Attr. The descendants of the source Attr are recursively adopted.
DOCUMENT_FRAGMENT_NODE
The descendants of the source node are recursively adopted.
DOCUMENT_NODE
Document nodes cannot be adopted.
DOCUMENT_TYPE_NODE
DocumentType nodes cannot be adopted.
ELEMENT_NODE
Specified attribute nodes of the source element are adopted, and the generated Attr nodes. Default attributes are discarded, though if the document being adopted into defines default attributes for this element name, those are assigned. The descendants of the source element are recursively adopted.
ENTITY_NODE
Entity nodes cannot be adopted.
ENTITY_REFERENCE_NODE
Only the EntityReference node itself is adopted, the descendants are discarded, since the source and destination documents might have defined the entity differently. If the document being imported into provides a definition for this entity name, its value is assigned.
NOTATION_NODE
Notation nodes cannot be adopted.
PROCESSING_INSTRUCTION_NODE, TEXT_NODE, CDATA_SECTION_NODE, COMMENT_NODE
These nodes can all be adopted. No specifics.
Issue adoptNode-1:
Should this method simply return null when it fails? How "exceptional" is failure for this method?
Resolution: Stick with raising exceptions only in exceptional circumstances, return null on failure (F2F 19 Jun 2000).
Issue adoptNode-2:
Can an entity node really be adopted?
Resolution: No, neither can Notation nodes (Telcon 13 Dec 2000).
Issue adoptNode-3:
Does this affect keys and hashCode's of the adopted subtree nodes?
If so, what about readonly-ness of key and hashCode?
if not, would appendChild affect keys/hashCodes or would it generate exceptions if key's are duplicate?
Parameters
source of type Node
The node to move into this document.
Return Value

Node

The adopted node, or null if this operation fails, such as when the source node comes from a different implementation.

Exceptions

DOMException

NOT_SUPPORTED_ERR: Raised if the source node is of type DOCUMENT, DOCUMENT_TYPE.

NO_MODIFICATION_ALLOWED_ERR: Raised when the source node is readonly.

Interface Node3

This interface extends the Node interface with several new methods. One allows to compare a node against another with regard to document order. Another methode allows to retrieve the content of a node and its descendants as a single DOMString. One allows to test whether two nodes are the same. Two methods provide for searching the namespace URI associated to a given prefix and to ensure the document is "namespace wellformed". Finally, it also provides a new attribute to get the base URI of a node, as defined in the XML Infoset.

Issue namespace-wellformed:
The term namespace wellformed needs to be defined.
Resolution: Define it as "being conformant to the Namespaces in XML spec" in the glossary (Telcon 4 Jul 2000).

IDL Definition
interface Node3 {
  readonly attribute DOMString        baseURI;

  typedef enum _DocumentOrder {
    DOCUMENT_ORDER_PRECEDING,
    DOCUMENT_ORDER_FOLLOWING,
    DOCUMENT_ORDER_SAME,
    DOCUMENT_ORDER_UNORDERED
  };
 DocumentOrder;
  DocumentOrder      compareDocumentOrder(in Node other)
                                        raises(DOMException);

  typedef enum _TreePosition {
    TREE_POSITION_PRECEDING,
    TREE_POSITION_FOLLOWING,
    TREE_POSITION_ANCESTOR,
    TREE_POSITION_DESCENDANT,
    TREE_POSITION_SAME,
    TREE_POSITION_UNORDERED
  };
 TreePosition;
  TreePosition       compareTreePosition(in Node other)
                                        raises(DOMException);
           attribute DOMString        textContent;
  boolean            isSameNode(in Node other);
  DOMString          lookupNamespacePrefix(in DOMString namespaceURI);
  DOMString          lookupNamespaceURI(in DOMString prefix);
  void               normalizeNS();
  readonly attribute DOMKey           key;
  boolean            equalsNode(in Node arg, 
                                in boolean deep);
};

Type Definition DocumentOrder

A type to hold the document order of a node relative to another node.

Enumeration _DocumentOrder

An enumeration of the different orders the node can be in.

Enumerator Values
DOCUMENT_ORDER_PRECEDING

The node preceds the reference node in document order.

DOCUMENT_ORDER_FOLLOWING

The node follows the reference node in document order.

DOCUMENT_ORDER_SAME

The two nodes have the same document order.

DOCUMENT_ORDER_UNORDERED

The two nodes are unordered, they do not have any common ancestor.

Type Definition TreePosition

A type to hold the relative tree position of a node with respect to another node.

Enumeration _TreePosition

An enumeration of the different orders the node can be in.

Enumerator Values
TREE_POSITION_PRECEDING

The node preceds the reference node.

TREE_POSITION_FOLLOWING

The node follows the reference node.

TREE_POSITION_ANCESTOR

The node is an ancestor of the reference node.

TREE_POSITION_DESCENDANT

The node is a descendant of the reference node.

TREE_POSITION_SAME

The two nodes have the same position.

TREE_POSITION_UNORDERED

The two nodes are unordered, they do not have any common ancestor.

Attributes
baseURI of type DOMString, readonly
Returns the absolute base URI of this node.
Issue baseURI-1:
How will this be affected by resolution of relative namespace URIs issue?
Issue baseURI-2:
Should this only be on Document, Element, ProcessingInstruction, Entity, and Notation nodes, according to the infoset? If not, what is it equal to on other nodes? Null? An empty string?
Issue baseURI-3:
Should this be read-only and computed or and actual read-write attribute?
Resolution: The former (F2F 19 Jun 2000).
key of type DOMKey, readonly
This attribute returns a unique key identifying this node.
Issue key-1:
What type should this really be?
Resolution: DOMKey, mapped to Object in Java and Number in ECMAScript (Telcon 13 Dec 2000).
Issue key-2:
In what space is this key unique (Document, DOMImplementation)?
Issue key-3:
What is the lifetime of the uniqueness of this key (Node, Document, ...)?
textContent of type DOMString
This attribute returns the text content of this node and its descendants. When set, any possible children this node may have are removed and replaced by a single Text node containing the string this attribute is set to. On getting, no serialization is performed, the returned string does not contain any markup. Similarly, on setting, no parsing is performed either, the input string is taken as pure textual content.
The string returned is made of the text content of this node depending on its type, as defined below:
Node type Content
ELEMENT_NODE, ENTITY_NODE, ENTITY_REFERENCE_NODE, DOCUMENT_NODE, DOCUMENT_FRAGMENT_NODE concatenation of the textContent attribute value of every child node, excluding COMMENT_NODE and PROCESSING_INSTRUCTION_NODE nodes
ATTRIBUTE_NODE, TEXT_NODE, CDATA_SECTION_NODE, COMMENT_NODE, PROCESSING_INSTRUCTION_NODE nodeValue
DOCUMENT_TYPE_NODE, NOTATION_NODE empty string
Issue textContent-1:
Should any whitespace normalization be performed?
Issue textContent-2:
Should this be two methods instead?
Issue textContent-3:
What about the name?
Issue textContent-4:
Should this be optional? If yes, how do we signal it is not supported?
Methods
compareDocumentOrder
Compares a node with this node with regard to document order.
Issue compareOrder-1:
Should an exception be raised when comparing attributes? Entities and notations? An element against an attribute? If yes, which one? HIERARCHY_REQUEST_ERR? Should the enum value "unordered" be killed then?
Resolution: No, return unordered for attributes (F2F 19 Jun 2000).
Issue compareOrder-2:
Should this method be moved to Node and take only one node in argument?
Resolution: Yes (F2F 19 Jun 2000).
Issue compareOrder-3:
Should this method be optional?
Parameters
other of type Node
The node to compare against this node.
Return Value

DocumentOrder

Returns how the given node compares with this node in document order.

Exceptions

DOMException

WRONG_DOCUMENT_ERR: Raised if the given node does not belong to the same document as this node.

compareTreePosition
Compares a node with this node with regard to their position in the tree.
Issue compareTreePosition-1:
Should this method be optional?
Parameters
other of type Node
The node to compare against this node.
Return Value

TreePosition

Returns how the given node is positioned relatively to this node.

Exceptions

DOMException

WRONG_DOCUMENT_ERR: Raised if the given node does not belong to the same document as this node.

equalsNode
Tests whether two nodes are equal.
This method tests for equality of nodes, not sameness (i.e., whether the two nodes are exactly the same object) which can be tested with Node.isSameNode. All objects that are the same will also be equal, though the reverse may not be true.
Issue equalsNode-1:
Should this be optional?
Parameters
arg of type Node
The node to compare equality with.
deep of type boolean
If true, recursively compare the subtrees; if false, compare only the nodes themselves (and its attributes, if it is an Element).
Return Value

boolean

If the nodes, and possibly subtrees are equal, true otherwise false.

No Exceptions
isSameNode
Returns whether this node is the same node as the given one.
Issue isSameNode-1:
Do we really want to make this different from equals?
Resolution: Yes, change name from isIdentical to isSameNode. (Telcon 4 Jul 2000).
Issue isSameNode-2:
Is this really needed if we provide a unique key?
Parameters
other of type Node
The node to test against.
Return Value

boolean

Returns true if the nodes are the same, false otherwise.

No Exceptions
lookupNamespacePrefix
Look up the prefix associated to the given namespace URI, starting from this node.
Issue lookupNamespacePrefix-1:
Should this be optional?
Parameters
namespaceURI of type DOMString
The namespace URI to look for.
Return Value

DOMString

Returns the associated namespace prefix or null if none is found.

No Exceptions
lookupNamespaceURI
Look up the namespace URI associated to the given prefix, starting from this node.
Issue lookupNamespaceURI-1:
Name? May need to change depending on ending of the relative namespace URI reference nightmare.
Issue lookupNamespaceURI-2:
Should this be optional?
Parameters
prefix of type DOMString
The prefix to look for.
Return Value

DOMString

Returns the associated namespace URI or null if none is found.

No Exceptions
normalizeNS
This method walks down the tree, starting from this node, and adds namespace declarations where needed so that every namespace being used is properly declared. It also changes or assign prefixes when needed. This effectively makes this node subtree is "namespace wellformed".
What the generated prefixes are and/or how prefixes are changed to achieve this is implementation dependent.
Issue normalizeNS-1:
Any other name?
Issue normalizeNS-2:
How specific should this be? Should we not even specify that this should be done by walking down the tree?
Issue normalizeNS-3:
What does this do on attribute nodes?
Resolution: Doesn't do anything (F2F 1 Aug 2000).
Issue normalizeNS-4:
How does it work with entity reference subtree which may be broken?
Resolution: This doesn't affect entity references which are not visited in this operation (F2F 1 Aug 2000).
Issue normalizeNS-5:
Should this be really be on Node?
Resolution: Yes, but this only works on Document, Element, and DocumentFragment. On other types it is a no-op. (F2F 1 Aug 2000).
Issue normalizeNS-6:
What happens with read-only nodes?
Issue normalizeNS-7:
What/how errors should be reported? Are there any?
Issue normalizeNS-8:
Should this be optional?
No Parameters
No Return Value
No Exceptions
Interface Text3

This interface extends the Text interface with a new attribute that allows one to find out whether a Text node only contains whitespace in element content.


IDL Definition
interface Text3 : Text {
  readonly attribute boolean          isWhitespaceInElementContent;
};

Attributes
isWhitespaceInElementContent of type boolean, readonly
Returns whether this text node contains whitespace in element content, often abusively called "ignorable whitespace".

Note: An implementation can only return true if, one way or another, it has access to the relevant information (e.g., the DTD or schema).

1.2. DOM Level 3 Java Binding Extension

Because the DOM only defines interfaces, applications have to rely on some "proprietary" API to start from. Typically, a Java application starts with a line of code such as:

    DOMImplementation impl = org.apache.xerces.dom.DOMImplementationImpl.getDOMImplementation();
   

Since there is no language independent way of "bootstrapping" a DOM implementation, this section describes a solution for Java. Hopefully, similar solutions could be defined for other language bindings.

The following defines a Java class called DOMImplementationFactory which provides with a static method to get a reference to a DOMImplementation. The actual class to be returned by getDOMImplementation can be specified by the application or the implementation, depending on the context, through the Java system property "org.w3c.dom.DOMImplementation". In addition, an implementation can provide its own version of this class, with the same class name and package name, in order to set the default name to the desired value.

    package org.w3c.dom;

    public abstract class DOMImplementationFactory {

        // The system property to specify the DOMImplementation class name.
        private static String property = "org.w3c.dom.DOMImplementation";

        // The default DOMImplementation class name to use.
        private static String defaultImpl = "NO DEFAULT IMPLEMENTATION SET";

        /**
         * Returns a DOMImplementation
         **/
        public static DOMImplementation getDOMImplementation()
            throws ClassNotFoundException, InstantiationException,
                   IllegalAccessException, ClassCastException
        {
            // Retrieve the system property
            String impl;
            try {
                impl = System.getProperty(property, defaultImpl);
            } catch (SecurityException e) {
            // fallback on default implementation in case of security pb
            impl = defaultImpl;
            }

            // Attempt to load, instantiate and return the implementation class
            return (DOMImplementation) Class.forName(impl).newInstance();
        }
    }
   

With this, the first line of an application typically becomes something like:

    DOMImplementation impl = DOMImplementationFactory.getDOMImplementation();
   
Issue Level-3-Java-Bootstrap-1:
Should this provides for handling more than one implementation at a time?
Issue Level-3-Java-Bootstrap-2:
Should this be even simpler and force the implementation to provide this class (and not necessarily rely on any system property)?