AbstractThe Document Object Model (DOM) level one provides a mechanism for software developers and web script authors to access and manipulate parsed HTML and XML content. All markup as well as any document type declarations are made available. Level one also allows creation "from scratch" of entire web documents in memory; saving those documents persistently is left to the programmer. DOM Level one is intentionally limited in scope to content representation and manipulation; rendering, validation, externalization etc. are deferred to higher levels of the DOM. |
This document is part of the Document Object Model Specification
Node | +--Document | +--Element | +--Attribute | +--Text | +--Comment | +--PI | +--Reference | | | +--NamedCharacterReference | +--NumericCharacterReference
long datatype represents 32 signed bit integers. In
other language bindings, i.e. Java, this would be mapped to the Java
int datatype.
For more information on OMG's IDL, please visit the OMG home page, or download the CORBA 2.0 (minor version number is expected to change in the near future) specification (it's rather large) which contains the IDL language definition in chapter 3.
Note: The Object Management Group Interface Definition Language (OMG IDL) was chosen as it was designed for specifying language and implementation-neutral interfaces. Various other IDLs could be used; the use of OMG IDL does not imply a requirement to use a specific object binding runtime.
getParentNode() method will return
null.
Once the application has access to the root of the document object hierarchy, it can use the methods defined herein for accessing individual nodes, selection of specific node types such as all images, and so on.
Node documentType
null.
Element documentElement
"HTML"; for XML this is the outermost element, i.e. the
element non-terminal in production [23] in
Section 2.9 of the
XML-lang specification.
For documents which were not retrieved via HTTP, or for those which were created directly in memory, there may be no DocumentContext.
NOTE: The DocumentContext interface described here is expected to be significantly expanded in the level two specification of the Document Object Model.
Document document
NodeType getNodeType()
enum, and it is expected that most
language bindings will represent this runtime-queryable Ndoe type using
an integral data type. The names of the node type enumeration literals
are straightforwardly derived from the names of the actual Node subtypes,
and are fully specified in the IDL definition of Node in the IDL definition
in Appendix A.
Node getParentNode()
null is returned. [Note:
because in ECMAScript get/set method pairs are surfaced as properties, Parent
would conflict with the pre-defined Parent property, so we
disambiguate this with "ParentNode" even though it is inconsistent
with the naming convention of the other methods that do not include
"Node"].
Node getFirstChild()
null
is returned.
NodeList getChildren()
null is returned. The content of the
returned NodeList is "live" in the sense that changes to the children of
the Node object that it was created
from will be immediately reflected in the set of Nodes the NodeList contains;
it is not a static snapshot of the content of the Node. Similarly, changes
made to the NodeList will be immediately reflected in the set of children of
the Node that the NodeList was created from.
boolean hasChildren()
true if the node has any children, false
if the node has no children at all. This method exists both for convenience as
well as to allow implementations to be able to bypass object allocation, which
may be required for implementing getChildren().
Node getPreviousSibling()
null is returned.
Node getNextSibling()
null is returned.
Node insertChild(in unsigned long index, in Node newChild)
Node replaceChild(in unsigned long index, in Node newChild)
NoSuchNodeException is thrown.
Node removeChild(in unsigned long index)
NoSuchNodeException is thrown.
NodeEnumerator getElementsByTagName(wstring name)
tagName matches the
given name. The iteration order is a depth first enumeration of the
elements as they occurred in the original document.
Note: a later level of the DOM will provide a more generalized querying mechanism for Nodes. One such query involves obtaining all the Elements in a subtree with a given tagName. A convenience method for this query has been included in the core document. This method might be removed at a later date in favor of a more comprehensive querying mechanism.
The items in the NodeList are accessible via an integral index, starting from 0. A NodeEnumerator object may be created to allow simple sequential traversal over the members of the list.
NodeEnumerator getEnumerator()
Node item(in unsigned long index) raises(NoSuchNodeException)
NoSuchNodeException is thrown.
Node replace(in unsigned long index, in Node replacedNode)
raises (NoSuchNodeException)
null is returned if the index is equal to the previous number of
nodes in the list). If index is
greater than the number of nodes in the list, a
NoSuchNodeException is thrown.
void append(in Node newNode)
insert(self.getLength(), newNode).
void prepend(in Node newNode)
insertChild(0, newNode).
void insert(in unsigned long index, in Node newNode)
raises (NoSuchNodeException)
self.getLength(),
the node is added at the end of the list.
Node remove(in unsigned long index)
raises (NoSuchNodeException)
NoSuchNodeException is
thrown.
unsigned long getLength()
getLength()-1 inclusive.
Typical usage (in some C++ like language) might look like:
NodeEnumerator nodeEnum = document.getChildren().getEnumerator();
for (Node node = nodeEnum.first(); node != null; node = nodeEnum.next()) {
// ... do some computation on that node
}
Node getFirst()
null is returned.
NOTE: in some implementations this may or may not be a fast operation; it may be the case that the enumeration finds the requested node on demand, and for very large document object, this may take some time.
Node getNext()
null after the last node in the list has been passed, and
leaves the current pointer at the last node.
Node getPrevious()
null after the first node in the enumeration
has been returned, and leaves the current pointer at the first node.
Node getLast()
null. Doing a getNext()
immediately after this operation will return null.
Node getCurrent()
null if the enumeration
is empty.
boolean atStart()
getCurrent() will return the same
node as getFirst() would return. For empty enumerations,
true is always returned. Does not affect the state of the enumeration in any
way.
boolean atEnd()
getCurrent() will return the same
node as getLast() would return. For empty enumerations,
true is always returned.
Does not affect the state of the enumeration in any way.
Node getNode(in wstring name)
Node setNode(in wstring name, in Node node)
null is returned, and the named node is added to the end
of the NamedNodeList
object; that is, it is accessible via the item method using the
index one less than the value returned by getLength().
Node remove(in wstring name) raises (NoSuchNodeException)
NoSuchNodeException is
thrown.
Node item(in unsigned long index) raises(NoSuchNodeException)
NoSuchNodeException is thrown.
unsigned long getLength()
NodeEnumerator getEnumerator()
For example (in XML):
<elementExample id="demo">
<subelement1/>
<subelement2>
<subsubelement/>
</subelement2>
</elementExample>
When represented using DOM, the top node would be "elementExample", which
contains two child Element nodes (and some
space),
one for "subelement1" and one for "subelement2". "subelement1" contains no
child nodes of its own.
wstring tagName
<elementExample id="demo">
...
</elementExample>
This would have the value "elementExample". Note that this is
case-preserving, as are all of the operations of the DOM. See Name case in the DOM for a description of why the DOM
preserves case.
Note:This attribute's name may change in the near future to "elementType", which is a more technically correct term.
NamedNodeList attributes
elementExample
example above, the attributes list would consist of the id
attribute, as well as any attributes which were defined by the document type definition for this
element which have default values.
void setAttribute(in Attribute newAttr)
wstring name
NodeList value
toString() method will return a zero length string (as will
toString() invoked directly on this Attribute instance).
If the attribute has no effective value, then this method will return
null. Note the toString() method on the
Attribute instance can also be used to retrieve the string version of the
attribute's value(s).
Even seemingly simple string-valued attributes will be represented as a set of more than one Node if the value of the attribute includes things like entity references. For example:
<p class="foo&bar">
The nodes would be a Text containing "foo", then a NamedCharacterReference,
and finally a Text instance containing "bar".
boolean specified
wstring toString()
<!--' and ending '-->'. Note that this is the
definition of a comment in XML, and, in practice, HTML, although some HTML
tools may implement the full SGML comment structure.
wstring data
wstring name
null.
wstring data
<? (after the name in XML) to the
character immediately preceding the ?>.
wstring name
< or
&.
wstring getReplacementText()
<,
the returned string would be "<".
< or H. Applications may retrieve
both the actual Character object corresponding to this numeric value, and they
can also retrieve actual digits (and any associated radix-indicating prefix) of
the original reference.
wchar character
wstring original
< this attribute would be "60" (sans
the '&#' and ';'), and for H, this attribute would
have "x48" for its value.
wstring data
boolean isIgnorableWhitespace
Application developers using the DOM for HTML would be wise to use
case-insensitive comparisons when testing for equality.
Appendix A: IDL Interface definitions
Shown below are the core IDL definitions for the objects in the Document Object
Model. The HTML IDL definition is here,
and the XML IDL definition, including the types to represent the document type definition is here.
// $Date: 1997/10/07 19:37:20 $
module DOM {
// Basic grove object types
interface Node;
interface Document;
// Objects related to the instance
interface Element;
typedef sequence StringList;
exception NoSuchNodeException {};
//////////////////////////////////////////////////////////////////////////
// //
// OBJECTS USED TO DEFINE A GROVE //
// //
//////////////////////////////////////////////////////////////////////////
// Enumerator class for a node list
interface NodeEnumerator {
Node getFirst();
Node getNext();
Node getPrevious();
Node getLast();
Node getCurrent();
// not sure about these...
// the rationale for their existence is that the enumerator may be used
// internally to a method, which may return some interesting value, and
// therefore cannot also indicate whether the start or end of enumeration
// was reached. Any of the traversal methods affects the state, and
// so are not suitable for usage as predicates (unless possible state
// manipulation is acceptable).
boolean atStart();
boolean atEnd();
};
// Define the type for a sequence of nodes
interface NodeList {
NodeEnumerator getEnumerator();
Node item(in unsigned long index)
raises(NoSuchNodeException);
void replace(in unsigned long index, in Node replacedNode)
raises (NoSuchNodeException);
void append(in Node newNode);
void prepend(in Node newNode);
void insert(in unsigned long index, in Node newNode)
raises (NoSuchNodeException);
Node remove(in unsigned long index)
raises (NoSuchNodeException);
// This may be expensive to compute
unsigned long getLength();
};
// Interface to a node in a grove
interface Node {
enum NodeType {
NODE,
DOCUMENT,
ELEMENT,
ATTRIBUTE,
PI,
COMMENT,
REFERENCE,
NAMED_CHARACTER_REFERENCE,
NUMERIC_CHARACTER_REFERENCE,
TEXT
};
NodeType getNodeType();
// Simple traversal interface
Node getParentNode();
NodeList getChildren();
boolean hasChildren();
Node getPreviousSibling();
Node getNextSibling();
void insertChild(in unsigned long index, in Node newChild);
Node replaceChild(in unsigned long index, in Node newChild)
raises (NoSuchNodeException);
Node removeChild(in unsigned long index)
raises (NoSuchNodeException);
NodeEnumerator getElementsByTagName(in wstring name);
};
// Named node list
interface NamedNodeList {
// Core get and set interface. Note that implementations may
// build the list lazily
Node getNode(in wstring name);
Node setNode(in wstring name, in Node node);
Node remove(in wstring name) raises (NoSuchNodeException);
Node item(in unsigned long index)
raises(NoSuchNodeException);
unsigned long getLength();
NodeEnumerator getEnumerator();
};
// Placeholders
typedef wstring Date;
typedef wstring Location;
//////////////////////////////////////////////////////////////////////////
// //
// OBJECTS RELATED TO THE INSTANCE //
// //
//////////////////////////////////////////////////////////////////////////
interface DocumentContext {
attribute Document document;
};
interface Document : Node {
attribute Node documentType;
attribute Element documentElement;
};
interface Attribute : Node {
attribute wstring name;
attribute NodeList value;
attribute boolean specified;
// provides a connection to the DTD
// attribute Node definition;
wstring toString();
};
interface PI : Node {
attribute wstring name;
attribute wstring data;
};
interface Element : Node {
attribute wstring tagName;
attribute NamedNodeList attributes;
void setAttribute(in Attribute newAttr);
};
// Represents the content of
interface Comment : Node {
attribute wstring data;
};
// base type for named entities, including parameter entities, but not
// numeric entites
interface Reference : Node {
attribute wstring name;
// Entity will be defined in a later draft of the level one specification
//attribute Entity definition;
};
interface NamedCharacterReference : Reference {
wstring getReplacementText();
};
interface NumericCharacterReference : Node {
attribute wchar character;
// The "60" part of < or "x48" for H
attribute wstring original;
};
interface Text : Node {
attribute wstring data;
attribute boolean isIgnorableWhitespace;
};
};
// $Id: Node.java,v 1.13 1997/10/07 19:37:55 sbb Exp $
package w3c.dom;
public interface Node {
// Node type enumeration; these represent the set of
// values returned by the getNodeType() method.
static final int NODE = 1;
static final int DOCUMENT = 2;
static final int ELEMENT = 3;
static final int ATTRIBUTE = 4;
static final int PI = 5;
static final int COMMENT = 6;
static final int REFERENCE = 7;
static final int NAMED_CHARACTER_REFERENCE = 8;
static final int NUMERIC_CHARACTER_REFERENCE = 9;
static final int TEXT = 10;
int getNodeType();
// can return null
Node getParentNode();
NodeList getChildren();
boolean hasChildren();
Node getPreviousSibling();
Node getNextSibling();
void insertChild(int index, Node newChild);
Node replaceChild(int index, Node newChild)
throws NoSuchNodeException;
Node removeChild(int index)
throws NoSuchNodeException;
NodeEnumerator getElementsByTagName(String name);
}
// $Id: NodeList.java,v 1.2 1997/10/07 19:37:56 sbb Exp $
package w3c.dom;
public interface NodeList {
NodeEnumerator getEnumerator();
Node item(int index)
throws NoSuchNodeException;
void replace(int index, Node replacedNode)
throws NoSuchNodeException;
void append(Node newNode);
void prepend(Node newNode);
void insert(int index, Node newNode)
throws NoSuchNodeException;
Node remove(int index)
throws NoSuchNodeException;
// This may be expensive to compute
int getLength();
}
// $Id: NamedNodeList.java,v 1.5 1997/10/07 19:37:54 sbb Exp $
package w3c.dom;
public interface NamedNodeList {
// Core get and set interface. Note that implementations may
// build the list lazily
Node getNode(String name);
Node setNode(String name, Node node);
Node remove(String name)
throws NoSuchNodeException;
Node item(int index)
throws NoSuchNodeException;
int getLength();
NodeEnumerator getEnumerator();
}
// $Id: NodeEnumerator.java,v 1.5 1997/10/07 19:37:55 sbb Exp $
package w3c.dom;
public interface NodeEnumerator {
Node getFirst();
Node getNext();
Node getPrevious();
Node getLast();
Node getCurrent();
// not sure about these...
// the rationale for their existence is that the enumerator may be used
// internally to a method, which may return some interesting value, and
// therefore cannot also indicate whether the start or end of enumeration
// was reached. Any of the traversal methods affects the state, and
// so are not suitable for usage as predicates (unless possible state
// manipulation is acceptable).
boolean atStart();
boolean atEnd();
}
// $Id: DocumentContext.java,v 1.3 1997/10/07 19:37:51 sbb Exp $
package w3c.dom;
public interface DocumentContext {
Document getDocument();
void setDocument(Document document);
}
// $Id: Document.java,v 1.4 1997/10/07 19:37:51 sbb Exp $
package w3c.dom;
public interface Document extends Node {
Node getDocumentType();
void setDocumentType(Node documentType);
Element getDocumentElement();
void setDocumentElement(Element documentElement);
}
// $Id: Attribute.java,v 1.4 1997/10/07 19:37:50 sbb Exp $
package w3c.dom;
public interface Attribute extends Node {
String getName();
void setName(String name);
NodeList getValue();
void setValue(NodeList value);
boolean getSpecified();
void setSpecified(boolean specified);
String toString();
}
// $Id: PI.java,v 1.4 1997/10/07 19:37:57 sbb Exp $
package w3c.dom;
public interface PI extends Node {
String getName();
void setName(String name);
String getData();
void setData(String data);
}
// $Id: Element.java,v 1.4 1997/10/07 19:37:52 sbb Exp $
package w3c.dom;
public interface Element extends Node {
String getTagName();
void setTagName(String tagName);
NamedNodeList getAttributes();
void setAttributes(NamedNodeList attributes);
void setAttribute(Attribute newAttr);
}
// $Id: Comment.java,v 1.3 1997/10/07 19:37:50 sbb Exp $
package w3c.dom;
// Represents the content of
public interface Comment extends Node {
String getData();
void setData(String data);
}
// $Id: Reference.java,v 1.3 1997/10/07 19:37:57 sbb Exp $
package w3c.dom;
// base type for named entities, including parameter entities, but not
// numeric entites
public interface Reference extends Node {
String getName();
void setName(String name);
// not until the DTD exists
//attribute Entity definition;
}
// $Id: NamedCharacterReference.java,v 1.1 1997/10/07 19:37:54 sbb Exp $
package w3c.dom;
// For Named character references
public interface NamedCharacterReference extends Reference {
String getReplacementText();
}
// $Id: NumericCharacterReference.java,v 1.3 1997/10/07 19:37:56 sbb Exp $
package w3c.dom;
public interface NumericCharacterReference extends Node {
char getCharacter();
void setCharacter(char character);
// The "60" part of < or "x48" for H
String getOriginal();
void setOriginal(String original);
}
// $Id: Text.java,v 1.3 1997/10/07 19:37:58 sbb Exp $
package w3c.dom;
public interface Text extends Node {
String getData();
void setData(String data);
boolean getIsIgnorableWhitespace();
void setIsIgnorableWhitespace(boolean isIgnorableWhitespace);
}
// $Id: NoSuchNodeException.java,v 1.3 1997/09/26 17:57:36 sbb Exp $
package w3c.dom;
public class NoSuchNodeException extends Exception {
// Adds nothing over the base Exception class.
}