The Document Object Model (DOM) is a representation — a model — of a document and its content. [DOM3CORE] The DOM is not just an API; the conformance criteria of HTML implementations are defined, in this specification, in terms of operations on the DOM.
This specification defines the language represented in the DOM
by features together called DOM5 HTML. DOM5 HTML consists of DOM
Core Document
nodes and DOM Core Element
nodes, along with text nodes and other content.
Elements in the DOM represent things; that is, they have intrinsic meaning , also known as semantics.
For example, an ol
element represents an ordered list.
In addition, documents and elements in the DOM host APIs that extend the DOM Core APIs, providing new features to application developers using DOM5 HTML.
Every XML and HTML document in an HTML UA is represented by a
Document
object. [DOM3CORE]
Document
objects are assumed to be XML documents unless they are flagged as
being HTML documents when they are created.
Whether a document is an HTML document or an XML document affects the behaviour behavior of
certain APIs, as well as a few CSS rendering rules. [CSS21]
A Document
object created by the
createDocument()
API on the
DOMImplementation
object is initially an XML document , but can
be made into an HTML
document by calling document.open()
on it.
All Document
objects (in user agents implementing
this specification) must also implement the HTMLDocument
interface, available using
binding-specific methods. (This is the case whether or not the
document in question is an HTML document or indeed whether it contains
any HTML elements at all.)
Document
objects must also implement the
document-level interface of any other namespaces found in the
document that the UA supports. For example, if an HTML
implementation also supports SVG, then the Document
object must implement HTMLDocument
and
SVGDocument
.
Because the HTMLDocument
interface is now obtained
using binding-specific casting methods instead of simply being the
primary interface of the document object, it is no longer defined
as inheriting from Document
.
interface HTMLDocument { // Resource metadata managementreadonly attribute ;[PutForwards=href] readonly attribute Location location; readonly attribute DOMString URL; attribute DOMString domain; readonly attribute DOMString referrer; attribute DOMString cookie; readonly attribute DOMString lastModified; readonly attribute DOMString compatMode; attribute DOMString charset; readonly attribute DOMString characterSet; readonly attribute DOMString defaultCharset; readonly attribute DOMString readyState; // DOM tree accessors attribute DOMString title; attribute DOMString dir; attribute HTMLElement body; readonly attribute HTMLCollection images; readonly attribute HTMLCollection embeds; readonly attribute HTMLCollection plugins; readonly attribute HTMLCollection links; readonly attribute HTMLCollection forms; readonly attribute HTMLCollection anchors; readonly attribute HTMLCollection scripts; NodeList getElementsByName(in DOMString elementName); NodeList getElementsByClassName(in DOMString classNames); // Dynamic markup insertion attribute DOMString innerHTML; HTMLDocument open(); HTMLDocument open(in DOMString type); HTMLDocument open(in DOMString type, in DOMString replace); Window open(in DOMString url, in DOMString name, in DOMString features); Window open(in DOMString url, in DOMString name, in DOMString features, in boolean replace); void close(); void write(in DOMString text); void writeln(in DOMString text); // Interaction readonly attribute Element activeElement;readonly attribute boolean ;boolean hasFocus(); // Commands readonly attribute HTMLCollection commands; // Editing attribute boolean designMode; boolean execCommand(in DOMString commandId);boolean (in DOMString commandId, in boolean doShowUI); boolean (in DOMString commandId, in boolean doShowUI, in DOMString value);boolean execCommand(in DOMString commandId, in boolean showUI); boolean execCommand(in DOMString commandId, in boolean showUI, in DOMString value); boolean queryCommandEnabled(in DOMString commandId); boolean queryCommandIndeterm(in DOMString commandId); boolean queryCommandState(in DOMString commandId); boolean queryCommandSupported(in DOMString commandId); DOMString queryCommandValue(in DOMString commandId); Selection getSelection(); };
Since the HTMLDocument
interface holds methods and attributes related to a number of
disparate features, the members of this interface are described in
various different sections.
User agents must raise a security
exception whenever any of the members of an HTMLDocument
object are accessed by
scripts whose effective
script origin is not the same as the Document
's origin. effective script origin .
The URL
attribute must return
the document's address .
The
attribute must
domain
referrerbe initialised to the document's domain , if
it has one, and null otherwise. On getting, the attribute
must return its current value. On
setting, if the new value is an allowed value (as defined below),
the attribute's value must be changed to the new value. If
either the new
value is not an allowed value, then a security exception must be
raised instead. A new value is an allowed value for
URI of the document.domain active document attribute if it is equal to the attribute's current
value, or if the new value, prefixed by a U+002E FULL STOP ("."),
exactly matches the end of the current
value. If the current value is null, new values other than null
will never be allowed. If the Document object's address is
hierarchical and uses a server-based naming authority, then its
domain is the <host>/<ihost> part of that address.
Otherwise, it has no domain. The domain source browsing context
attribute is used to enable pages on
different hosts of a domain to access each others' DOMs , though
this is not yet defined by this specification . we should handle IP
addresses here The referrer attribute must return either
at the URI
of time the navigation was started
(that is, the page which navigated the browsing
context to the current document (if
any), document), or the empty
string if there is no such originating page, or if the UA has been
configured not to report referrers,
referrers in this case, or if the
navigation was initiated for a hyperlink
with a noreferrer
keyword.
In the case of HTTP, the referrer
DOM
attribute will match the Referer
(sic) header
that was sent when fetching the current page.
Typically user agents are
configured to not report referrers in the case where the referrer
uses an encrypted protocol and the current page does not (e.g. when
navigating from an https:
page to
an http:
page).
The cookie
attribute
must, on represents the cookies of the resource.
On getting,
if the sandboxed origin browsing
context flag is set on the
browsing
context of the document, the user
agent must raise a security exception .Otherwise, it must return the same string as the
value of the Cookie
HTTP header it would
include if fetching the resource indicated by the document's
address over HTTP, as per RFC 2109 section 4.3.4. [RFC2109]
On setting, if the cookie sandboxed origin browsing context flag
attribute is set
on the browsing
context of the document, the user
agent must cause raise a security exception .Otherwise, the user agent to must act as it
would when processing cookies if it had just attempted to fetch
the document's address over HTTP, and had received a
response with a Set-Cookie
header whose value was the
specified value, as per RFC 2109 sections 4.3.1, 4.3.2, and 4.3.3.
[RFC2109]
Since the cookie
attribute is accessible across frames, the path restrictions on
cookies are only a tool to help manage which cookies are sent to
which parts of the site, and are not in any way a security
feature.
The lastModified
attribute, on getting, must return the date and time of the
Document
's source file's last modification, in the
user's local timezone, in the following format:
All the numeric components above, other than the year, must be given as two digits in the range U+0030 DIGIT ZERO to U+0039 DIGIT NINE representing the number in base ten, zero-padded if necessary.
The Document
's source file's last modification
date and time must be derived from relevant features of the
networking protocols used, e.g. from the value of the HTTP
Last-Modified
header of the document, or from
metadata in the filesystem file system for local files. If the last
modification date and time are not known, the attribute must return
the string 01/01/1970 00:00:00
.
A Document
is always
set to one of three modes: no quirks mode ,the
default; quirks
mode ,used typically for legacy
documents; and limited quirks mode ,also known as "almost standards" mode. The mode is only
ever changed from the default by the HTML parser ,based on the presence, absence, or value of the DOCTYPE
string.
The compatMode
DOM
attribute must return the literal string " CSS1Compat
" unless the document has been set to quirks mode by the HTML parser
, in which case it must instead return the literal string "
BackCompat
". The
document can also be set to limited quirks mode (also known as
"almost standards" mode). By default, the document is set to no
quirks mode (also known as "standards mode").
As far as parsing goes, the quirks I know of are:
Documents have an associated
character
encoding .When a
Document
object is created, the document's character
encoding must be initialized to
UTF-16. Various algorithms during page loading affect this value,
as does the charset
setter. [IANACHARSET]
The charset
DOM
attribute must, on getting, return the preferred MIME name of
the document's
character encoding .On setting, if
the new value is an IANA-registered alias for a character encoding,
the document's
character encoding must be set to
that character encoding. (Otherwise, nothing happens.)
The characterSet
DOM attribute must, on getting, return the preferred
MIME name of the document's character encoding .
The defaultCharset
DOM attribute must, on getting, return the preferred
MIME name of a character encoding, possibly the user's default
encoding, or an encoding associated with the user's current
geographical location, or any arbitrary encoding name.
Each document has a current document
readiness .When a
Document
object is created, it must have its current document
readiness set to the string
"loading". Various algorithms during page loading affect this
value. When the value is set, the user agent must fire a simple event
called readystatechanged
at
the Document
object.
The readyState
DOM
attribute must, on getting, return the current document
readiness .
The nodes representing HTML elements in the DOM must implement, and expose to scripts, the interfaces listed for them in the relevant sections of this specification. This includes XHTML elements in XML documents , even when those documents are in another context (e.g. inside an XSLT transform).
The basic interface, from which all the HTML elements ' interfaces inherit, and which
must be used by elements that have no additional requirements, is
the HTMLElement
interface.
interface HTMLElement : Element { // DOM tree accessors NodeList getElementsByClassName(in DOMString classNames); // dynamic markup insertion attribute DOMString innerHTML; // metadata attributes attribute DOMString id; attribute DOMString title; attribute DOMString lang; attribute DOMString dir; attribute DOMString className; readonly attribute DOMTokenList classList; readonly attribute DOMStringMap dataset; // interaction attribute boolean irrelevant; attribute long tabIndex; void click(); void focus(); void blur(); void scrollIntoView(); void scrollIntoView(in boolean top); // commands attribute HTMLMenuElement contextMenu; // editing attribute boolean draggable; attribute DOMString contentEditable; readonly attribute DOMString isContentEditable; // styling readonly attribute CSSStyleDeclaration style; // data templates attribute DOMString template; readonly attribute HTMLDataTemplateElement templateElement; attribute DOMString ref; readonly attribute Node refNode; attribute DOMString registrationMark; readonly attribute DocumentFragment originalContent;//// event handler DOM attributes attribute EventListener onabort; attribute EventListener onbeforeunload; attribute EventListener onblur; attribute EventListener onchange; attribute EventListener onclick; attribute EventListener oncontextmenu; attribute EventListener ondblclick; attribute EventListener ondrag; attribute EventListener ondragend; attribute EventListener ondragenter; attribute EventListener ondragleave; attribute EventListener ondragover; attribute EventListener ondragstart; attribute EventListener ondrop; attribute EventListener onerror; attribute EventListener onfocus; attribute EventListener onkeydown; attribute EventListener onkeypress; attribute EventListener onkeyup; attribute EventListener onload; attribute EventListener onmessage; attribute EventListener onmousedown; attribute EventListener onmousemove; attribute EventListener onmouseout; attribute EventListener onmouseover; attribute EventListener onmouseup; attribute EventListener onmousewheel; attribute EventListener onresize; attribute EventListener onscroll; attribute EventListener onselect; attribute EventListener onstorage; attribute EventListener onsubmit; attribute EventListener onunload; };
As with the HTMLDocument
interface, the
HTMLElement
interface holds
methods and attributes related to a number of disparate features,
and the members of this interface are therefore described in
various different sections of this specification.
Some DOM attributes are defined to reflect a particular content attribute . This means that on getting, the DOM attribute returns the current value of the content attribute, and on setting, the DOM attribute changes the value of the content attribute to the given value.
If a reflecting DOM attribute is a DOMString
attribute whose content attribute is defined to contain a URI, then
on getting, the DOM attribute must return the value of the content
attribute, resolved to an absolute URI, and on setting, must set
the content attribute to the specified literal value. If the
content attribute is absent, the DOM attribute must return the
default value, if the content attribute has one, or else the empty
string.
If a reflecting DOM attribute is a DOMString
attribute whose content attribute is defined to contain one or more
URIs, then on getting, the DOM attribute must split the content attribute on
spaces and return the concatenation of each token URI,
resolved to an absolute URI, with a single U+0020 SPACE character
between each URI; and on setting, must set
the content attribute to the specified literal value. If
if the content attribute is absent, the
DOM attribute must return the default value, if the content
attribute has one, or else the empty string. On setting, the DOM attribute must set the content
attribute to the specified literal value.
If a reflecting DOM attribute is a DOMString
whose
content attribute is an enumerated
attribute , and the DOM attribute is limited
to only known values , then, on getting, the DOM attribute
must return the conforming value
associated with the state the attribute is in (in its canonical
case), or the empty string if the attribute is in a state that has
no associated keyword value; and on setting, if the new value
case-insensitively matches one of the keywords given for that
attribute, then the content attribute must be set to the conforming value associated with the state
that the attribute would be in if set to the
given new value, otherwise, if the new value is the empty
string, then the content attribute must be removed, otherwise, the
setter must raise a SYNTAX_ERR
exception.
If a reflecting DOM attribute is a DOMString
but
doesn't fall into any of the above categories, then the getting and
setting must be done in a transparent, case-preserving manner.
If a reflecting DOM attribute is a boolean attribute, then on getting the DOM attribute must return true if the attribute is set, and false if it is absent. On setting, the content attribute must be removed if the DOM attribute is set to false, and must be set to have the same value as its name if the DOM attribute is set to true. (This corresponds to the rules for boolean content attributes .)
If a reflecting DOM attribute is a signed integer type (
long
) then then, on getting, the content attribute must be
parsed according to the rules for parsing signed integers
first. If , and if that is successful,
the resulting value must be returned. If, on the other hand,
it fails, or if the attribute is absent, then the default value must be returned instead,
or 0 if there is no default value. On setting, the given value must
be converted to a the shortest possible string representing the
number as a valid integer in base ten and
then that string must be used as the new content attribute
value.
If a reflecting DOM attribute is an unsigned integer
type ( unsigned long
) then then, on getting,
the content attribute must be parsed according to the rules for
parsing unsigned integers first. If
, and if that is
successful, the resulting value must be returned. If, on the other
hand, it fails, or if the attribute is absent, the default
value must be returned instead, or 0 if there is no default value.
On setting, the given value must be converted to a the shortest
possible string representing the number as a valid non-negative integer in base ten and then that
string must be used as the new content attribute value.
If a reflecting DOM attribute is an unsigned integer type (
unsigned long
) that is limited to
only positive non-zero numbers , then the behavior is similar
to the previous case, but zero is not allowed. On getting, the
content attribute must first be parsed according to the rules
for parsing unsigned integers , and if that is successful, the resulting value must be returned. If,
on the other hand, it fails, or if the attribute is absent,
the default value must be returned instead, or 1 if there is no
default value. On setting, if the value is zero, the user agent
must fire an INDEX_SIZE_ERR
exception. Otherwise, the
given value must be converted to a
the shortest possible string
representing the number as a valid non-negative
integer in base ten and then that string must be used as the
new content attribute value.
If a reflecting DOM attribute is a floating point number type (
float
) and the content attribute is defined to
contain a time offset, then then, on getting, the content attribute must be
parsed according to the rules for parsing time
ofsets first. offsets ,and if that is
successful, the resulting value, in seconds, must be
returned. If that fails, or if the attribute is absent, the
default value must be returned instead,
returned, or the not-a-number value
(NaN) if there is no default value. On setting, the given
value value,
interpreted as a time offset in seconds, must be converted to
a string using the time offset serialisation serialization
rules ,and that string must be used
as the new content attribute value.
If a reflecting DOM attribute is a
floating point number type ( float
) and it
doesn't fall into one of the earlier categories, then, on getting,
the content attribute must be parsed according to the rules for parsing
floating point number values , and if that is successful, the resulting value must be
returned. If, on the other hand, it fails, or if the attribute is
absent, the default value must be returned instead, or 0.0 if there
is no default value. On setting, the given value must be converted
to the shortest possible string representing the number as a
valid floating point
number in base ten and then
that string must be used as the new content attribute value.
If a reflecting DOM attribute is of the type DOMTokenList
, then on getting it must
return a DOMTokenList
object whose underlying string is the element's corresponding
content attribute. When the DOMTokenList
object mutates its
underlying string, the content
attribute must itself be immediately mutated. When the attribute is
absent, then the string represented by the DOMTokenList
object is the empty
string; when the object mutates this empty string, the user agent
must first add the corresponding content attribute, and then mutate
that attribute instead. DOMTokenList
attributes are always
read-only. The same DOMTokenList
object must be returned
every time for each attribute.
If a reflecting DOM attribute has the type HTMLElement
, or an interface that
descends from HTMLElement
,
then, on getting, it must run the following algorithm (stopping at
the first point where a value is returned):
document.getElementById()
method would find
if it was passed as its argument the current value of the
corresponding content attribute.On setting, if the given element has an id
attribute, then the content
attribute must be set to the value of that id
attribute. Otherwise, the DOM
attribute must be set to the empty string.
The HTMLCollection
,
HTMLFormControlsCollection
, and HTMLOptionsCollection
interfaces represent various lists of DOM nodes. Collectively,
objects implementing these interfaces are called collections .
When a collection is created, a filter and a root are associated with the collection.
For example, when the HTMLCollection
object for the
document.images
attribute is created, it is
associated with a filter that selects only img
elements, and rooted at the root of the
document.
The collection then represents a live view of the subtree rooted at the collection's root, containing only nodes that match the given filter. The view is linear. In the absence of specific requirements to the contrary, the nodes within the collection must be sorted in tree order .
The rows
list is not in tree order.
An attribute that returns a collection must return the same object every time it is retrieved.
The HTMLCollection
interface represents a generic collection of elements.
interface HTMLCollection { readonly attribute unsigned long length;Element (in unsigned long index); Element (in DOMString name);[IndexGetter] Element item(in unsigned long index); [NameGetter] Element namedItem(in DOMString name); };
The length
attribute
must return the number of nodes represented
by the collection . .
The item( index
)
method must return the index th
node in the collection. If there is no index th
node in the collection, then the method must return null.
The namedItem( key )
method must return the first node in
the collection that matches the following requirements:
a
, applet
, area
, form
, img
, or
object
element with a
name
attribute equal to key , or,id
attribute equal to key .
(Non-HTML elements, even if they have IDs, are not searched for the
purposes of namedItem()
.)If no such elements are found, then the method must return null.
In ECMAScript implementations, objects that
implement the HTMLCollection interface must also have a [[Get]]
method that, when invoked with a property name that is a number,
acts like the item() method would when invoked with that argument,
and when invoked with a property name that is a string, acts like
the namedItem() method would when invoked with that
argument.
The HTMLFormControlsCollection
interface represents a collection of form controls.
interface HTMLFormControlsCollection { readonly attribute unsigned long length;(in unsigned long index); Object (in DOMString name);[IndexGetter] HTMLElement item(in unsigned long index); [NameGetter] Object namedItem(in DOMString name); };
The length
attribute must return the number of nodes represented by the collection . .
The item( index )
method must return the index th node in the collection. If there is no
index th node in the collection, then the
method must return null.
The namedItem(
key )
method must act according to
the following algorithm:
id
attribute or a name
attribute equal to key
, then return that node and stop the algorithm.id
attribute or a name
attribute equal
to key , then return null and stop the
algorithm.NodeList
object representing a
live view of the HTMLFormControlsCollection
object, further filtered so that the only nodes in the
NodeList
object are those that have either an
id
attribute or a
name
attribute equal to key . The nodes in the NodeList
object must
be sorted in tree order .NodeList
object.The HTMLOptionsCollection
interface represents a list of option
elements.
interface HTMLOptionsCollection { attribute unsigned long length;HTMLOptionElement (in unsigned long index); Object (in DOMString name);[IndexGetter] HTMLOptionElement item(in unsigned long index); [NameGetter] Object namedItem(in DOMString name); };
On getting, the length
attribute must return the number of nodes represented by the collection . .
On setting, the behaviour
behavior depends on whether the new
value is equal to, greater than, or less than the number of nodes
represented by the collection at that
time. If the number is the same, then setting the attribute must do
nothing. If the new value is greater, then n
new option
elements with no attributes and no child
nodes must be appended to the select
element on which
the HTMLOptionsCollection
is
rooted, where n is the difference between the
two numbers (new value minus old value). If the new value is lower,
then the last n nodes in the collection must be
removed from their parent nodes, where n is the
difference between the two numbers (old value minus new value).
Setting length
never removes or adds any
optgroup
elements, and never adds new children to
existing optgroup
elements (though it can remove
children from them).
The item( index )
method must return the index th node in the collection. If there is no
index th node in the collection, then the
method must return null.
The namedItem( key )
method must act according to the
following algorithm:
id
attribute or a name
attribute equal to key
, then return that node and stop the algorithm.id
attribute or a name
attribute equal
to key , then return null and stop the
algorithm.NodeList
object representing a
live view of the HTMLOptionsCollection
object,
further filtered so that the only nodes in the
NodeList
object are those that have either an
id
attribute or a
name
attribute equal to
key . The nodes in the NodeList
object must be sorted in tree order
.NodeList
object.We may want to add add()
and
remove()
methods here too because IE implements
HTMLSelectElement and HTMLOptionsCollection on the same object, and
so people use them almost interchangeably in the wild.
The DOMTokenList
interface represents an interface to an underlying string that
consists of an unordered set of unique
space-separated tokens .
Which string underlies a particular DOMTokenList
object is defined when the
object is created. It might be a content attribute (e.g. the string
that underlies the classList
object is the class
attribute), or it
might be an anonymous string (e.g. when a DOMTokenList
object is passed to an
author-implemented callback in the datagrid
APIs). {
class=idl>[Stringifies] interface DOMTokenList { readonly attribute unsigned long length;DOMString (in unsigned long index);[IndexGetter] DOMString item(in unsigned long index); boolean has(in DOMString token); void add(in DOMString token); void remove(in DOMString token); boolean toggle(in DOMString token); };
The length
attribute must
return the number of unique tokens that result from
splitting the
underlying string on spaces .
The item(
index )
method must split the underlying
string on spaces , sort the resulting list of tokens by Unicode
codepoint , remove exact duplicates, and then return the
index th item in this list. If index is equal to or greater than the number of tokens,
then the method must return null.
In ECMAScript implementations, objects
that implement the DOMTokenList interface must also have a [[Get]]
method that, when invoked with a property name that is a number,
acts like the item() method would when invoked with that
argument. The has( token
)
method must run the following algorithm:
INVALID_CHARACTER_ERR
exception and stop
the algorithm.The add(
token )
method must run the
following algorithm:
INVALID_CHARACTER_ERR
exception and stop
the algorithm.DOMTokenList
object's underlying string
then stop the algorithm.DOMTokenList
object's underlying string is not the empty
string and the last character of that string is not a
space character , then append a U+0020 SPACE
character to the end of that string.DOMTokenList
object's
underlying string.The remove(
token )
method must run the
following algorithm:
INVALID_CHARACTER_ERR
exception and stop the
algorithm.The toggle(
token )
method must run the
following algorithm:
INVALID_CHARACTER_ERR
exception and stop
the algorithm.DOMTokenList
object's underlying string
then remove
the given token from the underlying string
, and stop the algorithm, returning false.DOMTokenList
object's underlying string is not the empty
string and the last character of that string is not a
space character , then append a U+0020 SPACE
character to the end of that string.DOMTokenList
object's
underlying string.In the ECMAScript DOM binding,
objects Objects implementing the
DOMTokenList
interface
must stringify to the object's underlying
string representation.
The DOMStringMap
interface represents a set of name-value pairs. When
a DOMStringMap
object is instanced, it is associated with three
algorithms, one for getting values from names, one for setting
names to certain values, and one for deleting names.
The names of the methods on this interface are temporary and will be fixed when the Web IDL / "Language Bindings for DOM Specifications" spec is ready to handle this case.
interface DOMStringMap { [NameGetter] DOMString XXX1(in DOMString name); [NameSetter] void XXX2(in DOMString name, in DOMString value); [XXX] boolean XXX3(in DOMString name); };
The XXX1(
name )
method must
call the algorithm for getting values from names, passing
name as the name, and must return the corresponding value, or
null if name
has no corresponding value.
The XXX2(
name ,value )
method must
call the algorithm for setting names to certain values,
passing name
as the name and value as
the value.
The XXX3(
name )
method must
call the algorithm for deleting names, passing name as
the name, and must return true.
DOM3 Core defines mechanisms for checking for interface support, and for obtaining implementations of interfaces, using feature strings . [DOM3CORE]
A DOM application can use the hasFeature( feature ,
version )
method of the
DOMImplementation
interface with parameter values "
HTML
" and " 5.0
"
(respectively) to determine whether or not this module is supported
by the implementation. In addition to the feature string "
HTML
", the feature string " XHTML
" (with version string " 5.0
") can be
used to check if the implementation supports XHTML. User agents
should respond with a true value when the hasFeature
method is queried with these
values. Authors are cautioned, however, that UAs returning true
might not be perfectly compliant, and that UAs returning false
might well have support for features in this specification; in
general, therefore, use of this method is discouraged.
The values " HTML
" and " XHTML
" (both with version " 5.0
") should
also be supported in the context of the getFeature()
and isSupported()
methods, as defined by DOM3
Core.
The interfaces defined in this specification are
not always supersets of the interfaces defined in DOM2 HTML; some
features that were formerly deprecated, poorly supported, rarely
used or considered unnecessary have been removed. Therefore it is
not guarenteed guaranteed that an implementation that supports "
HTML
" " 5.0
" also supports "
HTML
" " 2.0
".
The html
element of a
document is the document's root element, if there is one and it's
an html
element, or null
otherwise.
The head
element of a
document is the first head
element
that is a child of the html
element , if there is one, or null otherwise.
The title
element of a
document is the first title
element that is a child of in the head element ,
document (in tree order), if there is
one, or null otherwise.
The title
attribute must, on
getting, run the following algorithm:
If the root element is an
svg
element in the " http://www.w3.org/2000/svg
" namespace, and the user
agent supports SVG, then the getter must return the value that
would have been returned by the DOM attribute of the same name on
the SVGDocument
interface.
Otherwise, it must return a concatenation of the data of all the
child text nodes of
the title
element , in tree
order, or the empty string if the
title
element is null.
On setting, the following algorithm must be run:
If the root element is an
svg
element in the " http://www.w3.org/2000/svg
" namespace, and the user
agent supports SVG, then the setter must defer to the setter for
the DOM attribute of the same name on the SVGDocument
interface. Stop the algorithm here.
title
element is null
and the head
element is
null, then the attribute must do nothing. Stop the algorithm
here.title
element is
null, then a new title
element
must be created and appended to the
head
element .title
element (if any) must all be removed.Text
node whose data is the new value
being assigned must be appended to the
title
element .The title
attribute on the HTMLDocument
interface should shadow the
attribute of the same name on the SVGDocument
interface when the user agent supports both HTML and SVG.
The body element of a document is the
first child of the html
element that is either a body
element or a frameset
element. If there is no such
element, it is null. If the body element is null, then when the
specification requires that events be fired at "the body element",
they must instead be fired at the Document
object.
The body
attribute, on getting,
must return the body element of the
document (either a body
element,
a frameset
element, or null). On setting, the
following algorithm must be run:
body
or frameset
element, then
raise a HIERARCHY_REQUEST_ERR
exception and abort
these steps.replaceChild()
method
had been called with the new value and the incumbent body element as its two
arguments respectively, then abort these steps.The images
attribute must
return an HTMLCollection
rooted at the
Document
node, whose filter matches only
img
elements.
The embeds
attribute must return an HTMLCollection
rooted at the Document
node, whose
filter matches only embed
elements.
The plugins
attribute must return the same object as that returned
by the embeds
attribute.
The links
attribute must return
an HTMLCollection
rooted at the Document
node, whose filter matches only
a
elements with href
attributes
and area
elements with
href
attributes.
The forms
attribute must return
an HTMLCollection
rooted at the Document
node, whose filter matches only
form
elements.
The anchors
attribute must
return an HTMLCollection
rooted at the
Document
node, whose filter matches only
a
elements with name
attributes.
The scripts
attribute must return an HTMLCollection
rooted at the Document
node, whose
filter matches only script
elements.
The getElementsByName(
name )
method a string name , and must return a live NodeList
containing all the a
, applet
, button
,
form
, iframe
,
img
, input
,
map
, meta
, object
, select
, and
textarea
elements in that document that have a
name
attribute whose value is equal to the
name argument.
The getElementsByClassName(
classNames )
method takes a string
that contains an unordered set of unique
space-separated tokens representing classes. When called, the
method must return a live NodeList
object containing
all the elements in the document that have all the classes
specified in that argument, having obtained the classes by splitting a string on
spaces . If there are no tokens specified in the argument, then
the method must return an empty NodeList
.
The getElementsByClassName()
method on the HTMLElement
interface must return a live NodeList
with the nodes
that the HTMLDocument
getElementsByClassName()
method would return when passed the same argument(s), excluding any
elements that are not descendants of the HTMLElement
object on which the method
was invoked.
HTML, SVG, and MathML elements define which classes they are in
by having an attribute in the per-element partition with the name
class
containing a space-separated list of
classes to which the element belongs. Other specifications may also
allow elements in their namespaces to be labelled labeled as
being in specific classes. UAs must not assume that all attributes
of the name class
for elements in any
namespace work in this way, however, and must not assume that such
attributes, when used as global attributes, label other elements as
being in specific classes.
Given the following XHTML fragment:
<div id="example"> <p id="p1" class="aaa bbb"/> <p id="p2" class="aaa ccc"/> <p id="p3" class="bbb ccc"/> </div>
A call to
document.getElementById('example').getElementsByClassName('aaa')
would return a NodeList
with the two paragraphs
p1
and p2
in it.
A call to getElementsByClassName('ccc bbb')
would only return one node, however, namely p3
. A
call to
document.getElementById('example').getElementsByClassName('bbb ccc ')
would return the same thing.
A call to getElementsByClassName('aaa,bbb')
would
return no nodes; none of the elements above are in the "aaa,bbb"
class.
The dir
attribute on the HTMLDocument
interface is defined along
with the dir
content attribute.
The document.write()
family of methods and
the innerHTML
family of DOM attributes enable
script authors to dynamically insert markup into the document.
bz argues that innerHTML should be called something else on XML documents and XML elements. Is the sanity worth the migration pain?
Because these APIs interact with the parser, their behaviour behavior
varies depending on whether they are used with HTML documents (and the HTML
parser ) or XHTML in XML documents
(and the XML parser ). The following table
cross-references the various versions of these APIs.
document.write() |
innerHTML |
|
---|---|---|
For documents that are HTML documents | document.write() in
HTML |
innerHTML in HTML |
For documents that are XML documents | document.write() in
XML |
innerHTML in XML |
Regardless of the parsing mode, the document.writeln(...)
method must call the document.write()
method with the same
argument(s), and then call the document.write()
method with, as its
argument, a string consisting of a single line feed character
(U+000A).
The open()
method comes in
several variants with different numbers of arguments.
When called with two or fewer arguments, the method must act as follows:
Let type be the value of the first argument,
if there is one, or " text/html
" otherwise.
Let replace be true if there is a second argument and it has the value "replace" , and false otherwise.
If the document has an active parser that isn't a
script-created parser , and the
insertion point associated with that
parser's input stream is not undefined (that
is, it does point to somewhere in the input stream), then
the method does nothing. Abort these steps and return the
Document
object on which the method was invoked.
This basically causes document.open()
to
be ignored when it's called in an inline script found during the
parsing of data sent over the network, while still letting it have
an effect when called asynchronously or on a document that is
itself being spoon-fed using these APIs.
onbeforeunload, onunload onunload, reset
timers, empty event queue, kill any pending transactions,
XMLHttpRequests, etc
If the document has an active parser , then stop that parser, and throw away any pending content in the input stream. what about if it doesn't, because it's either like a text/plain, or Atom, or PDF, or XHTML, or image document, or something?
Remove all child nodes of the document.
Change the document's character encoding to UTF-16.
Create a new HTML parser and associate it
with the document. This is a script-created parser (meaning that it can
be closed by the document.open()
and document.close()
methods, and that the tokeniser will wait for an explicit call to
document.close()
before emitting an end-of-file
token).
If type does not have the value "
text/html
" , then act as if the tokeniser had emitted
a pre
element start tag, then set
the HTML parser 's tokenisation stage's content model flag to PLAINTEXT .
If replace is false, then:
Document
's History
objectDocument
Document
object, as well as the state of the
document at the start of these steps. (This allows the user to step
backwards in the session history to see the page before it was
blown away by the document.open()
call.)Finally, set the insertion point to point at just before the end of the input stream (which at this point will be empty).
Return the Document
on which the method was
invoked.
We shouldn't hard-code text/plain
there. We should do it some other way, e.g. hand off to the section
on content-sniffing and handling of incoming data streams, the part
that defines how this all works when stuff comes over the
network.
When called with three or more arguments, the open()
method on the
HTMLDocument
object must
call the open()
method on the Window
interface
of the object returned by the defaultView
attribute of the
DocumentView
interface of the HTMLDocument
object, with the same
arguments as the original call to the open()
method, and
return whatever that method returned. If the defaultView
attribute of the
DocumentView
interface of the HTMLDocument
object is null, then the
method must raise an INVALID_ACCESS_ERR
exception.
The close()
method must do
nothing if there is no script-created
parser associated with the document. If there is such a parser,
then, when the method is called, the user agent must insert an
explicit "EOF" character at the insertion point of the parser's input stream .
In HTML, the document.write(...)
method must act as follows:
If the insertion point is undefined,
the open()
method must be called (with no arguments)
on the document
object. The insertion point will point at just before the end
of the (empty) input stream .
The string consisting of the concatenation of all the arguments to the method must be inserted into the input stream just before the insertion point .
If there is a script that will execute as soon as the parser resumes , then the method must now return without further processing of the input stream .
Otherwise, the tokeniser must process the characters that were
inserted, one at a time, processing resulting tokens as they are
emitted, and stopping when the tokeniser reaches the insertion
point or when the processing of the tokeniser is aborted by the
tree construction stage (this can happen if a script
start tag token is emitted by the
tokeniser).
If the document.write()
method was called
from script executing inline (i.e. executing because the parser
parsed a set of script
tags),
then this is a reentrant invocation of the
parser .
Finally, the method must return.
In HTML, the innerHTML
DOM attribute of
all HTMLElement
and
HTMLDocument
nodes returns
a serialisation serialization of the node's children using the
HTML syntax . On setting, it replaces the node's
children with new nodes that result from parsing the given value.
The formal definitions follow.
On getting, the innerHTML
DOM attribute must return the
result of running the HTML fragment
serialisation serialization algorithm on the node.
On setting, if the node is a document, the innerHTML
DOM
attribute must run the following algorithm:
If the document has an active parser , then stop that parser, and throw away any pending content in the input stream. what about if it doesn't, because it's either like a text/plain, or Atom, or PDF, or XHTML, or image document, or something?
Remove the children nodes of the Document
whose
innerHTML
attribute is being set.
Create a new HTML parser , in its initial
state, and associate it with the Document
node.
Place into the input stream for the
HTML parser just created the string being
assigned into the innerHTML
attribute.
Start the parser and let it run until it has consumed all the
characters just inserted into the input stream. (The
Document
node will have been populated with elements
and a load
event will have fired on its body element .)
Otherwise, if the node is an element, then setting the
innerHTML
DOM attribute must cause the
following algorithm to run instead:
Invoke the HTML fragment parsing
algorithm , with the element whose innerHTML
attribute is being set as the context
element, and the string being assigned
into the innerHTML
attribute as the input . Let new children be the result
of this algorithm.
Remove the children of the element whose innerHTML
attribute is being set.
Let target document be the ownerDocument
of the Element
node whose
innerHTML
attribute is being set.
Set the ownerDocument
of all the nodes in
new children to the target
document .
Append all the new children nodes to the
node whose innerHTML
attribute is being set,
preserving their order.
script
elements
inserted using innerHTML
do not execute when they are
inserted.
In an XML context, the document.write()
method
must raise an INVALID_ACCESS_ERR
exception.
On the other hand, however, the innerHTML
attribute is
indeed usable in an XML context.
In an XML context, the innerHTML
DOM attribute on HTMLElement
s must
return a string in the form of an internal general parsed entity , and
on HTMLDocument
s, on
getting, s must return a string
in the form of an internal general
parsed a document entity that is XML
namespace-well-formed, the . The string being
returned must be XML namespace-well-formed
and must be an isomorphic serialisation serialization of all of that node's child nodes,
in document order. User agents may adjust prefixes and namespace
declarations in the serialisation
serialization (and indeed might be
forced to do so in some cases to obtain namespace-well-formed XML).
If any of the elements in the serialization
are in no namespace, the default namespace in scope for those
elements must be explicitly declared as the empty string.
[XML] [XMLNS]
If any of the following cases are found in the DOM being
serialised, serialized, the user agent must raise an
INVALID_STATE_ERR
exception:
Document
node with no child element nodes.DocumentType
node
that has an external subset public identifier or an external subset
system identifier that contains both a U+0022 QUOTATION MARK ('"')
and a U+0027 APOSTROPHE ("'").Attr
node, Text
node, CDATASection
node, Comment
node,
or ProcessingInstruction
node whose data
contains characters that are not matched by the XML Char
production. [XML]CDATASection
node whose data contains the string
" ]]>
".Comment
node whose data contains two adjacent
U+002D HYPHEN-MINUS (-) characters or ends with such a
character.ProcessingInstruction
node whose target name is
the string " xml
" (case insensitively)
.ProcessingInstruction
node whose target name
contains a U+003A COLON (":").ProcessingInstruction
node whose data contains
the string " ?>
".These are the only ways to make a DOM unserialisable. unserializable. The DOM enforces all the other XML
constraints; for example, trying to set an attribute with a name
that contains an equals sign (=) will raised an
INVALID_CHARACTER_ERR
exception.
On setting, in an XML context, the innerHTML
DOM
attribute on HTMLElement
s
and HTMLDocument
s must
run the following algorithm:
The user agent must create a new XML parser .
If the innerHTML
attribute is being set on an
element, the user agent must feed the parser just
created the string corresponding to the start tag of that element,
declaring all the namespace prefixes that are in scope on that
element in the DOM, as well as declaring the default namespace (if
any) that is in scope on that element in the DOM.
The user agent must feed the parser just created
the string being assigned into the innerHTML
attribute.
If the innerHTML
attribute is being set on an
element, the user agent must feed the parser the
string corresponding to the end tag of that element.
If the parser found a well-formedness error, the attribute's
setter must raise a SYNTAX_ERR
exception and abort
these steps.
The user agent must remove the children nodes of the node whose
innerHTML
attribute is being set.
If the attribute is being set on a Document
node,
let new children be the children of the
document, preserving their order. Otherwise, the attribute is being
set on an Element
node; let new
children be the children of the the document's root element, preserving their
order.
If the attribute is being set on a Document
node,
let target document be that
Document
node. Otherwise, the attribute is being set
on an Element
node; let target
document be the ownerDocument
of that
Element
.
Set the ownerDocument
of all the nodes in
new children to the target
document .
Append all the new children nodes to the
node whose innerHTML
attribute is being set,
preserving their order.
script
elements
inserted using innerHTML
do not execute when they are
inserted.
For HTML documents , and for HTML elements in HTML
documents , certain APIs defined in DOM3 Core become
case-insensitive or case-changing, as sometimes defined in DOM3
Core, and as summarised summarized or required below. [DOM3CORE] .
This does not apply to XML documents or to elements that are not in the HTML namespace despite being in HTML documents .
Element.tagName
, Node.nodeName
, and Node.localName
These attributes return tag names in all uppercase and attribute names in all lowercase, regardless of the case with which they were created.
Document.createElement()
The canonical form of HTML markup is all-lowercase; thus, this method will lowercase the argument before creating the requisite element. Also, the element created must be in the HTML namespace .
This doesn't apply to Document.createElementNS()
. Thus, it is possible, by
passing this last method a tag name in the wrong case, to create an
element that claims to have the tag name of an element defined in
this specification, but doesn't support its interfaces, because it
really has another tag name not accessible from the DOM APIs.
Element.setAttributeNode()
When an Attr
node is set on an HTML element , it must
have its name lowercased before the element is affected.
This doesn't apply to Document.setAttributeNodeNS()
.
Element.setAttribute()
When an attribute is set on an HTML element , the name argument must be lowercased before the element is affected.
This doesn't apply to Document.setAttributeNS()
.
Document.getElementsByTagName()
and
Element.getElementsByTagName()
These methods (but not their namespaced counterparts) must compare the given argument case-insensitively when looking at HTML elements , and case-sensitively otherwise.
Thus, in an HTML document with nodes in multiple namespaces, these methods will be both case-sensitive and case-insensitive at the same time.
Document.renameNode()
If the new namespace is the HTML namespace , then the new qualified name must be lowercased before the rename takes place.