This is a work in progress! For the latest updates from the HTML WG, possibly including important bug fixes, please look at the editor's draft instead. There may also be a more up-to-date Working Draft with changes based on resolution of Last Call issues.
For HTML documents, and for HTML elements in HTML documents, certain APIs defined in DOM Core become case-insensitive or case-changing, as sometimes defined in DOM Core, and as summarized or required below. [DOMCORE]
This does not apply to XML documents or to elements that are not in the HTML namespace despite being in HTML documents.
Element.tagName
and Node.nodeName
These attributes must return element names converted to ASCII uppercase, regardless of the case with which they were created.
Document.createElement()
The canonical form of HTML markup is all-lowercase; thus, this method will lowercase the argument before creating the requisite element. Also, the element created must be in the HTML namespace.
This doesn't apply to Document.createElementNS()
.
Thus, it is possible, by passing this last method a tag name in
the wrong case, to create an element that appears to have the same
tag name as that of an element defined in this specification when
its tagName
attribute is
examined, but that doesn't support the corresponding interfaces.
The "real" element name (unaffected by case conversions) can be
obtained from the localName
attribute.
Element.setAttribute()
Element.setAttributeNode()
Attribute names are converted to ASCII lowercase.
Specifically: when an attribute is set on an HTML element using Element.setAttribute()
, the name argument must be
converted to ASCII lowercase before the element is
affected; and when an Attr
node is set on an HTML element using Element.setAttributeNode()
, it must have its name
converted to ASCII lowercase before the element is
affected.
This doesn't apply to Element.setAttributeNS()
and Element.setAttributeNodeNS()
.
Element.getAttribute()
Element.getAttributeNode()
Attribute names are converted to ASCII lowercase.
Specifically: When the Element.getAttribute()
method or the Element.getAttributeNode()
method is invoked on
an HTML element, the name
argument must be converted to ASCII lowercase before the
element's attributes are examined.
This doesn't apply to Element.getAttributeNS()
and Element.getAttributeNodeNS()
.
Document.getElementsByTagName()
Element.getElementsByTagName()
HTML elements match by lower-casing the argument before comparison, elements from other namespaces are treated as in XML (case-sensitively).
Specifically, these methods (but not their namespaced counterparts) must compare the given argument in a case-sensitive manner, but when looking at HTML elements, the argument must first be converted to ASCII lowercase.
Thus, in an HTML document with nodes in multiple namespaces, these methods will effectively be both case-sensitive and case-insensitive at the same time.
Implementations of XPath 1.0 that
operate on HTML documents parsed or created in the
manners described in this specification (e.g. as part of the document.evaluate()
API) must act as if the
following edit was applied to the XPath 1.0 specification.
First, remove this paragraph:
A QName in the node test is expanded into an expanded-name using the namespace declarations from the expression context. This is the same way expansion is done for element type names in start and end-tags except that the default namespace declared with
xmlns
is not used: if the QName does not have a prefix, then the namespace URI is null (this is the same way attribute names are expanded). It is an error if the QName has a prefix for which there is no namespace declaration in the expression context.
Then, insert in its place the following:
A QName in the node test is expanded into an expanded-name using the namespace declarations from the expression context. If the QName has a prefix, then there must be a namespace declaration for this prefix in the expression context, and the corresponding namespace URI is the one that is associated with this prefix. It is an error if the QName has a prefix for which there is no namespace declaration in the expression context.
If the QName has no prefix and the principal node type of the axis is element, then the default element namespace is used. Otherwise if the QName has no prefix, the namespace URI is null. The default element namespace is a member of the context for the XPath expression. The value of the default element namespace when executing an XPath expression through the DOM3 XPath API is determined in the following way:
- If the context node is from an HTML DOM, the default element namespace is "http://www.w3.org/1999/xhtml".
- Otherwise, the default element namespace URI is null.
This is equivalent to adding the default element namespace feature of XPath 2.0 to XPath 1.0, and using the HTML namespace as the default element namespace for HTML documents. It is motivated by the desire to have implementations be compatible with legacy HTML content while still supporting the changes that this specification introduces to HTML regarding the namespace used for HTML elements, and by the desire to use XPath 1.0 rather than XPath 2.0.
This change is a willful violation of the XPath 1.0 specification, motivated by desire to have implementations be compatible with legacy content while still supporting the changes that this specification introduces to HTML regarding which namespace is used for HTML elements. [XPATH10]
XSLT 1.0 processors outputting to a DOM when the output method is "html" (either explicitly or via the defaulting rule in XSLT 1.0) are affected as follows:
If the transformation program outputs an element in no namespace, the processor must, prior to constructing the corresponding DOM element node, change the namespace of the element to the HTML namespace, ASCII-lowercase the element's local name, and ASCII-lowercase the names of any non-namespaced attributes on the element.
This requirement is a willful violation of the XSLT 1.0 specification, required because this specification changes the namespaces and case-sensitivity rules of HTML in a manner that would otherwise be incompatible with DOM-based XSLT transformations. (Processors that serialize the output are unaffected.) [XSLT10]
There are also additional comments regarding the
interaction of XSLT and HTML in the
script
element section.
APIs for dynamically inserting markup into the document interact with the parser, and thus their behavior varies depending on whether they are used with HTML documents (and the HTML parser) or XHTML in XML documents (and the XML parser).
The open()
method comes in several variants with different numbers of
arguments.
open
( [ type [, replace ] ] )Causes the Document
to be replaced in-place, as if
it was a new Document
object, but reusing the
previous object, which is then returned.
If the type argument is omitted or has the
value "text/html
", then the resulting
Document
has an HTML parser associated with it, which
can be given data to parse using document.write()
. Otherwise, all
content passed to document.write()
will be parsed
as plain text.
If the replace argument is present and has
the value "replace
", the existing entries in
the session history for the Document
object are
removed.
The method has no effect if the Document
is still
being parsed.
Throws an INVALID_STATE_ERR
exception if the
Document
is an XML
document.
open
( url, name, features [, replace ] )Works like the window.open()
method.
When called with two or fewer arguments, the method must act as follows:
Document
object is not flagged as an HTML document, throw an
INVALID_STATE_ERR
exception and abort these
steps.Let type be the value of the first
argument, if there is one, or "text/html
"
otherwise.
Let replace be true if there is a second argument and it is an ASCII case-insensitive match for the value "replace", and false otherwise.
If the document has an active parser that isn't a
script-created parser, and the insertion
point associated with that parser's input
stream is not undefined (that is, it does point to
somewhere in the input stream), then the method does
nothing. Abort these steps and return the Document
object on which the method was invoked.
This basically causes document.open()
to be ignored
when it's called in an inline script found during the parsing of
data sent over the network, while still letting it have an effect
when called asynchronously or on a document that is itself being
spoon-fed using these APIs.
Release the storage mutex.
Prompt to
unload the Document
object. If the user
refused to allow the document to be unloaded, then
these steps must be aborted.
Unload the
Document
object, with the recycle
parameter set to true.
Unregister all event listeners registered on the
Document
node and its descendants.
Remove any tasks
associated with the Document
in any task
source.
Remove all child nodes of the document, without firing any mutation events.
Replace the Document
's singleton objects with
new instances of those objects. (This includes in particular the
Window
, Location
, History
,
ApplicationCache
,
and Navigator
, objects, the various
BarProp
objects, the two Storage
objects,
the various HTMLCollection
objects, and objects
defined by other specifications, like Selection
. It
also includes all the Web IDL prototypes in the JavaScript binding,
including the Document
object's prototype.)
Change the document's character encoding to UTF-8.
Set the Document
object's reload override
flag and set the Document
's reload
override buffer to the empty string.
Change the document's address to the entry script's document's address.
Create a new HTML parser and associate it with
the document. This is a script-created parser (meaning
that it can be closed by the document.open()
and document.close()
methods, and
that the tokenizer will wait for an explicit call to document.close()
before emitting
an end-of-file token). The encoding confidence is
irrelevant.
Set the current document readiness of the document to "loading".
If the type string contains a U+003B SEMICOLON character (;), remove the first such character and all characters from it up to the end of the string.
Strip all leading and trailing space characters from type.
If type is not now an ASCII
case-insensitive match for the string
"text/html
", then act as if the tokenizer had emitted
a start tag token with the tag name "pre" followed by a single
U+000A LINE FEED (LF) character, then
switch the HTML parser's tokenizer to the
PLAINTEXT state.
Remove all the entries in the browsing context's session history after the current entry. If the current entry is the last entry in the session history, then no entries are removed.
This doesn't necessarily have to affect the user agent's user interface.
Remove any tasks queued by the history traversal task source.
Document
.If replace is false, then add a new
entry, just before the last entry, and associate with the new entry
the text that was parsed by the previous parser associated with the
Document
object, as well as the state of the document
at the start of these steps. This allows the user to step backwards
in the session history to see the page before it was blown away by
the document.open()
call.
This new entry does not have a Document
object, so a
new one will be created if the session history is traversed to that
entry.
Finally, set the insertion point to point at just before the end of the input stream (which at this point will be empty).
Return the Document
on which the method was
invoked.
When called with three or more arguments, the open()
method on the
HTMLDocument
object must call the open()
method on the Window
object of the HTMLDocument
object, with the same
arguments as the original call to the open()
method, and return whatever
that method returned. If the HTMLDocument
object has no
Window
object, then the method must raise an
INVALID_ACCESS_ERR
exception.
close
()Closes the input stream that was opened by the document.open()
method.
Throws an INVALID_STATE_ERR
exception if the
Document
is an XML
document.
The close()
method must run the following steps:
If the Document
object is not flagged as an
HTML document, throw an
INVALID_STATE_ERR
exception and abort these
steps.
If there is no script-created parser associated with the document, then abort these steps.
Insert an explicit "EOF" character at the end of the parser's input stream.
If there is a pending parsing-blocking script, then abort these steps.
Run the tokenizer, processing resulting tokens as they are emitted, and stopping when the tokenizer reaches the explicit "EOF" character or spins the event loop.
document.write()
write
(text...)In general, adds the given string(s) to the
Document
's input stream.
This method has very idiosyncratic behavior. In
some cases, this method can affect the state of the HTML
parser while the parser is running, resulting in a DOM that
does not correspond to the source of the document. In other cases,
the call can clear the current page first, as if document.open()
had been called.
In yet more cases, the method is simply ignored, or throws an
exception. To make matters worse, the exact behavior of this
method can in some cases be dependent on network latency, which
can lead to failures that are very hard to debug. For all
these reasons, use of this method is strongly
discouraged.
This method throws an INVALID_STATE_ERR
exception
when invoked on XML documents.
Document
objects have an
ignore-destructive-writes counter, which is used in
conjunction with the processing of script
elements to
prevent external scripts from being able to use document.write()
to blow away the
document by implicitly calling document.open()
. Initially, the
counter must be set to zero.
The document.write(...)
method must act as follows:
If the method was invoked on an XML
document, throw an INVALID_STATE_ERR
exception and abort these steps.
If the insertion point is undefined and the
Document
's ignore-destructive-writes
counter is greater than zero, then abort these steps.
If the insertion point is undefined, call the
open()
method on the document
object (with no arguments). If
the user refused to allow the document to be
unloaded, then abort these steps. Otherwise, the
insertion point will point at just before the end of
the (empty) input stream.
Insert the string consisting of the concatenation of all the arguments to the method into the input stream just before the insertion point.
If the Document
object's reload override
flag is set, then append the string consisting of the
concatenation of all the arguments to the method to the
Document
's reload override buffer.
If there is no pending parsing-blocking script,
have the tokenizer process the characters that were inserted, one
at a time, processing resulting tokens as they are emitted, and
stopping when the tokenizer reaches the insertion point or when
the processing of the tokenizer is aborted by the tree
construction stage (this can happen if a script
end
tag token is emitted by the tokenizer).
If the document.write()
method was
called from script executing inline (i.e. executing because the
parser parsed a set of script
tags), then this is a
reentrant invocation of the
parser.
Finally, return from the method.
document.writeln()
writeln
(text...)Adds the given string(s) to the Document
's input
stream, followed by a newline character. If necessary, calls the
open()
method implicitly
first.
This method throws an INVALID_STATE_ERR
exception
when invoked on XML documents.
The document.writeln(...)
method, when invoked, must act as if the document.write()
method had been
invoked with the same argument(s), plus an extra argument consisting
of a string containing a single line feed character (U+000A).
innerHTML
The innerHTML
IDL
attribute represents the markup of the node's contents.
innerHTML
[ = value ]Returns a fragment of HTML or XML that represents the
Document
.
Can be set, to replace the Document
's contents
with the result of parsing the given string.
In the case of XML documents, will throw an
INVALID_STATE_ERR
if the Document
cannot
be serialized to XML, and a SYNTAX_ERR
if the given
string is not well-formed.
innerHTML
[ = value ]Returns a fragment of HTML or XML that represents the element's contents.
Can be set, to replace the contents of the element with nodes parsed from the given string.
In the case of XML documents, will throw an
INVALID_STATE_ERR
if the element cannot be serialized
to XML, and a SYNTAX_ERR
if the given string is not
well-formed.
On getting, if the node's document is an HTML document, then the attribute must return the result of running the HTML fragment serialization algorithm on the node; otherwise, the node's document is an XML document, and the attribute must return the result of running the XML fragment serialization algorithm on the node instead (this might raise an exception instead of returning a string).
On setting, the following steps must be run:
If the node's document is an HTML document: Invoke the HTML fragment parsing algorithm.
If the node's document is an XML document: Invoke the XML fragment parsing algorithm.
In either case, the algorithm must be invoked with the string
being assigned into the innerHTML
attribute as the input. If the node is an Element
node, then, in addition, that element must be passed as the context element.
If this raises an exception, then abort these steps.
Otherwise, let new children be the nodes returned.
If the attribute is being set on a Document
node,
and that document has an active parser, then abort
that parser.
Remove the child nodes of the node whose innerHTML
attribute is being set,
firing appropriate mutation events.
If the attribute is being set on a Document
node,
let target document be that
Document
node. Otherwise, the attribute is being set
on an Element
node; let target
document be the ownerDocument
of that
Element
.
Set the ownerDocument
of all the
nodes in new children to the target document.
Append all the new children nodes to the
node whose innerHTML
attribute
is being set, preserving their order, and firing mutation events
as if a DocumentFragment
containing the new children had been inserted.
outerHTML
The outerHTML
IDL
attribute represents the markup of the element and its contents.
outerHTML
[ = value ]Returns a fragment of HTML or XML that represents the element and its contents.
Can be set, to replace the element with nodes parsed from the given string.
In the case of XML documents, will throw an
INVALID_STATE_ERR
if the element cannot be serialized
to XML, and a SYNTAX_ERR
if the given string is not
well-formed.
Throws a NO_MODIFICATION_ALLOWED_ERR
exception if
the parent of the element is the Document
node.
On getting, if the node's document is an HTML document, then the attribute must return the result of running the HTML fragment serialization algorithm on a fictional node whose only child is the node on which the attribute was invoked; otherwise, the node's document is an XML document, and the attribute must return the result of running the XML fragment serialization algorithm on that fictional node instead (this might raise an exception instead of returning a string).
On setting, the following steps must be run:
Let target be the element whose outerHTML
attribute is being
set.
If target has no parent node, then abort these steps. There would be no way to obtain a reference to the nodes created even if the remaining steps were run.
If target's parent node is a
Document
object, throw a
NO_MODIFICATION_ALLOWED_ERR
exception and abort these
steps.
Let parent be target's
parent node, unless that is a DocumentFragment
node,
in which case let parent be an arbitrary
body
element.
If target's document is an HTML document: Invoke the HTML fragment parsing algorithm.
If target's document is an XML document: Invoke the XML fragment parsing algorithm.
In either case, the algorithm must be invoked with the string
being assigned into the outerHTML
attribute as the input, and parent as the context element.
If this raises an exception, then abort these steps.
Otherwise, let new children be the nodes returned.
Set the ownerDocument
of all the
nodes in new children to target's document.
Remove target from its parent node, firing
mutation events as appropriate, and then insert in its place all
the new children nodes, preserving their
order, and again firing mutation events as if a
DocumentFragment
containing the new
children had been inserted.
insertAdjacentHTML()
insertAdjacentHTML
(position, text)Parses the given string text as HTML or XML and inserts the resulting nodes into the tree in the position given by the position argument, as follows:
Throws a SYNTAX_ERR
exception if the arguments
have invalid values (e.g., in the case of XML
documents, if the given string is not well-formed).
Throws a NO_MODIFICATION_ALLOWED_ERR
exception if
the given position isn't possible (e.g. inserting elements after
the root element of a Document
).
The insertAdjacentHTML(position, text)
method, when invoked, must run the following algorithm:
Let position and text be the method's first and second arguments, respectively.
Let target be the element on which the method was invoked.
Use the first matching item from this list:
If target has no parent node, then abort these steps.
If target's parent node is a
Document
object, then throw a
NO_MODIFICATION_ALLOWED_ERR
exception and abort
these steps.
Otherwise, let destination be the parent node of target.
Let destination be the same as target.
Throw a SYNTAX_ERR
exception.
If target's document is an HTML document: Invoke the HTML fragment parsing algorithm.
If target's document is an XML document: Invoke the XML fragment parsing algorithm.
In either case, the algorithm must be invoked with text as the input, and destination as the context element.
If this raises an exception, then abort these steps.
Otherwise, let new children be the nodes returned.
Set the ownerDocument
of all the
nodes in new children to target's document.
Use the first matching item from this list:
Insert all the new children nodes immediately before target.
Insert all the new children nodes before the first child of target, if there is one. If there is no such child, append them all to target.
Append all the new children nodes to target.
Insert all the new children nodes immediately after target.
The new children nodes must be inserted in
a manner that preserves their order and fires mutation events as
if a DocumentFragment
containing the new children had been inserted.