HTML5

A vocabulary and associated APIs for HTML and XHTML

← 3 Semantics, structure, and APIs of HTML documents – Table of contents – 4 The elements of HTML →

3.2.5.1.6 Embedded content

Status: Last call for comments

Embedded content is content that imports another resource into the document, or content from another vocabulary that is inserted into the document.

Elements that are from namespaces other than the HTML namespace and that convey content but not metadata, are embedded content for the purposes of the content models defined in this specification. (For example, MathML, or SVG.)

Some embedded content elements can have fallback content: content that is to be used when the external resource cannot be used (e.g. because it is of an unsupported format). The element definitions state what the fallback is, if any.

3.2.5.1.7 Interactive content

Status: Last call for comments

Interactive content is content that is specifically intended for user interaction.

Certain elements in HTML have an activation behavior, which means that the user can activate them. This triggers a sequence of events dependent on the activation mechanism, and normally culminating in a click event followed by a DOMActivate event, as described below.

The user agent should allow the user to manually trigger elements that have an activation behavior, for instance using keyboard or voice input, or through mouse clicks. When the user triggers an element with a defined activation behavior in a manner other than clicking it, the default action of the interaction event must be to run synthetic click activation steps on the element.

When a user agent is to run synthetic click activation steps on an element, the user agent must run pre-click activation steps on the element, then fire a click event at the element. The default action of this click event must be to run post-click activation steps on the element. If the event is canceled, the user agent must run canceled activation steps on the element instead.

Given an element target, the nearest activatable element is the element returned by the following algorithm:

If target has a defined activation behavior, then return target and abort these steps.
If target has a parent element, then set target to that parent element and return to the first step.
Otherwise, there is no nearest activatable element.

When a pointing device is clicked, the user agent must run these steps:

Let e be the nearest activatable element of the element designated by the user, if any.
If there is an element e, run pre-click activation steps on it.
Dispatch the required click event.

If there is an element e, then the default action of the click event must be to run post-click activation steps on element e.

If there is an element e but the event is canceled, the user agent must run canceled activation steps on element e.

The above doesn't happen for arbitrary synthetic events dispatched by author script. However, the click() method can be used to make it happen programmatically.

When a user agent is to run pre-click activation steps on an element, it must run the pre-click activation steps defined for that element, if any.

When a user agent is to run post-click activation steps on an element, the user agent must fire a simple event named DOMActivate that is cancelable at that element. The default action of this event must be to run final activation steps on that element. If the event is canceled, the user agent must run canceled activation steps on the element instead.

When a user agent is to run canceled activation steps on an element, it must run the canceled activation steps defined for that element, if any.

When a user agent is to run final activation steps on an element, it must run the activation behavior defined for that element. Activation behaviors can refer to the click and DOMActivate events that were fired by the steps above leading up to this point.

3.2.5.2 Transparent content models

Status: Last call for comments

Some elements are described as transparent; they have "transparent" in the description of their content model.

When a content model includes a part that is "transparent", those parts must not contain content that would not be conformant if all transparent elements in the tree were replaced, in their parent element, by the children in the "transparent" part of their content model, retaining order.

Consider the following markup fragment:

<p>Hello <a href="world.html"><em>wonderful</em> world</a>!</p>

Its DOM looks like the following:

p
- #text: Hello
- a href="world.html"
  - em
    - #text: wonderful
  - #text: world
- #text: !

The content model of the a element is transparent. To see if its contents are conforming, therefore, the element is replaced by its contents:

p
- #text: Hello
- em
  - #text: wonderful
- #text: world
- #text: !

Since that is conforming, the contents of the a are conforming in the original fragment.

When a transparent element has no parent, then the part of its content model that is "transparent" must instead be treated as accepting any flow content.

3.2.5.3 Paragraphs

Status: Last call for comments

The term paragraph as defined in this section is distinct from (though related to) the p element defined later. The paragraph concept defined here is used to describe how to interpret documents.

A paragraph is typically a run of phrasing content that forms a block of text with one or more sentences that discuss a particular topic, as in typography, but can also be used for more general thematic grouping. For instance, an address is also a paragraph, as is a part of a form, a byline, or a stanza in a poem.

In the following example, there are two paragraphs in a section. There is also a heading, which contains phrasing content that is not a paragraph. Note how the comments and inter-element whitespace do not form paragraphs.

<section>
  <h1>Example of paragraphs</h1>
  This is the <em>first</em> paragraph in this example.
  <p>This is the second.</p>
  <!-- This is not a paragraph. -->
</section>

Paragraphs in flow content are defined relative to what the document looks like without the a, ins, del, and map elements complicating matters, since those elements, with their hybrid content models, can straddle paragraph boundaries, as shown in the first two examples below.

Generally, having elements straddle paragraph boundaries is best avoided. Maintaining such markup can be difficult.

The following example takes the markup from the earlier example and puts ins and del elements around some of the markup to show that the text was changed (though in this case, the changes admittedly don't make much sense). Notice how this example has exactly the same paragraphs as the previous one, despite the ins and del elements — the ins element straddles the heading and the first paragraph, and the del element straddles the boundary between the two paragraphs.

<section>
  <ins><h1>Example of paragraphs</h1>
  This is the <em>first</em> paragraph in</ins> this example<del>.
  <p>This is the second.</p></del>
  <!-- This is not a paragraph. -->
</section>

Let view be a view of the DOM that replaces all a, ins, del, and map elements in the document with their contents. Then, in view, for each run of sibling phrasing content nodes uninterrupted by other types of content, in an element that accepts content other than phrasing content as well as phrasing content, let first be the first node of the run, and let last be the last node of the run. For each such run that consists of at least one node that is neither embedded content nor inter-element whitespace, a paragraph exists in the original DOM from immediately before first to immediately after last. (Paragraphs can thus span across a, ins, del, and map elements.)

Conformance checkers may warn authors of cases where they have paragraphs that overlap each other (this can happen with object, video, audio, and canvas elements, and indirectly through elements in other namespaces that allow HTML to be further embedded therein, like svg or math).

A paragraph is also formed explicitly by p elements.

The p element can be used to wrap individual paragraphs when there would otherwise not be any content other than phrasing content to separate the paragraphs from each other.

In the following example, the link spans half of the first paragraph, all of the heading separating the two paragraphs, and half of the second paragraph. It straddles the paragraphs and the heading.

<aside>
 Welcome!
 <a href="about.html">
  This is home of...
  <h1>The Falcons!</h1>
  The Lockheed Martin multirole jet fighter aircraft!
 </a>
 This page discusses the F-16 Fighting Falcon's innermost secrets.
</aside>

Here is another way of marking this up, this time showing the paragraphs explicitly, and splitting the one link element into three:

<aside>
 <p>Welcome! <a href="about.html">This is home of...</a></p>
 <h1><a href="about.html">The Falcons!</a></h1>
 <p><a href="about.html">The Lockheed Martin multirole jet
 fighter aircraft!</a> This page discusses the F-16 Fighting
 Falcon's innermost secrets.</p>
</aside>

It is possible for paragraphs to overlap when using certain elements that define fallback content. For example, in the following section:

<section>
 <h1>My Cats</h1>
 You can play with my cat simulator.
 <object data="cats.sim">
  To see the cat simulator, use one of the following links:
  <ul>
   <li><a href="cats.sim">Download simulator file</a>
   <li><a href="http://sims.example.com/watch?v=LYds5xY4INU">Use online simulator</a>
  </ul>
  Alternatively, upgrade to the Mellblom Browser.
 </object>
 I'm quite proud of it.
</section>

There are five paragraphs:

The paragraph that says "You can play with my cat simulator. object I'm quite proud of it.", where object is the object element.
The paragraph that says "To see the cat simulator, use one of the following links:".
The paragraph that says "Download simulator file".
The paragraph that says "Use online simulator".
The paragraph that says "Alternatively, upgrade to the Mellblom Browser.".

The first paragraph is overlapped by the other four. A user agent that supports the "cats.sim" resource will only show the first one, but a user agent that shows the fallback will confusingly show the first sentence of the first paragraph as if it was in the same paragraph as the second one, and will show the last paragraph as if it was at the start of the second sentence of the first paragraph.

To avoid this confusion, explicit p elements can be used.

3.2.6 Annotations for assistive technology products (ARIA)

Authors may use the ARIA role and aria-* attributes on HTML elements, in accordance with the requirements described in the ARIA specifications, except where these conflict with the strong native semantics described below. These exceptions are intended to prevent authors from making assistive technology products report nonsensical states that do not represent the actual state of the document. [ARIA]

User agents are required to implement ARIA semantics on all HTML elements, as defined in the ARIA specifications. The implicit ARIA semantics defined below must be recognised by implementations. [ARIAIMPL]

The following table defines the strong native semantics and corresponding implicit ARIA semantics that apply to HTML elements. Each language feature (element or attribute) in a cell in the first column implies the ARIA semantics (role, states, and/or properties) given in the cell in the second column of the same row. Authors must not set the ARIA role and aria-* attributes in a manner that conflicts with the semantics described in the following table. When multiple rows apply to an element, the role from the last row to define a role must be applied, and the states and properties from all the rows must be combined.

Language feature	Strong native semantics and implied ARIA semantics
`a` element that represents a hyperlink	`link` role
`area` element that represents a hyperlink	`link` role
`button` element	`button` role
`datalist` element	`listbox` role, with the `aria-multiselectable` property set to "false"
`h1` element that does not have an `hgroup` ancestor	`heading` role, with the `aria-level` property set to the element's outline depth
`h2` element that does not have an `hgroup` ancestor	`heading` role, with the `aria-level` property set to the element's outline depth
`h3` element that does not have an `hgroup` ancestor	`heading` role, with the `aria-level` property set to the element's outline depth
`h4` element that does not have an `hgroup` ancestor	`heading` role, with the `aria-level` property set to the element's outline depth
`h5` element that does not have an `hgroup` ancestor	`heading` role, with the `aria-level` property set to the element's outline depth
`h6` element that does not have an `hgroup` ancestor	`heading` role, with the `aria-level` property set to the element's outline depth
`hgroup` element	`heading` role, with the `aria-level` property set to the element's outline depth
`hr` element	`separator` role
`img` element whose `alt` attribute's value is empty	`presentation` role
`input` element with a `type` attribute in the Button state	`button` role
`input` element with a `type` attribute in the Checkbox state	`checkbox` role, with the `aria-checked` state set to "mixed" if the element's `indeterminate` IDL attribute is true, or "true" if the element's checkedness is true, or "false" otherwise
`input` element with a `type` attribute in the Color state	No role
`input` element with a `type` attribute in the Date state	No role, with the `aria-readonly` state set to "true" if the element has a `readonly` attribute
`input` element with a `type` attribute in the Date and Time state	No role, with the `aria-readonly` state set to "true" if the element has a `readonly` attribute
`input` element with a `type` attribute in the Local Date and Time state	No role, with the `aria-readonly` state set to "true" if the element has a `readonly` attribute
`input` element with a `type` attribute in the E-mail state with no suggestions source element	`textbox` role, with the `aria-readonly` state set to "true" if the element has a `readonly` attribute
`input` element with a `type` attribute in the File Upload state	`button` role
`input` element with a `type` attribute in the Hidden state	No role
`input` element with a `type` attribute in the Image Button state	`button` role
`input` element with a `type` attribute in the Month state	No role, with the `aria-readonly` state set to "true" if the element has a `readonly` attribute
`input` element with a `type` attribute in the Number state	`spinbutton` role, with the `aria-readonly` state set to "true" if the element has a `readonly` attribute, the `aria-valuemax` property set to the element's maximum, the `aria-valuemin` property set to the element's minimum, and, if the result of applying the rules for parsing floating point number values to the element's value is a number, with the `aria-valuenow` property set to that number
`input` element with a `type` attribute in the Password state	`textbox` role, with the `aria-readonly` state set to "true" if the element has a `readonly` attribute
`input` element with a `type` attribute in the Radio Button state	`radio` role, with the `aria-checked` state set to "true" if the element's checkedness is true, or "false" otherwise
`input` element with a `type` attribute in the Range state	`slider` role, with the `aria-valuemax` property set to the element's maximum, the `aria-valuemin` property set to the element's minimum, and the `aria-valuenow` property set to the result of applying the rules for parsing floating point number values to the element's value, if that that results in a number, or the default value otherwise
`input` element with a `type` attribute in the Reset Button state	`button` role
`input` element with a `type` attribute in the Search state with no suggestions source element	`textbox` role, with the `aria-readonly` state set to "true" if the element has a `readonly` attribute
`input` element with a `type` attribute in the Submit Button state	`button` role
`input` element with a `type` attribute in the Telephone state with no suggestions source element	`textbox` role, with the `aria-readonly` state set to "true" if the element has a `readonly` attribute
`input` element with a `type` attribute in the Text state with no suggestions source element	`textbox` role, with the `aria-readonly` state set to "true" if the element has a `readonly` attribute
`input` element with a `type` attribute in the Text, Search, Telephone, URL, or E-mail states with a suggestions source element	`combobox` role, with the `aria-owns` property set to the same value as the `list` attribute, and the `aria-readonly` state set to "true" if the element has a `readonly` attribute
`input` element with a `type` attribute in the Time state	No role, with the `aria-readonly` state set to "true" if the element has a `readonly` attribute
`input` element with a `type` attribute in the URL state with no suggestions source element	`textbox` role, with the `aria-readonly` state set to "true" if the element has a `readonly` attribute
`input` element with a `type` attribute in the Week state	No role, with the `aria-readonly` state set to "true" if the element has a `readonly` attribute
`link` element that represents a hyperlink	`link` role
`menu` element with a `type` attribute in the context menu state	No role
`menu` element with a `type` attribute in the list state	`menu` role
`menu` element with a `type` attribute in the toolbar state	`toolbar` role
`nav` element	`navigation` role
`option` element that is in a list of options or that represents a suggestion in a `datalist` element	`option` role, with the `aria-selected` state set to "true" if the element's selectedness is true, or "false" otherwise.
`progress` element	`progressbar` role, with, if the progress bar is determinate, the `aria-valuemax` property set to the maximum value of the progress bar, the `aria-valuemin` property set to zero, and the `aria-valuenow` property set to the current value of the progress bar
`select` element with a `multiple` attribute	`listbox` role, with the `aria-multiselectable` property set to "true"
`select` element with no `multiple` attribute	`listbox` role, with the `aria-multiselectable` property set to "false"
`td` element	`gridcell` role, with the `aria-labelledby` property set to the value of the `headers` attribute, if any
`textarea` element	`textbox` role, with the `aria-multiline` property set to "true", and the `aria-readonly` state set to "true" if the element has a `readonly` attribute
`th` element that is neither a column header nor a row header	`gridcell` role, with the `aria-labelledby` property set to the value of the `headers` attribute, if any
`th` element that is a column header	`columnheader` role, with the `aria-labelledby` property set to the value of the `headers` attribute, if any
`th` element that is a row header	`rowheader` role, with the `aria-labelledby` property set to the value of the `headers` attribute, if any
`tr` element	`row` role
An element that defines a command, whose Type facet is "checkbox", and that is a descendant of a `menu` element whose `type` attribute in the list state	`menuitemcheckbox` role, with the `aria-checked` state set to "true" if the command's Checked State facet is true, and "false" otherwise
An element that defines a command, whose Type facet is "command", and that is a descendant of a `menu` element whose `type` attribute in the list state	`menuitem` role
An element that defines a command, whose Type facet is "radio", and that is a descendant of a `menu` element whose `type` attribute in the list state	`menuitemradio` role, with the `aria-checked` state set to "true" if the command's Checked State facet is true, and "false" otherwise
Elements that are disabled	The `aria-disabled` state set to "true"
Elements that are required	The `aria-required` state set to "true"

Some HTML elements have native semantics that can be overridden. The following table lists these elements and their implicit ARIA semantics, along with the restrictions that apply to those elements. Each language feature (element or attribute) in a cell in the first column implies, unless otherwise overriden, the ARIA semantic (role, state, or property) given in the cell in the second column of the same row, but this semantic may be overridden under the conditions listed in the cell in the third column of that row.

Language feature	Default implied ARIA semantic	Restrictions
`address` element	No role	If specified, role must be `contentinfo` (ARIA restricts usage of this role to one per page)
`article` element	`article` role	Role must be either `article`, `document`, `application`, or `main` (ARIA restricts usage of this role to one per page)
`aside` element	`note` role	Role must be either `note`, `complementary`, or `search`
`footer` element	No role	If specified, role must be `contentinfo` (ARIA restricts usage of this role to one per page)
`header` element	No role	If specified, role must be `banner` (ARIA restricts usage of this role to one per page)
`li` element whose parent is an `ol` or `ul` element	`listitem` role	Role must be either `listitem` or `treeitem`
`ol` element	`list` role	Role must be either `list`, `tree`, or `directory`
`output` element	`status` role	No restrictions
`section` element	`region` role	Role must be either `region`, `document`, `application`, `contentinfo` (ARIA restricts usage of this role to one per page), `main` (ARIA restricts usage of this role to one per page), `search`, `alert`, `dialog`, `alertdialog`, `status`, or `log`
`table` element	`grid` role	Role must be either `grid` or `treegrid`
`ul` element	`list` role	Role must be either `list` or `tree`, or `directory`
The body element	`document` role	Role must be either `document` or `application`

User agents may apply different defaults than those described in this section in order to expose the semantics of HTML elements in a manner more fine-grained than possible with the above definitions.

Conformance checkers are encouraged to phrase errors such that authors are encouraged to use more appropriate elements rather than remove accessibility annotations. For example, if an a element is marked as having the button role, a conformance checker could say "Either a button element or an input element is required when using the button role" rather than "The button role cannot be used with a elements".

3.3 APIs in HTML documents

Status: Last call for comments

For HTML documents, and for HTML elements in HTML documents, certain APIs defined in DOM Core become case-insensitive or case-changing, as sometimes defined in DOM Core, and as summarized or required below. [DOMCORE]

This does not apply to XML documents or to elements that are not in the HTML namespace despite being in HTML documents.

Element.tagName and Node.nodeName

These attributes must return element names converted to ASCII uppercase, regardless of the case with which they were created.

Document.createElement()

The canonical form of HTML markup is all-lowercase; thus, this method will lowercase the argument before creating the requisite element. Also, the element created must be in the HTML namespace.

This doesn't apply to Document.createElementNS(). Thus, it is possible, by passing this last method a tag name in the wrong case, to create an element that appears to have the same tag name as that of an element defined in this specification when its tagName attribute is examined, but that doesn't support the corresponding interfaces. The "real" element name (unaffected by case conversions) can be obtained from the localName attribute.

Element.setAttribute()

Element.setAttributeNode()

Attribute names are converted to ASCII lowercase.

Specifically: when an attribute is set on an HTML element using Element.setAttribute(), the name argument must be converted to ASCII lowercase before the element is affected; and when an Attr node is set on an HTML element using Element.setAttributeNode(), it must have its name converted to ASCII lowercase before the element is affected.

This doesn't apply to Element.setAttributeNS() and Element.setAttributeNodeNS().

Element.getAttribute()

Element.getAttributeNode()

Attribute names are converted to ASCII lowercase.

Specifically: When the Element.getAttribute() method or the Element.getAttributeNode() method is invoked on an HTML element, the name argument must be converted to ASCII lowercase before the element's attributes are examined.

This doesn't apply to Element.getAttributeNS() and Element.getAttributeNodeNS().

Document.getElementsByTagName()

Element.getElementsByTagName()

HTML elements match by lower-casing the argument before comparison, elements from other namespaces are treated as in XML (case-sensitively).

Specifically, these methods (but not their namespaced counterparts) must compare the given argument in a case-sensitive manner, but when looking at HTML elements, the argument must first be converted to ASCII lowercase.

Thus, in an HTML document with nodes in multiple namespaces, these methods will effectively be both case-sensitive and case-insensitive at the same time.

3.4 Interactions with XPath and XSLT

Status: Last call for comments

Implementations of XPath 1.0 that operate on HTML documents parsed or created in the manners described in this specification (e.g. as part of the document.evaluate() API) must act as if the following edit was applied to the XPath 1.0 specification.

First, remove this paragraph:

A QName in the node test is expanded into an expanded-name using the namespace declarations from the expression context. This is the same way expansion is done for element type names in start and end-tags except that the default namespace declared with xmlns is not used: if the QName does not have a prefix, then the namespace URI is null (this is the same way attribute names are expanded). It is an error if the QName has a prefix for which there is no namespace declaration in the expression context.

Then, insert in its place the following:

A QName in the node test is expanded into an expanded-name using the namespace declarations from the expression context. If the QName has a prefix, then there must be a namespace declaration for this prefix in the expression context, and the corresponding namespace URI is the one that is associated with this prefix. It is an error if the QName has a prefix for which there is no namespace declaration in the expression context.

If the QName has no prefix and the principal node type of the axis is element, then the default element namespace is used. Otherwise if the QName has no prefix, the namespace URI is null. The default element namespace is a member of the context for the XPath expression. The value of the default element namespace when executing an XPath expression through the DOM3 XPath API is determined in the following way:

If the context node is from an HTML DOM, the default element namespace is "http://www.w3.org/1999/xhtml".

Otherwise, the default element namespace URI is null.

This is equivalent to adding the default element namespace feature of XPath 2.0 to XPath 1.0, and using the HTML namespace as the default element namespace for HTML documents. It is motivated by the desire to have implementations be compatible with legacy HTML content while still supporting the changes that this specification introduces to HTML regarding the namespace used for HTML elements, and by the desire to use XPath 1.0 rather than XPath 2.0.

This change is a willful violation of the XPath 1.0 specification, motivated by desire to have implementations be compatible with legacy content while still supporting the changes that this specification introduces to HTML regarding which namespace is used for HTML elements. [XPATH10]

XSLT 1.0 processors outputting to a DOM when the output method is "html" (either explicitly or via the defaulting rule in XSLT 1.0) are affected as follows:

If the transformation program outputs an element in no namespace, the processor must, prior to constructing the corresponding DOM element node, change the namespace of the element to the HTML namespace, ASCII-lowercase the element's local name, and ASCII-lowercase the names of any non-namespaced attributes on the element.

This requirement is a willful violation of the XSLT 1.0 specification, required because this specification changes the namespaces and case-sensitivity rules of HTML in a manner that would otherwise be incompatible with DOM-based XSLT transformations. (Processors that serialize the output are unaffected.) [XSLT10]

3.5 Dynamic markup insertion

Status: Last call for comments

APIs for dynamically inserting markup into the document interact with the parser, and thus their behavior varies depending on whether they are used with HTML documents (and the HTML parser) or XHTML in XML documents (and the XML parser).

3.5.1 Opening the input stream

Status: Last call for comments

The open() method comes in several variants with different numbers of arguments.

document = document . open( [ type [, replace ] ] )

Causes the Document to be replaced in-place, as if it was a new Document object, but reusing the previous object, which is then returned.

If the type argument is omitted or has the value "text/html", then the resulting Document has an HTML parser associated with it, which can be given data to parse using document.write(). Otherwise, all content passed to document.write() will be parsed as plain text.

If the replace argument is present and has the value "replace", the existing entries in the session history for the Document object are removed.

The method has no effect if the Document is still being parsed.

Throws an INVALID_STATE_ERR exception if the Document is an XML document.

window = document . open( url, name, features [, replace ] )

Works like the window.open() method.

When called with two or fewer arguments, the method must act as follows:

If the Document object is not flagged as an HTML document, throw an INVALID_STATE_ERR exception and abort these steps.
Let type be the value of the first argument, if there is one, or "text/html" otherwise.
Let replace be true if there is a second argument and it is an ASCII case-insensitive match for the value "replace", and false otherwise.
If the document has an active parser that isn't a script-created parser, and the insertion point associated with that parser's input stream is not undefined (that is, it does point to somewhere in the input stream), then the method does nothing. Abort these steps and return the Document object on which the method was invoked.

This basically causes document.open() to be ignored when it's called in an inline script found during the parsing of data sent over the network, while still letting it have an effect when called asynchronously or on a document that is itself being spoon-fed using these APIs.
Release the storage mutex.
Prompt to unload the Document object. If the user refused to allow the document to be unloaded, then these steps must be aborted.
Unload the Document object, with the recycle parameter set to true.
If the document has an active parser, then abort that parser.
Unregister all event listeners registered on the Document node and its descendants.
Remove any tasks associated with the Document in any task source.
Remove all child nodes of the document, without firing any mutation events.
Replace the Document's singleton objects with new instances of those objects. (This includes in particular the Window, Location, History, ApplicationCache, UndoManager, Navigator, and Selection objects, the various BarProp objects, the two Storage objects, and the various HTMLCollection objects. It also includes all the Web IDL prototypes in the JavaScript binding, including the Document object's prototype.)
Change the document's character encoding to UTF-16.
Change the document's address to the entry script's document's address.
Create a new HTML parser and associate it with the document. This is a script-created parser (meaning that it can be closed by the document.open() and document.close() methods, and that the tokenizer will wait for an explicit call to document.close() before emitting an end-of-file token). The encoding confidence is irrelevant.
If the type string contains a U+003B SEMICOLON character (;), remove the first such character and all characters from it up to the end of the string.

Strip all leading and trailing space characters from type.

If type is not now an ASCII case-insensitive match for the string "text/html", then act as if the tokenizer had emitted a start tag token with the tag name "pre", then switch the HTML parser's tokenizer to the PLAINTEXT state.
Remove all the entries in the browsing context's session history after the current entry. If the current entry is the last entry in the session history, then no entries are removed.

This doesn't necessarily have to affect the user agent's user interface.
Remove any tasks queued by the history traversal task source.
Remove any earlier entries that share the same Document.
If replace is false, then add a new entry, just before the last entry, and associate with the new entry the text that was parsed by the previous parser associated with the Document object, as well as the state of the document at the start of these steps. (This allows the user to step backwards in the session history to see the page before it was blown away by the document.open() call.)
Finally, set the insertion point to point at just before the end of the input stream (which at this point will be empty).
Return the Document on which the method was invoked.

When called with three or more arguments, the open() method on the HTMLDocument object must call the open() method on the Window object of the HTMLDocument object, with the same arguments as the original call to the open() method, and return whatever that method returned. If the HTMLDocument object has no Window object, then the method must raise an INVALID_ACCESS_ERR exception.

3.5.2 Closing the input stream

Status: Last call for comments

document . close()

Closes the input stream that was opened by the document.open() method.

Throws an INVALID_STATE_ERR exception if the Document is an XML document.

The close() method must run the following steps:

If the Document object is not flagged as an HTML document, throw an INVALID_STATE_ERR exception and abort these steps.
If there is no script-created parser associated with the document, then abort these steps.
Insert an explicit "EOF" character at the end of the parser's input stream.
If there is a pending parsing-blocking script, then abort these steps.
Run the tokenizer, processing resulting tokens as they are emitted, and stopping when the tokenizer reaches the explicit "EOF" character or spins the event loop.

3.5.3 `document.write()`

document . write(text...)

Adds the given string(s) to the Document's input stream. If necessary, calls the open() method implicitly first.

This method throws an INVALID_ACCESS_ERR exception when invoked on XML documents.

Unless called from the body of a script element while the document is being parsed, or called on a script-created document, calling this method will clear the current page first, as if document.open() had been called.

The document.write(...) method must act as follows:

If the method was invoked on an XML document, throw an INVALID_ACCESS_ERR exception and abort these steps.
If the insertion point is undefined, the open() method must be called (with no arguments) on the document object. If the user refused to allow the document to be unloaded, then these steps must be aborted. Otherwise, the insertion point will point at just before the end of the (empty) input stream.
The string consisting of the concatenation of all the arguments to the method must be inserted into the input stream just before the insertion point.
If there is a pending parsing-blocking script, then the method must now return without further processing of the input stream.
Otherwise, the tokenizer must process the characters that were inserted, one at a time, processing resulting tokens as they are emitted, and stopping when the tokenizer reaches the insertion point or when the processing of the tokenizer is aborted by the tree construction stage (this can happen if a script end tag token is emitted by the tokenizer).
If the document.write() method was called from script executing inline (i.e. executing because the parser parsed a set of script tags), then this is a reentrant invocation of the parser.
Finally, the method must return.

3.5.4 `document.writeln()`

document . writeln(text...)

Adds the given string(s) to the Document's input stream, followed by a newline character. If necessary, calls the open() method implicitly first.

This method throws an INVALID_ACCESS_ERR exception when invoked on XML documents.

The document.writeln(...) method, when invoked, must act as if the document.write() method had been invoked with the same argument(s), plus an extra argument consisting of a string containing a single line feed character (U+000A).

3.5.5 `innerHTML`

Status: Last call for comments

The innerHTML IDL attribute represents the markup of the node's contents.

document . innerHTML [ = value ]

Returns a fragment of HTML or XML that represents the Document.

Can be set, to replace the Document's contents with the result of parsing the given string.

In the case of XML documents, will throw an INVALID_STATE_ERR if the Document cannot be serialized to XML, and a SYNTAX_ERR if the given string is not well-formed.

element . innerHTML [ = value ]

Returns a fragment of HTML or XML that represents the element's contents.

Can be set, to replace the contents of the element with nodes parsed from the given string.

In the case of XML documents, will throw an INVALID_STATE_ERR if the element cannot be serialized to XML, and a SYNTAX_ERR if the given string is not well-formed.

On getting, if the node's document is an HTML document, then the attribute must return the result of running the HTML fragment serialization algorithm on the node; otherwise, the node's document is an XML document, and the attribute must return the result of running the XML fragment serialization algorithm on the node instead (this might raise an exception instead of returning a string).

On setting, the following steps must be run:

If the node's document is an HTML document: Invoke the HTML fragment parsing algorithm.

If the node's document is an XML document: Invoke the XML fragment parsing algorithm.

In either case, the algorithm must be invoked with the string being assigned into the innerHTML attribute as the input. If the node is an Element node, then, in addition, that element must be passed as the context element.

If this raises an exception, then abort these steps.

Otherwise, let new children be the nodes returned.
If the attribute is being set on a Document node, and that document has an active parser, then abort that parser.
Remove the child nodes of the node whose innerHTML attribute is being set, firing appropriate mutation events.
If the attribute is being set on a Document node, let target document be that Document node. Otherwise, the attribute is being set on an Element node; let target document be the ownerDocument of that Element.
Set the ownerDocument of all the nodes in new children to the target document.
Append all the new children nodes to the node whose innerHTML attribute is being set, preserving their order, and firing mutation events as if a DocumentFragment containing the new children had been inserted.

3.5.6 `outerHTML`

Status: Last call for comments

The outerHTML IDL attribute represents the markup of the element and its contents.

element . outerHTML [ = value ]

Returns a fragment of HTML or XML that represents the element and its contents.

Can be set, to replace the element with nodes parsed from the given string.

In the case of XML documents, will throw an INVALID_STATE_ERR if the element cannot be serialized to XML, and a SYNTAX_ERR if the given string is not well-formed.

Throws a NO_MODIFICATION_ALLOWED_ERR exception if the parent of the element is the Document node.

On getting, if the node's document is an HTML document, then the attribute must return the result of running the HTML fragment serialization algorithm on a fictional node whose only child is the node on which the attribute was invoked; otherwise, the node's document is an XML document, and the attribute must return the result of running the XML fragment serialization algorithm on that fictional node instead (this might raise an exception instead of returning a string).

On setting, the following steps must be run:

Let target be the element whose outerHTML attribute is being set.
If target has no parent node, then abort these steps. There would be no way to obtain a reference to the nodes created even if the remaining steps were run.
If target's parent node is a Document object, throw a NO_MODIFICATION_ALLOWED_ERR exception and abort these steps.
Let parent be target's parent node, unless that is a DocumentFragment node, in which case let parent be an arbitrary body element.
If target's document is an HTML document: Invoke the HTML fragment parsing algorithm.

If target's document is an XML document: Invoke the XML fragment parsing algorithm.

In either case, the algorithm must be invoked with the string being assigned into the outerHTML attribute as the input, and parent as the context element.

If this raises an exception, then abort these steps.

Otherwise, let new children be the nodes returned.
Set the ownerDocument of all the nodes in new children to target's document.
Remove target from its parent node, firing mutation events as appropriate, and then insert in its place all the new children nodes, preserving their order, and again firing mutation events as if a DocumentFragment containing the new children had been inserted.

3.5.7 `insertAdjacentHTML()`

element . insertAdjacentHTML(position, text)

Parses the given string text as HTML or XML and inserts the resulting nodes into the tree in the position given by the position argument, as follows:

"beforebegin": Before the element itself.
"afterbegin": Just inside the element, before its first child.
"beforeend": Just inside the element, after its last child.
"afterend": After the element itself.

Throws a SYNTAX_ERR exception if the arguments have invalid values (e.g., in the case of XML documents, if the given string is not well-formed).

Throws a NO_MODIFICATION_ALLOWED_ERR exception if the given position isn't possible (e.g. inserting elements after the root element of a Document).

The insertAdjacentHTML(position, text) method, when invoked, must run the following algorithm:

Let position and text be the method's first and second arguments, respectively.
Let target be the element on which the method was invoked.
Use the first matching item from this list:

If position is an ASCII case-insensitive match for the string "beforebegin"

If position is an ASCII case-insensitive match for the string "afterend"

If target has no parent node, then abort these steps.

If target's parent node is a Document object, then throw a NO_MODIFICATION_ALLOWED_ERR exception and abort these steps.
Otherwise, let context be the parent node of target.

If position is an ASCII case-insensitive match for the string "afterbegin"

If position is an ASCII case-insensitive match for the string "beforeend"

Let context be the same as target.

Otherwise

Throw a SYNTAX_ERR exception.
If target's document is an HTML document: Invoke the HTML fragment parsing algorithm.

If target's document is an XML document: Invoke the XML fragment parsing algorithm.

In either case, the algorithm must be invoked with text as the input, and the element selected in by the previous step as the context element.

If this raises an exception, then abort these steps.

Otherwise, let new children be the nodes returned.
Set the ownerDocument of all the nodes in new children to target's document.
Use the first matching item from this list:

If position is an ASCII case-insensitive match for the string "beforebegin"

Insert all the new children nodes immediately before target.

If position is an ASCII case-insensitive match for the string "afterbegin"

Insert all the new children nodes before the first child of target, if there is one. If there is no such child, append them all to target.

If position is an ASCII case-insensitive match for the string "beforeend"

Append all the new children nodes to target.

If position is an ASCII case-insensitive match for the string "afterend"

Insert all the new children nodes immediately after target.

The new children nodes must be inserted in a manner that preserves their order and fires mutation events as if a DocumentFragment containing the new children had been inserted.

HTML5

A vocabulary and associated APIs for HTML and XHTML

3.2.5.1.6 Embedded content

3.2.5.1.7 Interactive content

3.2.5.2 Transparent content models

3.2.5.3 Paragraphs

3.2.6 Annotations for assistive technology products (ARIA)

3.3 APIs in HTML documents

3.4 Interactions with XPath and XSLT

3.5 Dynamic markup insertion

3.5.1 Opening the input stream

3.5.2 Closing the input stream

3.5.3 document.write()

3.5.4 document.writeln()

3.5.5 innerHTML

3.5.6 outerHTML

3.5.7 insertAdjacentHTML()

3.5.3 `document.write()`

3.5.4 `document.writeln()`

3.5.5 `innerHTML`

3.5.6 `outerHTML`

3.5.7 `insertAdjacentHTML()`