This is revision 1.5612.
Each element defined in this specification has a content model: a description of the element's expected contents. An HTML element must have contents that match the requirements described in the element's content model.
The space characters are
always allowed between elements. User agents represent these
characters between elements in the source markup as
Text
nodes in the DOM. Empty Text
nodes and
Text
nodes consisting of just sequences of those
characters are considered inter-element whitespace.
Inter-element whitespace, comment nodes, and processing instruction nodes must be ignored when establishing whether an element's contents match the element's content model or not, and must be ignored when following algorithms that define document and element semantics.
Thus, an element A is said to be
preceded or followed by a second element B if A and B
have the same parent node and there are no other element nodes or
Text
nodes (other than inter-element
whitespace) between them. Similarly, a node is the only
child of an element if that element contains no other nodes
other than inter-element whitespace, comment nodes, and
processing instruction nodes.
Authors must not use HTML elements anywhere except where they are explicitly allowed, as defined for each element, or as explicitly required by other specifications. For XML compound documents, these contexts could be inside elements from other namespaces, if those elements are defined as providing the relevant contexts.
For example, the Atom specification defines a content
element. When its type
attribute has the value xhtml
, the Atom specification requires that it
contain a single HTML div
element. Thus, a
div
element is allowed in that context, even though
this is not explicitly normatively stated by this specification. [ATOM]
In addition, HTML elements may be orphan nodes (i.e. without a parent node).
For example, creating a td
element and storing it
in a global variable in a script is conforming, even though
td
elements are otherwise only supposed to be used
inside tr
elements.
var data = { name: "Banana", cell: document.createElement('td'), };
Each element in HTML falls into zero or more categories that group elements with similar characteristics together. The following broad categories are used in this specification:
Some elements also fall into other categories, which are defined in other parts of this specification.
These categories are related as follows:
Sectioning content, heading content, phrasing content, embedded content, and interactive content are all types of flow content. Metadata is sometimes flow content. Metadata and interactive content are sometimes phrasing content. Embedded content is also a type of phrasing content, and sometimes is interactive content.
Other categories are also used for specific purposes, e.g. form controls are specified using a number of categories to define common requirements. Some elements have unique requirements and do not fit into any particular category.
Metadata content is content that sets up the presentation or behavior of the rest of the content, or that sets up the relationship of the document with other documents, or that conveys other "out of band" information.
Elements from other namespaces whose semantics are primarily metadata-related (e.g. RDF) are also metadata content.
Thus, in the XML serialization, one can use RDF, like this:
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:r="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <head> <title>Hedral's Home Page</title> <r:RDF> <Person xmlns="http://www.w3.org/2000/10/swap/pim/contact#" r:about="http://hedral.example.com/#"> <fullName>Cat Hedral</fullName> <mailbox r:resource="mailto:hedral@damowmow.com"/> <personalTitle>Sir</personalTitle> </Person> </r:RDF> </head> <body> <h1>My home page</h1> <p>I like playing with string, I guess. Sister says squirrels are fun too so sometimes I follow her to play with them.</p> </body> </html>
This isn't possible in the HTML serialization, however.
Most elements that are used in the body of documents and applications are categorized as flow content.
a
abbr
address
area
(if it is a descendant of a map
element)article
aside
audio
b
bdi
bdo
blockquote
br
button
canvas
cite
code
command
datalist
del
details
dfn
div
dl
em
embed
fieldset
figure
footer
form
h1
h2
h3
h4
h5
h6
header
hgroup
hr
i
iframe
img
input
ins
kbd
keygen
label
map
mark
math
menu
meter
nav
noscript
object
ol
output
p
pre
progress
q
ruby
s
samp
script
section
select
small
span
strong
style
(if the scoped
attribute is present)sub
sup
svg
table
textarea
time
u
ul
var
video
wbr
Sectioning content is content that defines the scope of headings and footers.
Each sectioning content element potentially has a heading and an outline. See the section on headings and sections for further details.
There are also certain elements that are sectioning roots. These are distinct from sectioning content, but they can also have an outline.
Heading content defines the header of a section (whether explicitly marked up using sectioning content elements, or implied by the heading content itself).
Phrasing content is the text of the document, as well as elements that mark up that text at the intra-paragraph level. Runs of phrasing content form paragraphs.
a
(if it contains only phrasing content)abbr
area
(if it is a descendant of a map
element)audio
b
bdi
bdo
br
button
canvas
cite
code
command
datalist
del
(if it contains only phrasing content)dfn
em
embed
i
iframe
img
input
ins
(if it contains only phrasing content)kbd
keygen
label
map
(if it contains only phrasing content)mark
math
meter
noscript
object
output
progress
q
ruby
s
samp
script
select
small
span
strong
sub
sup
svg
textarea
time
u
var
video
wbr
As a general rule, elements whose content model allows any
phrasing content should have either at least one
descendant Text
node that is not inter-element
whitespace, or at least one descendant element node that is
embedded content. For the purposes of this requirement,
nodes that are descendants of del
elements must not be
counted as contributing to the ancestors of the del
element.
Most elements that are categorized as phrasing content can only contain elements that are themselves categorized as phrasing content, not any flow content.
Text, in the context of content
models, means Text
nodes. Text is sometimes used as a content model on its
own, but is also phrasing content, and can be
inter-element whitespace (if the Text
nodes are empty or contain just space
characters).
Text
nodes and attribute values must consist of
Unicode characters, must not
contain U+0000 characters, must not contain permanently undefined
Unicode characters (noncharacters), and must not contain control
characters other than space
characters.
This specification includes extra constraints on the exact value of
Text
nodes and attribute values depending on their
precise context.
Embedded content is content that imports another resource into the document, or content from another vocabulary that is inserted into the document.
Elements that are from namespaces other than the HTML namespace and that convey content but not metadata, are embedded content for the purposes of the content models defined in this specification. (For example, MathML, or SVG.)
Some embedded content elements can have fallback content: content that is to be used when the external resource cannot be used (e.g. because it is of an unsupported format). The element definitions state what the fallback is, if any.
Interactive content is content that is specifically intended for user interaction.
a
audio
(if the controls
attribute is present)button
details
embed
iframe
img
(if the usemap
attribute is present)input
(if the type
attribute is not in the Hidden state)keygen
label
menu
(if the type
attribute is in the toolbar state)object
(if the usemap
attribute is present)select
textarea
video
(if the controls
attribute is present)Certain elements in HTML have an activation
behavior, which means that the user can activate them. This
triggers a sequence of events dependent on the activation mechanism,
and normally culminating in a click
event, as described below.
The user agent should allow the user to manually trigger elements that have an activation behavior, for instance using keyboard or voice input, or through mouse clicks. When the user triggers an element with a defined activation behavior in a manner other than clicking it, the default action of the interaction event must be to run synthetic click activation steps on the element.
Each element has a click in progress flag, initially set to false.
When a user agent is to run synthetic click activation steps on an element, the user agent must run the following steps:
If the element's click in progress flag is set to true, then abort these steps.
Set the click in progress flag on the element to true.
Run pre-click activation steps on the element.
Fire a click
event at the element.
If this click
event is not
canceled, run post-click activation steps on the
element.
If the event is canceled, the user agent must run canceled activation steps on the element instead.
Set the click in progress flag on the element to false.
When a pointing device is clicked, the user agent must run these steps:
If the element's click in progress flag is set to true, then abort these steps.
Set the click in progress flag on the element to true.
Let e be the nearest activatable element of the element designated by the user (defined below), if any.
If there is an element e, run pre-click activation steps on it.
Dispatch the required click
event.
If there is an element e and the click
event is not canceled, run
post-click activation steps on element e.
If there is an element e and the event is canceled, run canceled activation steps on element e.
Set the click in progress flag on the element to false.
The above doesn't happen for arbitrary synthetic
events dispatched by author script. However, the click()
method can be used to make it
happen programmatically.
Click-focusing behavior (e.g. the focusing of a text field when user clicks in one) typically happens before the click, when the mouse button is first depressed, and is therefore not discussed here.
Given an element target, the nearest activatable element is the element returned by the following algorithm:
If target has a defined activation behavior, then return target and abort these steps.
If target has a parent element, then set target to that parent element and return to the first step.
Otherwise, there is no nearest activatable element.
When a user agent is to run pre-click activation steps on an element, it must run the pre-click activation steps defined for that element, if any.
When a user agent is to run canceled activation steps on an element, it must run the canceled activation steps defined for that element, if any.
When a user agent is to run post-click activation
steps on an element, it must run the activation
behavior defined for that element, if any. Activation
behaviors can refer to the click
event that was fired by the steps above leading up to this
point.
As a general rule, elements whose content model allows any
flow content or phrasing content should
have at least one child node that is palpable content
and that does not have the hidden
attribute specified.
This requirement is not a hard requirement, however, as there are many cases where an element can be empty legitimately, for example when it is used as a placeholder which will later be filled in by a script, or when the element is part of a template and would on most pages be filled in but on some pages is not relevant.
Conformance checkers are encouraged to provide a mechanism for authors to find elements that fail to fulfill this requirement, as an authoring aid.
The following elements are palpable content:
a
abbr
address
article
aside
audio
(if the controls
attribute is present)b
bdi
bdo
blockquote
button
canvas
cite
code
details
dfn
div
dl
(if the element's children include at least one name-value group)em
embed
fieldset
figure
footer
form
h1
h2
h3
h4
h5
h6
header
hgroup
i
iframe
img
input
(if the type
attribute is not in the Hidden state)ins
kbd
keygen
label
map
mark
math
menu
(if the type
attribute is in the toolbar state or the list state)meter
nav
object
ol
(if the element's children include at least one li
element)output
p
pre
progress
q
ruby
s
samp
section
select
small
span
strong
sub
sup
svg
table
textarea
time
u
ul
(if the element's children include at least one li
element)var
video
Some elements are described as transparent; they have "transparent" in the description of their content model. The content model of a transparent element is derived from the content model of its parent element: the elements required in the part of the content model that is "transparent" are the same elements as required in the part of the content model of the parent of the transparent element in which the transparent element finds itself.
For instance, an ins
element inside a
ruby
element cannot contain an rt
element, because the part of the ruby
element's
content model that allows ins
elements is the part
that allows phrasing content, and the rt
element is not phrasing content.
In some cases, where transparent elements are nested in each other, the process has to be applied iteratively.
Consider the following markup fragment:
<p><object><param><ins><map><a href="/">Apples</a></map></ins></object></p>
To check whether "Apples" is allowed inside the a
element, the content models are examined. The a
element's content model is transparent, as is the map
element's, as is the ins
element's, as is the part of
the object
element's in which the ins
element is found. The object
element is found in the
p
element, whose content model is phrasing
content. Thus, "Apples" is allowed, as text is phrasing
content.
When a transparent element has no parent, then the part of its content model that is "transparent" must instead be treated as accepting any flow content.
The term paragraph as defined in this
section is used for more than just the definition of the
p
element. The paragraph concept defined
here is used to describe how to interpret documents. The
p
element is merely one of several ways of marking up a
paragraph.
A paragraph is typically a run of phrasing content that forms a block of text with one or more sentences that discuss a particular topic, as in typography, but can also be used for more general thematic grouping. For instance, an address is also a paragraph, as is a part of a form, a byline, or a stanza in a poem.
In the following example, there are two paragraphs in a section. There is also a heading, which contains phrasing content that is not a paragraph. Note how the comments and inter-element whitespace do not form paragraphs.
<section> <h1>Example of paragraphs</h1> This is the <em>first</em> paragraph in this example. <p>This is the second.</p> <!-- This is not a paragraph. --> </section>
Paragraphs in flow content are defined relative to
what the document looks like without the a
,
ins
, del
, and map
elements
complicating matters, since those elements, with their hybrid
content models, can straddle paragraph boundaries, as shown in the
first two examples below.
Generally, having elements straddle paragraph boundaries is best avoided. Maintaining such markup can be difficult.
The following example takes the markup from the earlier example
and puts ins
and del
elements around some
of the markup to show that the text was changed (though in this
case, the changes admittedly don't make much sense). Notice how
this example has exactly the same paragraphs as the previous one,
despite the ins
and del
elements —
the ins
element straddles the heading and the first
paragraph, and the del
element straddles the boundary
between the two paragraphs.
<section> <ins><h1>Example of paragraphs</h1> This is the <em>first</em> paragraph in</ins> this example<del>. <p>This is the second.</p></del> <!-- This is not a paragraph. --> </section>
Let view be a view of the DOM that replaces
all a
, ins
, del
, and
map
elements in the document with their contents. Then,
in view, for each run of sibling phrasing
content nodes uninterrupted by other types of content, in an
element that accepts content other than phrasing
content as well as phrasing content, let first be the first node of the run, and let last be the last node of the run. For each such run
that consists of at least one node that is neither embedded
content nor inter-element whitespace, a
paragraph exists in the original DOM from immediately before first to immediately after last. (Paragraphs can thus span across
a
, ins
, del
, and
map
elements.)
Conformance checkers may warn authors of cases where they have
paragraphs that overlap each other (this can happen with
object
, video
, audio
, and
canvas
elements, and indirectly through elements in
other namespaces that allow HTML to be further embedded therein,
like svg
or math
).
A paragraph is also formed explicitly by
p
elements.
The p
element can be used to wrap
individual paragraphs when there would otherwise not be any content
other than phrasing content to separate the paragraphs from each
other.
In the following example, the link spans half of the first paragraph, all of the heading separating the two paragraphs, and half of the second paragraph. It straddles the paragraphs and the heading.
<header> Welcome! <a href="about.html"> This is home of... <h1>The Falcons!</h1> The Lockheed Martin multirole jet fighter aircraft! </a> This page discusses the F-16 Fighting Falcon's innermost secrets. </header>
Here is another way of marking this up, this time showing the paragraphs explicitly, and splitting the one link element into three:
<header> <p>Welcome! <a href="about.html">This is home of...</a></p> <h1><a href="about.html">The Falcons!</a></h1> <p><a href="about.html">The Lockheed Martin multirole jet fighter aircraft!</a> This page discusses the F-16 Fighting Falcon's innermost secrets.</p> </header>
It is possible for paragraphs to overlap when using certain elements that define fallback content. For example, in the following section:
<section> <h1>My Cats</h1> You can play with my cat simulator. <object data="cats.sim"> To see the cat simulator, use one of the following links: <ul> <li><a href="cats.sim">Download simulator file</a> <li><a href="http://sims.example.com/watch?v=LYds5xY4INU">Use online simulator</a> </ul> Alternatively, upgrade to the Mellblom Browser. </object> I'm quite proud of it. </section>
There are five paragraphs:
object
element.The first paragraph is overlapped by the other four. A user agent that supports the "cats.sim" resource will only show the first one, but a user agent that shows the fallback will confusingly show the first sentence of the first paragraph as if it was in the same paragraph as the second one, and will show the last paragraph as if it was at the start of the second sentence of the first paragraph.
To avoid this confusion, explicit p
elements can be
used.