Status: Last call for comments
This specification refers to both HTML and XML attributes and IDL attributes, often in the same context. When it is not clear which is being referred to, they are referred to as content attributes for HTML and XML attributes, and IDL attributes for those defined on IDL interfaces. Similarly, the term "properties" is used for both JavaScript object properties and CSS properties. When these are ambiguous they are qualified as object properties and CSS properties respectively.
Generally, when the specification states that a feature applies to the HTML syntax or the XHTML syntax, it also includes the other. When a feature specifically only applies to one of the two languages, it is called out by explicitly stating that it does not apply to the other format, as in "for HTML, ... (this does not apply to XHTML)".
This specification uses the term document to refer to any use of HTML, ranging from short static documents to long essays or reports with rich multimedia, as well as to fully-fledged interactive applications.
For simplicity, terms such as shown, displayed, and visible might sometimes be used when referring to the way a document is rendered to the user. These terms are not meant to imply a visual medium; they must be considered to apply to other media in equivalent ways.
Status: Last call for comments
The specification uses the term supported when referring to whether a user agent has an implementation capable of decoding the semantics of an external resource. A format or type is said to be supported if the implementation can process an external resource of that format or type without critical aspects of the resource being ignored. Whether a specific resource is supported can depend on what features of the resource's format are in use.
For example, a PNG image would be considered to be in a supported format if its pixel data could be decoded and rendered, even if, unbeknownst to the implementation, the image also contained animation data.
A MPEG4 video file would not be considered to be in a supported format if the compression format used was not supported, even if the implementation could determine the dimensions of the movie from the file's metadata.
What some specifications, in particular the HTTP and URI specifications, refer to as a representation is referred to in this specification as a resource. [HTTP] [RFC3986]
The term MIME type is used to refer to what is sometimes called an Internet media type in protocol literature. The term media type in this specification is used to refer to the type of media intended for presentation, as used by the CSS specifications. [RFC2046] [MQ]
A string is a valid MIME type if it matches the media-type
rule defined in section 3.7 "Media Types"
of RFC 2616. In particular, a valid MIME type may
include MIME type parameters. [HTTP]
A string is a valid MIME type with no parameters if it
matches the media-type
rule defined in section
3.7 "Media Types" of RFC 2616, but does not contain any U+003B
SEMICOLON characters (;). In other words, if it consists only of a
type and subtype, with no MIME Type parameters. [HTTP]
The term HTML MIME type is used to refer to the MIME types text/html
and
text/html-sandboxed
.
A resource's critical subresources are those that the
resource needs to have available to be correctly processed. Which
resources are considered critical or not is defined by the
specification that defines the resource's format. For CSS resources,
only @import
rules introduce critical
subresources; other resources, e.g. fonts or backgrounds, are
not.
Status: Last call for comments
To ease migration from HTML to XHTML, UAs
conforming to this specification will place elements in HTML in the
http://www.w3.org/1999/xhtml
namespace, at least for
the purposes of the DOM and CSS. The term "HTML
elements", when used in this specification, refers to any
element in that namespace, and thus refers to both HTML and XHTML
elements.
Except where otherwise stated, all elements defined or mentioned
in this specification are in the
http://www.w3.org/1999/xhtml
namespace, and all
attributes defined or mentioned in this specification have no
namespace.
Attribute names are said to be XML-compatible if they
match the Name
production defined in XML, they contain no
U+003A COLON characters (:), and their first three characters are
not an ASCII case-insensitive match for the string
"xml
". [XML]
The term XML MIME type is used to refer to the MIME types text/xml
,
application/xml
, and any MIME
type whose subtype ends with the four characters "+xml
". [RFC3023]
Status: Last call for comments
The term root element, when not explicitly qualified as referring to the document's root element, means the furthest ancestor element node of whatever node is being discussed, or the node itself if it has no ancestors. When the node is a part of the document, then the node's root element is indeed the document's root element; however, if the node is not currently part of the document tree, the root element will be an orphaned node.
When an element's root element is the root element
of a Document
, it is said to be in a
Document
. An element is said to have been inserted into a
document when its root element changes and is now
the document's root element. Analogously, an element is
said to have been removed from a document when its root
element changes from being the document's root
element to being another element.
A node's home subtree is the subtree rooted at that
node's root element. When a node is in a
Document
, its home subtree is that
Document
's tree.
The Document
of a Node
(such as an
element) is the Document
that the Node
's
ownerDocument
IDL attribute returns. When a
Node
is in a Document
then
that Document
is always the Node
's
Document
, and the Node
's ownerDocument
IDL attribute thus always returns that
Document
.
The term tree order means a pre-order, depth-first
traversal of DOM nodes involved (through the parentNode
/childNodes
relationship).
When it is stated that some element or attribute is ignored, or treated as some other value, or handled as if it was something else, this refers only to the processing of the node after it is in the DOM.
The term text node refers to any Text
node, including CDATASection
nodes; specifically, any
Node
with node type TEXT_NODE
(3)
or CDATA_SECTION_NODE
(4). [DOMCORE]
A content attribute is said to change value only if its new value is different than its previous value; setting an attribute to a value it already has does not change it.
Status: Last call for comments
The construction "a Foo
object", where
Foo
is actually an interface, is sometimes used instead
of the more accurate "an object implementing the interface
Foo
".
An IDL attribute is said to be getting when its value is being retrieved (e.g. by author script), and is said to be setting when a new value is assigned to it.
If a DOM object is said to be live, then the attributes and methods on that object operate on the actual underlying data, not a snapshot of the data.
The terms fire and dispatch are used interchangeably in the context of events, as in the DOM Events specifications. The term trusted event is used as defined by the DOM Events specification. [DOMEVENTS]
Status: Last call for comments
The term plugin refers to a user-agent defined set of
content handlers used by the user agent that can take part in the
user agent's rendering of a Document
object, but that
neither act as child browsing
contexts of the Document
nor introduce any
Node
objects to the Document
's DOM.
Typically such content handlers are provided by third parties, though a user agent can also designate built-in content handlers as plugins.
A user agent must not consider the types text/plain
and application/octet-stream
as having a registered
plugin.
One example of a plugin would be a PDF viewer that is instantiated in a browsing context when the user navigates to a PDF file. This would count as a plugin regardless of whether the party that implemented the PDF viewer component was the same as that which implemented the user agent itself. However, a PDF viewer application that launches separate from the user agent (as opposed to using the same interface) is not a plugin by this definition.
This specification does not define a mechanism for interacting with plugins, as it is expected to be user-agent- and platform-specific. Some UAs might opt to support a plugin mechanism such as the Netscape Plugin API; others might use remote content converters or have built-in support for certain types. [NPAPI]
Browsers should take extreme care when interacting with external content intended for plugins. When third-party software is run with the same privileges as the user agent itself, vulnerabilities in the third-party software become as dangerous as those in the user agent.
Status: Last call for comments. ISSUE-101 (us-ascii-ref) blocks progress to Last Call
The preferred MIME name of a character encoding is the name or alias labeled as "preferred MIME name" in the IANA Character Sets registry, if there is one, or the encoding's name, if none of the aliases are so labeled. [IANACHARSET]
An ASCII-compatible character encoding is a single-byte or variable-length encoding in which the bytes 0x09, 0x0A, 0x0C, 0x0D, 0x20 - 0x22, 0x26, 0x27, 0x2C - 0x3F, 0x41 - 0x5A, and 0x61 - 0x7A, ignoring bytes that are the second and later bytes of multibyte sequences, all correspond to single-byte sequences that map to the same Unicode characters as those bytes in ANSI_X3.4-1968 (US-ASCII). [RFC1345]
This includes such encodings as Shift_JIS, HZ-GB-2312, and variants of ISO-2022, even though it is possible in these encodings for bytes like 0x70 to be part of longer sequences that are unrelated to their interpretation as ASCII. It excludes such encodings as UTF-7, UTF-16, GSM03.38, and EBCDIC variants.
The term Unicode character is used to mean a Unicode scalar value (i.e. any Unicode code point that is not a surrogate code point). [UNICODE]
Status: Last call for comments
All diagrams, examples, and notes in this specification are non-normative, as are all sections explicitly marked non-normative. Everything else in this specification is normative.
The key words "MUST", "MUST NOT", "REQUIRED", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in the normative parts of this document are to be interpreted as described in RFC2119. For readability, these words do not appear in all uppercase letters in this specification. [RFC2119]
This specification describes the conformance criteria for documents.
Conforming documents are those that comply with all the conformance criteria for documents. For readability, some of these conformance requirements are phrased as conformance requirements on authors; such requirements are implicitly requirements on documents: by definition, all documents are assumed to have had an author. (In some cases, that author may itself be a user agent — such user agents are subject to additional rules, as explained below.)
For example, if a requirement states that
"authors must not use the foobar
element", it
would imply that documents are not allowed to contain elements named
foobar
.
For compatibility with existing content and prior specifications, this specification describes two authoring formats: one based on XML (referred to as the XHTML syntax), and one using a custom format inspired by SGML (referred to as the HTML syntax).
Status: Last call for comments. ISSUE-41 (Decentralized-extensibility) blocks progress to Last Call
HTML has a wide number of extensibility mechanisms that can be used for adding semantics in a safe manner:
class
attribute to extend elements, effectively creating their own
elements, while using the most applicable existing "real" HTML
element, so that browsers and other tools that don't know of the
extension can still support it somewhat well. This is the tack used
by Microformats, for example.data-*=""
attributes. These are
guaranteed to never be touched by browsers, and allow scripts to
include data on HTML elements that scripts can then look for and
process.<meta name=""
content="">
mechanism to include page-wide metadata by
registering extensions to the
predefined set of metadata names.rel=""
mechanism to annotate
links with specific meanings by registering extensions to the predefined set of
link types. This is also used by Microformats.<script type="">
mechanism with a custom
type, for further handling by a inline or server-side scripts.embed
element. This is how Flash
works.item=""
and itemprop=""
attributes) to embed
nested name-value pairs of data to be shared with other
applications and sites.Status: Last call for comments
Comparing two strings in a case-sensitive manner means comparing them exactly, code point for code point.
Comparing two strings in an ASCII case-insensitive manner means comparing them exactly, code point for code point, except that the characters in the range U+0041 to U+005A (i.e. LATIN CAPITAL LETTER A to LATIN CAPITAL LETTER Z) and the corresponding characters in the range U+0061 to U+007A (i.e. LATIN SMALL LETTER A to LATIN SMALL LETTER Z) are considered to also match.
Comparing two strings in a compatibility caseless manner means using the Unicode compatibility caseless match operation to compare the two strings. [UNICODE]
A string pattern is a prefix match for a string s when pattern is not longer than s and truncating s to pattern's length leaves the two strings as matches of each other.