meta
elementcharset
attribute is present, or if the element's http-equiv
attribute is in the Encoding declaration state: in a head
element.http-equiv
attribute is present but not in the Encoding declaration state: in a head
element.http-equiv
attribute is present but not in the Encoding declaration state: in a noscript
element that is a child of a head
element.name
attribute is present: where metadata content is expected.name
http-equiv
content
charset
interface HTMLMetaElement : HTMLElement { attribute DOMString name; attribute DOMString httpEquiv; attribute DOMString content; };
The meta
element represents various
kinds of metadata that cannot be expressed using the
title
, base
, link
,
style
, and script
elements.
The meta
element can represent document-level
metadata with the name
attribute, pragma directives with the http-equiv
attribute, and the
file's character encoding declaration when an HTML
document is serialized to string form (e.g. for transmission over
the network or for disk storage) with the charset
attribute.
Exactly one of the name
,
http-equiv
, and charset
attributes must be
specified.
If either name
or http-equiv
is specified, then
the content
attribute must
also be specified. Otherwise, it must be omitted.
The charset
attribute specifies the character encoding used by the
document. This is a character encoding declaration. If
the attribute is present in an XML
document, its value must be an ASCII
case-insensitive match for the string "UTF-8
" (and the document is therefore forced to use
UTF-8 as its encoding).
The charset
attribute on the meta
element has no effect in XML
documents, and is only allowed in order to facilitate migration to
and from XHTML.
There must not be more than one meta
element with a
charset
attribute per
document.
The content
attribute gives the value of the document metadata or pragma
directive when the element is used for those purposes. The allowed
values depend on the exact context, as described in subsequent
sections of this specification.
If a meta
element has a name
attribute, it sets
document metadata. Document metadata is expressed in terms of
name-value pairs, the name
attribute on the meta
element giving the name, and the
content
attribute on the same
element giving the value. The name specifies what aspect of metadata
is being set; valid names and the meaning of their values are
described in the following sections. If a meta
element
has no content
attribute,
then the value part of the metadata name-value pair is the empty
string.
The name
and content
IDL attributes
must reflect the respective content attributes of the
same name. The IDL attribute httpEquiv
must
reflect the content attribute http-equiv
.
This specification defines a few names for the name
attribute of the
meta
element.
Names are case-insensitive, and must be compared in an ASCII case-insensitive manner.
application-name
The value must be a short free-form string giving the name
of the Web application that the page represents. If the page is not
a Web application, the application-name
metadata name
must not be used. There must not be more than one meta
element with its name
attribute
set to the value application-name
per
document. User agents may use the application
name in UI in preference to the page's title
, since
the title might include status messages and the like relevant to
the status of the page at a particular moment in time instead of
just being the name of the application.
author
The value must be a free-form string giving the name of one of the page's authors.
description
The value must be a free-form string that describes the
page. The value must be appropriate for use in a directory of
pages, e.g. in a search engine. There must not be more than one
meta
element with its name
attribute set to the value description
per document.
generator
The value must be a free-form string that identifies one of the software packages used to generate the document.
Here is what a tool called "Frontweaver" could include in its
output, in the page's head
element, to identify
itself as the tool used to generate the page:
<meta name=generator content="Frontweaver 8.2">
keywords
The value must be a set of comma-separated tokens, each of which is a keyword relevant to the page.
This page about typefaces on British motorways uses a
meta
element to specify some keywords that users
might use to look for the page:
<!DOCTYPE HTML> <html> <head> <title>Typefaces on UK motorways</title> <meta name="keywords" content="british,type face,font,fonts,highway,highways"> </head> <body> ...
Many search engines do not consider such keywords, because this feature has historically been used unreliably and even misleadingly as a way to spam search engine results in a way that is not helpful for users.
To obtain the list of keywords that the author has specified as applicable to the page, the user agent must run the following steps:
Let keywords be an empty list.
For each meta
element with a name
attribute and a content
attribute and whose
name
attribute's value is
keywords
, run the following
substeps:
Split the value
of the element's content
attribute on commas.
Add the resulting tokens, if any, to keywords.
Remove any duplicates from keywords.
Return keywords. This is the list of keywords that the author has specified as applicable to the page.
User agents should not use this information when there is insufficient confidence in the reliability of the value.
For instance, it would be reasonable for a content management system to use the keyword information of pages within the system to populate the index of a site-specific search engine, but a large-scale content aggregator that used this information would likely find that certain users would try to game its ranking mechanism through the use of inappropriate keywords.
Extensions to the predefined set of metadata names may be registered in the WHATWG Wiki MetaExtensions page. [WHATWGWIKI]
Anyone is free to edit the WHATWG Wiki MetaExtensions page at any time to add a type. These new names must be specified with the following information:
The actual name being defined. The name should not be confusingly similar to any other defined name (e.g. differing only in case).
A short non-normative description of what the metadata name's meaning is, including the format the value is required to be in.
A list of other names that have exactly the same processing requirements. Authors should not use the names defined to be synonyms, they are only intended to allow user agents to support legacy content. Anyone may remove synonyms that are not used in practice; only names that need to be processed as synonyms for compatibility with legacy content are to be registered in this way.
One of the following:
If a metadata name is found to be redundant with existing values, it should be removed and listed as a synonym for the existing value.
If a metadata name is registered in the "proposed" state for a period of a month or more without being used or specified, then it may be removed from the registry.
If a metadata name is added with the "proposed" status and found to be redundant with existing values, it should be removed and listed as a synonym for the existing value. If a metadata name is added with the "proposed" status and found to be harmful, then it should be changed to "discontinued" status.
Anyone can change the status at any time, but should only do so in accordance with the definitions above.
Conformance checkers may use the information given on the WHATWG Wiki MetaExtensions page to establish if a value is allowed or not: values defined in this specification or marked as "proposed" or "ratified" must be accepted, whereas values marked as "discontinued" or not listed in either this specification or on the aforementioned page must be reported as invalid. Conformance checkers may cache this information (e.g. for performance reasons or to avoid the use of unreliable network connectivity).
When an author uses a new metadata name not defined by either this specification or the Wiki page, conformance checkers may offer to add the value to the Wiki, with the details described above, with the "proposed" status.
Metadata names whose values are to be URLs must not be proposed or accepted. Links must
be represented using the link
element, not the
meta
element.
When the http-equiv
attribute
is specified on a meta
element, the element is a pragma
directive.
The http-equiv
attribute is an enumerated attribute. The following
table lists the keywords defined for this attribute. The states
given in the first cell of the rows with keywords give the states to
which those keywords map. Some of the keywords
are non-conforming, as noted in the last column.
State | Keyword | Notes |
---|---|---|
Content Language | content-language
| Non-conforming |
Encoding declaration | content-type
| |
Default style | default-style
| |
Refresh | refresh
| |
Cookie setter | set-cookie
| Non-conforming |
When a meta
element is inserted into the document, if its
http-equiv
attribute is
present and represents one of the above states, then the user agent
must run the algorithm appropriate for that state, as described in
the following list:
http-equiv="content-language"
)
This feature is non-conforming. Authors are
encouraged to use the lang
attribute instead.
This pragma sets the pragma-set default language. Until such a pragma is successfully processed, there is no pragma-set default language.
If the meta
element has no content
attribute, or if that
attribute's value is the empty string, then abort these
steps.
If the element's content
attribute contains a
"," (U+002C) character then abort these steps.
Let input be the value of the
element's content
attribute.
Let position point at the first character of input.
Collect a sequence of characters that are not space characters.
Set the pragma-set default language to the string that resulted from the previous step.
This pragma is not exactly equivalent to the HTTP
Content-Language
header. [HTTP]
http-equiv="content-type"
)
The Encoding
declaration state is just an alternative form of setting
the charset
attribute: it is a
character encoding declaration. This state's user agent requirements are all handled
by the parsing section of the specification.
For meta
elements with an http-equiv
attribute in the
Encoding
declaration state, the content
attribute must have a
value that is an ASCII case-insensitive match for a
string that consists of: the literal string "text/html;
", optionally followed by any number of
space characters, followed by
the literal string "charset=
", followed by
the character encoding name of the character encoding
declaration.
A document must not contain both a meta
element
with an http-equiv
attribute in the Encoding declaration
state and a meta
element with the charset
attribute present.
The Encoding
declaration state may be used in HTML
documents, but elements with an http-equiv
attribute in that
state must not be used in XML documents.
http-equiv="default-style"
)
This pragma sets the name of the default alternative style sheet set.
If the meta
element has no content
attribute, or if that
attribute's value is the empty string, then abort these
steps.
Set the preferred style sheet set to the
value of the element's content
attribute. [CSSOM]
http-equiv="refresh"
)
This pragma acts as timed redirect.
If another meta
element with an http-equiv
attribute in the
Refresh state
has already been successfully processed (i.e. when it was
inserted the user agent processed it and reached the last step of
this list of steps), then abort these steps.
If the meta
element has no content
attribute, or if that
attribute's value is the empty string, then abort these
steps.
Let input be the value of the
element's content
attribute.
Let position point at the first character of input.
Collect a sequence of characters in the range ASCII digits, and parse the resulting string using the rules for parsing non-negative integers. If the sequence of characters collected is the empty string, then no number will have been parsed; abort these steps. Otherwise, let time be the parsed number.
Collect a sequence of characters in the range ASCII digits and "." (U+002E). Ignore any collected characters.
Let url be the address of the current page.
If the character in input pointed to by position is a ";" (U+003B), then advance position to the next character. Otherwise, jump to the last step.
If the character in input pointed to by position is a "U" (U+0055) character or a "u" (U+0075) character, then advance position to the next character. Otherwise, jump to the last step.
If the character in input pointed to by position is a "R" (U+0052) character or a "r" (U+0072) character, then advance position to the next character. Otherwise, jump to the last step.
If the character in input pointed to by position is s "L" (U+004C) character or a "l" (U+006C) character, then advance position to the next character. Otherwise, jump to the last step.
If the character in input pointed to by position is a "=" (U+003D), then advance position to the next character. Otherwise, jump to the last step.
If the character in input pointed to by position is either a "'" (U+0027) character or """ (U+0022) character, then let quote be that character, and advance position to the next character. Otherwise, let quote be the empty string.
Let url be equal to the substring of input from the character at position to the end of the string.
If quote is not the empty string, and there is a character in url equal to quote, then truncate url at that character, so that it and all subsequent characters are removed.
Strip any trailing space characters from the end of url.
Strip any "tab" (U+0009), "LF" (U+000A), and "CR" (U+000D) characters from url.
Resolve the url value to an absolute URL,
relative to the meta
element. If this fails, abort
these steps.
Perform one or more of the following steps:
After the refresh has come due (as defined below), if the
user has not canceled the redirect and if the
meta
element's Document
's
active sandboxing flag set does not have the
sandboxed automatic features browsing context
flag set, navigate the
Document
's browsing context to url, with replacement enabled, and
with the Document
's browsing context
as the source browsing context.
For the purposes of the previous paragraph, a refresh is said to have come due as soon as the later of the following two conditions occurs:
meta
element was inserted into the
Document
, adjusted to take into account
user or user agent preferences.Provide the user with an interface that, when selected, navigates a browsing context to url, with the document's browsing context as the source browsing context.
Do nothing.
In addition, the user agent may, as with anything, inform the user of any and all aspects of its operation, including the state of any timers, the destinations of any timed redirects, and so forth.
For meta
elements with an http-equiv
attribute in the
Refresh state,
the content
attribute must
have a value consisting either of:
URL
", followed by a "=" (U+003D) character, followed by a valid URL
that does not start with a literal "'" (U+0027) or
""" (U+0022) character.In the former case, the integer represents a number of seconds before the page is to be reloaded; in the latter case the integer represents a number of seconds before the page is to be replaced by the page at the given URL.
A news organization's front page could include the following
markup in the page's head
element, to ensure that
the page automatically reloads from the server every five
minutes:
<meta http-equiv="Refresh" content="300">
A sequence of pages could be used as an automated slide show by making each page refresh to the next page in the sequence, using markup such as the following:
<meta http-equiv="Refresh" content="20; URL=page4.html">
http-equiv="set-cookie"
)
This pragma sets an HTTP cookie. [COOKIES]
It is non-conforming. Real HTTP headers should be used instead.
If the meta
element has no content
attribute, or if that
attribute's value is the empty string, then abort these
steps.
Act as if receiving a set-cookie-string for
the document's address via a "non-HTTP" API,
consisting of the value of the element's content
attribute encoded as
UTF-8. [COOKIES] [RFC3629]
There must not be more than one meta
element with
any particular state in the document at a time.
Extensions to the predefined set of pragma directives may, under certain conditions, be registered in the WHATWG Wiki PragmaExtensions page. [WHATWGWIKI]
Such extensions must use a name that is identical to an HTTP header registered in the Permanent Message Header Field Registry, and must have behavior identical to that described for the HTTP header. [IANAPERMHEADERS]
Pragma directives corresponding to headers describing metadata, or not requiring specific user agent processing, must not be registered; instead, use metadata names. Pragma directives corresponding to headers that affect the HTTP processing model (e.g. caching) must not be registered, as they would result in HTTP-level behavior being different for user agents that implement HTML than for user agents that do not.
Anyone is free to edit the WHATWG Wiki PragmaExtensions page at any time to add a pragma directive satisfying these conditions. Such registrations must specify the following information:
The actual name being defined. The name must match a previously-registered HTTP name with the same requirements.
A short non-normative description of the purpose of the pragma directive.
Conformance checkers may use the information given on the WHATWG Wiki PragmaExtensions page to establish if a value is allowed or not: values defined in this specification or listed on the aforementioned page must be accepted, whereas values not listed in either this specification or on the aforementioned page must be reported as invalid. Conformance checkers may cache this information (e.g. for performance reasons or to avoid the use of unreliable network connectivity).
A character encoding declaration is a mechanism by which the character encoding used to store or transmit a document is specified.
The following restrictions apply to character encoding declarations:
In addition, due to a number of restrictions on meta
elements, there can only be one meta
-based character
encoding declaration per document.
If an HTML document does not
start with a BOM, and its encoding is not explicitly given by Content-Type metadata, and the document
is not an iframe
srcdoc
document, then the
character encoding used must be an ASCII-compatible character
encoding, and the encoding must be specified using a
meta
element with a charset
attribute or a
meta
element with an http-equiv
attribute in the
Encoding declaration
state.
A character encoding declaration is required (either in the Content-Type metadata or explicitly in the file) even if the encoding is US-ASCII, because a character encoding is needed to process non-ASCII characters entered by the user in forms, in URLs generated by scripts, and so forth.
If the document is an iframe
srcdoc
document, the
document must not have a character encoding
declaration. (In this case, the source is already decoded,
since it is part of the document that contained the
iframe
.)
If an HTML document contains
a meta
element with a charset
attribute or a
meta
element with an http-equiv
attribute in the
Encoding declaration
state, then the character encoding used must be an
ASCII-compatible character encoding.
Authors are encouraged to use UTF-8. Conformance checkers may advise authors against using legacy encodings. [RFC3629]
Authoring tools should default to using UTF-8 for newly-created documents. [RFC3629]
Encodings in which a series of bytes in the range 0x20 to 0x7E
can encode characters other than the corresponding characters in the
range U+0020 to U+007E represent a potential security vulnerability:
a user agent that does not support the encoding (or does not support
the label used to declare the encoding, or does not use the same
mechanism to detect the encoding of unlabelled content as another
user agent) might end up interpreting technically benign plain text
content as HTML tags and JavaScript. For example, this applies to
encodings in which the bytes corresponding to "<script>
" in ASCII can encode a different
string. Authors should not use such encodings, which are known to
include JIS_C6226-1983,
JIS_X0212-1990, HZ-GB-2312, JOHAB (Windows code
page 1361), encodings based on ISO-2022, and encodings based on EBCDIC. Furthermore, authors must not
use the CESU-8, UTF-7, BOCU-1 and SCSU encodings, which also fall
into this category, because these encodings were never intended for
use for Web content.
[RFC1345]
[RFC1842]
[RFC1468]
[RFC2237]
[RFC1554]
[CP50220]
[RFC1922]
[RFC1557]
[CESU8]
[UTF7]
[BOCU1]
[SCSU]
Authors should not use UTF-32, as the encoding detection algorithms described in this specification intentionally do not distinguish it from UTF-16. [UNICODE]
Using non-UTF-8 encodings can have unexpected results on form submission and URL encodings, which use the document's character encoding by default.
In XHTML, the XML declaration should be used for inline character encoding information, if necessary.
In HTML, to declare that the character encoding is UTF-8, the
author could include the following markup near the top of the
document (in the head
element):
<meta charset="utf-8">
In XML, the XML declaration would be used instead, at the very top of the markup:
<?xml version="1.0" encoding="utf-8"?>