Copyright ©1997-1999 W3C® (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
This specification defines the HyperText Markup Language (HTML), the publishing language of the World Wide Web. This specification defines HTML 4.01, which is a subversion of HTML 4. In addition to the text, multimedia, and hyperlink features of the previous versions of HTML (HTML 3.2 [HTML32] and HTML 2.0 [RFC1866]), HTML 4 supports more multimedia options, scripting languages, style sheets, better printing facilities, and documents that are more accessible to users with disabilities. HTML 4 also takes great strides towards the internationalization of documents, with the goal of making the Web truly World Wide.
HTML 4 is an SGML application conforming to International Standard ISO 8879 -- Standard Generalized Markup Language [ISO8879].
This section describes the status of this document at the time of its publication. Other documents may supersede this document. The latest status of this document series is maintained at the W3C.
This document specifies HTML 4.01, which is part of the HTML 4 line of specifications. The first version of HTML 4 was HTML 4.0 [HTML40], published on 18 December 1997 and revised 24 April 1998. This specification is the first HTML 4.01 Recommendation. It includes non-editorial changes since the 24 April version of HTML 4.0. There have been some changes to the DTDs, for example. This document obsoletes previous versions of HTML 4.0, although W3C will continue to make those specifications and their DTDs available at the W3C Web site.
This document has been reviewed by W3C Members and other interested parties and has been endorsed by the Director as a W3C Recommendation. It is a stable document and may be used as reference material or cited as a normative reference from another document. W3C's role in making the Recommendation is to draw attention to the specification and to promote its widespread deployment. This enhances the functionality and interoperability of the Web.
W3C recommends that user agents and authors (and in particular, authoring tools) produce HTML 4.01 documents rather than HTML 4.0 documents. W3C recommends that authors produce HTML 4 documents instead of HTML 3.2 documents. For reasons of backward compatibility, W3C also recommends that tools interpreting HTML 4 continue to support HTML 3.2 and HTML 2.0 as well.
For information about the next generation of HTML, "The Extensible HyperText Markup Language" [XHTML], please refer to the W3C HTML Activity and the list of W3C Technical Reports.
This document has been produced as part of the W3C HTML Activity. The goals of the HTML Working Group (Members only) are discussed in the HTML Working Group charter (Members only).
A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR.
Public discussion on HTML features takes place on www-html@w3.org (archives of www-html@w3.org).
The English version of this specification is the only normative version. However, for translations of this document, see http://www.w3.org/MarkUp/html4-updates/translations.
Please report errors in this document to www-html-editor@w3.org.
Contents
This specification is divided into the following sections:
The brief SGML tutorial gives readers some understanding of HTML's relationship to SGML and gives summary information on how to read the HTML Document Type Definition (DTD).
This document has been organized by topic rather than by the grammar of HTML. Topics are grouped into three categories: structure, presentation, and interactivity. Although it is not easy to divide HTML constructs perfectly into these three categories, the model reflects the HTML Working Group's experience that separating a document's structure from its presentation produces more effective and maintainable documents.
The language reference consists of the following information:
What characters may appear in an HTML document.
Basic data types of an HTML document.
Elements that govern the structure of an HTML document, including text, lists, tables, links, and included objects, images, and applets.
Elements that govern the presentation of an HTML document, including style sheets, fonts, colors, rules, and other visual presentation, and frames for multi-windowed presentations.
Elements that govern interactivity with an HTML document, including forms for user input and scripts for active documents.
The SGML formal definition of HTML:
This document has been written with two types of readers in mind: authors and implementors. We hope the specification will provide authors with the tools they need to write efficient, attractive, and accessible documents, without over-exposing them to HTML's implementation details. Implementors, however, should find all they need to build conforming user agents.
The specification may be approached in several ways:
Read from beginning to end. The specification begins with a general presentation of HTML and becomes more and more technical and specific towards the end.
The front pages of each section of the language reference manual extend the initial table of contents with more detail about that section.
Element names are written in uppercase letters (e.g., BODY). Attribute names are written in lowercase letters (e.g., lang, onsubmit). Recall that in HTML, element and attribute names are case-insensitive; the convention is meant to encourage readability.
Element and attribute names in this document have been marked up and may be rendered specially by some user agents.
Each attribute definition specifies the type of its value. If the type allows a small set of possible values, the definition lists the set of values, separated by a bar (|).
After the type information, each attribute definition indicates the case-sensitivity of its values, between square brackets ("[]"). See the section on case information for details.
Informative notes are emphasized to stand out from surrounding text and may be rendered specially by some user agents.
All examples illustrating deprecated usage are marked as "DEPRECATED EXAMPLE". Deprecated examples also include recommended alternate solutions. All examples that illustrates illegal usage are clearly marked "ILLEGAL EXAMPLE".
Examples and notes have been marked up and may be rendered specially by some user agents.
Thanks to everyone who has helped to author the working drafts that went into the HTML 4 specification, and to all those who have sent suggestions and corrections.
Many thanks to the Web Accessibility Initiative task force (WAI HC group) for their work on improving the accessibility of HTML and to T.V. Raman (Adobe) for his early work on developing accessible forms.
The authors of this specification, the members of the W3C HTML Working Group, deserve much applause for their diligent review of this document, their constructive comments, and their hard work: John D. Burger (MITRE), Steve Byrne (JavaSoft), Martin J. Dürst (University of Zurich), Daniel Glazman (Electricité de France), Scott Isaacs (Microsoft), Murray Maloney (GRIF), Steven Pemberton (CWI), Robert Pernett (Lotus), Jared Sorensen (Novell), Powell Smith (IBM), Robert Stevahn (HP), Ed Tecot (Microsoft), Jeffrey Veen (HotWired), Mike Wexler (Adobe), Misha Wolf (Reuters), and Lauren Wood (SoftQuad).
Thank you Dan Connolly (W3C) for rigorous and bountiful input as part-time editor and thoughtful guidance as chairman of the HTML Working Group. Thank you Sally Khudairi (W3C) for your indispensable work on press releases.
Thanks to David M. Abrahamson and Roger Price for their careful reading of the specification and constructive comments.
Thanks to Jan Kärrman, author of html2ps for helping so much in creating the Postscript version of the specification.
Of particular help from the W3C at Sophia-Antipolis were Janet Bertot, Bert Bos, Stephane Boyera, Daniel Dardailler, Yves Lafon, Håkon Lie, Chris Lilley, and Colas Nahaboo (Bull).
Lastly, thanks to Tim Berners-Lee without whom none of this would have been possible.
Many thanks to Shane McCarron for tracking errata for this revision of the specification.
For information about copyrights, please refer to the W3C
Intellectual Property Notice, the W3C
Document Notice, and the W3C
IPR Software Notice.
The World Wide Web (Web) is a network of information resources. The Web relies on three mechanisms to make these resources readily available to the widest possible audience:
The ties between the three mechanisms are apparent throughout this specification.
Every resource available on the Web -- HTML document, image, video clip, program, etc. -- has an address that may be encoded by a Universal Resource Identifier, or "URI".
URIs typically consist of three pieces:
Consider the URI that designates the W3C Technical Reports page:
http://www.w3.org/TR
This URI may be read as follows: There is a document available via the HTTP protocol (see [RFC2616]), residing on the machine www.w3.org, accessible via the path "/TR". Other schemes you may see in HTML documents include "mailto" for email and "ftp" for FTP.
Here is another example of a URI. This one refers to a user's mailbox:
...this is text... For all comments, please send email to <A href="mailto:joe@someplace.com">Joe Cool</A>.
Note. Most readers may be familiar with the term "URL" and not the term "URI". URLs form a subset of the more general URI naming scheme.
Some URIs refer to a location within a resource. This kind of URI ends with "#" followed by an anchor identifier (called the fragment identifier). For instance, here is a URI pointing to an anchor named section_2:
http://somesite.com/html/top.html#section_2
A relative URI doesn't contain any naming scheme information. Its path generally refers to a resource on the same machine as the current document. Relative URIs may contain relative path components (e.g., ".." means one level up in the hierarchy defined by the path), and may contain fragment identifiers.
Relative URIs are resolved to full URIs using a base URI. As an example of relative URI resolution, assume we have the base URI "http://www.acme.com/support/intro.html". The relative URI in the following markup for a hypertext link:
<A href="suppliers.html">Suppliers</A>
would expand to the full URI "http://www.acme.com/support/suppliers.html", while the relative URI in the following markup for an image
<IMG src="../icons/logo.gif" alt="logo">
would expand to the full URI "http://www.acme.com/icons/logo.gif".
In HTML, URIs are used to:
Please consult the section on the URI type for more information about URIs.
To publish information for global distribution, one needs a universally understood language, a kind of publishing mother tongue that all computers may potentially understand. The publishing language used by the World Wide Web is HTML (from HyperText Markup Language).
HTML gives authors the means to:
HTML was originally developed by Tim Berners-Lee while at CERN, and popularized by the Mosaic browser developed at NCSA. During the course of the 1990s it has blossomed with the explosive growth of the Web. During this time, HTML has been extended in a number of ways. The Web depends on Web page authors and vendors sharing the same conventions for HTML. This has motivated joint work on specifications for HTML.
HTML 2.0 (November 1995, see [RFC1866]) was developed under the aegis of the Internet Engineering Task Force (IETF) to codify common practice in late 1994. HTML+ (1993) and HTML 3.0 (1995, see [HTML30]) proposed much richer versions of HTML. Despite never receiving consensus in standards discussions, these drafts led to the adoption of a range of new features. The efforts of the World Wide Web Consortium's HTML Working Group to codify common practice in 1996 resulted in HTML 3.2 (January 1997, see [HTML32]). Changes from HTML 3.2 are summarized in Appendix A
Most people agree that HTML documents should work well across different browsers and platforms. Achieving interoperability lowers costs to content providers since they must develop only one version of a document. If the effort is not made, there is much greater risk that the Web will devolve into a proprietary world of incompatible formats, ultimately reducing the Web's commercial potential for all participants.
Each version of HTML has attempted to reflect greater consensus among industry players so that the investment made by content providers will not be wasted and that their documents will not become unreadable in a short period of time.
HTML has been developed with the vision that all manner of devices should be able to use information on the Web: PCs with graphics displays of varying resolution and color depths, cellular telephones, hand held devices, devices for speech for output and input, computers with high or low bandwidth, and so on.
HTML 4 extends HTML with mechanisms for style sheets, scripting, frames, embedding objects, improved support for right to left and mixed direction text, richer tables, and enhancements to forms, offering improved accessibility for people with disabilities.
HTML 4.01 is a revision of HTML 4.0 that corrects errors and makes some changes since the previous revision.
This version of HTML has been designed with the help of experts in the field of internationalization, so that documents may be written in every language and be transported easily around the world. This has been accomplished by incorporating [RFC2070], which deals with the internationalization of HTML.
One important step has been the adoption of the ISO/IEC:10646 standard (see [ISO10646]) as the document character set for HTML. This is the world's most inclusive standard dealing with issues of the representation of international characters, text direction, punctuation, and other world language issues.
HTML now offers greater support for diverse human languages within a document. This allows for more effective indexing of documents for search engines, higher-quality typography, better text-to-speech conversion, better hyphenation, etc.
As the Web community grows and its members diversify in their abilities and skills, it is crucial that the underlying technologies be appropriate to their specific needs. HTML has been designed to make Web pages more accessible to those with physical limitations. HTML 4 developments inspired by concerns for accessibility include:
Authors who design pages with accessibility issues in mind will not only receive the blessings of the accessibility community, but will benefit in other ways as well: well-designed HTML documents that distinguish structure and presentation will adapt more easily to new technologies.
Note. For more information about designing accessible HTML documents, please consult [WAI].
The new table model in HTML is based on [RFC1942]. Authors now have greater control over structure and layout (e.g., column groups). The ability of designers to recommend column widths allows user agents to display table data incrementally (as it arrives) rather than waiting for the entire table before rendering.
Note. At the time of writing, some HTML authoring tools rely extensively on tables for formatting, which may easily cause accessibility problems.
HTML now offers a standard mechanism for embedding generic media objects and applications in HTML documents. The OBJECT element (together with its more specific ancestor elements IMG and APPLET) provides a mechanism for including images, video, sound, mathematics, specialized applications, and other objects in a document. It also allows authors to specify a hierarchy of alternate renderings for user agents that don't support a specific rendering.
Style sheets simplify HTML markup and largely relieve HTML of the responsibilities of presentation. They give both authors and users control over the presentation of documents -- font information, alignment, colors, etc.
Style information can be specified for individual elements or groups of elements. Style information may be specified in an HTML document or in external style sheets.
The mechanisms for associating a style sheet with a document is independent of the style sheet language.
Before the advent of style sheets, authors had limited control over rendering. HTML 3.2 included a number of attributes and elements offering control over alignment, font size, and text color. Authors also exploited tables and images as a means for laying out pages. The relatively long time it takes for users to upgrade their browsers means that these features will continue to be used for some time. However, since style sheets offer more powerful presentation mechanisms, the World Wide Web Consortium will eventually phase out many of HTML's presentation elements and attributes. Throughout the specification elements and attributes at risk are marked as "deprecated". They are accompanied by examples of how to achieve the same effects with other elements or style sheets.
Through scripts, authors may create dynamic Web pages (e.g., "smart forms" that react as users fill them out) and use HTML as a means to build networked applications.
The mechanisms provided to include scripts in an HTML document are independent of the scripting language.
Sometimes, authors will want to make it easy for users to print more than just the current document. When documents form part of a larger work, the relationships between them can be described using the HTML LINK element or using W3C's Resource Description Framework (RDF) (see [RDF10]).
We recommend that authors and implementors observe the following general principles when working with HTML 4.
HTML has its roots in SGML which has always been a language for the specification of structural markup. As HTML matures, more and more of its presentational elements and attributes are being replaced by other mechanisms, in particular style sheets. Experience has shown that separating the structure of a document from its presentational aspects reduces the cost of serving a wide range of platforms, media, etc., and facilitates document revisions.
To make the Web more accessible to everyone, notably those with disabilities, authors should consider how their documents may be rendered on a variety of platforms: speech-based browsers, braille-readers, etc. We do not recommend that authors limit their creativity, only that they consider alternate renderings in their design. HTML offers a number of mechanisms to this end (e.g., the alt attribute, the accesskey attribute, etc.)
Furthermore, authors should keep in mind that their documents may be reaching a far-off audience with different computer configurations. In order for documents to be interpreted correctly, authors should include in their documents information about the natural language and direction of the text, how the document is encoded, and other issues related to internationalization.
By carefully designing their tables and making use of new table features in HTML 4, authors can help user agents render documents more quickly. Authors can learn how to design tables for incremental rendering (see the TABLE element). Implementors should consult the notes on tables in the appendix for information on incremental algorithms.
Contents
This section of the document introduces SGML and discusses its relationship to HTML. A complete discussion of SGML is left to the standard (see [ISO8879]).
SGML is a system for defining markup languages. Authors mark up their documents by representing structural, presentational, and semantic information alongside content. HTML is one example of a markup language. Here is an example of an HTML document:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <HTML> <HEAD> <TITLE>My first HTML document</TITLE> </HEAD> <BODY> <P>Hello world! </BODY> </HTML>
An HTML document is divided into a head section (here, between <HEAD> and </HEAD>) and a body (here, between <BODY> and </BODY>). The title of the document appears in the head (along with other information about the document), and the content of the document appears in the body. The body in this example contains just one paragraph, marked up with <P>.
Each markup language defined in SGML is called an SGML application. An SGML application is generally characterized by:
This specification includes an SGML declaration, three document type definitions (see the section on HTML version information for a description of the three), and a list of character references.
The following sections introduce SGML constructs that are used in HTML.
The appendix lists some SGML features that are not widely supported by HTML tools and user agents and should be avoided.
An SGML document type definition declares element types that represent structures or desired behavior. HTML includes element types that represent paragraphs, hypertext links, lists, tables, images, etc.
Each element type declaration generally describes three parts: a start tag, content, and an end tag.
The element's name appears in the start tag (written <element-name>) and the end tag (written </element-name>); note the slash before the element name in the end tag. For example, the start and end tags of the UL element type delimit the items in a list:
<UL> <LI><P>...list item 1... <LI><P>...list item 2... </UL>
Some HTML element types allow authors to omit end tags (e.g., the P and LI element types). A few element types also allow the start tags to be omitted; for example, HEAD and BODY. The HTML DTD indicates for each element type whether the start tag and end tag are required.
Some HTML element types have no content. For example, the line break element BR has no content; its only role is to terminate a line of text. Such empty elements never have end tags. The document type definition and the text of the specification indicate whether an element type is empty (has no content) or, if it can have content, what is considered legal content.
Element names are always case-insensitive.
Please consult the SGML standard for information about rules governing elements (e.g., they must be properly nested, an end tag closes, back to the matching start tag, all unclosed intervening start tags with omitted end tags (section 7.5.1), etc.).
For example, the following paragraph:
<P>This is the first paragraph.</P> ...a block element...
may be rewritten without its end tag:
<P>This is the first paragraph. ...a block element...
since the <P> start tag is closed by the following block element. Similarly, if a paragraph is enclosed by a block element, as in:
<DIV> <P>This is the paragraph. </DIV>
the end tag of the enclosing block element (here, </DIV>) implies the end tag of the open <P> start tag.
Elements are not tags. Some people refer to elements as tags (e.g., "the P tag"). Remember that the element is one thing, and the tag (be it start or end tag) is another. For instance, the HEAD element is always present, even though both start and end HEAD tags may be missing in the markup.
All the element types declared in this specification are listed in the element index.
Elements may have associated properties, called attributes, which may have values (by default, or set by authors or scripts). Attribute/value pairs appear before the final ">" of an element's start tag. Any number of (legal) attribute value pairs, separated by spaces, may appear in an element's start tag. They may appear in any order.
In this example, the id attribute is set for an H1 element:
<H1 id="section1"> This is an identified heading thanks to the id attribute </H1>
By default, SGML requires that all attribute values be delimited using either double quotation marks (ASCII decimal 34) or single quotation marks (ASCII decimal 39). Single quote marks can be included within the attribute value when the value is delimited by double quote marks, and vice versa. Authors may also use numeric character references to represent double quotes (") and single quotes ('). For double quotes authors can also use the character entity reference ".
In certain cases, authors may specify the value of an attribute without any quotation marks. The attribute value may only contain letters (a-z and A-Z), digits (0-9), hyphens (ASCII decimal 45), periods (ASCII decimal 46), underscores (ASCII decimal 95), and colons (ASCII decimal 58). We recommend using quotation marks even when it is possible to eliminate them.
Attribute names are always case-insensitive.
Attribute values are generally case-insensitive. The definition of each attribute in the reference manual indicates whether its value is case-insensitive.
All the attributes defined by this specification are listed in the attribute index.
Character references are numeric or symbolic names for characters that may be included in an HTML document. They are useful for referring to rarely used characters, or those that authoring tools make it difficult or impossible to enter. You will see character references throughout this document; they begin with a "&" sign and end with a semi-colon (;). Some common examples include:
We discuss HTML character references in detail later in the section on the HTML document character set. The specification also contains a list of character references that may appear in HTML 4 documents.
HTML comments have the following syntax:
<!-- this is a comment --> <!-- and so is this one, which occupies more than one line -->
White space is not permitted between the markup declaration open delimiter("<!") and the comment open delimiter ("--"), but is permitted between the comment close delimiter ("--") and the markup declaration close delimiter (">"). A common error is to include a string of hyphens ("---") within a comment. Authors should avoid putting two or more adjacent hyphens inside comments.
Information that appears between comments has no special meaning (e.g., character references are not interpreted as such).
Note that comments are markup.
Each element and attribute declaration in this specification is accompanied by its document type definition fragment. We have chosen to include the DTD fragments in the specification rather than seek a more approachable, but longer and less precise means of describing an element's properties. The following tutorial should allow readers unfamiliar with SGML to read the DTD and understand the technical details of the HTML specification.
In DTDs, comments may spread over one or more lines. In the DTD, comments are delimited by a pair of "--" marks, e.g.
<!ELEMENT PARAM - O EMPTY -- named property value -->
The HTML DTD begins with a series of parameter entity definitions. A parameter entity definition defines a kind of macro that may be referenced and expanded elsewhere in the DTD. These macros may not appear in HTML documents, only in the DTD. Other types of macros, called character references, may be used in the text of an HTML document or within attribute values.
When the parameter entity is referred to by name in the DTD, it is expanded into a string.
A parameter entity definition begins with the keyword <!ENTITY % followed by the entity name, the quoted string the entity expands to, and finally a closing >. Instances of parameter entities in a DTD begin with "%", then the parameter entity name, and terminated by an optional ";".
The following example defines the string that the "%fontstyle;" entity will expand to.
<!ENTITY % fontstyle "TT | I | B | BIG | SMALL">
The string the parameter entity expands to may contain other parameter entity names. These names are expanded recursively. In the following example, the "%inline;" parameter entity is defined to include the "%fontstyle;", "%phrase;", "%special;" and "%formctrl;" parameter entities.
<!ENTITY % inline "#PCDATA | %fontstyle; | %phrase; | %special; | %formctrl;">
You will encounter two DTD entities frequently in the HTML DTD: "%block;" "%inline;". They are used when the content model includes block-level and inline elements, respectively (defined in the section on the global structure of an HTML document).
The bulk of the HTML DTD consists of the declarations of element types and their attributes. The <!ELEMENT keyword begins a declaration and the > character ends it. Between these are specified:
In this example:
<!ELEMENT UL - - (LI)+>
This example illustrates the declaration of an empty element type:
<!ELEMENT IMG - O EMPTY>
The content model describes what may be contained by an instance of an element type. Content model definitions may include:
The content model of an element is specified with the following syntax. Please note that the list below is a simplification of the full SGML syntax rules and does not address, e.g., precedences.
Here are some examples from the HTML DTD:
<!ELEMENT UL - - (LI)+>
The UL element must contain one or more LI elements.
<!ELEMENT DL - - (DT|DD)+>
The DL element must contain one or more DT or DD elements in any order.
<!ELEMENT OPTION - O (#PCDATA)>
The OPTION element may only contain text and entities, such as & -- this is indicated by the SGML data type #PCDATA.
A few HTML element types use an additional SGML feature to exclude elements from their content model. Excluded elements are preceded by a hyphen. Explicit exclusions override permitted elements.
In this example, the -(A) signifies that the element A cannot appear in another A element (i.e., anchors may not be nested).
<!ELEMENT A - - (%inline;)* -(A)>
Note that the A element type is part of the DTD parameter entity "%inline;", but is excluded explicitly because of -(A).
Similarly, the following element type declaration for FORM prohibits nested forms:
<!ELEMENT FORM - - (%block;|SCRIPT)+ -(FORM)>
The <!ATTLIST keyword begins the declaration of attributes that an element may take. It is followed by the name of the element in question, a list of attribute definitions, and a closing >. Each attribute definition is a triplet that defines:
In this example, the name attribute is defined for the MAP element. The attribute is optional for this element.
<!ATTLIST MAP name CDATA #IMPLIED >
The type of values permitted for the attribute is given as CDATA, an SGML data type. CDATA is text that may contain character references.
For more information about "CDATA", "NAME", "ID", and other data types, please consult the section on HTML data types.
The following examples illustrate several attribute definitions:
rowspan NUMBER 1 -- number of rows spanned by cell -- http-equiv NAME #IMPLIED -- HTTP response header name -- id ID #IMPLIED -- document-wide unique id -- valign (top|middle|bottom|baseline) #IMPLIED
The rowspan attribute requires values of type NUMBER. The default value is given explicitly as "1". The optional http-equiv attribute requires values of type NAME. The optional id attribute requires values of type ID. The optional valign attribute is constrained to take values from the set {top, middle, bottom, baseline}.
Attribute definitions may also contain parameter entity references.
In this example, we see that the attribute definition list for the LINK element begins with the "%attrs;" parameter entity.
<!ELEMENT LINK - O EMPTY -- a media-independent link --> <!ATTLIST LINK %attrs; -- %coreattrs, %i18n, %events -- charset %Charset; #IMPLIED -- char encoding of linked resource -- href %URI; #IMPLIED -- URI for linked resource -- hreflang %LanguageCode; #IMPLIED -- language code -- type %ContentType; #IMPLIED -- advisory content type -- rel %LinkTypes; #IMPLIED -- forward link types -- rev %LinkTypes; #IMPLIED -- reverse link types -- media %MediaDesc; #IMPLIED -- for rendering on these media -- >
Start tag: required, End tag: forbidden
The "%attrs;" parameter entity is defined as follows:
<!ENTITY % attrs "%coreattrs; %i18n; %events;">
The "%coreattrs;" parameter entity in the "%attrs;" definition expands as follows:
The "%attrs;" parameter entity has been defined for convenience since these attributes are defined for most HTML element types.
Similarly, the DTD defines the "%URI;" parameter entity as expanding into the string "CDATA".
As this example illustrates, the parameter entity "%URI;" provides readers of the DTD with more information as to the type of data expected for an attribute. Similar entities have been defined for "%Color;", "%Charset;", "%Length;", "%Pixels;", etc.
Some attributes play the role of boolean variables (e.g., the selected attribute for the OPTION element). Their appearance in the start tag of an element implies that the value of the attribute is "true". Their absence implies a value of "false".
Boolean attributes may legally take a single value: the name of the attribute itself (e.g., selected="selected").
This example defines the selected attribute to be a boolean attribute.
selected (selected) #IMPLIED -- option is pre-selected --
The attribute is set to "true" by appearing in the element's start tag:
<OPTION selected="selected"> ...contents... </OPTION>
In HTML, boolean attributes may appear in minimized form -- the attribute's value appears alone in the element's start tag. Thus, selected may be set by writing:
<OPTION selected>
instead of:
<OPTION selected="selected">
Authors should be aware that many user agents only recognize the minimized form of boolean attributes and not the full form.
Contents
In this section, we begin the specification of HTML 4, starting with the contract between authors, documents, users, and user agents.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. However, for readability, these words do not appear in all uppercase letters in this specification.
At times, the authors of this specification recommend good practice for authors and user agents. These recommendations are not normative and conformance with this specification does not depend on their realization. These recommendations contain the expression "We recommend ...", "This specification recommends ...", or some similar wording.
We recommend that authors write documents that conform to the strict DTD rather than the other DTDs defined by this specification. Please see the section on version information for details about the DTDs defined in HTML 4.
A conforming user agent for HTML 4 is one that observes the mandatory conditions ("must") set forth in this specification, including the following points:
However, for recommended error handling behavior, please consult the notes on invalid documents.
User agents should continue to support deprecated elements for reasons of backward compatibility.
Definitions of elements and attributes clearly indicate which are deprecated.
This specification includes examples that illustrate how to avoid using deprecated elements. In most cases these depend on user agent support for style sheets. In general, authors should use style sheets to achieve stylistic and formatting effects rather than HTML presentational attributes. HTML presentational attributes have been deprecated when style sheet alternatives exist (see, for example, [CSS1]).
HTML 4 is an SGML application conforming to International Standard ISO 8879 -- Standard Generalized Markup Language SGML (defined in [ISO8879]).
Examples in the text conform to the strict document type definition unless the example in question refers to elements or attributes only defined by the transitional document type definition or frameset document type definition. For the sake of brevity, most of the examples in this specification do not begin with the document type declaration that is mandatory at the beginning of each HTML document.
DTD fragments in element definitions come from the strict document type definition except for the elements related to frames.
Please consult the section on HTML version information for details about when to use the strict, transitional, or frameset DTD.
Comments appearing in the HTML 4 DTD have no normative value; they are informative only.
User agents must not render SGML processing instructions (e.g., <?full volume>) or comments. For more information about this and other SGML features that may be legal in HTML but aren't widely supported by HTML user agents, please consult the section on SGML features with limited support.
HTML documents are sent over the Internet as a sequence of bytes accompanied by encoding information (described in the section on character encodings). The structure of the transmission, termed a message entity, is defined by [RFC2045] and [RFC2616]. A message entity with a content type of "text/html" represents an HTML document.
The content type for HTML documents is defined as follows:
The optional parameter "charset" refers to the character encoding used to represent the HTML document as a sequence of bytes. Legal values for this parameter are defined in the section on character encodings. Although this parameter is optional, we recommend that it always be present.
Contents
In this chapter, we discuss how HTML documents are represented on a computer and over the Internet.
The section on the document character set addresses the issue of what abstract characters may be part of an HTML document. Characters include the Latin letter "A", the Cyrillic letter "I", the Chinese character meaning "water", etc.
The section on character encodings addresses the issue of how those characters may be represented in a file or when transferred over the Internet. As some character encodings cannot directly represent all characters an author may want to include in a document, HTML offers other mechanisms, called character references, for referring to any character.
Since there are a great number of characters throughout human languages, and a great variety of ways to represent those characters, proper care must be taken so that documents may be understood by user agents around the world.
To promote interoperability, SGML requires that each application (including HTML) specify its document character set. A document character set consists of:
Each SGML document (including each HTML document) is a sequence of characters from the repertoire. Computer systems identify each character by its code position; for example, in the ASCII character set, code positions 65, 66, and 67 refer to the characters 'A', 'B', and 'C', respectively.
The ASCII character set is not sufficient for a global information system such as the Web, so HTML uses the much more complete character set called the Universal Character Set (UCS), defined in [ISO10646]. This standard defines a repertoire of thousands of characters used by communities all over the world.
The character set defined in [ISO10646] is character-by-character equivalent to Unicode ([UNICODE]). Both of these standards are updated from time to time with new characters, and the amendments should be consulted at the respective Web sites. In the current specification, "[ISO10646]" is used to refer to the document character set while "[UNICODE]" is reserved for references to the Unicode bidirectional text algorithm.
The document character set, however, does not suffice to allow user agents to correctly interpret HTML documents as they are typically exchanged -- encoded as a sequence of bytes in a file or during a network transmission. User agents must also know the specific character encoding that was used to transform the document character stream into a byte stream.
What this specification calls a character encoding is known by different names in other specifications (which may cause some confusion). However, the concept is largely the same across the Internet. Also, protocol headers, attributes, and parameters referring to character encodings share the same name -- "charset" -- and use the same values from the [IANA] registry (see [CHARSETS] for a complete list).
The "charset" parameter identifies a character encoding, which is a method of converting a sequence of bytes into a sequence of characters. This conversion fits naturally with the scheme of Web activity: servers send HTML documents to user agents as a stream of bytes; user agents interpret them as a sequence of characters. The conversion method can range from simple one-to-one correspondence to complex switching schemes or algorithms.
A simple one-byte-per-character encoding technique is not sufficient for text strings over a character repertoire as large as [ISO10646]. There are several different encodings of parts of [ISO10646] in addition to encodings of the entire character set (such as UCS-4).
Authoring tools (e.g., text editors) may encode HTML documents in the character encoding of their choice, and the choice largely depends on the conventions used by the system software. These tools may employ any convenient encoding that covers most of the characters contained in the document, provided the encoding is correctly labeled. Occasional characters that fall outside this encoding may still be represented by character references. These always refer to the document character set, not the character encoding.
Servers and proxies may change a character encoding (called transcoding) on the fly to meet the requests of user agents (see section 14.2 of [RFC2616], the "Accept-Charset" HTTP request header). Servers and proxies do not have to serve a document in a character encoding that covers the entire document character set.
Commonly used character encodings on the Web include ISO-8859-1 (also referred to as "Latin-1"; usable for most Western European languages), ISO-8859-5 (which supports Cyrillic), SHIFT_JIS (a Japanese encoding), EUC-JP (another Japanese encoding), and UTF-8 (an encoding of ISO 10646 using a different number of bytes for different characters). Names for character encodings are case-insensitive, so that for example "SHIFT_JIS", "Shift_JIS", and "shift_jis" are equivalent.
This specification does not mandate which character encodings a user agent must support.
Conforming user agents must correctly map to ISO 10646 all characters in any character encodings that they recognize (or they must behave as if they did).
When HTML text is transmitted in UTF-16 (charset=UTF-16), text data should be transmitted in network byte order ("big-endian", high-order byte first) in accordance with [ISO10646], Section 6.3 and [UNICODE], clause C3, page 3-1.
Furthermore, to maximize chances of proper interpretation, it is recommended that documents transmitted as UTF-16 always begin with a ZERO-WIDTH NON-BREAKING SPACE character (hexadecimal FEFF, also called Byte Order Mark (BOM)) which, when byte-reversed, becomes hexadecimal FFFE, a character guaranteed never to be assigned. Thus, a user-agent receiving a hexadecimal FFFE as the first bytes of a text would know that bytes have to be reversed for the remainder of the text.
The UTF-1 transformation format of [ISO10646] (registered by IANA as ISO-10646-UTF-1), should not be used. For information about ISO 8859-8 and the bidirectional algorithm, please consult the section on bidirectionality and character encoding.
How does a server determine which character encoding applies for a document it serves? Some servers examine the first few bytes of the document, or check against a database of known files and encodings. Many modern servers give Web masters more control over charset configuration than old servers do. Web masters should use these mechanisms to send out a "charset" parameter whenever possible, but should take care not to identify a document with the wrong "charset" parameter value.
How does a user agent know which character encoding has been used? The server should provide this information. The most straightforward way for a server to inform the user agent about the character encoding of the document is to use the "charset" parameter of the "Content-Type" header field of the HTTP protocol ([RFC2616], sections 3.4 and 14.17) For example, the following HTTP header announces that the character encoding is EUC-JP:
Content-Type: text/html; charset=EUC-JP
Please consult the section on conformance for the definition of text/html.
The HTTP protocol ([RFC2616], section 3.7.1) mentions ISO-8859-1 as a default character encoding when the "charset" parameter is absent from the "Content-Type" header field. In practice, this recommendation has proved useless because some servers don't allow a "charset" parameter to be sent, and others may not be configured to send the parameter. Therefore, user agents must not assume any default value for the "charset" parameter.
To address server or configuration limitations, HTML documents may include explicit information about the document's character encoding; the META element can be used to provide user agents with this information.
For example, to specify that the character encoding of the current document is "EUC-JP", a document should include the following META declaration:
<META http-equiv="Content-Type" content="text/html; charset=EUC-JP">
The META declaration must only be used when the character encoding is organized such that ASCII-valued bytes stand for ASCII characters (at least until the META element is parsed). META declarations should appear as early as possible in the HEAD element.
For cases where neither the HTTP protocol nor the META element provides information about the character encoding of a document, HTML also provides the charset attribute on several elements. By combining these mechanisms, an author can greatly improve the chances that, when the user retrieves a resource, the user agent will recognize the character encoding.
To sum up, conforming user agents must observe the following priorities when determining a document's character encoding (from highest priority to lowest):
In addition to this list of priorities, the user agent may use heuristics and user settings. For example, many user agents use a heuristic to distinguish the various encodings used for Japanese text. Also, user agents typically have a user-definable, local default character encoding which they apply in the absence of other indicators.
User agents may provide a mechanism that allows users to override incorrect "charset" information. However, if a user agent offers such a mechanism, it should only offer it for browsing and not for editing, to avoid the creation of Web pages marked with an incorrect "charset" parameter.
Note. If, for a specific application, it becomes necessary to refer to characters outside [ISO10646], characters should be assigned to a private zone to avoid conflicts with present or future versions of the standard. This is highly discouraged, however, for reasons of portability.
A given character encoding may not be able to express all characters of the document character set. For such encodings, or when hardware or software configurations do not allow users to input some document characters directly, authors may use SGML character references. Character references are a character encoding-independent mechanism for entering any character from the document character set.
Character references in HTML may appear in two forms:
Character references within comments have no special meaning; they are comment data only.
Note. HTML provides other ways to present character data, in particular inline images.
Note. In SGML, it is possible to eliminate the final ";" after a character reference in some cases (e.g., at a line break or immediately before a tag). In other circumstances it may not be eliminated (e.g., in the middle of a word). We strongly suggest using the ";" in all cases to avoid problems with user agents that require this character to be present.
Numeric character references specify the code position of a character in the document character set. Numeric character references may take two forms:
Here are some examples of numeric character references:
Note. Although the hexadecimal representation is not defined in [ISO8879], it is expected to be in the revision, as described in [WEBSGML]. This convention is particularly useful since character standards generally use hexadecimal representations.
In order to give authors a more intuitive way of referring to characters in the document character set, HTML offers a set of character entity references. Character entity references use symbolic names so that authors need not remember code positions. For example, the character entity reference å refers to the lowercase "a" character topped with a ring; "å" is easier to remember than å.
HTML 4 does not define a character entity reference for every character in the document character set. For instance, there is no character entity reference for the Cyrillic capital letter "I". Please consult the full list of character references defined in HTML 4.
Character entity references are case-sensitive. Thus, Å refers to a different character (uppercase A, ring) than å (lowercase a, ring).
Four character entity references deserve special mention since they are frequently used to escape special characters:
Authors wishing to put the "<" character in text should use "<" (ASCII decimal 60) to avoid possible confusion with the beginning of a tag (start tag open delimiter). Similarly, authors should use ">" (ASCII decimal 62) in text instead of ">" to avoid problems with older user agents that incorrectly perceive this as the end of a tag (tag close delimiter) when it appears in quoted attribute values.
Authors should use "&" (ASCII decimal 38) instead of "&" to avoid confusion with the beginning of a character reference (entity reference open delimiter). Authors should also use "&" in attribute values since character references are allowed within CDATA attribute values.
Some authors use the character entity reference """ to encode instances of the double quote mark (") since that character may be used to delimit attribute values.
A user agent may not be able to render all characters in a document meaningfully, for instance, because the user agent lacks a suitable font, a character has a value that may not be expressed in the user agent's internal character encoding, etc.
Because there are many different things that may be done in such cases, this document does not prescribe any specific behavior. Depending on the implementation, undisplayable characters may also be handled by the underlying display system and not the application itself. In the absence of more sophisticated behavior, for example tailored to the needs of a particular script or language, we recommend the following behavior for user agents:
Contents
This section of the specification describes the basic data types that may appear as an element's content or an attribute's value.
For introductory information about reading the HTML DTD, please consult the SGML tutorial.
Each attribute definition includes information about the case-sensitivity of its values. The case information is presented with the following keys:
If an attribute value is a list, the keys apply to every value in the list, unless otherwise indicated.
The document type definition specifies the syntax of HTML element content and attribute values using SGML tokens (e.g., PCDATA, CDATA, NAME, ID, etc.). See [ISO8879] for their full definitions. The following is a summary of key information:
User agents may ignore leading and trailing white space in CDATA attribute values (e.g., " myval " may be interpreted as "myval"). Authors should not declare attribute values with leading or trailing white space.
For some HTML 4 attributes with CDATA attribute values, the specification imposes further constraints on the set of legal values for the attribute that may not be expressed by the DTD.
Although the STYLE and SCRIPT elements use CDATA for their data model, for these elements, CDATA must be handled differently by user agents. Markup and entities must be treated as raw text and passed to the application as is. The first occurrence of the character sequence "</" (end-tag open delimiter) is treated as terminating the end of the element's content. In valid documents, this would be the end tag for the element.
A number of attributes ( %Text; in the DTD) take text that is meant to be "human readable". For introductory information about attributes, please consult the tutorial discussion of attributes.
This specification uses the term URI as defined in [URI] (see also [RFC1630]).
Note that URIs include URLs (as defined in [RFC1738] and [RFC1808]).
Relative URIs are resolved to full URIs using a base URI. [RFC1808], section 3, defines the normative algorithm for this process. For more information about base URIs, please consult the section on base URIs in the chapter on links.
URIs are represented in the DTD by the parameter entity %URI;.
URIs in general are case-sensitive. There may be URIs, or parts of URIs, where case doesn't matter (e.g., machine names), but identifying these may not be easy. Users should always consider that URIs are case-sensitive (to be on the safe side).
Please consult the appendix for information about non-ASCII characters in URI attribute values.
The attribute value type "color" (%Color;) refers to color definitions as specified in [SRGB]. A color value may either be a hexadecimal number (prefixed by a hash mark) or one of the following sixteen color names. The color names are case-insensitive.
![]() |
Black = "#000000" | ![]() |
Green = "#008000" |
![]() |
Silver = "#C0C0C0" | ![]() |
Lime = "#00FF00" |
![]() |
Gray = "#808080" | ![]() |
Olive = "#808000" |
![]() |
White = "#FFFFFF" | ![]() |
Yellow = "#FFFF00" |
![]() |
Maroon = "#800000" | ![]() |
Navy = "#000080" |
![]() |
Red = "#FF0000" | ![]() |
Blue = "#0000FF" |
![]() |
Purple = "#800080" | ![]() |
Teal = "#008080" |
![]() |
Fuchsia = "#FF00FF" | ![]() |
Aqua = "#00FFFF" |
Thus, the color values "#800080" and "Purple" both refer to the color purple.
Although colors can add significant amounts of information to documents and make them more readable, please consider the following guidelines when including color in your documents:
HTML specifies three types of length values for attributes:
Length values are case-neutral.
Note. A "media type" (defined in [RFC2045] and [RFC2046]) specifies the nature of a linked resource. This specification employs the term "content type" rather than "media type" in accordance with current usage. Furthermore, in this specification, "media type" may refer to the media where a user agent renders a document.
This type is represented in the DTD by %ContentType;.
Content types are case-insensitive.
Examples of content types include "text/html", "image/png", "image/gif", "video/mpeg", "text/css", and "audio/basic". For the current list of registered MIME types, please consult [MIMETYPES].
The value of attributes whose type is a language code ( %LanguageCode in the DTD) refers to a language code as specified by [RFC1766], section 2. For information on specifying language codes in HTML, please consult the section on language codes. Whitespace is not allowed within the language-code.
Language codes are case-insensitive.
The "charset" attributes (%Charset in the DTD) refer to a character encoding as described in the section on character encodings. Values must be strings (e.g., "euc-jp") from the IANA registry (see [CHARSETS] for a complete list).
Names of character encodings are case-insensitive.
User agents must follow the steps set out in the section on specifying character encodings in order to determine the character encoding of an external resource.
Certain attributes call for a single character from the document character set. These attributes take the %Character type in the DTD.
Single characters may be specified with character references (e.g., "&").
[ISO8601] allows many options and variations in the representation of dates and times. The current specification uses one of the formats described in the profile [DATETIME] for its definition of legal date/time strings ( %Datetime in the DTD).
The format is:
YYYY-MM-DDThh:mm:ssTZDwhere:
YYYY = four-digit year MM = two-digit month (01=January, etc.) DD = two-digit day of month (01 through 31) hh = two digits of hour (00 through 23) (am/pm NOT allowed) mm = two digits of minute (00 through 59) ss = two digits of second (00 through 59) TZD = time zone designator
The time zone designator is one of:
Exactly the components shown here must be present, with exactly this punctuation. Note that the "T" appears literally in the string (it must be uppercase), to indicate the beginning of the time element, as specified in [ISO8601]
If a generating application does not know the time to the second, it may use the value "00" for the seconds (and minutes and hours if necessary).
Note. [DATETIME] does not address the issue of leap seconds.
Authors may use the following recognized link types, listed here with their conventional interpretations. In the DTD, %LinkTypes refers to a space-separated list of link types. White space characters are not permitted within link types.
These link types are case-insensitive, i.e., "Alternate" has the same meaning as "alternate".
User agents, search engines, etc. may interpret these link types in a variety of ways. For example, user agents may provide access to linked documents through a navigation bar.
Authors may wish to define additional link types not described in this specification. If they do so, they should use a profile to cite the conventions used to define the link types. Please see the profile attribute of the HEAD element for more details.
For further discussions about link types, please consult the section on links in HTML documents.
The following is a list of recognized media descriptors ( %MediaDesc in the DTD).
Future versions of HTML may introduce new values and may allow parameterized values. To facilitate the introduction of these extensions, conforming user agents must be able to parse the media attribute value as follows:
media="screen, 3d-glasses, print and resolution > 90dpi"
is mapped to:
"screen" "3d-glasses" "print and resolution > 90dpi"
"screen" "3d-glasses" "print"
Note. Style sheets may include media-dependent variations within them (e.g., the CSS @media construct). In such cases it may be appropriate to use "media=all".
Script data ( %Script; in the DTD) can be the content of the SCRIPT element and the value of intrinsic event attributes. User agents must not evaluate script data as HTML markup but instead must pass it on as data to a script engine.
The case-sensitivity of script data depends on the scripting language.
Please note that script data that is element content may not contain character references, but script data that is the value of an attribute may contain them. The appendix provides further information about specifying non-HTML data.
Style sheet data (%StyleSheet; in the DTD) can be the content of the STYLE element and the value of the style attribute. User agents must not evaluate style data as HTML markup.
The case-sensitivity of style data depends on the style sheet language.
Please note that style sheet data that is element content may not contain character references, but style sheet data that is the value of an attribute may contain them. The appendix provides further information about specifying non-HTML data.
Except for the reserved names listed below, frame target names (%FrameTarget; in the DTD) must begin with an alphabetic character (a-zA-Z). User agents should ignore all other target names.
The following target names are reserved and have special meanings.
Contents
An HTML 4 document is composed of three parts:
White space (spaces, newlines, tabs, and comments) may appear before or after each section. Sections 2 and 3 should be delimited by the HTML element.
Here's an example of a simple HTML document:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <HTML> <HEAD> <TITLE>My first HTML document</TITLE> </HEAD> <BODY> <P>Hello world! </BODY> </HTML>
A valid HTML document declares what version of HTML is used in the document. The document type declaration names the document type definition (DTD) in use for the document (see [ISO8879]).
HTML 4.01 specifies three DTDs, so authors must include one of the following document type declarations in their documents. The DTDs vary in the elements they support.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN" "http://www.w3.org/TR/html4/frameset.dtd">
The URI in each document type declaration allows user agents to download the DTD and any entity sets that are needed. The following (relative) URIs refer to DTDs and entity sets for HTML 4:
The binding between public identifiers and files can be specified using a catalog file following the format recommended by the Oasis Open Consortium (see [OASISOPEN]). A sample catalog file for HTML 4.01 is included at the beginning of the section on SGML reference information for HTML. The last two letters of the declaration indicate the language of the DTD. For HTML, this is always English ("EN").
Note. As of the 24 December version of HTML 4.01, the HTML Working Group commits to the following policy:
This means that in a document type declaration, authors may safely use a system identifier that refers to the latest version of an HTML 4 DTD. Authors may also choose to use a system identifier that refers to a specific (dated) version of an HTML 4 DTD when validation to that particular DTD is required. W3C will make every effort to make archival documents indefinitely available at their original address in their original form.
<!ENTITY % html.content "HEAD, BODY"> <!ELEMENT HTML O O (%html.content;) -- document root element --> <!ATTLIST HTML %i18n; -- lang, dir -- >
Start tag: optional, End tag: optional
Attribute definitions
Attributes defined elsewhere
After document type declaration, the remainder of an HTML document is contained by the HTML element. Thus, a typical HTML document has this structure:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <HTML> ...The head, body, etc. goes here... </HTML>
<!-- %head.misc; defined earlier on as "SCRIPT|STYLE|META|LINK|OBJECT" --> <!ENTITY % head.content "TITLE & BASE?"> <!ELEMENT HEAD O O (%head.content;) +(%head.misc;) -- document head --> <!ATTLIST HEAD %i18n; -- lang, dir -- profile %URI; #IMPLIED -- named dictionary of meta info -- >
Start tag: optional, End tag: optional
Attribute definitions
Attributes defined elsewhere
The HEAD element contains information about the current document, such as its title, keywords that may be useful to search engines, and other data that is not considered document content. User agents do not generally render elements that appear in the HEAD as content. They may, however, make information in the HEAD available to users through other mechanisms.
<!-- The TITLE element is not considered part of the flow of text. It should be displayed, for example as the page header or window title. Exactly one title is required per document. --> <!ELEMENT TITLE - - (#PCDATA) -(%head.misc;) -- document title --> <!ATTLIST TITLE %i18n>
Start tag: required, End tag: required
Attributes defined elsewhere
Every HTML document must have a TITLE element in the HEAD section.
Authors should use the TITLE element to identify the contents of a document. Since users often consult documents out of context, authors should provide context-rich titles. Thus, instead of a title such as "Introduction", which doesn't provide much contextual background, authors should supply a title such as "Introduction to Medieval Bee-Keeping" instead.
For reasons of accessibility, user agents must always make the content of the TITLE element available to users (including TITLE elements that occur in frames). The mechanism for doing so depends on the user agent (e.g., as a caption, spoken).
Titles may contain character entities (for accented characters, special characters, etc.), but may not contain other markup (including comments). Here is a sample document title:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <HTML> <HEAD> <TITLE>A study of population dynamics</TITLE> ... other head elements... </HEAD> <BODY> ... document body... </BODY> </HTML>
Attribute definitions
Unlike the TITLE element, which provides information about an entire document and may only appear once, the title attribute may annotate any number of elements. Please consult an element's definition to verify that it supports this attribute.
Values of the title attribute may be rendered by user agents in a variety of ways. For instance, visual browsers frequently display the title as a "tool tip" (a short message that appears when the pointing device pauses over an object). Audio user agents may speak the title information in a similar context. For example, setting the attribute on a link allows user agents (visual and non-visual) to tell users about the nature of the linked resource:
...some text... Here's a photo of <A href="http://someplace.com/neatstuff.gif" title="Me scuba diving"> me scuba diving last summer </A> ...some more text...
The title attribute has an additional role when used with the LINK element to designate an external style sheet. Please consult the section on links and style sheets for details.
Note. To improve the quality of speech synthesis for cases handled poorly by standard techniques, future versions of HTML may include an attribute for encoding phonemic and prosodic information.
Note. The W3C Resource Description Framework (see [RDF10]) became a W3C Recommendation in February 1999. RDF allows authors to specify machine-readable metadata about HTML documents and other network-accessible resources.
HTML lets authors specify meta data -- information about a document rather than document content -- in a variety of ways.
For example, to specify the author of a document, one may use the META element as follows:
<META name="Author" content="Dave Raggett">
The META element specifies a property (here "Author") and assigns a value to it (here "Dave Raggett").
This specification does not define a set of legal meta data properties. The meaning of a property and the set of legal values for that property should be defined in a reference lexicon called a profile. For example, a profile designed to help search engines index documents might define properties such as "author", "copyright", "keywords", etc.
In general, specifying meta data involves two steps:
Note that since a profile is defined for the HEAD element, the same profile applies to all META and LINK elements in the document head.
User agents are not required to support meta data mechanisms. For those that choose to support meta data, this specification does not define how meta data should be interpreted.
<!ELEMENT META - O EMPTY -- generic metainformation --> <!ATTLIST META %i18n; -- lang, dir, for use with content -- http-equiv NAME #IMPLIED -- HTTP response header name -- name NAME #IMPLIED -- metainformation name -- content CDATA #REQUIRED -- associated information -- scheme CDATA #IMPLIED -- select form of content -- >
Start tag: required, End tag: forbidden
Attribute definitions
For the following attributes, the permitted values and their interpretation are profile dependent:
Attributes defined elsewhere
The META element can be used to identify properties of a document (e.g., author, expiration date, a list of key words, etc.) and assign values to those properties. This specification does not define a normative set of properties.
Each META element specifies a property/value pair. The name attribute identifies the property and the content attribute specifies the property's value.
For example, the following declaration sets a value for the Author property:
<META name="Author" content="Dave Raggett">
The lang attribute can be used with META to specify the language for the value of the content attribute. This enables speech synthesizers to apply language dependent pronunciation rules.
In this example, the author's name is declared to be French:
<META name="Author" lang="fr" content="Arnaud Le Hors">
Note. The META element is a generic mechanism for specifying meta data. However, some HTML elements and attributes already handle certain pieces of meta data and may be used by authors instead of META to specify those pieces: the TITLE element, the ADDRESS element, the INS and DEL elements, the title attribute, and the cite attribute.
Note. When a property specified by a META element takes a value that is a URI, some authors prefer to specify the meta data via the LINK element. Thus, the following meta data declaration:
<META name="DC.identifier" content="http://www.ietf.org/rfc/rfc1866.txt">
might also be written:
<LINK rel="DC.identifier" type="text/plain" href="http://www.ietf.org/rfc/rfc1866.txt">
The http-equiv attribute can be used in place of the name attribute and has a special significance when documents are retrieved via the Hypertext Transfer Protocol (HTTP). HTTP servers may use the property name specified by the http-equiv attribute to create an [RFC822]-style header in the HTTP response. Please see the HTTP specification ([RFC2616]) for details on valid HTTP headers.
The following sample META declaration:
<META http-equiv="Expires" content="Tue, 20 Aug 1996 14:25:27 GMT">
will result in the HTTP header:
Expires: Tue, 20 Aug 1996 14:25:27 GMT
This can be used by caches to determine when to fetch a fresh copy of the associated document.
Note. Some user agents support the use of META to refresh the current page after a specified number of seconds, with the option of replacing it by a different URI. Authors should not use this technique to forward users to different pages, as this makes the page inaccessible to some users. Instead, automatic page forwarding should be done using server-side redirects.
A common use for META is to specify keywords that a search engine may use to improve the quality of search results. When several META elements provide language-dependent information about a document, search engines may filter on the lang attribute to display search results using the language preferences of the user. For example,
<-- For speakers of US English --> <META name="keywords" lang="en-us" content="vacation, Greece, sunshine"> <-- For speakers of British English --> <META name="keywords" lang="en" content="holiday, Greece, sunshine"> <-- For speakers of French --> <META name="keywords" lang="fr" content="vacances, Grèce, soleil">
The effectiveness of search engines can also be increased by using the LINK element to specify links to translations of the document in other languages, links to versions of the document in other media (e.g., PDF), and, when the document is part of a collection, links to an appropriate starting point for browsing the collection.
Further help is provided in the section on helping search engines index your Web site.
This example illustrates how one can use a META declaration to include a PICS 1.1 label:
<HEAD> <META http-equiv="PICS-Label" content=' (PICS-1.1 "http://www.gcf.org/v2.5" labels on "1994.11.05T08:15-0500" until "1995.12.31T23:59-0000" for "http://w3.org/PICS/Overview.html" ratings (suds 0.5 density 0 color/hue 1)) '> <TITLE>... document title ...</TITLE> </HEAD>
The META element may be used to specify the default information for a document in the following instances:
The following example specifies the character encoding for a document as being ISO-8859-5
<META http-equiv="Content-Type" content="text/html; charset=ISO-8859-5">
This example refers to a hypothetical profile that defines useful properties for document indexing. The properties defined by this profile -- including "author", "copyright", "keywords", and "date" -- have their values set by subsequent META declarations.
<HEAD profile="http://www.acme.com/profiles/core"> <TITLE>How to complete Memorandum cover sheets</TITLE> <META name="author" content="John Doe"> <META name="copyright" content="© 1997 Acme Corp."> <META name="keywords" content="corporate,guidelines,cataloging"> <META name="date" content="1994-11-06T08:49:37+00:00"> </HEAD>
As this specification is being written, it is common practice to use the date formats described in [RFC2616], section 3.3. As these formats are relatively hard to process, we recommend that authors use the [ISO8601] date format. For more information, see the sections on the INS and DEL elements.
The scheme attribute allows authors to provide user agents more context for the correct interpretation of meta data. At times, such additional information may be critical, as when meta data may be specified in different formats. For example, an author might specify a date in the (ambiguous) format "10-9-97"; does this mean 9 October 1997 or 10 September 1997? The scheme attribute value "Month-Day-Year" would disambiguate this date value.
At other times, the scheme attribute may provide helpful but non-critical information to user agents.
For example, the following scheme declaration may help a user agent determine that the value of the "identifier" property is an ISBN code number:
<META scheme="ISBN" name="identifier" content="0-8230-2355-9 begin_of_the_skype_highlightingKOSTENLOS 0-8230-2355-9 end_of_the_skype_highlighting">
Values for the scheme attribute depend on the property name and the associated profile.
Note. One sample profile is the Dublin Core (see [DCORE]). This profile defines a set of recommended properties for electronic bibliographic descriptions, and is intended to promote interoperability among disparate description models.
<!ELEMENT BODY O O (%block;|SCRIPT)+ +(INS|DEL) -- document body --> <!ATTLIST BODY %attrs; -- %coreattrs, %i18n, %events -- onload %Script; #IMPLIED -- the document has been loaded -- onunload %Script; #IMPLIED -- the document has been removed -- >
Start tag: optional, End tag: optional
Attribute definitions
Attributes defined elsewhere
The body of a document contains the document's content. The content may be presented by a user agent in a variety of ways. For example, for visual browsers, you can think of the body as a canvas where the content appears: text, images, colors, graphics, etc. For audio user agents, the same content may be spoken. Since style sheets are now the preferred way to specify a document's presentation, the presentational attributes of BODY have been deprecated.
DEPRECATED EXAMPLE:
The following HTML fragment illustrates the use of the deprecated
attributes. It sets the background color of the canvas to
white, the text foreground color to black, and the color of
hyperlinks to red initially, fuchsia when activated, and
maroon once visited.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <HTML> <HEAD> <TITLE>A study of population dynamics</TITLE> </HEAD> <BODY bgcolor="white" text="black" link="red" alink="fuchsia" vlink="maroon"> ... document body... </BODY> </HTML>
Using style sheets, the same effect could be accomplished as follows:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <HTML> <HEAD> <TITLE>A study of population dynamics</TITLE> <STYLE type="text/css"> BODY { background: white; color: black} A:link { color: red } A:visited { color: maroon } A:active { color: fuchsia } </STYLE> </HEAD> <BODY> ... document body... </BODY> </HTML>
Using external (linked) style sheets gives you the flexibility to change the presentation without revising the source HTML document:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <HTML> <HEAD> <TITLE>A study of population dynamics</TITLE> <LINK rel="stylesheet" type="text/css" href="smartstyle.css"> </HEAD> <BODY> ... document body... </BODY> </HTML>
Framesets and HTML bodies. Documents that contain framesets replace the BODY element by the FRAMESET element. Please consult the section on frames for more information.
Attribute definitions
<P id="myparagraph"> This is a uniquely named paragraph.</P> <P id="yourparagraph"> This is also a uniquely named paragraph.</P>
The id attribute has several roles in HTML:
The class attribute, on the other hand, assigns one or more class names to an element; the element may be said to belong to these classes. A class name may be shared by several element instances. The class attribute has several roles in HTML:
In the following example, the SPAN element is used in conjunction with the id and class attributes to markup document messages. Messages appear in both English and French versions.
<!-- English messages --> <P><SPAN id="msg1" class="info" lang="en">Variable declared twice</SPAN> <P><SPAN id="msg2" class="warning" lang="en">Undeclared variable</SPAN> <P><SPAN id="msg3" class="error" lang="en">Bad syntax for variable name</SPAN>
<!-- French messages --> <P><SPAN id="msg1" class="info" lang="fr">Variable déclarée deux fois</SPAN> <P><SPAN id="msg2" class="warning" lang="fr">Variable indéfinie</SPAN> <P><SPAN id="msg3" class="error" lang="fr">Erreur de syntaxe pour variable</SPAN>
The following CSS style rules would tell visual user agents to display informational messages in green, warning messages in yellow, and error messages in red:
SPAN.info { color: green } SPAN.warning { color: yellow } SPAN.error { color: red }
Note that the French "msg1" and the English "msg1" may not appear in the same document since they share the same id value. Authors may make further use of the id attribute to refine the presentation of individual messages, make them target anchors, etc.
Almost every HTML element may be assigned identifier and class information.
Suppose, for example, that we are writing a document about a programming language. The document is to include a number of preformatted examples. We use the PRE element to format the examples. We also assign a background color (green) to all instances of the PRE element belonging to the class "example".
<HEAD> <TITLE>... document title ...</TITLE> <STYLE type="text/css"> PRE.example { background : green } </STYLE> </HEAD> <BODY> <PRE class="example" id="example-1"> ...example code here... </PRE> </BODY>
By setting the id attribute for this example, we can (1) create a hyperlink to it and (2) override class style information with instance style information.
Note. The id attribute shares the same name space as the name attribute when used for anchor names. Please consult the section on anchors with id for more information.
Certain HTML elements that may appear in BODY are said to be "block-level" while others are "inline" (also known as "text level"). The distinction is founded on several notions:
Style sheets provide the means to specify the rendering of arbitrary elements, including whether an element is rendered as block or inline. In some cases, such as an inline style for list elements, this may be appropriate, but generally speaking, authors are discouraged from overriding the conventional interpretation of HTML elements in this way.
The alteration of the traditional presentation idioms for block level and inline elements also has an impact on the bidirectional text algorithm. See the section on the effect of style sheets on bidirectionality for more information.
<!ELEMENT DIV - - (%flow;)* -- generic language/style container --> <!ATTLIST DIV %attrs; -- %coreattrs, %i18n, %events -- > <!ELEMENT SPAN - - (%inline;)* -- generic language/style container --> <!ATTLIST SPAN %attrs; -- %coreattrs, %i18n, %events -- >
Start tag: required, End tag: required
Attributes defined elsewhere
The DIV and SPAN elements, in conjunction with the id and class attributes, offer a generic mechanism for adding structure to documents. These elements define content to be inline (SPAN) or block-level (DIV) but impose no other presentational idioms on the content. Thus, authors may use these elements in conjunction with style sheets, the lang attribute, etc., to tailor HTML to their own needs and tastes.
Suppose, for example, that we wanted to generate an HTML document based on a database of client information. Since HTML does not include elements that identify objects such as "client", "telephone number", "email address", etc., we use DIV and SPAN to achieve the desired structural and presentational effects. We might use the TABLE element as follows to structure the information:
<!-- Example of data from the client database: --> <!-- Name: Stephane Boyera, Tel: (212) 555-1212 begin_of_the_skype_highlightingKOSTENLOS (212) 555-1212 end_of_the_skype_highlighting, Email: sb@foo.org --> <DIV id="client-boyera" class="client"> <P><SPAN class="client-title">Client information:</SPAN> <TABLE class="client-data"> <TR><TH>Last name:<TD>Boyera</TR> <TR><TH>First name:<TD>Stephane</TR> <TR><TH>Tel:<TD>(212) 555-1212</TR> <TR><TH>Email:<TD>sb@foo.org</TR> </TABLE> </DIV> <DIV id="client-lafon" class="client"> <P><SPAN class="client-title">Client information:</SPAN> <TABLE class="client-data"> <TR><TH>Last name:<TD>Lafon</TR> <TR><TH>First name:<TD>Yves</TR> <TR><TH>Tel:<TD>(617) 555-1212</TR> <TR><TH>Email:<TD>yves@coucou.com</TR> </TABLE> </DIV>
Later, we may easily add style sheet declarations to fine tune the presentation of these database entries.
For another example of usage, please consult the example in the section on the class and id attributes.
Visual user agents generally place a line break before and after DIV elements, for instance:
<P>aaaaaaaaa<DIV>bbbbbbbbb</DIV><DIV>ccccc<P>ccccc</DIV>
which is typically rendered as:
aaaaaaaaa bbbbbbbbb ccccc ccccc
<!ENTITY % heading "H1|H2|H3|H4|H5|H6"> <!-- There are six levels of headings from H1 (the most important) to H6 (the least important). --> <!ELEMENT (%heading;) - - (%inline;)* -- heading --> <!ATTLIST (%heading;) %attrs; -- %coreattrs, %i18n, %events -- >
Start tag: required, End tag: required
Attributes defined elsewhere
A heading element briefly describes the topic of the section it introduces. Heading information may be used by user agents, for example, to construct a table of contents for a document automatically.
There are six levels of headings in HTML with H1 as the most important and H6 as the least. Visual browsers usually render more important headings in larger fonts than less important ones.
The following example shows how to use the DIV element to associate a heading with the document section that follows it. Doing so allows you to define a style for the section (color the background, set the font, etc.) with style sheets.
<DIV class="section" id="forest-elephants" > <H1>Forest elephants</H1> <P>In this section, we discuss the lesser known forest elephants. ...this section continues... <DIV class="subsection" id="forest-habitat" > <H2>Habitat</H2> <P>Forest elephants do not live in trees but among them. ...this subsection continues... </DIV> </DIV>
This structure may be decorated with style information such as:
<HEAD> <TITLE>... document title ...</TITLE> <STYLE type="text/css"> DIV.section { text-align: justify; font-size: 12pt} DIV.subsection { text-indent: 2em } H1 { font-style: italic; color: green } H2 { color: green } </STYLE> </HEAD>
Numbered sections and references
HTML does not itself cause section numbers
to be generated from headings. This facility may be
offered by user agents, however. Soon, style sheet
languages such as CSS will allow authors to control the
generation of section numbers (handy for forward
references in printed documents, as in "See section 7.2").
Some people consider skipping heading levels to be bad practice. They accept H1 H2 H1 while they do not accept H1 H3 H1 since the heading level H2 is skipped.
<!ELEMENT ADDRESS - - (%inline;)* -- information on author --> <!ATTLIST ADDRESS %attrs; -- %coreattrs, %i18n, %events -- >
Start tag: required, End tag: required
Attributes defined elsewhere
The ADDRESS element may be used by authors to supply contact information for a document or a major part of a document such as a form. This element often appears at the beginning or end of a document.
For example, a page at the W3C Web site related to HTML might include the following contact information:
<ADDRESS> <A href="../People/Raggett/">Dave Raggett</A>, <A href="../People/Arnaud/">Arnaud Le Hors</A>, contact persons for the <A href="Activity">W3C HTML Activity</A><BR> $Date: 1999/12/24 23:37:50 $ </ADDRESS>
Contents
This section of the document discusses two important issues that affect the internationalization of HTML: specifying the language (the lang attribute) and direction (the dir attribute) of text in a document.
Language information specified via the lang attribute may be used by a user agent to control rendering in a variety of ways. Some situations where author-supplied language information may be helpful include:
The lang attribute specifies the language of element content and attribute values; whether it is relevant for a given attribute depends on the syntax and semantics of the attribute and the operation involved.
The intent of the lang attribute is to allow user agents to render content more meaningfully based on accepted cultural practice for a given language. This does not imply that user agents should render characters that are atypical for a particular language in less meaningful ways; user agents must make a best attempt to render all characters, regardless of the value specified by lang.
For instance, if characters from the Greek alphabet appear in the midst of English text:
<P><Q lang="en">Her super-powers were the result of γ-radiation,</Q> he explained.</P>
a user agent (1) should try to render the English content in an appropriate manner (e.g., in its handling the quotation marks) and (2) must make a best attempt to render γ even though it is not an English character.
Please consult the section on undisplayable characters for related information.
The lang attribute's value is a language code that identifies a natural language spoken, written, or otherwise used for the communication of information among people. Computer languages are explicitly excluded from language codes.
[RFC1766] defines and explains the language codes that must be used in HTML documents.
Briefly, language codes consist of a primary code and a possibly empty series of subcodes:
language-code = primary-code ( "-" subcode )*
Here are some sample language codes:
Two-letter primary codes are reserved for [ISO639] language abbreviations. Two-letter codes include fr (French), de (German), it (Italian), nl (Dutch), el (Greek), es (Spanish), pt (Portuguese), ar (Arabic), he (Hebrew), ru (Russian), zh (Chinese), ja (Japanese), hi (Hindi), ur (Urdu), and sa (Sanskrit).
Any two-letter subcode is understood to be a [ISO3166] country code.
An element inherits language code information according to the following order of precedence (highest to lowest):
Content-Language: en-cockney
In this example, the primary language of the document is French ("fr"). One paragraph is declared to be in Spanish ("es"), after which the primary language returns to French. The following paragraph includes an embedded Japanese ("ja") phrase, after which the primary language returns to French.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <HTML lang="fr"> <HEAD> <TITLE>Un document multilingue</TITLE> </HEAD> <BODY> ...Interpreted as French... <P lang="es">...Interpreted as Spanish... <P>...Interpreted as French again... <P>...French text interrupted by<EM lang="ja">some Japanese</EM>French begins here again... </BODY> </HTML>
In the context of HTML, a language code should be interpreted by user agents as a hierarchy of tokens rather than a single token. When a user agent adjusts rendering according to language information (say, by comparing style sheet language codes and lang values), it should always favor an exact match, but should also consider matching primary codes to be sufficient. Thus, if the lang attribute value of "en-US" is set for the HTML element, a user agent should prefer style information that matches "en-US" first, then the more general value "en".
Note. Language code hierarchies do not guarantee that all languages with a common prefix will be understood by those fluent in one or more of those languages. They do allow a user to request this commonality when it is true for that user.
Attribute definitions
In addition to specifying the language of a document with the lang attribute, authors may need to specify the base directionality (left-to-right or right-to-left) of portions of a document's text, of table structure, etc. This is done with the dir attribute.
The [UNICODE] specification assigns directionality to characters and defines a (complex) algorithm for determining the proper directionality of text. If a document does not contain a displayable right-to-left character, a conforming user agent is not required to apply the [UNICODE] bidirectional algorithm. If a document contains right-to-left characters, and if the user agent displays these characters, the user agent must use the bidirectional algorithm.
Although Unicode specifies special characters that deal with text direction, HTML offers higher-level markup constructs that do the same thing: the dir attribute (do not confuse with the DIR element) and the BDO element. Thus, to express a Hebrew quotation, it is more intuitive to write
<Q lang="he" dir="rtl">...a Hebrew quotation...</Q>
than the equivalent with Unicode references:
‫״...a Hebrew quotation...״‬
User agents must not use the lang attribute to determine text directionality.
The dir attribute is inherited and may be overridden. Please consult the section on the inheritance of text direction information for details.
The following example illustrates the expected behavior of the bidirectional algorithm. It involves English, a left-to-right script, and Hebrew, a right-to-left script.
Consider the following example text:
english1 HEBREW2 english3 HEBREW4 english5 HEBREW6
The characters in this example (and in all related examples) are stored in the computer the way they are displayed here: the first character in the file is "e", the second is "n", and the last is "6".
Suppose the predominant language of the document containing this paragraph is English. This means that the base direction is left-to-right. The correct presentation of this line would be:
english1 2WERBEH english3 4WERBEH english5 6WERBEH <------ <------ <------ H H H -------------------------------------------------> E
The dotted lines indicate the structure of the sentence: English predominates and some Hebrew text is embedded. Achieving the correct presentation requires no additional markup since the Hebrew fragments are reversed correctly by user agents applying the bidirectional algorithm.
If, on the other hand, the predominant language of the document is Hebrew, the base direction is right-to-left. The correct presentation is therefore:
6WERBEH english5 4WERBEH english3 2WERBEH english1 -------> -------> -------> E E E <------------------------------------------------- H
In this case, the whole sentence has been presented as right-to-left and the embedded English sequences have been properly reversed by the bidirectional algorithm.
The Unicode bidirectional algorithm requires a base text direction for text blocks. To specify the base direction of a block-level element, set the element's dir attribute. The default value of the dir attribute is "ltr" (left-to-right text).
When the dir attribute is set for a block-level element, it remains in effect for the duration of the element and any nested block-level elements. Setting the dir attribute on a nested element overrides the inherited value.
To set the base text direction for an entire document, set the dir attribute on the HTML element.
For example:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <HTML dir="RTL"> <HEAD> <TITLE>...a right-to-left title...</TITLE> </HEAD> ...right-to-left text... <P dir="ltr">...left-to-right text...</P> <P>...right-to-left text again...</P> </HTML>
Inline elements, on the other hand, do not inherit the dir attribute. This means that an inline element without a dir attribute does not open an additional level of embedding with respect to the bidirectional algorithm. (Here, an element is considered to be block-level or inline based on its default presentation. Note that the INS and DEL elements can be block-level or inline depending on their context.)
The [UNICODE] bidirectional algorithm automatically reverses embedded character sequences according to their inherent directionality (as illustrated by the previous examples). However, in general only one level of embedding can be accounted for. To achieve additional levels of embedded direction changes, you must make use of the dir attribute on an inline element.
Consider the same example text as before:
english1 HEBREW2 english3 HEBREW4 english5 HEBREW6
Suppose the predominant language of the document containing this paragraph is English. Furthermore, the above English sentence contains a Hebrew section extending from HEBREW2 through HEBREW4 and the Hebrew section contains an English quotation (english3). The desired presentation of the text is thus:
english1 4WERBEH english3 2WERBEH english5 6WERBEH -------> E <----------------------- H -------------------------------------------------> E
To achieve two embedded direction changes, we must supply additional information, which we do by delimiting the second embedding explicitly. In this example, we use the SPAN element and the dir attribute to mark up the text:
english1 <SPAN dir="RTL">HEBREW2 english3 HEBREW4</SPAN> english5 HEBREW6
Authors may also use special Unicode characters to achieve multiple embedded direction changes. To achieve left-to-right embedding, surround embedded text with the characters LEFT-TO-RIGHT EMBEDDING ("LRE", hexadecimal 202A) and POP DIRECTIONAL FORMATTING ("PDF", hexadecimal 202C). To achieve right-to-left embedding, surround embedded text with the characters RIGHT-TO-LEFT EMBEDDING ("RTE", hexadecimal 202B) and PDF.
Using HTML directionality markup with Unicode characters. Authors and designers of authoring software should be aware that conflicts can arise if the dir attribute is used on inline elements (including BDO) concurrently with the corresponding [UNICODE] formatting characters. Preferably one or the other should be used exclusively. The markup method offers a better guarantee of document structural integrity and alleviates some problems when editing bidirectional HTML text with a simple text editor, but some software may be more apt at using the [UNICODE] characters. If both methods are used, great care should be exercised to insure proper nesting of markup and directional embedding or override, otherwise, rendering results are undefined.
<!ELEMENT BDO - - (%inline;)* -- I18N BiDi over-ride --> <!ATTLIST BDO %coreattrs; -- id, class, style, title -- lang %LanguageCode; #IMPLIED -- language code -- dir (ltr|rtl) #REQUIRED -- directionality -- >
Start tag: required, End tag: required
Attribute definitions
Attributes defined elsewhere
The bidirectional algorithm and the dir attribute generally suffice to manage embedded direction changes. However, some situations may arise when the bidirectional algorithm results in incorrect presentation. The BDO element allows authors to turn off the bidirectional algorithm for selected fragments of text.
Consider a document containing the same text as before:
english1 HEBREW2 english3 HEBREW4 english5 HEBREW6
but assume that this text has already been put in visual order. One reason for this may be that the MIME standard ([RFC2045], [RFC1556]) favors visual order, i.e., that right-to-left character sequences are inserted right-to-left in the byte stream. In an email, the above might be formatted, including line breaks, as:
english1 2WERBEH english3 4WERBEH english5 6WERBEH
This conflicts with the [UNICODE] bidirectional algorithm, because that algorithm would invert 2WERBEH, 4WERBEH, and 6WERBEH a second time, displaying the Hebrew words left-to-right instead of right-to-left.
The solution in this case is to override the bidirectional algorithm by putting the Email excerpt in a PRE element (to conserve line breaks) and each line in a BDO element, whose dir attribute is set to LTR:
<PRE> <BDO dir="LTR">english1 2WERBEH english3</BDO> <BDO dir="LTR">4WERBEH english5 6WERBEH</BDO> </PRE>
This tells the bidirectional algorithm "Leave me left-to-right!" and would produce the desired presentation:
english1 2WERBEH english3 4WERBEH english5 6WERBEH
The BDO element should be used in scenarios where absolute control over sequence order is required (e.g., multi-language part numbers). The dir attribute is mandatory for this element.
Authors may also use special Unicode characters to override the bidirectional algorithm -- LEFT-TO-RIGHT OVERRIDE (202D) or RIGHT-TO-LEFT OVERRIDE (hexadecimal 202E). The POP DIRECTIONAL FORMATTING (hexadecimal 202C) character ends either bidirectional override.
Note. Recall that conflicts can arise if the dir attribute is used on inline elements (including BDO) concurrently with the corresponding [UNICODE] formatting characters.
Bidirectionality and character encoding According to [RFC1555] and [RFC1556], there are special conventions for the use of "charset" parameter values to indicate bidirectional treatment in MIME mail, in particular to distinguish between visual, implicit, and explicit directionality. The parameter value "ISO-8859-8" (for Hebrew) denotes visual encoding, "ISO-8859-8-i" denotes implicit bidirectionality, and "ISO-8859-8-e" denotes explicit directionality.
Because HTML uses the Unicode bidirectionality algorithm, conforming documents encoded using ISO 8859-8 must be labeled as "ISO-8859-8-i". Explicit directional control is also possible with HTML, but cannot be expressed with ISO 8859-8, so "ISO-8859-8-e" should not be used.
The value "ISO-8859-8" implies that the document is formatted visually, misusing some markup (such as TABLE with right alignment and no line wrapping) to ensure reasonable display on older user agents that do not handle bidirectionality. Such documents do not conform to the present specification. If necessary, they can be made to conform to the current specification (and at the same time will be displayed correctly on older user agents) by adding BDO markup where necessary. Contrary to what is said in [RFC1555] and [RFC1556], ISO-8859-6 (Arabic) is not visual ordering.
Since ambiguities sometimes arise as to the directionality of certain characters (e.g., punctuation), the [UNICODE] specification includes characters to enable their proper resolution. Also, Unicode includes some characters to control joining behavior where this is necessary (e.g., some situations with Arabic letters). HTML 4 includes character references for these characters.
The following DTD excerpt presents some of the directional entities:
<!ENTITY zwnj CDATA "‌"--=zero width non-joiner--> <!ENTITY zwj CDATA "‍"--=zero width joiner--> <!ENTITY lrm CDATA "‎"--=left-to-right mark--> <!ENTITY rlm CDATA "‏"--=right-to-left mark-->
The zwnj entity is used to block joining behavior in contexts where joining will occur but shouldn't. The zwj entity does the opposite; it forces joining when it wouldn't occur but should. For example, the Arabic letter "HEH" is used to abbreviate "Hijri", the name of the Islamic calendar system. Since the isolated form of "HEH" looks like the digit five as employed in Arabic script (based on Indic digits), in order to prevent confusing "HEH" as a final digit five in a year, the initial form of "HEH" is used. However, there is no following context (i.e., a joining letter) to which the "HEH" can join. The zwj character provides that context.
Similarly, in Persian texts, there are cases where a letter that normally would join a subsequent letter in a cursive connection should not. The character zwnj is used to block joining in such cases.
The other characters, lrm and rlm, are used to force directionality of directionally neutral characters. For example, if a double quotation mark comes between an Arabic (right-to-left) and a Latin (left-to-right) letter, the direction of the quotation mark is not clear (is it quoting the Arabic text or the Latin text?). The lrm and rlm characters have a directional property but no width and no word/line break property. Please consult [UNICODE] for more details.
Mirrored character glyphs. In general, the bidirectional algorithm does not mirror character glyphs but leaves them unaffected. An exception are characters such as parentheses (see [UNICODE], table 4-7). In cases where mirroring is desired, for example for Egyptian Hieroglyphs, Greek Bustrophedon, or special design effects, this should be controlled with styles.
In general, using style sheets to change an element's visual rendering from block-level to inline or vice-versa is straightforward. However, because the bidirectional algorithm relies on the inline/block-level distinction, special care must be taken during the transformation.
When an inline element that does not have a dir attribute is transformed to the style of a block-level element by a style sheet, it inherits the dir attribute from its closest parent block element to define the base direction of the block.
When a block element that does not have a dir attribute is transformed to the style of an inline element by a style sheet, the resulting presentation should be equivalent, in terms of bidirectional formatting, to the formatting obtained by explicitly adding a dir attribute (assigned the inherited value) to the transformed element.
Contents
The following sections discuss issues surrounding the structuring of text. Elements that present text (alignment elements, font elements, style sheets, etc.) are discussed elsewhere in the specification. For information about characters, please consult the section on the document character set.
The document character set includes a wide variety of white space characters. Many of these are typographic elements used in some applications to produce particular visual spacing effects. In HTML, only the following characters are defined as white space characters:
Line breaks are also white space characters. Note that although 
 and 
 are defined in [ISO10646] to unambiguously separate lines and paragraphs, respectively, these do not constitute line breaks in HTML, nor does this specification include them in the more general category of white space characters.
This specification does not indicate the behavior, rendering or otherwise, of space characters other than those explicitly identified here as white space characters. For this reason, authors should use appropriate elements and styles to achieve visual formatting effects that involve white space, rather than space characters.
For all HTML elements except PRE, sequences of white space separate "words" (we use the term "word" here to mean "sequences of non-white space characters"). When formatting text, user agents should identify these words and lay them out according to the conventions of the particular written language (script) and target medium.
This layout may involve putting space between words (called inter-word space), but conventions for inter-word space vary from script to script. For example, in Latin scripts, inter-word space is typically rendered as an ASCII space ( ), while in Thai it is a zero-width word separator (​). In Japanese and Chinese, inter-word space is not typically rendered at all.
Note that a sequence of white spaces between words in the source document may result in an entirely different rendered inter-word spacing (except in the case of the PRE element). In particular, user agents should collapse input white space sequences when producing output inter-word space. This can and should be done even in the absence of language information (from the lang attribute, the HTTP "Content-Language" header field (see [RFC2616], section 14.12), user agent settings, etc.).
The PRE element is used for preformatted text, where white space is significant.
In order to avoid problems with SGML line break rules and inconsistencies among extant implementations, authors should not rely on user agents to render white space immediately after a start tag or immediately before an end tag. Thus, authors, and in particular authoring tools, should write:
<P>We offer free <A>technical support</A> for subscribers.</P>
and not:
<P>We offer free<A> technical support </A>for subscribers.</P>
<!ENTITY % phrase "EM | STRONG | DFN | CODE | SAMP | KBD | VAR | CITE | ABBR | ACRONYM" > <!ELEMENT (%fontstyle;|%phrase;) - - (%inline;)*> <!ATTLIST (%fontstyle;|%phrase;) %attrs; -- %coreattrs, %i18n, %events -- >
Start tag: required, End tag: required
Attributes defined elsewhere
Phrase elements add structural information to text fragments. The usual meanings of phrase elements are following:
EM and STRONG are used to indicate emphasis. The other phrase elements have particular significance in technical documents. These examples illustrate some of the phrase elements:
As <CITE>Harry S. Truman</CITE> said, <Q lang="en-us">The buck stops here.</Q> More information can be found in <CITE>[ISO-0000]</CITE>. Please refer to the following reference number in future correspondence: <STRONG>1-234-55</STRONG>
The presentation of phrase elements depends on the user agent. Generally, visual user agents present EM text in italics and STRONG text in bold font. Speech synthesizer user agents may change the synthesis parameters, such as volume, pitch and rate accordingly.
The ABBR and ACRONYM elements allow authors to clearly indicate occurrences of abbreviations and acronyms. Western languages make extensive use of acronyms such as "GmbH", "NATO", and "F.B.I.", as well as abbreviations like "M.", "Inc.", "et al.", "etc.". Both Chinese and Japanese use analogous abbreviation mechanisms, wherein a long name is referred to subsequently with a subset of the Han characters from the original occurrence. Marking up these constructs provides useful information to user agents and tools such as spell checkers, speech synthesizers, translation systems and search-engine indexers.
The content of the ABBR and ACRONYM elements specifies the abbreviated expression itself, as it would normally appear in running text. The title attribute of these elements may be used to provide the full or expanded form of the expression.
Here are some sample uses of ABBR:
<P> <ABBR title="World Wide Web">WWW</ABBR> <ABBR lang="fr" title="Société Nationale des Chemins de Fer"> SNCF </ABBR> <ABBR lang="es" title="Doña">Doña</ABBR> <ABBR title="Abbreviation">abbr.</ABBR>
Note that abbreviations and acronyms often have idiosyncratic pronunciations. For example, while "IRS" and "BBC" are typically pronounced letter by letter, "NATO" and "UNESCO" are pronounced phonetically. Still other abbreviated forms (e.g., "URI" and "SQL") are spelled out by some people and pronounced as words by other people. When necessary, authors should use style sheets to specify the pronunciation of an abbreviated form.
<!ELEMENT BLOCKQUOTE - - (%block;|SCRIPT)+ -- long quotation --> <!ATTLIST BLOCKQUOTE %attrs; -- %coreattrs, %i18n, %events -- cite %URI; #IMPLIED -- URI for source document or msg -- > <!ELEMENT Q - - (%inline;)* -- short inline quotation --> <!ATTLIST Q %attrs; -- %coreattrs, %i18n, %events -- cite %URI; #IMPLIED -- URI for source document or msg -- >
Start tag: required, End tag: required
Attribute definitions
Attributes defined elsewhere
These two elements designate quoted text. BLOCKQUOTE is for long quotations (block-level content) and Q is intended for short quotations (inline content) that don't require paragraph breaks.
This example formats an excerpt from "The Two Towers", by J.R.R. Tolkien, as a blockquote.
<BLOCKQUOTE cite="http://www.mycom.com/tolkien/twotowers.html"> <P>They went in single file, running like hounds on a strong scent, and an eager light was in their eyes. Nearly due west the broad swath of the marching Orcs tramped its ugly slot; the sweet grass of Rohan had been bruised and blackened as they passed.</P> </BLOCKQUOTE>
Visual user agents generally render BLOCKQUOTE as an indented block.
Visual user agents must ensure that the content of the Q element is rendered with delimiting quotation marks. Authors should not put quotation marks at the beginning and end of the content of a Q element.
User agents should render quotation marks in a language-sensitive manner (see the lang attribute). Many languages adopt different quotation styles for outer and inner (nested) quotations, which should be respected by user-agents.
The following example illustrates nested quotations with the Q element.
John said, <Q lang="en-us">I saw Lucy at lunch, she told me <Q lang="en-us">Mary wants you to get some ice cream on your way home.</Q> I think I will get some at Ben and Jerry's, on Gloucester Road.</Q>
Since the language of both quotations is American English, user agents should render them appropriately, for example with single quote marks around the inner quotation and double quote marks around the outer quotation:
John said, "I saw Lucy at lunch, she told me 'Mary wants you to get some ice cream on your way home.' I think I will get some at Ben and Jerry's, on Gloucester Road."
Note. We recommend that style sheet implementations provide a mechanism for inserting quotation marks before and after a quotation delimited by BLOCKQUOTE in a manner appropriate to the current language context and the degree of nesting of quotations.
However, as some authors have used BLOCKQUOTE merely as a mechanism to indent text, in order to preserve the intention of the authors, user agents should not insert quotation marks in the default style.
The usage of BLOCKQUOTE to indent text is deprecated in favor of style sheets.
<!ELEMENT (SUB|SUP) - - (%inline;)* -- subscript, superscript --> <!ATTLIST (SUB|SUP) %attrs; -- %coreattrs, %i18n, %events -- >
Start tag: required, End tag: required
Attributes defined elsewhere
Many scripts (e.g., French) require superscripts or subscripts for proper rendering. The SUB and SUP elements should be used to markup text in these cases.
H<sub>2</sub>O E = mc<sup>2</sup> <SPAN lang="fr">M<sup>lle</sup> Dupont</SPAN>
Authors traditionally divide their thoughts and arguments into sequences of paragraphs. The organization of information into paragraphs is not affected by how the paragraphs are presented: paragraphs that are double-justified contain the same thoughts as those that are left-justified.
The HTML markup for defining a paragraph is straightforward: the P element defines a paragraph.
The visual presentation of paragraphs is not so simple. A number of issues, both stylistic and technical, must be addressed:
We address these questions below. Paragraph alignment and floating objects are discussed later in this document.
Start tag: required, End tag: optional
Attributes defined elsewhere
The P element represents a paragraph. It cannot contain block-level elements (including P itself).
We discourage authors from using empty P elements. User agents should ignore empty P elements.
A line break is defined to be a carriage return (
), a line feed (
), or a carriage return/line feed pair. All line breaks constitute white space.
For more information about SGML's specification of line breaks, please consult the notes on line breaks in the appendix.
<!ELEMENT BR - O EMPTY -- forced line break --> <!ATTLIST BR %coreattrs; -- id, class, style, title -- >
Start tag: required, End tag: forbidden
Attributes defined elsewhere
The BR element forcibly breaks (ends) the current line of text.
For visual user agents, the clear attribute can be used to determine whether markup following the BR element flows around images and other objects floated to the left or right margin, or whether it starts after the bottom of such objects. Further details are given in the section on alignment and floating objects. Authors are advised to use style sheets to control text flow around floating images and other objects.
With respect to bidirectional formatting, the BR element should behave the same way the [ISO10646] LINE SEPARATOR character behaves in the bidirectional algorithm.
Sometimes authors may want to prevent a line break from occurring between two words. The entity (  or  ) acts as a space where user agents should not cause a line break.
In HTML, there are two types of hyphens: the plain hyphen and the soft hyphen. The plain hyphen should be interpreted by a user agent as just another character. The soft hyphen tells the user agent where a line break can occur.
Those browsers that interpret soft hyphens must observe the following semantics: If a line is broken at a soft hyphen, a hyphen character must be displayed at the end of the first line. If a line is not broken at a soft hyphen, the user agent must not display a hyphen character. For operations such as searching and sorting, the soft hyphen should always be ignored.
In HTML, the plain hyphen is represented by the "-" character (- or -). The soft hyphen is represented by the character entity reference ­ (­ or ­)
<!ENTITY % pre.exclusion "IMG|OBJECT|BIG|SMALL|SUB|SUP"> <!ELEMENT PRE - - (%inline;)* -(%pre.exclusion;) -- preformatted text --> <!ATTLIST PRE %attrs; -- %coreattrs, %i18n, %events -- >
Start tag: required, End tag: required
Attribute definitions
Attributes defined elsewhere
The PRE element tells visual user agents that the enclosed text is "preformatted". When handling preformatted text, visual user agents:
Non-visual user agents are not required to respect extra white space in the content of a PRE element.
For more information about SGML's specification of line breaks, please consult the notes on line breaks in the appendix.
The DTD fragment above indicates which elements may not appear within a PRE declaration. This is the same as in HTML 3.2, and is intended to preserve constant line spacing and column alignment for text rendered in a fixed pitch font. Authors are discouraged from altering this behavior through style sheets.
The following example shows a preformatted verse from Shelly's poem To a Skylark:
<PRE> Higher still and higher From the earth thou springest Like a cloud of fire; The blue deep thou wingest, And singing still dost soar, and soaring ever singest. </PRE>
Here is how this is typically rendered:
Higher still and higher From the earth thou springest Like a cloud of fire; The blue deep thou wingest, And singing still dost soar, and soaring ever singest.
The horizontal tab character
The horizontal tab character (decimal 9 in [ISO10646] and [ISO88591] ) is usually
interpreted by visual user agents as the smallest
non-zero number of spaces necessary to line
characters up along tab stops that are every 8
characters. We strongly discourage using horizontal
tabs in preformatted text since it is common
practice, when editing, to set the tab-spacing to
other values, leading to misaligned documents.
Note. The following section is an informative description of the behavior of some current visual user agents when formatting paragraphs. Style sheets allow better control of paragraph formatting.
How paragraphs are rendered visually depends on the user agent. Paragraphs are usually rendered flush left with a ragged right margin. Other defaults are appropriate for right-to-left scripts.
HTML user agents have traditionally rendered paragraphs with white space before and after, e.g.,
At the same time, there began to take form a system of numbering, the calendar, hieroglyphic writing, and a technically advanced art, all of which later influenced other peoples. Within the framework of this gradual evolution or cultural progress the Preclassic horizon has been divided into Lower, Middle and Upper periods, to which can be added a transitional or Protoclassic period with several features that would later distinguish the emerging civilizations of Mesoamerica.
This contrasts with the style used in novels which indents the first line of the paragraph and uses the regular line spacing between the final line of the current paragraph and the first line of the next, e.g.,
At the same time, there began to take form a system of numbering, the calendar, hieroglyphic writing, and a technically advanced art, all of which later influenced other peoples. Within the framework of this gradual evolution or cultural progress the Preclassic horizon has been divided into Lower, Middle and Upper periods, to which can be added a transitional or Protoclassic period with several features that would later distinguish the emerging civilizations of Mesoamerica.
Following the precedent set by the NCSA Mosaic browser in 1993, user agents generally don't justify both margins, in part because it's hard to do this effectively without sophisticated hyphenation routines. The advent of style sheets, and anti-aliased fonts with subpixel positioning promises to offer richer choices to HTML authors than previously possible.
Style sheets provide rich control over the size and style of a font, the margins, space before and after a paragraph, the first line indent, justification and many other details. The user agent's default style sheet renders P elements in a familiar form, as illustrated above. One could, in principle, override this to render paragraphs without the breaks that conventionally distinguish successive paragraphs. In general, since this may confuse readers, we discourage this practice.
By convention, visual HTML user agents wrap text lines to fit within the available margins. Wrapping algorithms depend on the script being formatted.
In Western scripts, for example, text should only be wrapped at white space. Early user agents incorrectly wrapped lines just after the start tag or just before the end tag of an element, which resulted in dangling punctuation. For example, consider this sentence:
A statue of the <A href="cih78">Cihuateteus</A>, who are patron ...
Wrapping the line just before the end tag of the A element causes the comma to be stranded at the beginning of the next line:
A statue of the Cihuateteus , who are patron ...
This is an error since there was no white space at that point in the markup.
<!-- INS/DEL are handled by inclusion on BODY --> <!ELEMENT (INS|DEL) - - (%flow;)* -- inserted text, deleted text --> <!ATTLIST (INS|DEL) %attrs; -- %coreattrs, %i18n, %events -- cite %URI; #IMPLIED -- info on reason for change -- datetime %Datetime; #IMPLIED -- date and time of change -- >
Start tag: required, End tag: required
Attribute definitions
Attributes defined elsewhere
INS and DEL are used to markup sections of the document that have been inserted or deleted with respect to a different version of a document (e.g., in draft legislation where lawmakers need to view the changes).
These two elements are unusual for HTML in that they may serve as either block-level or inline elements (but not both). They may contain one or more words within a paragraph or contain one or more block-level elements such as paragraphs, lists and tables.
This example could be from a bill to change the legislation for how many deputies a County Sheriff can employ from 3 to 5.
<P> A Sheriff can employ <DEL>3</DEL><INS>5</INS> deputies. </P>
The INS and DEL elements must not contain block-level content when these elements behave as inline elements.
ILLEGAL EXAMPLE:
The following is not legal HTML.
<P> <INS><DIV>...block-level content...</DIV></INS> </P>
User agents should render inserted and deleted text in ways that make the change obvious. For instance, inserted text may appear in a special font, deleted text may not be shown at all or be shown as struck-through or with special markings, etc.
Both of the following examples correspond to November 5, 1994, 8:15:30 am, US Eastern Standard Time.
1994-11-05T13:15:30Z 1994-11-05T08:15:30-05:00
Used with INS, this gives:
<INS datetime="1994-11-05T08:15:30-05:00" cite="http://www.foo.org/mydoc/comments.html"> Furthermore, the latest figures from the marketing department suggest that such practice is on the rise. </INS>
The document "http://www.foo.org/mydoc/comments.html" would contain comments about why information was inserted into the document.
Authors may also make comments about inserted or deleted text by means of the title attribute for the INS and DEL elements. User agents may present this information to the user (e.g., as a popup note). For example:
<INS datetime="1994-11-05T08:15:30-05:00" title="Changed as a result of Steve B's comments in meeting."> Furthermore, the latest figures from the marketing department suggest that such practice is on the rise. </INS>
Contents
The previous list, for example, is an unordered list, created with the UL element:
<UL> <LI>Unordered information. <LI>Ordered information. <LI>Definitions. </UL>
An ordered list, created using the OL element, should contain information where order should be emphasized, as in a recipe:
Definition lists, created using the DL element, generally consist of a series of term/definition pairs (although definition lists may have other applications). Thus, when advertising a product, one might use a definition list:
defined in HTML as:
<DL> <DT><STRONG>Lower cost</STRONG> <DD>The new version of this product costs significantly less than the previous one! <DT><STRONG>Easier to use</STRONG> <DD>We've changed the product so that it's much easier to use! <DT><STRONG>Safe for kids</STRONG> <DD>You can leave your kids alone in a room with this product and they won't get hurt (not a guarantee). </DL>
Lists may also be nested and different list types may be used together, as in the following example, which is a definition list that contains an unordered list (the ingredients) and an ordered list (the procedure):
The exact presentation of the three list types depends on the user agent. We discourage authors from using lists purely as a means of indenting text. This is a stylistic issue and is properly handled by style sheets.
<!ELEMENT UL - - (LI)+ -- unordered list --> <!ATTLIST UL %attrs; -- %coreattrs, %i18n, %events -- > <!ELEMENT OL - - (LI)+ -- ordered list --> <!ATTLIST OL %attrs; -- %coreattrs, %i18n, %events -- >
Start tag: required, End tag: required
Start tag: required, End tag: optional
Attribute definitions
Attributes defined elsewhere
Ordered and unordered lists are rendered in an identical manner except that visual user agents number ordered list items. User agents may present those numbers in a variety of ways. Unordered list items are not numbered.
Both types of lists are made up of sequences of list items defined by the LI element (whose end tag may be omitted).
This example illustrates the basic structure of a list.
<UL> <LI> ... first list item... <LI> ... second list item... ... </UL>
DEPRECATED EXAMPLE:
<UL> <LI> ... Level one, number one... <OL> <LI> ... Level two, number one... <LI> ... Level two, number two... <OL start="10"> <LI> ... Level three, number one... </OL> <LI> ... Level two, number three... </OL> <LI> ... Level one, number two... </UL>
Details about number order. In ordered lists, it is not possible to continue list numbering automatically from a previous list or to hide numbering of some list items. However, authors can reset the number of a list item by setting its value attribute. Numbering continues from the new value for subsequent list items. For example:
<ol> <li value="30"> makes this list item number 30. <li value="40"> makes this list item number 40. <li> makes this list item number 41. </ol>
<!-- definition lists - DT for term, DD for its definition --> <!ELEMENT DL - - (DT|DD)+ -- definition list --> <!ATTLIST DL %attrs; -- %coreattrs, %i18n, %events -- >
Start tag: required, End tag: required
<!ELEMENT DT - O (%inline;)* -- definition term --> <!ELEMENT DD - O (%flow;)* -- definition description --> <!ATTLIST (DT|DD) %attrs; -- %coreattrs, %i18n, %events -- >
Start tag: required, End tag: optional
Attributes defined elsewhere
Definition lists vary only slightly from other types of lists in that list items consist of two parts: a term and a description. The term is given by the DT element and is restricted to inline content. The description is given with a DD element that contains block-level content.
Here is an example:
<DL> <DT>Dweeb <DD>young excitable person who may mature into a <EM>Nerd</EM> or <EM>Geek</EM> <DT>Hacker <DD>a clever programmer <DT>Nerd <DD>technically bright but socially inept person </DL>
Here is an example with multiple terms and descriptions:
<DL> <DT>Center <DT>Centre <DD> A point equidistant from all points on the surface of a sphere. <DD> In some field sports, the player who holds the middle position on the field, court, or forward line. </DL>
Another application of DL, for example, is for marking up dialogues, with each DT naming a speaker, and each DD containing his or her words.
Note. The following is an informative description of the behavior of some current visual user agents when formatting lists. Style sheets allow better control of list formatting (e.g., for numbering, language-dependent conventions, indenting, etc.).
Visual user agents generally indent nested lists with respect to the current level of nesting.
For both OL and UL, the type attribute specifies rendering options for visual user agents.
For the UL element, possible values for the type attribute are disc, square, and circle. The default value depends on the level of nesting of the current list. These values are case-insensitive.
How each value is presented depends on the user agent. User agents should attempt to present a "disc" as a small filled-in circle, a "circle" as a small circle outline, and a "square" as a small square outline.
A graphical user agent might render this as:
for the value "disc"
for the value "circle"
for the value "square"
For the OL element, possible values for the type attribute are summarized in the table below (they are case-sensitive):
Type | Numbering style | |
---|---|---|
1 | arabic numbers | 1, 2, 3, ... |
a | lower alpha | a, b, c, ... |
A | upper alpha | A, B, C, ... |
i | lower roman | i, ii, iii, ... |
I | upper roman | I, II, III, ... |
Note that the type attribute is deprecated and list styles should be handled through style sheets.
For example, using CSS, one may specify that the style of numbers for list elements in a numbered list should be lowercase roman numerals. In the excerpt below, every OL element belonging to the class "withroman" will have roman numerals in front of its list items.
<STYLE type="text/css"> OL.withroman { list-style-type: lower-roman } </STYLE> <BODY> <OL class="withroman"> <LI> Step one ... <LI> Step two ... </OL> </BODY>
The rendering of a definition list also depends on the user agent. The example:
<DL> <DT>Dweeb <DD>young excitable person who may mature into a <EM>Nerd</EM> or <EM>Geek</EM> <DT>Hacker <DD>a clever programmer <DT>Nerd <DD>technically bright but socially inept person </DL>
might be rendered as follows:
Dweeb young excitable person who may mature into a Nerd or Geek Hacker a clever programmer Nerd technically bright but socially inept person
DIR and MENU are deprecated.
See the Transitional DTD for the formal definition.
Attributes defined elsewhere
The DIR element was designed to be used for creating multicolumn directory lists. The MENU element was designed to be used for single column menu lists. Both elements have the same structure as UL, just different rendering. In practice, a user agent will render a DIR or MENU list exactly as a UL list.
We strongly recommend using UL instead of these elements.
Contents
The HTML table model allows authors to arrange data -- text, preformatted text, images, links, forms, form fields, other tables, etc. -- into rows and columns of cells.
Each table may have an associated caption (see the CAPTION element) that provides a short description of the table's purpose. A longer description may also be provided (via the summary attribute) for the benefit of people using speech or Braille-based user agents.
Table rows may be grouped into a head, foot, and body sections, (via the THEAD, TFOOT and TBODY elements, respectively). Row groups convey additional structural information and may be rendered by user agents in ways that emphasize this structure. User agents may exploit the head/body/foot division to support scrolling of body sections independently of the head and foot sections. When long tables are printed, the head and foot information may be repeated on each page that contains table data.
Authors may also group columns to provide additional structural information that may be exploited by user agents. Furthermore, authors may declare column properties at the start of a table definition (via the COLGROUP and COL elements) in a way that enables user agents to render the table incrementally rather than having to wait for all the table data to arrive before rendering.
Table cells may either contain "header" information (see the TH element) or "data" (see the TD element). Cells may span multiple rows and columns. The HTML 4 table model allows authors to label each cell so that non-visual user agents may more easily communicate heading information about the cell to the user. Not only do these mechanisms greatly assist users with visual disabilities, they make it possible for multi-modal wireless browsers with limited display capabilities (e.g., Web-enabled pagers and phones) to handle tables.
Tables should not be used purely as a means to layout document content as this may present problems when rendering to non-visual media. Additionally, when used with graphics, these tables may force users to scroll horizontally to view a table designed on a system with a larger display. To minimize these problems, authors should use style sheets to control layout rather than tables.
Note. This specification includes more detailed information about tables in sections on table design rationale and implementation issues.
Here's a simple table that illustrates some of the features of the HTML table model. The following table definition:
<TABLE border="1" summary="This table gives some statistics about fruit flies: average height and weight, and percentage with red eyes (for both males and females)."> <CAPTION><EM>A test table with merged cells</EM></CAPTION> <TR><TH rowspan="2"><TH colspan="2">Average <TH rowspan="2">Red<BR>eyes <TR><TH>height<TH>weight <TR><TH>Males<TD>1.9<TD>0.003<TD>40% <TR><TH>Females<TD>1.7<TD>0.002<TD>43% </TABLE>
might be rendered something like this on a tty device:
A test table with merged cells /-----------------------------------------\ | | Average | Red | | |-------------------| eyes | | | height | weight | | |-----------------------------------------| | Males | 1.9 | 0.003 | 40% | |-----------------------------------------| | Females | 1.7 | 0.002 | 43% | \-----------------------------------------/
or like this by a graphical user agent:
<!ELEMENT TABLE - - (CAPTION?, (COL*|COLGROUP*), THEAD?, TFOOT?, TBODY+)> <!ATTLIST TABLE -- table element -- %attrs; -- %coreattrs, %i18n, %events -- summary %Text; #IMPLIED -- purpose/structure for speech output-- width %Length; #IMPLIED -- table width -- border %Pixels; #IMPLIED -- controls frame width around table -- frame %TFrame; #IMPLIED -- which parts of frame to render -- rules %TRules; #IMPLIED -- rulings between rows and cols -- cellspacing %Length; #IMPLIED -- spacing between cells -- cellpadding %Length; #IMPLIED -- spacing within cells -- >
Start tag: required, End tag: required
Attribute definitions
Attributes defined elsewhere
The TABLE element contains all other elements that specify caption, rows, content, and formatting.
The following informative list describes what operations user agents may carry out when rendering a table:
The HTML table model has been designed so that, with author assistance, user agents may render tables incrementally (i.e., as table rows arrive) rather than having to wait for all the data before beginning to render.
In order for a user agent to format a table in one pass, authors must tell the user agent:
More precisely, a user agent may render a table in a single pass when the column widths are specified using a combination of COLGROUP and COL elements. If any of the columns are specified in relative or percentage terms (see the section on calculating the width of columns), authors must also specify the width of the table itself.
The directionality of a table is either the inherited directionality (the default is left-to-right) or that specified by the dir attribute for the TABLE element.
For a left-to-right table, column zero is on the left side and row zero is at the top. For a right-to-left table, column zero is on the right side and row zero is at the top.
When a user agent allots extra cells to a row (see the section on calculating the number of columns in a table), extra row cells are added to the right of the table for left-to-right tables and to the left side for right-to-left tables.
Note that TABLE is the only element on which dir reverses the visual order of the columns; a single table row (TR) or a group of columns (COLGROUP) cannot be independently reversed.
When set for the TABLE element, the dir attribute also affects the direction of text within table cells (since the dir attribute is inherited by block-level elements).
To specify a right-to-left table, set the dir attribute as follows:
<TABLE dir="RTL"> ...the rest of the table... </TABLE>
The direction of text in individual cells can be changed by setting the dir attribute in an element that defines the cell. Please consult the section on bidirectional text for more information on text direction issues.
<!ELEMENT CAPTION - - (%inline;)* -- table caption --> <!ATTLIST CAPTION %attrs; -- %coreattrs, %i18n, %events -- >
Start tag: required, End tag: required
Attribute definitions
Attributes defined elsewhere
When present, the CAPTION element's text should describe the nature of the table. The CAPTION element is only permitted immediately after the TABLE start tag. A TABLE element may only contain one CAPTION element.
Visual user agents allow sighted people to quickly grasp the structure of the table from the headings as well as the caption. A consequence of this is that captions will often be inadequate as a summary of the purpose and structure of the table from the perspective of people relying on non-visual user agents.
Authors should therefore take care to provide additional information summarizing the purpose and structure of the table using the summary attribute of the TABLE element. This is especially important for tables without captions. Examples below illustrate the use of the summary attribute.
Visual user agents should avoid clipping any part of the table including the caption, unless a means is provided to access all parts, e.g., by horizontal or vertical scrolling. We recommend that the caption text be wrapped to the same width as the table. (See also the section on recommended layout algorithms.)
Start tag: required, End tag: optional
<!ELEMENT TBODY O O (TR)+ -- table body -->
Start tag: optional, End tag: optional
<!ATTLIST (THEAD|TBODY|TFOOT) -- table section -- %attrs; -- %coreattrs, %i18n, %events -- %cellhalign; -- horizontal alignment in cells -- %cellvalign; -- vertical alignment in cells -- >
Attributes defined elsewhere
Table rows may be grouped into a table head, table foot, and one or more table body sections, using the THEAD, TFOOT and TBODY elements, respectively. This division enables user agents to support scrolling of table bodies independently of the table head and foot. When long tables are printed, the table head and foot information may be repeated on each page that contains table data.
The table head and table foot should contain information about the table's columns. The table body should contain rows of table data.
When present, each THEAD, TFOOT, and TBODY contains a row group. Each row group must contain at least one row, defined by the TR element.
This example illustrates the order and structure of table heads, feet, and bodies.
<TABLE> <THEAD> <TR> ...header information... </THEAD> <TFOOT> <TR> ...footer information... </TFOOT> <TBODY> <TR> ...first row of block one data... <TR> ...second row of block one data... </TBODY> <TBODY> <TR> ...first row of block two data... <TR> ...second row of block two data... <TR> ...third row of block two data... </TBODY> </TABLE>
TFOOT must appear before TBODY within a TABLE definition so that user agents can render the foot before receiving all of the (potentially numerous) rows of data. The following summarizes which tags are required and which may be omitted:
Conforming user agent parsers must obey these rules for reasons of backward compatibility.
The table of the previous example could be shortened by removing certain end tags, as in:
<TABLE> <THEAD> <TR> ...header information... <TFOOT> <TR> ...footer information... <TBODY> <TR> ...first row of block one data... <TR> ...second row of block one data... <TBODY> <TR> ...first row of block two data... <TR> ...second row of block two data... <TR> ...third row of block two data... </TABLE>
The THEAD, TFOOT, and TBODY sections must contain the same number of columns.
Column groups allow authors to create structural divisions within a table. Authors may highlight this structure through style sheets or HTML attributes (e.g., the rules attribute for the TABLE element). For an example of the visual presentation of column groups, please consult the sample table.
A table may either contain a single implicit column group (no COLGROUP element delimits the columns) or any number of explicit column groups (each delimited by an instance of the COLGROUP element).
The COL element allows authors to share attributes among several columns without implying any structural grouping. The "span" of the COL element is the number of columns that will share the element's attributes.
<!ELEMENT COLGROUP - O (COL)* -- table column group --> <!ATTLIST COLGROUP %attrs; -- %coreattrs, %i18n, %events -- span NUMBER 1 -- default number of columns in group -- width %MultiLength; #IMPLIED -- default width for enclosed COLs -- %cellhalign; -- horizontal alignment in cells -- %cellvalign; -- vertical alignment in cells -- >
Start tag: required, End tag: optional
Attribute definitions
User agents must ignore this attribute if the COLGROUP element contains one or more COL elements.
This attribute specifies a default width for each column in the current column group. In addition to the standard pixel, percentage, and relative values, this attribute allows the special form "0*" (zero asterisk) which means that the width of the each column in the group should be the minimum width necessary to hold the column's contents. This implies that a column's entire contents must be known before its width may be correctly computed. Authors should be aware that specifying "0*" will prevent visual user agents from rendering a table incrementally.
This attribute is overridden for any column in the column group whose width is specified via a COL element.
Attributes defined elsewhere
The COLGROUP element creates an explicit column group. The number of columns in the column group may be specified in two, mutually exclusive ways:
The advantage of using the span attribute is that authors may group together information about column widths. Thus, if a table contains forty columns, all of which have a width of 20 pixels, it is easier to write:
<COLGROUP span="40" width="20"> </COLGROUP>
than:
<COLGROUP> <COL width="20"> <COL width="20"> ...a total of forty COL elements... </COLGROUP>
When it is necessary to single out a column (e.g., for style information, to specify width information, etc.) within a group, authors must identify that column with a COL element. Thus, to apply special style information to the last column of the previous table, we single it out as follows:
<COLGROUP width="20"> <COL span="39"> <COL id="format-me-specially"> </COLGROUP>
The width attribute of the COLGROUP element is inherited by all 40 columns. The first COL element refers to the first 39 columns (doing nothing special to them) and the second one assigns an id value to the fortieth column so that style sheets may refer to it.
The table in the following example contains two column groups. The first column group contains 10 columns and the second contains 5 columns. The default width for each column in the first column group is 50 pixels. The width of each column in the second column group will be the minimum required for that column.
<TABLE> <COLGROUP span="10" width="50"> <COLGROUP span="5" width="0*"> <THEAD> <TR><TD> ... </TABLE>
<!ELEMENT COL - O EMPTY -- table column --> <!ATTLIST COL -- column groups and properties -- %attrs; -- %coreattrs, %i18n, %events -- span NUMBER 1 -- COL attributes affect N columns -- width %MultiLength; #IMPLIED -- column width specification -- %cellhalign; -- horizontal alignment in cells -- %cellvalign; -- vertical alignment in cells -- >
Start tag: required, End tag: forbidden
Attribute definitions
Attributes defined elsewhere
The COL element allows authors to group together attribute specifications for table columns. The COL does not group columns together structurally -- that is the role of the COLGROUP element. COL elements are empty and serve only as a support for attributes. They may appear inside or outside an explicit column group (i.e., COLGROUP element).
The width attribute for COL refers to the width of each column in the element's span.
There are two ways to determine the number of columns in a table (in order of precedence):
It is an error if a table contains COLGROUP or COL elements and the two calculations do not result in the same number of columns.
Once the user agent has calculated the number of columns in the table, it may group them into column groups.
For example, for each of the following tables, the two column calculation methods should result in three columns. The first three tables may be rendered incrementally.
<TABLE> <COLGROUP span="3"></COLGROUP> <TR><TD> ... ...rows... </TABLE> <TABLE> <COLGROUP> <COL> <COL span="2"> </COLGROUP> <TR><TD> ... ...rows... </TABLE> <TABLE> <COLGROUP> <COL> </COLGROUP> <COLGROUP span="2"> <TR><TD> ... ...rows... </TABLE> <TABLE> <TR> <TD><TD><TD> </TR> </TABLE>
Authors may specify column widths in three ways:
However, if the table does not have a fixed width, user agents must receive all table data before they can determine the horizontal space required by the table. Only then may this space be allotted to proportional columns.
If an author specifies no width information for a column, a user agent may not be able to incrementally format the table since it must wait for the entire column of data to arrive in order to allot an appropriate width.
If column widths prove to be too narrow for the contents of a particular table cell, user agents may choose to reflow the table.
The table in this example contains six columns. The first one does not belong to an explicit column group. The next three belong to the first explicit column group and the last two belong to the second explicit column group. This table cannot be formatted incrementally since it contains proportional column width specifications and no value for the width attribute for the TABLE element.
Once the (visual) user agent has received the table's data: the available horizontal space will be alloted by the user agent as follows: First the user agent will allot 30 pixels to columns one and two. Then, the minimal space required for the third column will be reserved. The remaining horizontal space will be divided into six equal portions (since 2* + 1* + 3* = 6 portions). Column four (2*) will receive two of these portions, column five (1*) will receive one, and column six (3*) will receive three.
<TABLE> <COLGROUP> <COL width="30"> <COLGROUP> <COL width="30"> <COL width="0*"> <COL width="2*"> <COLGROUP align="center"> <COL width="1*"> <COL width="3*" align="char" char=":"> <THEAD> <TR><TD> ... ...rows... </TABLE>
We have set the value of the align attribute in the third column group to "center". All cells in every column in this group will inherit this value, but may override it. In fact, the final COL does just that, by specifying that every cell in the column it governs will be aligned along the ":" character.
In the following table, the column width specifications allow the user agent to format the table incrementally:
<TABLE width="200"> <COLGROUP span="10" width="15"> <COLGROUP width="*"> <COL id="penultimate-column"> <COL id="last-column"> <THEAD> <TR><TD> ... ...rows... </TABLE>
The first ten columns will be 15 pixels wide each. The last two columns will each receive half of the remaining 50 pixels. Note that the COL elements appear only so that an id value may be specified for the last two columns.
Note. Although the width attribute on the TABLE element is not deprecated, authors are encouraged to use style sheets to specify table widths.
<!ELEMENT TR - O (TH|TD)+ -- table row --> <!ATTLIST TR -- table row -- %attrs; -- %coreattrs, %i18n, %events -- %cellhalign; -- horizontal alignment in cells -- %cellvalign; -- vertical alignment in cells -- >
Start tag: required, End tag: optional
Attributes defined elsewhere
The TR elements acts as a container for a row of table cells. The end tag may be omitted.
This sample table contains three rows, each begun by the TR element:
<TABLE summary="This table charts the number of cups of coffee consumed by each senator, the type of coffee (decaf or regular), and whether taken with sugar."> <CAPTION>Cups of coffee consumed by each senator</CAPTION> <TR> ...A header row... <TR> ...First row of data... <TR> ...Second row of data... ...the rest of the table... </TABLE>
<!ELEMENT (TH|TD) - O (%flow;)* -- table header cell, table data cell--> <!-- Scope is simpler than headers attribute for common tables --> <!ENTITY % Scope "(row|col|rowgroup|colgroup)"> <!-- TH is for headers, TD for data, but for cells acting as both use TD --> <!ATTLIST (TH|TD) -- header or data cell -- %attrs; -- %coreattrs, %i18n, %events -- abbr %Text; #IMPLIED -- abbreviation for header cell -- axis CDATA #IMPLIED -- comma-separated list of related headers-- headers IDREFS #IMPLIED -- list of id's for header cells -- scope %Scope; #IMPLIED -- scope covered by header cells -- rowspan NUMBER 1 -- number of rows spanned by cell -- colspan NUMBER 1 -- number of cols spanned by cell -- %cellhalign; -- horizontal alignment in cells -- %cellvalign; -- vertical alignment in cells -- >
Start tag: required, End tag: optional
Attribute definitions
Attributes defined elsewhere
Table cells may contain two types of information: header information and data. This distinction enables user agents to render header and data cells distinctly, even in the absence of style sheets. For example, visual user agents may present header cell text with a bold font. Speech synthesizers may render header information with a distinct voice inflection.
The TH element defines a cell that contains header information. User agents have two pieces of header information available: the contents of the TH element and the value of the abbr attribute. User agents must render either the contents of the cell or the value of the abbr attribute. For visual media, the latter may be appropriate when there is insufficient space to render the full contents of the cell. For non-visual media abbr may be used as an abbreviation for table headers when these are rendered along with the contents of the cells to which they apply.
The headers and scope attributes also allow authors to help non-visual user agents process header information. Please consult the section on labeling cells for non-visual user agents for information and examples.
The TD element defines a cell that contains data.
Cells may be empty (i.e., contain no data).
For example, the following table contains four columns of data, each headed by a column description.
<TABLE summary="This table charts the number of cups of coffee consumed by each senator, the type of coffee (decaf or regular), and whether taken with sugar."> <CAPTION>Cups of coffee consumed by each senator</CAPTION> <TR> <TH>Name</TH> <TH>Cups</TH> <TH>Type of Coffee</TH> <TH>Sugar?</TH> <TR> <TD>T. Sexton</TD> <TD>10</TD> <TD>Espresso</TD> <TD>No</TD> <TR> <TD>J. Dinnen</TD> <TD>5</TD> <TD>Decaf</TD> <TD>Yes</TD> </TABLE>
A user agent rendering to a tty device might display this as follows:
Name Cups Type of Coffee Sugar? T. Sexton 10 Espresso No J. Dinnen 5 Decaf Yes
Cells may span several rows or columns. The number of rows or columns spanned by a cell is set by the rowspan and colspan attributes for the TH and TD elements.
In this table definition, we specify that the cell in row four, column two should span a total of three columns, including the current column.
<TABLE border="1"> <CAPTION>Cups of coffee consumed by each senator</CAPTION> <TR><TH>Name<TH>Cups<TH>Type of Coffee<TH>Sugar? <TR><TD>T. Sexton<TD>10<TD>Espresso<TD>No <TR><TD>J. Dinnen<TD>5<TD>Decaf<TD>Yes <TR><TD>A. Soria<TD colspan="3"><em>Not available</em> </TABLE>
This table might be rendered on a tty device by a visual user agent as follows:
Cups of coffee consumed by each senator -------------------------------------- | Name |Cups|Type of Coffee|Sugar?| -------------------------------------- |T. Sexton|10 |Espresso |No | -------------------------------------- |J. Dinnen|5 |Decaf |Yes | -------------------------------------- |A. Soria |Not available | --------------------------------------
The next example illustrates (with the help of table borders) how cell definitions that span more than one row or column affect the definition of later cells. Consider the following table definition:
<TABLE border="1"> <TR><TD>1 <TD rowspan="2">2 <TD>3 <TR><TD>4 <TD>6 <TR><TD>7 <TD>8 <TD>9 </TABLE>
As cell "2" spans the first and second rows, the definition of the second row will take it into account. Thus, the second TD in row two actually defines the row's third cell. Visually, the table might be rendered to a tty device as:
------------- | 1 | 2 | 3 | ----| |---- | 4 | | 6 | ----|---|---- | 7 | 8 | 9 | -------------
while a graphical user agent might render this as:
Note that if the TD defining cell "6" had been omitted, an extra empty cell would have been added by the user agent to complete the row.
Similarly, in the following table definition:
<TABLE border="1"> <TR><TD>1 <TD>2 <TD>3 <TR><TD colspan="2">4 <TD>6 <TR><TD>7 <TD>8 <TD>9 </TABLE>
cell "4" spans two columns, so the second TD in the row actually defines the third cell ("6"):
------------- | 1 | 2 | 3 | --------|---- | 4 | 6 | --------|---- | 7 | 8 | 9 | -------------
A graphical user agent might render this as:
Defining overlapping cells is an error. User agents may vary in how they handle this error (e.g., rendering may vary).
The following illegal example illustrates how one might create overlapping cells. In this table, cell "5" spans two rows and cell "7" spans two columns, so there is overlap in the cell between "7" and "9":
<TABLE border="1"> <TR><TD>1 <TD>2 <TD>3 <TR><TD>4 <TD rowspan="2">5 <TD>6 <TR><TD colspan="2">7 <TD>9 </TABLE>
Note. The following sections describe the HTML table attributes that concern visual formatting. When this specification was first published in 1997, [CSS1] did not offer mechanisms to control all aspects of visual table formatting. Since then, [CSS2] has added properties to allow visual formatting of tables.
HTML 4 includes mechanisms to control:
The following attributes affect a table's external frame and internal rules.
Attribute definitions
To help distinguish the cells of a table, we can set the border attribute of the TABLE element. Consider a previous example:
<TABLE border="1" summary="This table charts the number of cups of coffee consumed by each senator, the type of coffee (decaf or regular), and whether taken with sugar."> <CAPTION>Cups of coffee consumed by each senator</CAPTION> <TR> <TH>Name</TH> <TH>Cups</TH> <TH>Type of Coffee</TH> <TH>Sugar?</TH> <TR> <TD>T. Sexton</TD> <TD>10</TD> <TD>Espresso</TD> <TD>No</TD> <TR> <TD>J. Dinnen</TD> <TD>5</TD> <TD>Decaf</TD> <TD>Yes</TD> </TABLE>
In the following example, the user agent should show borders five pixels thick on the left-hand and right-hand sides of the table, with rules drawn between each column.
<TABLE border="5" frame="vsides" rules="cols"> <TR> <TD>1 <TD>2 <TD>3 <TR> <TD>4 <TD>5 <TD>6 <TR> <TD>7 <TD>8 <TD>9 </TABLE>
The following settings should be observed by user agents for backwards compatibility.
For example, the following definitions are equivalent:
<TABLE border="2"> <TABLE border="2" frame="border" rules="all">
as are the following:
<TABLE border> <TABLE frame="border" rules="all">
Note. The border attribute also defines the border behavior for the OBJECT and IMG elements, but takes different values for those elements.
The following attributes may be set for different table elements (see their definitions).
<!-- horizontal alignment attributes for cell contents --> <!ENTITY % cellhalign "align (left|center|right|justify|char) #IMPLIED char %Character; #IMPLIED -- alignment char, e.g. char=':' -- charoff %Length; #IMPLIED -- offset for alignment char --" > <!-- vertical alignment attributes for cell contents --> <!ENTITY % cellvalign "valign (top|middle|bottom|baseline) #IMPLIED" >
Attribute definitions
When charoff is used to set the offset of an alignment character, the direction of offset is determined by the current text direction (set by the dir attribute). In left-to-right texts (the default), offset is from the left margin. In right-to-left texts, offset is from the right margin. User agents are not required to support this attribute.
The table in this example aligns a row of currency values along a decimal point. We set the alignment character to "." explicitly.
<TABLE border="1"> <COLGROUP> <COL><COL align="char" char="."> <THEAD> <TR><TH>Vegetable <TH>Cost per kilo <TBODY> <TR><TD>Lettuce <TD>$1 <TR><TD>Silver carrots <TD>$10.50 <TR><TD>Golden turnips <TD>$100.30 </TABLE>
The formatted table may resemble the following:
------------------------------ | Vegetable |Cost per kilo| |--------------|-------------| |Lettuce | $1 | |--------------|-------------| |Silver carrots| $10.50| |--------------|-------------| |Golden turnips| $100.30| ------------------------------
When the contents of a cell contain more than one instance of the alignment character specified by char and the contents wrap, user agent behavior is undefined. Authors should therefore be attentive in their use of char.
Note. Visual user agents typically render TH elements vertically and horizontally centered within the cell and with a bold font weight.
The alignment of cell contents can be specified on a cell by cell basis, or inherited from enclosing elements, such as the row, column or the table itself.
The order of precedence (from highest to lowest) for the attributes align, char, and charoff is the following:
The order of precedence (from highest to lowest) for the attribute valign (as well as the other inherited attributes lang, dir, and style) is the following:
Furthermore, when rendering cells, horizontal alignment is determined by columns in preference to rows, while for vertical alignment, rows are given preference over columns.
The default alignment for cells depends on the user agent. However, user agents should substitute the default attribute for the current directionality (i.e., not just "left" in all cases).
User agents that do not support the "justify" value of the align attribute should use the value of the inherited directionality in its place.
Attribute definitions
These two attributes control spacing between and within cells. The following illustration explains how they relate:
In the following example, the cellspacing attribute specifies that cells should be separated from each other and from the table frame by twenty pixels. The cellpadding attribute specifies that the top margin of the cell and the bottom margin of the cell will each be separated from the cell's contents by 10% of the available vertical space (the total being 20%). Similarly, the left margin of the cell and the right margin of the cell will each be separated from the cell's contents by 10% of the available horizontal space (the total being 20%).
<TABLE cellspacing="20" cellpadding="20%"> <TR> <TD>Data1 <TD>Data2 <TD>Data3 </TABLE>
If a table or given column has a fixed width, cellspacing and cellpadding may demand more space than assigned. User agents may give these attributes precedence over the width attribute when a conflict occurs, but are not required to.
Non-visual user agents such as speech synthesizers and Braille-based devices may use the following TD and TH element attributes to render table cells more intuitively:
In the following example, we assign header information to cells by setting the headers attribute. Each cell in the same column refers to the same header cell (via the id attribute).
<TABLE border="1" summary="This table charts the number of cups of coffee consumed by each senator, the type of coffee (decaf or regular), and whether taken with sugar."> <CAPTION>Cups of coffee consumed by each senator</CAPTION> <TR> <TH id="t1">Name</TH> <TH id="t2">Cups</TH> <TH id="t3" abbr="Type">Type of Coffee</TH> <TH id="t4">Sugar?</TH> <TR> <TD headers="t1">T. Sexton</TD> <TD headers="t2">10</TD> <TD headers="t3">Espresso</TD> <TD headers="t4">No</TD> <TR> <TD headers="t1">J. Dinnen</TD> <TD headers="t2">5</TD> <TD headers="t3">Decaf</TD> <TD headers="t4">Yes</TD> </TABLE>
A speech synthesizer might render this table as follows:
Caption: Cups of coffee consumed by each senator Summary: This table charts the number of cups of coffee consumed by each senator, the type of coffee (decaf or regular), and whether taken with sugar. Name: T. Sexton, Cups: 10, Type: Espresso, Sugar: No Name: J. Dinnen, Cups: 5, Type: Decaf, Sugar: Yes
Note how the header "Type of Coffee" is abbreviated to "Type" using the abbr attribute.
Here is the same example substituting the scope attribute for the headers attribute. Note the value "col" for the scope attribute, meaning "all cells in the current column":
<TABLE border="1" summary="This table charts the number of cups of coffee consumed by each senator, the type of coffee (decaf or regular), and whether taken with sugar."> <CAPTION>Cups of coffee consumed by each senator</CAPTION> <TR> <TH scope="col">Name</TH> <TH scope="col">Cups</TH> <TH scope="col" abbr="Type">Type of Coffee</TH> <TH scope="col">Sugar?</TH> <TR> <TD>T. Sexton</TD> <TD>10</TD> <TD>Espresso</TD> <TD>No</TD> <TR> <TD>J. Dinnen</TD> <TD>5</TD> <TD>Decaf</TD> <TD>Yes</TD> </TABLE>
Here's a somewhat more complex example illustrating other values for the scope attribute:
<TABLE border="1" cellpadding="5" cellspacing="2" summary="History courses offered in the community of Bath arranged by course name, tutor, summary, code, and fee"> <TR> <TH colspan="5" scope="colgroup">Community Courses -- Bath Autumn 1997</TH> </TR> <TR> <TH scope="col" abbr="Name">Course Name</TH> <TH scope="col" abbr="Tutor">Course Tutor</TH> <TH scope="col">Summary</TH> <TH scope="col">Code</TH> <TH scope="col">Fee</TH> </TR> <TR> <TD scope="row">After the Civil War</TD> <TD>Dr. John Wroughton</TD> <TD> The course will examine the turbulent years in England after 1646. <EM>6 weekly meetings starting Monday 13th October.</EM> </TD> <TD>H27</TD> <TD>£32</TD> </TR> <TR> <TD scope="row">An Introduction to Anglo-Saxon England</TD> <TD>Mark Cottle</TD> <TD> One day course introducing the early medieval period reconstruction the Anglo-Saxons and their society. <EM>Saturday 18th October.</EM> </TD> <TD>H28</TD> <TD>£18</TD> </TR> <TR> <TD scope="row">The Glory that was Greece</TD> <TD>Valerie Lorenz</TD> <TD> Birthplace of democracy, philosophy, heartland of theater, home of argument. The Romans may have done it but the Greeks did it first. <EM>Saturday day school 25th October 1997</EM> </TD> <TD>H30</TD> <TD>£18</TD> </TR> </TABLE>
A graphical user agent might render this as:
Note the use of the scope attribute with the "row" value. Although the first cell in each row contains data, not header information, the scope attribute makes the data cell behave like a row header cell. This allows speech synthesizers to provide the relevant course name upon request or to state it immediately before each cell's content.
Users browsing a table with a speech-based user agent may wish to hear an explanation of a cell's contents in addition to the contents themselves. One way the user might provide an explanation is by speaking associated header information before speaking the data cell's contents (see the section on associating header information with data cells).
Users may also want information about more than one cell, in which case header information provided at the cell level (by headers, scope, and abbr) may not provide adequate context. Consider the following table, which classifies expenses for meals, hotels, and transport in two locations (San Jose and Seattle) over several days:
Users might want to extract information from the table in the form of queries:
Each query involves a computation by the user agent that may involve zero or more cells. In order to determine, for example, the costs of meals on 25 August, the user agent must know which table cells refer to "Meals" (all of them) and which refer to "Dates" (specifically, 25 August), and find the intersection of the two sets.
To accommodate this type of query, the HTML 4 table model allows authors to place cell headers and data into categories. For example, for the travel expense table, an author could group the header cells "San Jose" and "Seattle" into the category "Location", the headers "Meals", "Hotels", and "Transport" in the category "Expenses", and the four days into the category "Date". The previous three questions would then have the following meanings:
Authors categorize a header or data cell by setting the axis attribute for the cell. For instance, in the travel expense table, the cell containing the information "San Jose" could be placed in the "Location" category as follows:
<TH id="a6" axis="location">San Jose</TH>
Any cell containing information related to "San Jose" should refer to this header cell via either the headers or the scope attribute. Thus, meal expenses for 25-Aug-1997 should be marked up to refer to id attribute (whose value here is "a6") of the "San Jose" header cell:
<TD headers="a6">37.74</TD>
Each headers attribute provides a list of id references. Authors may thus categorize a given cell in any number of ways (or, along any number of "headers", hence the name).
Below we mark up the travel expense table with category information:
<TABLE border="1" summary="This table summarizes travel expenses incurred during August trips to San Jose and Seattle"> <CAPTION> Travel Expense Report </CAPTION> <TR> <TH></TH> <TH id="a2" axis="expenses">Meals</TH> <TH id="a3" axis="expenses">Hotels</TH> <TH id="a4" axis="expenses">Transport</TH> <TD>subtotals</TD> </TR> <TR> <TH id="a6" axis="location">San Jose</TH> <TH></TH> <TH></TH> <TH></TH> <TD></TD> </TR> <TR> <TD id="a7" axis="date">25-Aug-97</TD> <TD headers="a6 a7 a2">37.74</TD> <TD headers="a6 a7 a3">112.00</TD> <TD headers="a6 a7 a4">45.00</TD> <TD></TD> </TR> <TR> <TD id="a8" axis="date">26-Aug-97</TD> <TD headers="a6 a8 a2">27.28</TD> <TD headers="a6 a8 a3">112.00</TD> <TD headers="a6 a8 a4">45.00</TD> <TD></TD> </TR> <TR> <TD>subtotals</TD> <TD>65.02</TD> <TD>224.00</TD> <TD>90.00</TD> <TD>379.02</TD> </TR> <TR> <TH id="a10" axis="location">Seattle</TH> <TH></TH> <TH></TH> <TH></TH> <TD></TD> </TR> <TR> <TD id="a11" axis="date">27-Aug-97</TD> <TD headers="a10 a11 a2">96.25</TD> <TD headers="a10 a11 a3">109.00</TD> <TD headers="a10 a11 a4">36.00</TD> <TD></TD> </TR> <TR> <TD id="a12" axis="date">28-Aug-97</TD> <TD headers="a10 a12 a2">35.00</TD> <TD headers="a10 a12 a3">109.00</TD> <TD headers="a10 a12 a4">36.00</TD> <TD></TD> </TR> <TR> <TD>subtotals</TD> <TD>131.25</TD> <TD>218.00</TD> <TD>72.00</TD> <TD>421.25</TD> </TR> <TR> <TH>Totals</TH> <TD>196.27</TD> <TD>442.00</TD> <TD>162.00</TD> <TD>800.27</TD> </TR> </TABLE>
Note that marking up the table this way also allows user agents to avoid confusing the user with unwanted information. For instance, if a speech synthesizer were to speak all of the figures in the "Meals" column of this table in response to the query "What were all my meal expenses?", a user would not be able to distinguish a day's expenses from subtotals or totals. By carefully categorizing cell data, authors allow user agents to make important semantic distinctions when rendering.
Of course, there is no limit to how authors may categorize information in a table. In the travel expense table, for example, we could add the additional categories "subtotals" and "totals".
This specification does not require user agents to handle information provided by the axis attribute, nor does it make any recommendations about how user agents may present axis information to users or how users may query the user agent about this information.
However, user agents, particularly speech synthesizers, may want to factor out information common to several cells that are the result of a query. For instance, if the user asks "What did I spend for meals in San Jose?", the user agent would first determine the cells in question (25-Aug-1997: 37.74, 26-Aug-1997:27.28), then render this information. A user agent speaking this information might read it:
Location: San Jose. Date: 25-Aug-1997. Expenses, Meals: 37.74 Location: San Jose. Date: 26-Aug-1997. Expenses, Meals: 27.28
or, more compactly:
San Jose, 25-Aug-1997, Meals: 37.74 San Jose, 26-Aug-1997, Meals: 27.28
An even more economical rendering would factor the common information and reorder it:
San Jose, Meals, 25-Aug-1997: 37.74 26-Aug-1997: 27.28
User agents that support this type of rendering should allow user agents a means to customize rendering (e.g., through style sheets).
In the absence of header information from either the scope or headers attribute, user agents may construct header information according to the following algorithm. The goal of the algorithm is to find an ordered list of headers. (In the following description of the algorithm the table directionality is assumed to be left-to-right.)
This sample illustrates grouped rows and columns. The example is adapted from "Developing International Software", by Nadine Kano.
In "ascii art", the following table:
<TABLE border="2" frame="hsides" rules="groups" summary="Code page support in different versions of MS Windows."> <CAPTION>CODE-PAGE SUPPORT IN MICROSOFT WINDOWS</CAPTION> <COLGROUP align="center"> <COLGROUP align="left"> <COLGROUP align="center" span="2"> <COLGROUP align="center" span="3"> <THEAD valign="top"> <TR> <TH>Code-Page<BR>ID <TH>Name <TH>ACP <TH>OEMCP <TH>Windows<BR>NT 3.1 <TH>Windows<BR>NT 3.51 <TH>Windows<BR>95 <TBODY> <TR><TD>1200<TD>Unicode (BMP of ISO/IEC-10646)<TD><TD><TD>X<TD>X<TD>* <TR><TD>1250<TD>Windows 3.1 Eastern European<TD>X<TD><TD>X<TD>X<TD>X <TR><TD>1251<TD>Windows 3.1 Cyrillic<TD>X<TD><TD>X<TD>X<TD>X <TR><TD>1252<TD>Windows 3.1 US (ANSI)<TD>X<TD><TD>X<TD>X<TD>X <TR><TD>1253<TD>Windows 3.1 Greek<TD>X<TD><TD>X<TD>X<TD>X <TR><TD>1254<TD>Windows 3.1 Turkish<TD>X<TD><TD>X<TD>X<TD>X <TR><TD>1255<TD>Hebrew<TD>X<TD><TD><TD><TD>X <TR><TD>1256<TD>Arabic<TD>X<TD><TD><TD><TD>X <TR><TD>1257<TD>Baltic<TD>X<TD><TD><TD><TD>X <TR><TD>1361<TD>Korean (Johab)<TD>X<TD><TD><TD>**<TD>X <TBODY> <TR><TD>437<TD>MS-DOS United States<TD><TD>X<TD>X<TD>X<TD>X <TR><TD>708<TD>Arabic (ASMO 708)<TD><TD>X<TD><TD><TD>X <TR><TD>709<TD>Arabic (ASMO 449+, BCON V4)<TD><TD>X<TD><TD><TD>X <TR><TD>710<TD>Arabic (Transparent Arabic)<TD><TD>X<TD><TD><TD>X <TR><TD>720<TD>Arabic (Transparent ASMO)<TD><TD>X<TD><TD><TD>X </TABLE>
would be rendered something like this:
CODE-PAGE SUPPORT IN MICROSOFT WINDOWS =============================================================================== Code-Page | Name | ACP OEMCP | Windows Windows Windows ID | | | NT 3.1 NT 3.51 95 ------------------------------------------------------------------------------- 1200 | Unicode (BMP of ISO 10646) | | X X * 1250 | Windows 3.1 Eastern European | X | X X X 1251 | Windows 3.1 Cyrillic | X | X X X 1252 | Windows 3.1 US (ANSI) | X | X X X 1253 | Windows 3.1 Greek | X | X X X 1254 | Windows 3.1 Turkish | X | X X X 1255 | Hebrew | X | X 1256 | Arabic | X | X 1257 | Baltic | X | X 1361 | Korean (Johab) | X | ** X ------------------------------------------------------------------------------- 437 | MS-DOS United States | X | X X X 708 | Arabic (ASMO 708) | X | X 709 | Arabic (ASMO 449+, BCON V4) | X | X 710 | Arabic (Transparent Arabic) | X | X 720 | Arabic (Transparent ASMO) | X | X ===============================================================================
A graphical user agent might render this as:
This example illustrates how COLGROUP can be used to group columns and set the default column alignment. Similarly, TBODY is used to group rows. The frame and rules attributes tell the user agent which borders and rules to render.
Contents
HTML offers many of the conventional publishing idioms for rich text and structured documents, but what separates it from most other markup languages is its features for hypertext and interactive documents. This section introduces the link (or hyperlink, or Web link), the basic hypertext construct. A link is a connection from one Web resource to another. Although a simple concept, the link has been one of the primary forces driving the success of the Web.
A link has two ends -- called anchors -- and a direction. The link starts at the "source" anchor and points to the "destination" anchor, which may be any Web resource (e.g., an image, a video clip, a sound bite, a program, an HTML document, an element within an HTML document, etc.).
The default behavior associated with a link is the retrieval of another Web resource. This behavior is commonly and implicitly obtained by selecting the link (e.g., by clicking, through keyboard input, etc.).
The following HTML excerpt contains two links, one whose destination anchor is an HTML document named "chapter2.html" and the other whose destination anchor is a GIF image in the file "forest.gif":
<BODY> ...some text... <P>You'll find a lot more in <A href="chapter2.html">chapter two</A>. See also this <A href="../images/forest.gif">map of the enchanted forest.</A> </BODY>
By activating these links (by clicking with the mouse, through keyboard input, voice commands, etc.), users may visit these resources. Note that the href attribute in each source anchor specifies the address of the destination anchor with a URI.
The destination anchor of a link may be an element within an HTML document. The destination anchor must be given an anchor name and any URI addressing this anchor must include the name as its fragment identifier.
Destination anchors in HTML documents may be specified either by the A element (naming it with the name attribute), or by any other element (naming with the id attribute).
Thus, for example, an author might create a table of contents whose entries link to header elements H2, H3, etc., in the same document. Using the A element to create destination anchors, we would write:
<H1>Table of Contents</H1> <P><A href="#section1">Introduction</A><BR> <A href="#section2">Some background</A><BR> <A href="#section2.1">On a more personal note</A><BR> ...the rest of the table of contents... ...the document body... <H2><A name="section1">Introduction</A></H2> ...section 1... <H2><A name="section2">Some background</A></H2> ...section 2... <H3><A name="section2.1">On a more personal note</A></H3> ...section 2.1...
We may achieve the same effect by making the header elements themselves the anchors:
<H1>Table of Contents</H1> <P><A href="#section1">Introduction</A><BR> <A href="#section2">Some background</A><BR> <A href="#section2.1">On a more personal note</A><BR> ...the rest of the table of contents... ...the document body... <H2 id="section1">Introduction</H2> ...section 1... <H2 id="section2">Some background</H2> ...section 2... <H3 id="section2.1">On a more personal note</H3> ...section 2.1...
By far the most common use of a link is to retrieve another Web resource, as illustrated in the previous examples. However, authors may insert links in their documents that express other relationships between resources than simply "activate this link to visit that related resource". Links that express other types of relationships have one or more link types specified in their source anchor.
The roles of a link defined by A or LINK are specified via the rel and rev attributes.
For instance, links defined by the LINK element may describe the position of a document within a series of documents. In the following excerpt, links within the document entitled "Chapter 5" point to the previous and next chapters:
<HEAD> ...other head information... <TITLE>Chapter 5</TITLE> <LINK rel="prev" href="chapter4.html"> <LINK rel="next" href="chapter6.html"> </HEAD>
The link type of the first link is "prev" and that of the second is "next" (two of several recognized link types). Links specified by LINK are not rendered with the document's contents, although user agents may render them in other ways (e.g., as navigation tools).
Even if they are not used for navigation, these links may be interpreted in interesting ways. For example, a user agent that prints a series of HTML documents as a single document may use this link information as the basis of forming a coherent linear document. Further information is given below on using links for the benefit of search engines.
Although several HTML elements and attributes create links to other resources (e.g., the IMG element, the FORM element, etc.), this chapter discusses links and anchors created by the LINK and A elements. The LINK element may only appear in the head of a document. The A element may only appear in the body.
When the A element's href attribute is set, the element defines a source anchor for a link that may be activated by the user to retrieve a Web resource. The source anchor is the location of the A instance and the destination anchor is the Web resource.
The retrieved resource may be handled by the user agent in several ways: by opening a new HTML document in the same user agent window, opening a new HTML document in a different window, starting a new program to handle the resource, etc. Since the A element has content (text, images, etc.), user agents may render this content in such a way as to indicate the presence of a link (e.g., by underlining the content).
When the name or id attributes of the A element are set, the element defines an anchor that may be the destination of other links.
Authors may set the name and href attributes simultaneously in the same A instance.
The LINK element defines a relationship between the current document and another resource. Although LINK has no content, the relationships it defines may be rendered by some user agents.
The title attribute may be set for both A and LINK to add information about the nature of a link. This information may be spoken by a user agent, rendered as a tool tip, cause a change in cursor image, etc.
Thus, we may augment a previous example by supplying a title for each link:
<BODY> ...some text... <P>You'll find a lot more in <A href="chapter2.html" title="Go to chapter two">chapter two</A>. <A href="./chapter2.html" title="Get chapter two.">chapter two</A>. See also this <A href="../images/forest.gif" title="GIF image of enchanted forest">map of the enchanted forest.</A> </BODY>
Since links may point to documents encoded with different character encodings, the A and LINK elements support the charset attribute. This attribute allows authors to advise user agents about the encoding of data at the other end of the link.
The hreflang attribute provides user agents with information about the language of a resource at the end of a link, just as the lang attribute provides information about the language of an element's content or attribute values.
Armed with this additional knowledge, user agents should be able to avoid presenting "garbage" to the user. Instead, they may either locate resources necessary for the correct presentation of the document or, if they cannot locate the resources, they should at least warn the user that the document will be unreadable and explain the cause.
<!ELEMENT A - - (%inline;)* -(A) -- anchor --> <!ATTLIST A %attrs; -- %coreattrs, %i18n, %events -- charset %Charset; #IMPLIED -- char encoding of linked resource -- type %ContentType; #IMPLIED -- advisory content type -- name CDATA #IMPLIED -- named link end -- href %URI; #IMPLIED -- URI for linked resource -- hreflang %LanguageCode; #IMPLIED -- language code -- rel %LinkTypes; #IMPLIED -- forward link types -- rev %LinkTypes; #IMPLIED -- reverse link types -- accesskey %Character; #IMPLIED -- accessibility key character -- shape %Shape; rect -- for use with client-side image maps -- coords %Coords; #IMPLIED -- for use with client-side image maps -- tabindex NUMBER #IMPLIED -- position in tabbing order -- onfocus %Script; #IMPLIED -- the element got the focus -- onblur %Script; #IMPLIED -- the element lost the focus -- >
Start tag: required, End tag: required
Attribute definitions
Attributes defined elsewhere
Each A element defines an anchor
Authors may also create an A element that specifies no anchors, i.e., that doesn't specify href, name, or id. Values for these attributes may be set at a later time through scripts.
In the example that follows, the A element defines a link. The source anchor is the text "W3C Web site" and the destination anchor is "http://www.w3.org/":
For more information about W3C, please consult the <A href="http://www.w3.org/">W3C Web site</A>.
This link designates the home page of the World Wide Web Consortium. When a user activates this link in a user agent, the user agent will retrieve the resource, in this case, an HTML document.
User agents generally render links in such a way as to make them obvious to users (underlining, reverse video, etc.). The exact rendering depends on the user agent. Rendering may vary according to whether the user has already visited the link or not. A possible visual rendering of the previous link might be:
For more information about W3C, please consult the W3C Web site. ~~~~~~~~~~~~
To tell user agents explicitly what the character encoding of the destination page is, set the charset attribute:
For more information about W3C, please consult the <A href="http://www.w3.org/" charset="ISO-8859-1">W3C Web site</A>
Suppose we define an anchor named "anchor-one" in the file "one.html".
...text before the anchor... <A name="anchor-one">This is the location of anchor one.</A> ...text after the anchor...
This creates an anchor around the text "This is the location of anchor one.". Usually, the contents of A are not rendered in any special way when A defines an anchor only.
Having defined the anchor, we may link to it from the same or another document. URIs that designate anchors contain a "#" character followed by the anchor name (the fragment identifier). Here are some examples of such URIs:
Thus, a link defined in the file "two.html" in the same directory as "one.html" would refer to the anchor as follows:
...text before the link... For more information, please consult <A href="./one.html#anchor-one"> anchor one</A>. ...text after the link...
The A element in the following example specifies a link (with href) and creates a named anchor (with name) simultaneously:
I just returned from vacation! Here's a <A name="anchor-two" href="http://www.somecompany.com/People/Ian/vacation/family.png"> photo of my family at the lake.</A>.
This example contains a link to a different type of Web resource (a PNG image). Activating the link should cause the image resource to be retrieved from the Web (and possibly displayed if the system has been configured to do so).
Note. User agents should be able to find anchors created by empty A elements, but some fail to do so. For example, some user agents may not find the "empty-anchor" in the following HTML fragment:
<A name="empty-anchor"></A> <EM>...some HTML...</EM> <A href="#empty-anchor">Link to empty anchor</A>
An anchor name is the value of either the name or id attribute when used in the context of anchors. Anchor names must observe the following rules:
Thus, the following example is correct with respect to string matching and must be considered a match by user agents:
<P><A href="#xxx">...</A> ...more document... <P><A name="xxx">...</A>
ILLEGAL EXAMPLE:
The following example is illegal with respect to
uniqueness since the two names are the same except for
case:
<P><A name="xxx">...</A> <P><A name="XXX">...</A>
Although the following excerpt is legal HTML, the behavior of the user agent is not defined; some user agents may (incorrectly) consider this a match and others may not.
<P><A href="#xxx">...</A> ...more document... <P><A name="XXX">...</A>
Anchor names should be restricted to ASCII characters. Please consult the appendix for more information about non-ASCII characters in URI attribute values.
Links and anchors defined by the A element must not be nested; an A element must not contain any other A elements.
Since the DTD defines the LINK element to be empty, LINK elements may not be nested either.
The id attribute may be used to create an anchor at the start tag of any element (including the A element).
This example illustrates the use of the id attribute to position an anchor in an H2 element. The anchor is linked to via the A element.
You may read more about this in <A href="#section2">Section Two</A>. ...later in the document <H2 id="section2">Section Two</H2> ...later in the document <P>Please refer to <A href="#section2">Section Two</A> above for more details.
The following example names a destination anchor with the id attribute:
I just returned from vacation! Here's a <A id="anchor-two">photo of my family at the lake.</A>.
The id and name attributes share the same name space. This means that they cannot both define an anchor with the same name in the same document. It is permissible to use both attributes to specify an element's unique identifier for the following elements: A, APPLET, FORM, FRAME, IFRAME, IMG, and MAP. When both attributes are used on a single element, their values must be identical.
ILLEGAL EXAMPLE:
The following excerpt is illegal HTML since these
attributes declare the same name twice in the same
document.
<A href="#a1">...</A> ... <H1 id="a1"> ...pages and pages... <A name="a1"></A>
The following example illustrates that id and name must be the same when both appear in an element's start tag:
<P><A name="a1" id="a1" href="#a1">...</A>
Because of its specification in the HTML DTD, the name attribute may contain character references. Thus, the value Dürst is a valid name attribute value, as is Dürst . The id attribute, on the other hand, may not contain character references.
Use id or name? Authors should consider the following issues when deciding whether to use id or name for an anchor name:
A reference to an unavailable or unidentifiable resource is an error. Although user agents may vary in how they handle such an error, we recommend the following behavior:
<!ELEMENT LINK - O EMPTY -- a media-independent link --> <!ATTLIST LINK %attrs; -- %coreattrs, %i18n, %events -- charset %Charset; #IMPLIED -- char encoding of linked resource -- href %URI; #IMPLIED -- URI for linked resource -- hreflang %LanguageCode; #IMPLIED -- language code -- type %ContentType; #IMPLIED -- advisory content type -- rel %LinkTypes; #IMPLIED -- forward link types -- rev %LinkTypes; #IMPLIED -- reverse link types -- media %MediaDesc; #IMPLIED -- for rendering on these media -- >
Start tag: required, End tag: forbidden
Attributes defined elsewhere
This element defines a link. Unlike A, it may only appear in the HEAD section of a document, although it may appear any number of times. Although LINK has no content, it conveys relationship information that may be rendered by user agents in a variety of ways (e.g., a tool-bar with a drop-down menu of links).
This example illustrates how several LINK definitions may appear in the HEAD section of a document. The current document is "Chapter2.html". The rel attribute specifies the relationship of the linked document with the current document. The values "Index", "Next", and "Prev" are explained in the section on link types.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <HTML> <HEAD> <TITLE>Chapter 2</TITLE> <LINK rel="Index" href="../index.html"> <LINK rel="Next" href="Chapter3.html"> <LINK rel="Prev" href="Chapter1.html"> </HEAD> ...the rest of the document...
The rel and rev attributes play complementary roles -- the rel attribute specifies a forward link and the rev attribute specifies a reverse link.
Consider two documents A and B.
Document A: <LINK href="docB" rel="foo">
Has exactly the same meaning as:
Document B: <LINK href="docA" rev="foo">
Both attributes may be specified simultaneously.
When the LINK element links an external style sheet to a document, the type attribute specifies the style sheet language and the media attribute specifies the intended rendering medium or media. User agents may save time by retrieving from the network only those style sheets that apply to the current device.
Media types are further discussed in the section on style sheets.
Authors may use the LINK element to provide a variety of information to search engines, including:
The examples below illustrate how language information, media types, and link types may be combined to improve document handling by search engines.
In the following example, we use the hreflang attribute to tell search engines where to find Dutch, Portuguese, and Arabic versions of a document. Note the use of the charset attribute for the Arabic manual. Note also the use of the lang attribute to indicate that the value of the title attribute for the LINK element designating the French manual is in French.
<HEAD> <TITLE>The manual in English</TITLE> <LINK title="The manual in Dutch" type="text/html" rel="alternate" hreflang="nl" href="http://someplace.com/manual/dutch.html"> <LINK title="The manual in Portuguese" type="text/html" rel="alternate" hreflang="pt" href="http://someplace.com/manual/portuguese.html"> <LINK title="The manual in Arabic" type="text/html" rel="alternate" charset="ISO-8859-6" hreflang="ar" href="http://someplace.com/manual/arabic.html"> <LINK lang="fr" title="La documentation en Français" type="text/html" rel="alternate" hreflang="fr" href="http://someplace.com/manual/french.html"> </HEAD>
In the following example, we tell search engines where to find the printed version of a manual.
<HEAD> <TITLE>Reference manual</TITLE> <LINK media="print" title="The manual in postscript" type="application/postscript" rel="alternate" href="http://someplace.com/manual/postscript.ps"> </HEAD>
In the following example, we tell search engines where to find the front page of a collection of documents.
<HEAD> <TITLE>Reference manual -- Page 5</TITLE> <LINK rel="Start" title="The first page of the manual" type="text/html" href="http://someplace.com/manual/start.html"> </HEAD>
Further information is given in the notes in the appendix on helping search engines index your Web site.
<!ELEMENT BASE - O EMPTY -- document base URI --> <!ATTLIST BASE href %URI; #REQUIRED -- URI that acts as base URI -- >
Start tag: required, End tag: forbidden
Attribute definitions
Attributes defined elsewhere
In HTML, links and references to external images, applets, form-processing programs, style sheets, etc. are always specified by a URI. Relative URIs are resolved according to a base URI, which may come from a variety of sources. The BASE element allows authors to specify a document's base URI explicitly.
When present, the BASE element must appear in the HEAD section of an HTML document, before any element that refers to an external source. The path information specified by the BASE element only affects URIs in the document where the element appears.
For example, given the following BASE declaration and A declaration:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <HTML> <HEAD> <TITLE>Our Products</TITLE> <BASE href="http://www.aviary.com/products/intro.html"> </HEAD> <BODY> <P>Have you seen our <A href="../cages/birds.gif">Bird Cages</A>? </BODY> </HTML>
the relative URI "../cages/birds.gif" would resolve to:
http://www.aviary.com/cages/birds.gif
User agents must calculate the base URI for resolving relative URIs according to [RFC1808], section 3. The following describes how [RFC1808] applies specifically to HTML.
User agents must calculate the base URI according to the following precedences (highest priority to lowest):
Additionally, the OBJECT and APPLET elements define attributes that take precedence over the value set by the BASE element. Please consult the definitions of these elements for more information about URI issues specific to them.
Note. For versions of HTTP that define a Link header, user agents should handle these headers exactly as LINK elements in the document. HTTP 1.1 as defined by [RFC2616] does not include a Link header field (refer to section 19.6.3).
Contents
HTML's multimedia features allow authors to include images, applets (programs that are automatically downloaded and run on the user's machine), video clips, and other HTML documents in their pages.
For example, to include a PNG image in a document, authors may write:
<BODY> <P>Here's a closeup of the Grand Canyon: <OBJECT data="canyon.png" type="image/png"> This is a <EM>closeup</EM> of the Grand Canyon. </OBJECT> </BODY>
Previous versions of HTML allowed authors to include images (via IMG) and applets (via APPLET). These elements have several limitations:
To address these issues, HTML 4 introduces the OBJECT element, which offers an all-purpose solution to generic object inclusion. The OBJECT element allows HTML authors to specify everything required by an object for its presentation by a user agent: source code, initial values, and run-time data. In this specification, the term "object" is used to describe the things that people want to place in HTML documents; other commonly used terms for these things are: applets, plug-ins, media handlers, etc.
The new OBJECT element thus subsumes some of the tasks carried out by existing elements. Consider the following chart of functionalities:
Type of inclusion | Specific element | Generic element |
---|---|---|
Image | IMG | OBJECT |
Applet | APPLET (Deprecated.) | OBJECT |
Another HTML document | IFRAME | OBJECT |
The chart indicates that each type of inclusion has a specific and a general solution. The generic OBJECT element will serve as the solution for implementing future media types.
To include images, authors may use the OBJECT element or the IMG element.
To include applets, authors should use the OBJECT element as the APPLET element is deprecated.
To include one HTML document in another, authors may use either the new IFRAME element or the OBJECT element. In both cases, the embedded document remains independent of the main document. Visual user agents may present the embedded document in a distinct window within the main document. Please consult the notes on embedded documents for a comparison of OBJECT and IFRAME for document inclusion.
Images and other included objects may have hyperlinks associated with them, both through the standard linking mechanisms, but also via image maps. An image map specifies active geometric regions of an included object and assigns a link to each region. When activated, these links may cause a document to be retrieved, may run a program on the server, etc.
In the following sections, we discuss the various mechanisms available to authors for multimedia inclusions and creating image maps for those inclusions.
<!-- To avoid problems with text-only UAs as well as to make image content understandable and navigable to users of non-visual UAs, you need to provide a description with ALT, and avoid server-side image maps --> <!ELEMENT IMG - O EMPTY -- Embedded image --> <!ATTLIST IMG %attrs; -- %coreattrs, %i18n, %events -- src %URI; #REQUIRED -- URI of image to embed -- alt %Text; #REQUIRED -- short description -- longdesc %URI; #IMPLIED -- link to long description (complements alt) -- name CDATA #IMPLIED -- name of image for scripting -- height %Length; #IMPLIED -- override height -- width %Length; #IMPLIED -- override width -- usemap %URI; #IMPLIED -- use client-side image map -- ismap (ismap) #IMPLIED -- use server-side image map -- >
Start tag: required, End tag: forbidden
Attribute definitions
Attributes defined elsewhere
The IMG element embeds an image in the current document at the location of the element's definition. The IMG element has no content; it is usually replaced inline by the image designated by the src attribute, the exception being for left or right-aligned images that are "floated" out of line.
In an earlier example, we defined a link to a family photo. Here, we insert the photo directly into the current document:
<BODY> <P>I just returned from vacation! Here's a photo of my family at the lake: <IMG src="http://www.somecompany.com/People/Ian/vacation/family.png" alt="A photo of my family at the lake."> </BODY>
This inclusion may also be achieved with the OBJECT element as follows:
<BODY> <P>I just returned from vacation! Here's a photo of my family at the lake: <OBJECT data="http://www.somecompany.com/People/Ian/vacation/family.png" type="image/png"> A photo of my family at the lake. </OBJECT> </BODY>
The alt attribute specifies alternate text that is rendered when the image cannot be displayed (see below for information on how to specify alternate text ). User agents must render alternate text when they cannot support images, they cannot support a certain image type or when they are configured not to display images.
The following example shows how the longdesc attribute can be used to link to a richer description:
<BODY> <P> <IMG src="sitemap.gif" alt="HP Labs Site Map" longdesc="sitemap.html"> </BODY>
The alt attribute provides a short description of the image. This should be sufficient to allow users to decide whether they want to follow the link given by the longdesc attribute to the longer description, here "sitemap.html".
Please consult the section on the visual presentation of objects, images, and applets for information about image size, alignment, and borders.
<!ELEMENT OBJECT - - (PARAM | %flow;)* -- generic embedded object --> <!ATTLIST OBJECT %attrs; -- %coreattrs, %i18n, %events -- declare (declare) #IMPLIED -- declare but don't instantiate flag -- classid %URI; #IMPLIED -- identifies an implementation -- codebase %URI; #IMPLIED -- base URI for classid, data, archive-- data %URI; #IMPLIED -- reference to object's data -- type %ContentType; #IMPLIED -- content type for data -- codetype %ContentType; #IMPLIED -- content type for code -- archive CDATA #IMPLIED -- space-separated list of URIs -- standby %Text; #IMPLIED -- message to show while loading -- height %Length; #IMPLIED -- override height -- width %Length; #IMPLIED -- override width -- usemap %URI; #IMPLIED -- use client-side image map -- name CDATA #IMPLIED -- submit as part of form -- tabindex NUMBER #IMPLIED -- position in tabbing order -- >
Start tag: required, End tag: required
Attribute definitions
Attributes defined elsewhere
Most user agents have built-in mechanisms for rendering common data types such as text, GIF images, colors, fonts, and a handful of graphic elements. To render data types they don't support natively, user agents generally run external applications. The OBJECT element allows authors to control whether data should be rendered externally or by some program, specified by the author, that renders the data within the user agent.
In the most general case, an author may need to specify three types of information:
The OBJECT element allows authors to specify all three types of data, but authors may not have to specify all three at once. For example, some objects may not require data (e.g., a self-contained applet that performs a small animation). Others may not require run-time initialization. Still others may not require additional implementation information, i.e., the user agent itself may already know how to render that type of data (e.g., GIF images).
Authors specify an object's implementation and the location of the data to be rendered via the OBJECT element. To specify run-time values, however, authors use the PARAM element, which is discussed in the section on object initialization.
The OBJECT element may also appear in the content of the HEAD element. Since user agents generally do not render elements in the HEAD, authors should ensure that any OBJECT elements in the HEAD do not specify content that may be rendered. Please consult the section on sharing frame data for an example of including the OBJECT element in the HEAD element.
Please consult the section on form controls for information about OBJECT elements in forms.
This document does not specify the behavior of OBJECT elements that use both the classid attribute to identify an implementation and the data attribute to specify data for that implementation. In order to ensure portability, authors should use the PARAM element to tell implementations where to retrieve additional data.
A user agent must interpret an OBJECT element according to the following precedence rules:
Authors should not include content in OBJECT elements that appear in the HEAD element.
In the following example, we insert an analog clock applet in a document via the OBJECT element. The applet, written in the Python language, requires no additional data or run-time values. The classid attribute specifies the location of the applet:
<P><OBJECT classid="http://www.miamachina.it/analogclock.py"> </OBJECT>
Note that the clock will be rendered as soon as the user agent interprets this OBJECT declaration. It is possible to delay rendering of an object by first declaring the object (described below).
Authors should complete this declaration by including alternate text as the contents of OBJECT in case the user agent cannot render the clock.
<P><OBJECT classid="http://www.miamachina.it/analogclock.py"> An animated clock. </OBJECT>
One significant consequence of the OBJECT element's design is that it offers a mechanism for specifying alternate object renderings; each embedded OBJECT declaration may specify alternate content types. If a user agent cannot render the outermost OBJECT, it tries to render the contents, which may be another OBJECT element, etc.
In the following example, we embed several OBJECT declarations to illustrate how alternate renderings work. A user agent will attempt to render the first OBJECT element it can, in the following order: (1) an Earth applet written in the Python language, (2) an MPEG animation of the Earth, (3) a GIF image of the Earth, (4) alternate text.
<P> <!-- First, try the Python applet --> <OBJECT title="The Earth as seen from space" classid="http://www.observer.mars/TheEarth.py"> <!-- Else, try the MPEG video --> <OBJECT data="TheEarth.mpeg" type="application/mpeg"> <!-- Else, try the GIF image --> <OBJECT data="TheEarth.gif" type="image/gif"> <!-- Else render the text --> The <STRONG>Earth</STRONG> as seen from space. </OBJECT> </OBJECT> </OBJECT>
The outermost declaration specifies an applet that requires no data or initial values. The second declaration specifies an MPEG animation and, since it does not define the location of an implementation to handle MPEG, relies on the user agent to handle the animation. We also set the type attribute so that a user agent that knows it cannot render MPEG will not bother to retrieve "TheEarth.mpeg" from the network. The third declaration specifies the location of a GIF file and furnishes alternate text in case all other mechanisms fail.
Inline vs. external data. Data to be rendered may be supplied in two ways: inline and from an external resource. While the former method will generally lead to faster rendering, it is not convenient when rendering large quantities of data.
Here's an example that illustrates how inline data may be fed to an OBJECT:
<P> <OBJECT id="clock1" classid="clsid:663C8FEF-1EF9-11CF-A3DB-080036F12502" data="data:application/x-oleobject;base64, ...base64 data..."> A clock. </OBJECT>
Please consult the section on the visual presentation of objects, images, and applets for information about object size, alignment, and borders.
<!ELEMENT PARAM - O EMPTY -- named property value --> <!ATTLIST PARAM id ID #IMPLIED -- document-wide unique id -- name CDATA #REQUIRED -- property name -- value CDATA #IMPLIED -- property value -- valuetype (DATA|REF|OBJECT) DATA -- How to interpret value -- type %ContentType; #IMPLIED -- content type for value when valuetype=ref -- >
Start tag: required, End tag: forbidden
Attribute definitions
Attributes defined elsewhere
PARAM elements specify a set of values that may be required by an object at run-time. Any number of PARAM elements may appear in the content of an OBJECT or APPLET element, in any order, but must be placed at the start of the content of the enclosing OBJECT or APPLET element.
The syntax of names and values is assumed to be understood by the object's implementation. This document does not specify how user agents should retrieve name/value pairs nor how they should interpret parameter names that appear twice.
We return to the clock example to illustrate the use of PARAM: suppose that the applet is able to handle two run-time parameters that define its initial height and width. We can set the initial dimensions to 40x40 pixels with two PARAM elements.
<P><OBJECT classid="http://www.miamachina.it/analogclock.py"> <PARAM name="height" value="40" valuetype="data"> <PARAM name="width" value="40" valuetype="data"> This user agent cannot render Python applications. </OBJECT>
In the following example, run-time data for the object's "Init_values" parameter is specified as an external resource (a GIF file). The value of the valuetype attribute is thus set to "ref" and the value is a URI designating the resource.
<P><OBJECT classid="http://www.gifstuff.com/gifappli" standby="Loading Elvis..."> <PARAM name="Init_values" value="./images/elvis.gif"> valuetype="ref"> </OBJECT>
Note that we have also set the standby attribute so that the user agent may display a message while the rendering mechanism loads.
When an OBJECT element is rendered, user agents must search the content for only those PARAM elements that are direct children and "feed" them to the OBJECT.
Thus, in the following example, if "obj1" is rendered, "param1" applies to "obj1" (and not "obj2"). If "obj1" is not rendered and "obj2" is, "param1" is ignored, and "param2" applies to "obj2". If neither OBJECT is rendered, neither PARAM applies.
<P> <OBJECT id="obj1"> <PARAM name="param1"> <OBJECT id="obj2"> <PARAM name="param2"> </OBJECT> </OBJECT>
The location of an object's implementation is given by a URI. As we discussed in the introduction to URIs, the first segment of an absolute URI specifies the naming scheme used to transfer the data designated by the URI. For HTML documents, this scheme is frequently "http". Some applets might employ other naming schemes. For instance, when specifying a Java applet, authors may use URIs that begin with "java" and for ActiveX applets, authors may use "clsid".
In the following example, we insert a Java applet into an HTML document.
<P><OBJECT classid="java:program.start"> </OBJECT>
By setting the codetype attribute, a user agent can decide whether to retrieve the Java application based on its ability to do so.
<OBJECT codetype="application/java-archive" classid="java:program.start"> </OBJECT>
Some rendering schemes require additional information to identify their implementation and must be told where to find that information. Authors may give path information to the object's implementation via the codebase attribute.
<OBJECT codetype="application/java-archive" classid="java:program.start"> codebase="http://foooo.bar.com/java/myimplementation/" </OBJECT>
The following example specifies (with the classid attribute) an ActiveX object via a URI that begins with the naming scheme "clsid". The data attribute locates the data to render (another clock).
<P><OBJECT classid="clsid:663C8FEF-1EF9-11CF-A3DB-080036F12502" data="http://www.acme.com/ole/clock.stm"> This application is not supported. </OBJECT>
To declare an object so that it is not executed when read by the user agent, set the boolean declare attribute in the OBJECT element. At the same time, authors must identify the declaration by setting the id attribute in the OBJECT element to a unique value. Later instantiations of the object will refer to this identifier.
A declared OBJECT must appear in a document before the first instance of that OBJECT.
An object defined with the declare attribute is instantiated every time an element that refers to that object requires it to be rendered (e.g., a link that refers to it is activated, an object that refers to it is activated, etc.).
In the following example, we declare an OBJECT and cause it to be instantiated by referring to it from a link. Thus, the object can be activated by clicking on some highlighted text, for example.
<P><OBJECT declare id="earth.declaration" data="TheEarth.mpeg" type="application/mpeg"> The <STRONG>Earth</STRONG> as seen from space. </OBJECT> ...later in the document... <P>A neat <A href="#earth.declaration"> animation of The Earth!</A>
The following example illustrates how to specify run-time values that are other objects. In this example, we send text (a poem, in fact) to a hypothetical mechanism for viewing poems. The object recognizes a run-time parameter named "font" (say, for rendering the poem text in a certain font). The value for this parameter is itself an object that inserts (but does not render) the font object. The relationship between the font object and the poem viewer object is achieved by (1) assigning the id "tribune" to the font object declaration and (2) referring to it from the PARAM element of the poem viewer object (with valuetype and value).
<P><OBJECT declare id="tribune" type="application/x-webfont" data="tribune.gif"> </OBJECT> ...view the poem in KublaKhan.txt here... <P><OBJECT classid="http://foo.bar.com/poem_viewer" data="KublaKhan.txt"> <PARAM name="font" valuetype="object" value="#tribune"> <P>You're missing a really cool poem viewer ... </OBJECT>
User agents that don't support the declare attribute must render the contents of the OBJECT declaration.
See the Transitional DTD for the formal definition.
Attribute definitions
When the applet is "deserialized" the start() method is invoked but not the init() method. Attributes valid when the original object was serialized are not restored. Any attributes passed to this APPLET instance will be available to the applet. Authors should use this feature with extreme caution. An applet should be stopped before it is serialized.
Either code or object must be present. If both code and object are given, it is an error if they provide different class names.
Attributes defined elsewhere
This element, supported by all Java-enabled browsers, allows designers to embed a Java applet in an HTML document. It has been deprecated in favor of the OBJECT element.
The content of the APPLET acts as alternate information for user agents that don't support this element or are currently configured not to support applets. User agents must ignore the content otherwise.
DEPRECATED EXAMPLE:
In the following example, the APPLET
element includes a Java applet in the document. Since
no codebase is
supplied, the applet is assumed to be in the same
directory as the current document.
<APPLET code="Bubbles.class" width="500" height="500"> Java applet that draws animated bubbles. </APPLET>
This example may be rewritten with OBJECT as follows:
<P><OBJECT codetype="application/java" classid="java:Bubbles.class" width="500" height="500"> Java applet that draws animated bubbles. </OBJECT>
Initial values may be supplied to the applet via the PARAM element.
DEPRECATED EXAMPLE:
The following sample Java applet:
<APPLET code="AudioItem" width="15" height="15"> <PARAM name="snd" value="Hello.au|Welcome.au"> Java applet that plays a welcoming sound. </APPLET>
may be rewritten as follows with OBJECT:
<OBJECT codetype="application/java" classid="AudioItem" width="15" height="15"> <PARAM name="snd" value="Hello.au|Welcome.au"> Java applet that plays a welcoming sound. </OBJECT>
An embedded document is entirely independent of the document in which it is embedded. For instance, relative URIs within the embedded document resolve according to the base URI of the embedded document, not that of the main document. An embedded document is only rendered within another document (e.g., in a subwindow); it remains otherwise independent.
For instance, the following line embeds the contents of embed_me.html at the location where the OBJECT definition occurs.
...text before... <OBJECT data="embed_me.html"> Warning: embed_me.html could not be embedded. </OBJECT> ...text after...
Recall that the contents of OBJECT must only be rendered if the file specified by the data attribute cannot be loaded.
The behavior of a user agent in cases where a file includes itself is not defined.
An image map is created by associating an object with a specification of sensitive geometric areas on the object.
There are two types of image maps:
Client-side image maps are preferred over server-side image maps for at least two reasons: they are accessible to people browsing with non-graphical user agents and they offer immediate feedback as to whether or not the pointer is over an active region.
<!ELEMENT MAP - - ((%block;) | AREA)+ -- client-side image map --> <!ATTLIST MAP %attrs; -- %coreattrs, %i18n, %events -- name CDATA #REQUIRED -- for reference by usemap -- >
Start tag: required, End tag: required
<!ELEMENT AREA - O EMPTY -- client-side image map area --> <!ATTLIST AREA %attrs; -- %coreattrs, %i18n, %events -- shape %Shape; rect -- controls interpretation of coords -- coords %Coords; #IMPLIED -- comma-separated list of lengths -- href %URI; #IMPLIED -- URI for linked resource -- nohref (nohref) #IMPLIED -- this region has no action -- alt %Text; #REQUIRED -- short description -- tabindex NUMBER #IMPLIED -- position in tabbing order -- accesskey %Character; #IMPLIED -- accessibility key character -- onfocus %Script; #IMPLIED -- the element got the focus -- onblur %Script; #IMPLIED -- the element lost the focus -- >
Start tag: required, End tag: forbidden
MAP attribute definitions
AREA attribute definitions
Coordinates are relative to the top, left corner of the object. All values are lengths. All values are separated by commas.
Attribute to associate an image map with an element
Attributes defined elsewhere
The MAP element specifies a client-side image map (or other navigation mechanism) that may be associated with another elements (IMG, OBJECT, or INPUT). An image map is associated with an element via the element's usemap attribute. The MAP element may be used without an associated image for general navigation mechanisms.
The presence of the usemap attribute for an OBJECT implies that the object being included is an image. Furthermore, when the OBJECT element has an associated client-side image map, user agents may implement user interaction with the OBJECT solely in terms of the client-side image map. This allows user agents (such as an audio browser or robot) to interact with the OBJECT without having to process it; the user agent may even elect not to retrieve (or process) the object. When an OBJECT has an associated image map, authors should not expect that the object will be retrieved or processed by every user agent.
The MAP element content model allows authors to combine the following:
When a MAP element contains mixed content (both AREA elements and block-level content), user agents must ignore the AREA elements.
Authors should specify an image maps's geometry completely with AREA elements, or completely with A elements, or completely with both if content is mixed. Authors may wish to mix content so that older user agents will handle map geometries specified by AREA elements and new user agents will take advantage of richer block content.
If two or more defined regions overlap, the region-defining element that appears earliest in the document takes precedence (i.e., responds to user input).
User agents and authors should offer textual alternates to graphical image maps for cases when graphics are not available or the user cannot access them. For example, user agents may use alt text to create textual links in place of a graphical image map. Such links may be activated in a variety of ways (keyboard, voice activation, etc.).
Note. MAP is not backwards compatible with HTML 2.0 user agents.
In the following example, we create a client-side image map for the OBJECT element. We do not want to render the image map's contents when the OBJECT is rendered, so we "hide" the MAP element within the OBJECT element's content. Consequently, the MAP element's contents will only be rendered if the OBJECT cannot be rendered.
<HTML> <HEAD> <TITLE>The cool site!</TITLE> </HEAD> <BODY> <P><OBJECT data="navbar1.gif" type="image/gif" usemap="#map1"> <MAP name="map1"> <P>Navigate the site: <A href="guide.html" shape="rect" coords="0,0,118,28">Access Guide</a> | <A href="shortcut.html" shape="rect" coords="118,0,184,28">Go</A> | <A href="search.html" shape="circle" coords="184,200,60">Search</A> | <A href="top10.html" shape="poly" coords="276,0,276,28,100,200,50,50,276,0">Top Ten</A> </MAP> </OBJECT> </BODY> </HTML>
We may want to render the image map's contents even when a user agent can render the OBJECT. For instance, we may want to associate an image map with an OBJECT element and include a text navigation bar at the bottom of the page. To do so, we define the MAP element outside the OBJECT:
<HTML> <HEAD> <TITLE>The cool site!</TITLE> </HEAD> <BODY> <P><OBJECT data="navbar1.gif" type="image/gif" usemap="#map1"> </OBJECT> ...the rest of the page here... <MAP name="map1"> <P>Navigate the site: <A href="guide.html" shape="rect" coords="0,0,118,28">Access Guide</a> | <A href="shortcut.html" shape="rect" coords="118,0,184,28">Go</A> | <A href="search.html" shape="circle" coords="184,200,60">Search</A> | <A href="top10.html" shape="poly" coords="276,0,276,28,100,200,50,50,276,0">Top Ten</A> </MAP> </BODY> </HTML>
In the following example, we create a similar image map, this time using the AREA element. Note the use of alt text:
<P><OBJECT data="navbar1.gif" type="image/gif" usemap="#map1"> <P>This is a navigation bar. </OBJECT> <MAP name="map1"> <AREA href="guide.html" alt="Access Guide" shape="rect" coords="0,0,118,28"> <AREA href="search.html" alt="Search" shape="rect" coords="184,0,276,28"> <AREA href="shortcut.html" alt="Go" shape="circle" coords="184,200,60"> <AREA href="top10.html" alt="Top Ten" shape="poly" coords="276,0,276,28,100,200,50,50,276,0"> </MAP>
Here is a similar version using the IMG element instead of OBJECT (with the same MAP declaration):
<P><IMG src="navbar1.gif" usemap="#map1" alt="navigation bar">
The following example illustrates how image maps may be shared.
Nested OBJECT elements are useful for providing fallbacks in case a user agent doesn't support certain formats. For example:
<P> <OBJECT data="navbar.png" type="image/png"> <OBJECT data="navbar.gif" type="image/gif"> text describing the image... </OBJECT> </OBJECT>
If the user agent doesn't support the PNG format, it tries to render the GIF image. If it doesn't support GIF (e.g., it's a speech-based user agent), it defaults to the text description provided as the content of the inner OBJECT element. When OBJECT elements are nested this way, authors may share image maps among them:
<P> <OBJECT data="navbar.png" type="image/png" usemap="#map1"> <OBJECT data="navbar.gif" type="image/gif" usemap="#map1"> <MAP name="map1"> <P>Navigate the site: <A href="guide.html" shape="rect" coords="0,0,118,28">Access Guide</a> | <A href="shortcut.html" shape="rect" coords="118,0,184,28">Go</A> | <A href="search.html" shape="circle" coords="184,200,60">Search</A> | <A href="top10.html" shape="poly" coords="276,0,276,28,100,200,50,50,276,0">Top Ten</A> </MAP> </OBJECT> </OBJECT>
The following example illustrates how anchors may be specified to create inactive zones within an image map. The first anchor specifies a small circular region with no associated link. The second anchor specifies a larger circular region with the same center coordinates. Combined, the two form a ring whose center is inactive and whose rim is active. The order of the anchor definitions is important, since the smaller circle must override the larger circle.
<MAP name="map1"> <P> <A shape="circle" coords="100,200,50">I'm inactive.</A> <A href="outer-ring-link.html" shape="circle" coords="100,200,250">I'm active.</A> </MAP>
Similarly, the nohref attribute for the AREA element declares that geometric region has no associated link.
Server-side image maps may be interesting in cases where the image map is too complicated for a client-side image map.
It is only possible to define a server-side image map for the IMG and INPUT elements. In the case of IMG, the IMG must be inside an A element and the boolean attribute ismap ([CI]) must be set. In the case of INPUT, the INPUT must be of type "image".
When the user activates the link by clicking on the image, the screen coordinates are sent directly to the server where the document resides. Screen coordinates are expressed as screen pixel values relative to the image. For normative information about the definition of a pixel and how to scale it, please consult [CSS1].
In the following example, the active region defines a server-side link. Thus, a click anywhere on the image will cause the click's coordinates to be sent to the server.
<P><A href="http://www.acme.com/cgi-bin/competition"> <IMG src="game.gif" ismap alt="target"></A>
The location clicked is passed to the server as follows. The user agent derives a new URI from the URI specified by the href attribute of the A element, by appending `?' followed by the x and y coordinates, separated by a comma. The link is then followed using the new URI. For instance, in the given example, if the user clicks at the location x=10, y=27 then the derived URI is "http://www.acme.com/cgi-bin/competition?10,27".
User agents that do not offer the user a means to select specific coordinates (e.g., non-graphical user agents that rely on keyboard input, speech-based user agents, etc.) should send the coordinates "0,0" to the server when the link is activated.
Attribute definitions
When specified, the width and height attributes tell user agents to override the natural image or object size in favor of these values.
When the object is an image, it is scaled. User agents should do their best to scale an object or image to match the width and height specified by the author. Note that lengths expressed as percentages are based on the horizontal or vertical space currently available, not on the natural size of the image, object, or applet.
The height and width attributes give user agents an idea of the size of an image or object so that they may reserve space for it and continue rendering the document while waiting for the image data.
Attribute definitions
An image or object may be surrounded by a border (e.g., when a border is specified by the user or when the image is the content of an A element).
Attribute definitions
Attribute definitions
The following values for align concern the object's position with respect to surrounding text:
Two other values, left and right, cause the image to float to the current left or right margin. They are discussed in the section on floating objects.
Differing interpretations of align. User agents vary in their interpretation of the align attribute. Some only take into account what has occurred on the text line prior to the element, some take into account the text on both sides of the element.
Attribute definitions
Several non-textual elements (IMG, AREA, APPLET, and INPUT) let authors specify alternate text to serve as content when the element cannot be rendered normally. Specifying alternate text assists users without graphic display terminals, users whose browsers don't support forms, visually impaired users, those who use speech synthesizers, those who have configured their graphical user agents not to display images, etc.
The alt attribute must be specified for the IMG and AREA elements. It is optional for the INPUT and APPLET elements.
While alternate text may be very helpful, it must be handled with care. Authors should observe the following guidelines:
Implementors should consult the section on accessibility
for information about how to handle cases of omitted
alternate text.
Contents
Style sheets represent a major breakthrough for Web page designers, expanding their ability to improve the appearance of their pages. In the scientific environments in which the Web was conceived, people are more concerned with the content of their documents than the presentation. As people from wider walks of life discovered the Web, the limitations of HTML became a source of continuing frustration and authors were forced to sidestep HTML's stylistic limitations. While the intentions have been good -- to improve the presentation of Web pages -- the techniques for doing so have had unfortunate side effects. These techniques work for some of the people, some of the time, but not for all of the people, all of the time. They include:
These techniques considerably increase the complexity of Web pages, offer limited flexibility, suffer from interoperability problems, and create hardships for people with disabilities.
Style sheets solve these problems at the same time they supersede the limited range of presentation mechanisms in HTML. Style sheets make it easy to specify the amount of white space between text lines, the amount lines are indented, the colors used for the text and the backgrounds, the font size and style, and a host of other details.
For example, the following short CSS style sheet (stored in the file "special.css"), sets the text color of a paragraph to green and surrounds it with a solid red border:
P.special { color : green; border: solid red; }
Authors may link this style sheet to their source HTML document with the LINK element:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <HTML> <HEAD> <LINK href="special.css" rel="stylesheet" type="text/css"> </HEAD> <BODY> <P class="special">This paragraph should have special green text. </BODY> </HTML>
HTML 4 provides support for the following style sheet features:
Style sheets, by contrast, apply to specific media or media groups. A style sheet intended for screen use may be applicable when printing, but is of little use for speech-based browsers. This specification allows you to define the broad categories of media a given style sheet is applicable to. This allows user agents to avoid retrieving inappropriate style sheets. Style sheet languages may include features for describing media dependencies within the same style sheet.
The current proposal addresses these issues by allowing authors to include rendering instructions within each HTML element. The rendering information is then always available by the time the user agent wants to render each element.
In many cases, authors will take advantage of a common style sheet for a group of documents. In this case, distributing style rules throughout the document will actually lead to worse performance than using a linked style sheet, since for most documents, the style sheet will already be present in the local cache. The public availability of good style sheets will encourage this effect.
Note. The sample default style sheet for HTML 4 that is included in [CSS2] expresses generally accepted default style information for each element. Authors and implementors alike might find this a useful resource.
HTML documents may contain style sheet rules directly in them or they may import style sheets.
Any style sheet language may be used with HTML. A simple style sheet language may suffice for the needs of most users, but other languages may be more suited to highly specialized needs. This specification uses the style language "Cascading Style Sheets" ([CSS1]), abbreviated CSS, for examples.
The syntax of style data depends on the style sheet language.
Authors must specify the style sheet language of style information associated with an HTML document.
Authors should use the META element to set the default style sheet language for a document. For example, to set the default to CSS, authors should put the following declaration in the HEAD of their documents:
<META http-equiv="Content-Style-Type" content="text/css">
The default style sheet language may also be set with HTTP headers. The above META declaration is equivalent to the HTTP header:
Content-Style-Type: text/css
User agents should determine the default style sheet language for a document according to the following steps (highest to lowest priority):
Documents that include elements that set the style attribute but which don't define a default style sheet language are incorrect. Authoring tools should generate default style sheet language information (typically a META declaration) so that user agents do not have to rely on a default of "text/css".
Attribute definitions
The syntax of the value of the style attribute is determined by the default style sheet language. For example, for [[CSS2]] inline style, use the declaration block syntax described in section 4.1.8 (without curly brace delimiters).
This CSS example sets color and font size information for the text in a specific paragraph.
<P style="font-size: 12pt; color: fuchsia">Aren't style sheets wonderful?
In CSS, property declarations have the form "name : value" and are separated by a semi-colon.
To specify style information for more than one element, authors should use the STYLE element. For optimal flexibility, authors should define styles in external style sheets.
<!ELEMENT STYLE - - %StyleSheet -- style info --> <!ATTLIST STYLE %i18n; -- lang, dir, for use with title -- type %ContentType; #REQUIRED -- content type of style language -- media %MediaDesc; #IMPLIED -- designed for use with these media -- title %Text; #IMPLIED -- advisory title -- >
Start tag: required, End tag: required
Attribute definitions
Attributes defined elsewhere
The STYLE element allows authors to put style sheet rules in the head of the document. HTML permits any number of STYLE elements in the HEAD section of a document.
User agents that don't support style sheets, or don't support the specific style sheet language used by a STYLE element, must hide the contents of the STYLE element. It is an error to render the content as part of the document's text. Some style sheet languages support syntax for hiding the content from non-conforming user agents.
The syntax of style data depends on the style sheet language.
Some style sheet implementations may allow a wider variety of rules in the STYLE element than in the style attribute. For example, with CSS, rules may be declared within a STYLE element for:
Rules for style rule precedences and inheritance depend on the style sheet language.
The following CSS STYLE declaration puts a border around every H1 element in the document and centers it on the page.
<HEAD> <STYLE type="text/css"> H1 {border-width: 1; border: solid; text-align: center} </STYLE> </HEAD>
To specify that this style information should only apply to H1 elements of a specific class, we modify it as follows:
<HEAD> <STYLE type="text/css"> H1.myclass {border-width: 1; border: solid; text-align: center} </STYLE> </HEAD> <BODY> <H1 class="myclass"> This H1 is affected by our style </H1> <H1> This one is not affected by our style </H1> </BODY>
Finally, to limit the scope of the style information to a single instance of H1, set the id attribute:
<HEAD> <STYLE type="text/css"> #myid {border-width: 1; border: solid; text-align: center} </STYLE> </HEAD> <BODY> <H1 class="myclass"> This H1 is not affected </H1> <H1 id="myid"> This H1 is affected by style </H1> <H1> This H1 is not affected </H1> </BODY>
Although style information may be set for almost every HTML element, two elements, DIV and SPAN, are particularly useful in that they do not impose any presentation semantics (besides block-level vs. inline). When combined with style sheets, these elements allow users to extend HTML indefinitely, particularly when used with the class and id attributes.
In the following example, we use the SPAN element to set the font style of the first few words of a paragraph to small caps.
<HEAD> <STYLE type="text/css"> SPAN.sc-ex { font-variant: small-caps } </STYLE> </HEAD> <BODY> <P><SPAN class="sc-ex">The first</SPAN> few words of this paragraph are in small-caps. </BODY>
In the following example, we use DIV and the class attribute to set the text justification for a series of paragraphs that make up the abstract section of a scientific article. This style information could be reused for other abstract sections by setting the class attribute elsewhere in the document.
<HEAD> <STYLE type="text/css"> DIV.Abstract { text-align: justify } </STYLE> </HEAD> <BODY> <DIV class="Abstract"> <P>The Chieftain product range is our market winner for the coming year. This report sets out how to position Chieftain against competing products. <P>Chieftain replaces the Commander range, which will remain on the price list until further notice. </DIV> </BODY>
HTML allows authors to design documents that take advantage of the characteristics of the media where the document is to be rendered (e.g., graphical displays, television screens, handheld devices, speech-based browsers, braille-based tactile devices, etc.). By specifying the media attribute, authors allow user agents to load and apply style sheets selectively. Please consult the list of recognized media descriptors.
The following sample declarations apply to H1 elements. When projected in a business meeting, all instances will be blue. When printed, all instances will be centered.
<HEAD> <STYLE type="text/css" media="projection"> H1 { color: blue} </STYLE> <STYLE type="text/css" media="print"> H1 { text-align: center } </STYLE>
This example adds sound effects to anchors for use in speech output:
<STYLE type="text/css" media="aural"> A { cue-before: uri(bell.aiff); cue-after: uri(dong.wav)} </STYLE> </HEAD>
Media control is particularly interesting when applied to external style sheets since user agents can save time by retrieving from the network only those style sheets that apply to the current device. For instance, speech-based browsers can avoid downloading style sheets designed for visual rendering. See the section on media-dependent cascades for more information.
Authors may separate style sheets from HTML documents. This offers several benefits:
HTML allows authors to associate any number of external style sheets with a document. The style sheet language defines how multiple external style sheets interact (for example, the CSS "cascade" rules).
Authors may specify a number of mutually exclusive style sheets called alternate style sheets. Users may select their favorite among these depending on their preferences. For instance, an author may specify one style sheet designed for small screens and another for users with weak vision (e.g., large fonts). User agents should allow users to select from alternate style sheets.
The author may specify that one of the alternates is a preferred style sheet. User agents should apply the author's preferred style sheet unless the user has selected a different alternate.
Authors may group several alternate style sheets (including the author's preferred style sheets) under a single style name. When a user selects a named style, the user agent must apply all style sheets with that name. User agents must not apply alternate style sheets with a different style name. The section on specifying external style sheets explains how to name a group of style sheets.
User agents must respect media descriptors when applying any style sheet.
User agents should also allow users to disable the author's style sheets entirely, in which case the user agent must not apply any persistent or alternate style sheets.
Authors specify external style sheets with the following attributes of the LINK element:
User agents should provide a means for users to view and pick from the list of alternate styles. The value of the title attribute is recommended as the name of each choice.
In this example, we first specify a persistent style sheet located in the file mystyle.css:
<LINK href="mystyle.css" rel="stylesheet" type="text/css">
Setting the title attribute makes this the author's preferred style sheet:
<LINK href="mystyle.css" title="compact" rel="stylesheet" type="text/css">
Adding the keyword "alternate" to the rel attribute makes it an alternate style sheet:
<LINK href="mystyle.css" title="Medium" rel="alternate stylesheet" type="text/css">
For more information on external style sheets, please consult the section on links and external style sheets.
Authors may also use the META element to set the document's preferred style sheet. For example, to set the preferred style sheet to "compact" (see the preceding example), authors may include the following line in the HEAD:
<META http-equiv="Default-Style" content="compact">
The preferred style sheet may also be specified with HTTP headers. The above META declaration is equivalent to the HTTP header:
Default-Style: "compact"
If two or more META declarations or HTTP headers specify the preferred style sheet, the last one takes precedence. HTTP headers are considered to occur earlier than the document HEAD for this purpose.
If two or more LINK elements specify a preferred style sheet, the first one takes precedence.
Preferred style sheets specified with META or HTTP headers have precedence over those specified with the LINK element.
Cascading style sheet languages such as CSS allow style information from several sources to be blended together. However, not all style sheet languages support cascading. To define a cascade, authors specify a sequence of LINK and/or STYLE elements. The style information is cascaded in the order the elements appear in the HEAD.
Note. This specification does not specify how style sheets from different style languages cascade. Authors should avoid mixing style sheet languages.
In the following example, we specify two alternate style sheets named "compact". If the user selects the "compact" style, the user agent must apply both external style sheets, as well as the persistent "common.css" style sheet. If the user selects the "big print" style, only the alternate style sheet "bigprint.css" and the persistent "common.css" will be applied.
<LINK rel="alternate stylesheet" title="compact" href="small-base.css" type="text/css"> <LINK rel="alternate stylesheet" title="compact" href="small-extras.css" type="text/css"> <LINK rel="alternate stylesheet" title="big print" href="bigprint.css" type="text/css"> <LINK rel="stylesheet" href="common.css" type="text/css">
Here is a cascade example that involves both the LINK and STYLE elements.
<LINK rel="stylesheet" href="corporate.css" type="text/css"> <LINK rel="stylesheet" href="techreport.css" type="text/css"> <STYLE type="text/css"> p.special { color: rgb(230, 100, 180) } </STYLE>
A cascade may include style sheets applicable to different media. Both LINK and STYLE may be used with the media attribute. The user agent is then responsible for filtering out those style sheets that do not apply to the current medium.
In the following example, we define a cascade where the "corporate" style sheet is provided in several versions: one suited to printing, one for screen use and one for speech-based browsers (useful, say, when reading email in the car). The "techreport" stylesheet applies to all media. The color rule defined by the STYLE element is used for print and screen but not for aural rendering.
<LINK rel="stylesheet" media="aural" href="corporate-aural.css" type="text/css"> <LINK rel="stylesheet" media="screen" href="corporate-screen.css" type="text/css"> <LINK rel="stylesheet" media="print" href="corporate-print.css" type="text/css"> <LINK rel="stylesheet" href="techreport.css" type="text/css"> <STYLE media="screen, print" type="text/css"> p.special { color: rgb(230, 100, 180) } </STYLE>
When the user agent wants to render a document, it needs to find values for style properties, e.g. the font family, font style, size, line height, text color and so on. The exact mechanism depends on the style sheet language, but the following description is generally applicable:
The cascading mechanism is used when a number of style rules all apply directly to an element. The mechanism allows the user agent to sort the rules by specificity, to determine which rule to apply. If no rule can be found, the next step depends on whether the style property can be inherited or not. Not all properties can be inherited. For these properties the style sheet language provides default values for use when there are no explicit rules for a particular element.
If the property can be inherited, the user agent examines the immediately enclosing element to see if a rule applies to that. This process continues until an applicable rule is found. This mechanism allows style sheets to be specified compactly. For instance, authors may specify the font family for all elements within the BODY by a single rule that applies to the BODY element.
Some style sheet languages support syntax intended to allow authors to hide the content of STYLE elements from non-conforming user agents.
This example illustrates for CSS how to comment out the content of STYLE elements to ensure that older, non-conforming user agents will not render them as text.
<STYLE type="text/css"> <!-- H1 { color: red } P { color: blue} --> </STYLE>
This section only applies to user agents conforming to versions of HTTP that define a Link header field. Note that HTTP 1.1 as defined by [RFC2616] does not include a Link header field (refer to section 19.6.3).
Web server managers may find it convenient to configure a server so that a style sheet will be applied to a group of pages. The HTTP Link header has the same effect as a LINK element with the same attributes and values. Multiple Link headers correspond to multiple LINK elements occurring in the same order. For instance,
Link: <http://www.acme.com/corporate.css>; REL=stylesheet
corresponds to:
<LINK rel="stylesheet" href="http://www.acme.com/corporate.css">
It is possible to specify several alternate styles using multiple Link headers, and then use the rel attribute to determine the default style.
In the following example, "compact" is applied by default since it omits the "alternate" keyword for the rel attribute.
Link: <compact.css>; rel="stylesheet"; title="compact" Link: <bigprint.css>; rel="alternate stylesheet"; title="big print"
This should also work when HTML documents are sent by email. Some email agents can alter the ordering of [RFC822] headers. To protect against this affecting the cascading order for style sheets specified by Link headers, authors can use header concatenation to merge several instances of the same header field. The quote marks are only needed when the attribute values include whitespace. Use SGML entities to reference characters that are otherwise not permitted within HTTP or email headers, or that are likely to be affected by transit through gateways.
LINK and META elements implied by HTTP headers are defined as occurring before any explicit LINK and META elements in the document's HEAD.