An HTML document is a tree of elements, including a head and body, headings, paragraphs, lists, etc. Form elements are discussed in section Forms.
The HTML document element consists of a head and a body, much like a memo or a mail message. The head contains the title and optional elements. The body is a text flow consisting of paragraphs, lists, and other elements.
The head of an HTML document is an unordered collection of information about the document. For example:
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <HEAD> <TITLE>Introduction to HTML</TITLE> </HEAD> ...
Every HTML document must contain a TITLE element.
The title should identify the contents of the document in a global context. A short title, such as "Introduction" may be meaningless out of context. A title such as "Introduction to HTML Elements" is more appropriate. (12)
A user agent may display the title of a document in a history list or as a label for the window displaying the document. This differs from headings (section Headings: H1 ... H6), which are typically displayed within the body text flow.
The optional BASE element provides a base address for interpreting relative URLs when the document is read out of context (see section Hyperlinks). The value of the HREF attribute must be an absolute URI.
The ISINDEX element indicates that the user agent should allow the user to search an index by giving keywords. See section Queries and Indexes for details.
The LINK element represents a hyperlink (see section Hyperlinks). Any number of LINK elements may occur in the HEAD element of an HTML document. It has the same attributes as the A element (see section Anchor: A).
The LINK element is typically used to indicate authorship, related indexes and glossaries, older or more recent versions, document hierarchy, associated resources such as style sheets, etc.
The META element is an extensible container for use in identifying specialized document meta-information. Meta-information has two main functions:
Each META element specifies a name/value pair. If multiple META elements are provided with the same name, their combined contents--concatenated as a comma-separated list--is the value associated with that name. (13)
HTTP servers may read the content of the document HEAD to generate header fields corresponding to any elements defining a value for the attribute HTTP-EQUIV. (14)
Attributes of the META element:
If the document contains:
<META HTTP-EQUIV="Expires" CONTENT="Tue, 04 Dec 1993 21:29:02 GMT"> <meta http-equiv="Keywords" CONTENT="Fred"> <META HTTP-EQUIV="Reply-to" content="email@example.com (Roy Fielding)"> <Meta Http-equiv="Keywords" CONTENT="Barney">
then the server may include the following header fields:
Expires: Tue, 04 Dec 1993 21:29:02 GMT Keywords: Fred, Barney Reply-to: firstname.lastname@example.org (Roy Fielding)
as part of the HTTP response to a `GET' or `HEAD' request for that document.
An HTTP server must not use the META element to form an HTTP response header unless the HTTP-EQUIV attribute is present.
An HTTP server may disregard any META elements that specify information controlled by the HTTP server, for example `Server', `Date', and `Last-modified'.
The NEXTID element is included for historical reasons only. HTML documents should not contain NEXTID elements.
The NEXTID element gives a hint for the name to use for a new A element when editing an HTML document. It should be distinct from all NAME attribute values on A elements. For example:
The BODY element contains the text flow of the document, including headings, paragraphs, lists, etc.
<BODY> <h1>Important Stuff</h1> <p>Explanation about important stuff... </BODY>
The six heading elements, H1 through H6, denote section headings. Although the order and occurrence of headings is not constrained by the HTML DTD, documents should not skip levels (for example, from H1 to H3), as converting such documents to other representations is often problematic.
Example of use:
<H1>This is a heading</H1> Here is some text <H2>Second level heading</H2> Here is some more text.
Typical renderings are:
Block structuring elements include paragraphs, lists, and block quotes. They must not contain heading elements, but they may contain phrase markup, and in some cases, they may be nested.
The P element indicates a paragraph. The exact indentation, leading space, etc. of a paragraph is not specified and may be a function of other tags, style sheets, etc.
Typically, paragraphs are surrounded by a vertical space of one line or half a line. The first line in a paragraph is indented in some cases.
Example of use:
<H1>This Heading Precedes the Paragraph</H1> <P>This is the text of the first paragraph. <P>This is the text of the second paragraph. Although you do not need to start paragraphs on new lines, maintaining this convention facilitates document maintenance.</P> <P>This is the text of a third paragraph.</P>
The PRE element represents a character cell block of text and is suitable for text that has been formatted for a monospaced font.
The PRE tag may be used with the optional WIDTH attribute. The WIDTH attribute specifies the maximum number of characters for a line and allows the HTML user agent to select a suitable font and indentation.
Within preformatted text:
Example of use:
<PRE> Line 1. Line 2 is to the right of line 1. <a href="abc">abc</a> Line 3 aligns with line 2. <a href="def">def</a> </PRE>
The XMP and LISTING elements are similar to the PRE element, but they have a different syntax. Their content is declared as CDATA, which means that no markup except the end-tag open delimiter-in-context is recognized (see 9.6 "Delimiter Recognition" of [SGML]). (18)
Since CDATA declared content has a number of unfortunate interactions with processing techniques and tends to be used and implemented inconsistently, HTML documents should not contain XMP nor LISTING elements -- the PRE tag is more expressive and more consistently supported.
The LISTING element should be rendered so that at least 132 characters fit on a line. The XMP element should be rendered so that at least 80 characters fit on a line but is otherwise identical to the LISTING element. (19)
The ADDRESS element contains such information as address, signature and authorship, often at the beginning or end of the body of a document.
Typically, the ADDRESS element is rendered in an italic typeface and may be indented.
Example of use:
<ADDRESS> Newsletter editor<BR> J.R. Brown<BR> JimquickPost News, Jimquick, CT 01234<BR> Tel (123) 456 7890 </ADDRESS>
The BLOCKQUOTE element contains text quoted from another source.
A typical rendering might be a slight extra left and right indent, and/or italic font. The BLOCKQUOTE typically provides space above and below the quote.
Single-font rendition may reflect the quotation style of Internet mail by putting a vertical line of graphic characters, such as the greater than symbol (>), in the left margin.
Example of use:
I think the play ends <BLOCKQUOTE> <P>Soft you now, the fair Ophelia. Nymph, in thy orisons, be all my sins remembered. </BLOCKQUOTE> but I am not sure.
HTML includes a number of list elements. They may be used in combination; for example, a OL may be nested in an LI element of a UL.
The COMPACT attribute suggests that a compact rendering be used.
The UL represents a list of items -- typically rendered as a bulleted list.
The content of a UL element is a sequence of LI elements. For example:
<UL> <LI>First list item <LI>Second list item <p>second paragraph of second item <LI>Third list item </UL>
The OL element represents an ordered list of items, sorted by sequence or order of importance. It is typically rendered as a numbered list.
The content of a OL element is a sequence of LI elements. For example:
<OL> <LI>Click the Web button to open URI window. <LI>Enter the URI number in the text field of the Open URI window. The Web document you specified is displayed. <ol> <li>substep 1 <li>substep 2 </ol> <LI>Click highlighted text to move from one link to another. </OL>
The DIR element is similar to the UL element. It represents a list of short items, typically up to 20 characters each. Items in a directory list may be arranged in columns, typically 24 characters wide.
The content of a DIR element is a sequence of LI elements. Nested block elements are not allowed in the content of DIR elements. For example:
<DIR> <LI>A-H<LI>I-M <LI>M-R<LI>S-Z </DIR>
The MENU element is a list of items with typically one line per item. The menu list style is typically more compact than the style of an unordered list.
The content of a MENU element is a sequence of LI elements. Nested block elements are not allowed in the content of MENU elements. For example:
<MENU> <LI>First item in the list. <LI>Second item in the list. <LI>Third item in the list. </MENU>
A definition list is a list of terms and corresponding definitions. Definition lists are typically formatted with the term flush-left and the definition, formatted paragraph style, indented after the term.
The content of a DL element is a sequence of DT elements and/or DD elements, usually in pairs. Multiple DT may be paired with a single DD element. Documents should not contain multiple consecutive DD elements.
Example of use:
<DL> <DT>Term<DD>This is the definition of the first term. <DT>Term<DD>This is the definition of the second term. </DL>
If the DT term does not fit in the DT column (typically one third of the display area), it may be extended across the page with the DD section moved to the next line, or it may be wrapped onto successive lines of the left hand column.
The optional COMPACT attribute suggests that a compact rendering be used, because the list items are small and/or the entire list is large.
Unless the COMPACT attribute is present, an HTML user agent may leave white space between successive DT, DD pairs. The COMPACT attribute may also reduce the width of the left-hand (DT) column.
<DL COMPACT> <DT>Term<DD>This is the first definition in compact format. <DT>Term<DD>This is the second definition in compact format. </DL>
Phrases may be marked up according to idiomatic usage, typographic appearance, or for use as hyperlink anchors.
User agents must render highlighted phrases distinctly from plain text. Additionally, EM content must be rendered as distinct from STRONG content, and B content must rendered as distinct from I content.
Phrase elements may be nested within the content of other phrase elements; however, HTML user agents may render nested phrase elements indistinctly from non-nested elements:
plain <B>bold <I>italic</I></B> may be rendered the same as plain <B>bold </B><I>italic</I>
Phrases may be marked up to indicate certain idioms. (20)
The CITE element is used to indicate the title of a book or other citation. It is typically rendered as italics. For example:
He just couldn't get enough of <cite>The Grapes of Wrath</cite>.
The CODE element indicates an example of code, typically rendered in a mono-spaced font. The CODE element is intended for short words or phrases of code; the PRE block structuring element (section Preformatted Text: PRE) is more appropriate for multiple-line listings. For example:
The expression <code>x += 1</code> is short for <code>x = x + 1</code>.
The EM element indicates an emphasized phrase, typically rendered as italics. For example:
A singular subject <em>always</em> takes a singular verb.
The KBD element indicates text typed by a user, typically rendered in a mono-spaced font. This is commonly used in instruction manuals. For example:
Enter <kbd>FIND IT</kbd> to search the database.
The SAMP element indicates a sequence of literal characters, typically rendered in a mono-spaced font. For example:
The only word containing the letters <samp>mt</samp> is dreamt.
The STRONG element indicates strong emphasis, typically rendered in bold. For example:
<strong>STOP</strong>, or I'll say "<strong>STOP</strong>" again!
The VAR element indicates a placeholder variable, typically rendered as italic. For example:
Type <SAMP>html-check <VAR>file</VAR> | more</SAMP> to check <VAR>file</VAR> for markup errors.
Typographic elements are used to specify the format of marked text.
Typical renderings for idiomatic elements may vary between user agents. If a specific rendering is necessary -- for example, when referring to a specific text attribute as in "The italic parts are mandatory" -- a typographic element can be used to ensure that the intended typography is used where possible.
The B element indicates bold text. Where bold typography is unavailable, an alternative representation may be used.
The I element indicates italic text. Where italic typography is unavailable, an alternative representation may be used.
The TT element indicates teletype (monospaced )text. Where a teletype font is unavailable, an alternative representation may be used.
The A element indicates a hyperlink anchor (see section Hyperlinks). At least one of the NAME and HREF attributes should be present. Attributes of the A element:
The BR element specifies a line break between words (see section Characters, Words, and Paragraphs). For example:
<P> Pease porridge hot<BR> Pease porridge cold<BR> Pease porridge in the pot<BR> Nine days old.
The HR element is a divider between sections of text; typically a full width horizontal rule or equivalent graphic. For example:
<HR> <ADDRESS>February 8, 1995, CERN</ADDRESS> </BODY>
The IMG element refers to an image or icon via a hyperlink (see section Simultaneous Presentation of Image Resources).
HTML user agents may process the value of the ALT attribute as an alternative to processing the image resource indicated by the SRC attribute. (22)
Attributes of the IMG element:
Examples of use:
<IMG SRC="triangle.xbm" ALT="Warning:"> Be sure to read these instructions.
<a href="http://machine/htbin/imagemap/sample"> <IMG SRC="sample.xbm" ISMAP> </a>