4 Conformance: requirements and recommendations

Contents

  1. Definitions
  2. SGML
  3. The text/html content type

In this section, we begin the specification of HTML 4.0, starting with the contract between authors, documents, users, and user agents.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. However, for readability, these words do not appear in all upper case letters in this specification.

At times, the authors of this specification recommend good practice for authors and user agents. These recommendations are not normative and conformance with this specification does not depend on their realization. These recommendations contain the expression "We recommend ...", "This specification recommends ...", or some similar wording.

4.1 Definitions

HTML document
An HTML document is an SGML document that meets the constraints of this specification.
Author
An author is a person or program that writes or generates HTML documents.
User
A user is a person who interacts with a user agent to view, hear, or otherwise use a rendered HTML document.
HTML user agent
An HTML user agent is any device that interprets HTML documents. User agents include visual browsers (text-only and graphical), non-visual browsers (audio, Braille), search robots, proxies, etc.

A conforming user agent for HTML 4.0 is one that observes the mandatory conditions ("must") set forth in this specification, including the following points:

Error conditions
This specification does not define how conforming user agents handle general error conditions, including how user agents behave when they encounter elements, attributes, attribute values, or entities not specified in this document.

However, to facilitate experimentation and interoperability between implementations of various versions of HTML, we recommend the following behavior:

We also recommend that user agents provide support for notifying the user of such errors.

Since user agents may vary in how they handle error conditions, authors and users must not rely on specific error recovery behavior.

Deprecated
A deprecated element or attribute is one that has been outdated by newer constructs. Deprecated elements are defined in the reference manual in appropriate locations, but are clearly marked as deprecated. Deprecated elements may become obsolete in future versions of HTML.

User agents should continue to support deprecated elements for reasons of backward compatibility.

Definitions of elements and attributes clearly indicate which are deprecated.

This specification includes examples that illustrate how to avoid using deprecated elements. In most cases these depend on user agent support for style sheets. In general, authors should use style sheets to achieve stylistic and formatting effects rather than HTML presentational attributes. HTML presentational attributes have been deprecated when style sheet alternatives exist (see, for example, [CSS1]).

Obsolete
An obsolete element or attribute is one for which there is no guarantee of support by a user agent. Obsolete elements are no longer defined in the specification, but are listed for historical purposes in the changes section of the reference manual.

4.2 SGML

HTML 4.0 is an SGML application conforming to International Standard ISO 8879 -- Standard Generalized Markup Language SGML (defined in [ISO8879]).

Comments appearing in the HTML 4.0 DTD have no normative value; they are informative only.

User agents must not render SGML processing instructions (e.g., <?full volume>) or comments. For more information about this and other SGML features that may be legal in HTML but aren't widely supported by HTML user agents, please consult the section on SGML features with limited support.

4.3 The text/html content type

HTML documents are sent over the Internet as a sequence of bytes accompanied by encoding information (described in the section on character encodings). The structure of the transmission, termed a message entity, is defined in [RFC2045]) and [RFC2068]. A message entity with a content type of "text/html" represents an HTML document.

The content type for HTML documents is defined as follows:

Content type name
text
Content subtype name
html
Required parameters
none
Optional parameters
charset
Encoding considerations
any encoding is allowed
Security considerations
See the notes on security

The optional parameter "charset" refers to the character encoding used to represent the HTML document as a sequence of bytes. Legal values for this parameter are defined in the section on character encodings. Although this parameter is optional, we recommend that it always be present.

Note. The relationship between this specification's encoding of line breaks and [RFC2045], section 2.10 is not yet clear. The editors expect to review this relationship during the Proposed Recommendation review phase.