HTML and MIME

The definition of the HTML content subtype is
MIME Type name
text
MIME subtype name:
html
Required parameters:
none
Optional parameters:
level, version, charset

Level

The level parameter specifies the feature set which is used in the document. The level is an integer number, implying that any features of same or lower level may be present in the document. Levels are defined by this specification.

Version

In order to help avoid future compatibility problems, the version parameter may be used to give the version number of this specification to which the document conforms. The version number appears at the front of this document and within public identifier for the SGML DTD.

Character sets

The base character set (the SGML BASESET) for HTML is ISO Latin-1. This is the set referred to by any numeric character references . The actual character set used in the representation of an HTML document may be ISO Latin 1, or its 7-bit subset which is ASCII. There is no obligation for an HTML document to contain any characters above decimal 127. It is possible that a transport medium such as electronic mail imposes constraints on the number of bits in a representation of a document, though the HTTP access protocol used by W3 always allows 8 bit transfer.

When an HTML document is encoded using 7-bit characters, then the mechanisms of character references and entity references may be used to encode characters in the upper half of the ISO Latin-1 set. In this way, documents may be prepared which are suitable for mailing through 7-bit limited systems.

Character set option (proposed)

The SGML declaration specified ISO Latin 1 as the base character set. The charset parameter is reserved for future use. Its intended significance is to override the base character set of the SGML declaration. Support of character sets other than ISO-Latin-1 is not a requirement for conformance with this specification.