HTML 4.01 Test Suite - Assertions
Testable Assertions: Section 6 Basic HTML data types
6 Basic HTML data types - Character data, colors, lengths, URIs, content types, etc.
Each attribute definition includes information about the case-sensitivity of its values. The case information is presented with the following keys:
CS The value is case-sensitive (i.e., user agents interpret "a" and "A" differently).
CI The value is case-insensitive (i.e., user agents interpret "a" and "A" as the same).
CN The value is not subject to case changes, e.g., because it is a number or a character from the document character set.
CA The element or attribute definition itself gives case information.
CT Consult the type definition for details about case-sensitivity.
If an attribute value is a list, the keys apply to every value in the list, unless otherwise indicated.
CDATA is a sequence of characters from the document character set and may include character entities. User agents should interpret attribute values as follows:
Replace character entities with characters,
Ignore line feeds,
Replace each carriage return or tab with a single space
User agents may ignore leading and trailing white space in CDATA attribute values (e.g., " myval " may be interpreted as "myval").
Although the STYLE and SCRIPT elements use CDATA for their data model, for these elements, CDATA must be handled differently by user agents. Markup and entities must be treated as raw text and passed to the application as is. The first occurrence of the character sequence "</" (end-tag open delimiter) is treated as terminating the end of the element's content. In valid documents, this would be the end tag for the element.
ID and NAME tokens must begin with a letter ([A-Za-z]) and may be followed by any number of letters, digits ([0-9]), hyphens ("-"), underscores ("_"), colons (":"), and periods (".").
IDREF and IDREFS are references to ID tokens defined by other attributes. IDREF is a single token and IDREFS is a space-separated list of tokens.
NUMBER tokens must contain at least one digit ([0-9]).
The attribute value type "color" (%Color;) refers to color definitions as specified in [SRGB]. A color value may either be a hexadecimal number (prefixed by a hash mark) or one of the following sixteen color names. The color names are case-insensitive.
Color names and sRGB values
Black = "#000000" Green = "#008000" Silver = "#C0C0C0" Lime = "#00FF00" Gray = "#808080" Olive = "#808000" White = "#FFFFFF" Yellow = "#FFFF00" Maroon = "#800000" Navy = "#000080" Red = "#FF0000" Blue = "#0000FF" Purple = "#800080" Teal = "#008080" Fuchsia = "#FF00FF" Aqua = "#00FFFF"
HTML specifies three types of length values for attributes:
Pixels: The value (%Pixels; in the DTD) is an integer that represents the number of pixels of the canvas (screen, paper).
Length: The value (%Length; in the DTD) may be either a %Pixel; or a percentage of the available horizontal or vertical space.
MultiLength: The value ( %MultiLength; in the DTD) may be a %Length; or a relative length. A relative length has the form "i*", where "i" is an integer. When allotting space among elements competing for that space, user agents allot pixel and percentage lengths first, then divide up remaining available space among relative lengths. Each relative length receives a portion of the available space that is proportional to the integer preceding the "*". The value "*" is equivalent to "1*". Thus, if 60 pixels of space are available after the user agent allots pixel and percentage space, and the competing relative lengths are 1*, 2*, and 3*, the 1* will be alloted 10 pixels, the 2* will be alloted 20 pixels, and the 3* will be alloted 30 pixels.
Length values are case-neutral.
Content types are case-insensitive. Examples of content types include "text/html", "image/png", "image/gif", "video/mpeg", "text/css", and "audio/basic". For the current list of registered MIME types, please consult [MIMETYPES].
Whitespace is not allowed within the language-code. Language codes are case-insensitive.
The "charset" attributes (%Charset in the DTD) refer to a character encoding as described in the section on character encodings. Values must be strings (e.g., "euc-jp") from the IANA registry (see [CHARSETS] for a complete list). Names of character encodings are case-insensitive.
Single characters may be specified with character references (e.g., "&").
The current specification uses one of the formats described in the profile [DATETIME] for its definition of legal date/time strings ( %Datetime in the DTD). The format is:YYYY-MM-DDThh:mm:ssTZD
TZD = time zone designator. The time zone designator is one of:
Z indicates UTC (Coordinated Universal Time). The "Z" must be uppercase.
+hh:mm indicates that the time is a local time which is hh hours and mm minutes ahead of UTC.
-hh:mm indicates that the time is a local time which is hh hours and mm minutes behind UTC.
Exactly the components shown here must be present, with exactly this punctuation. Note that the "T" appears literally in the string (it must be uppercase), to indicate the beginning of the time element, as specified in [ISO8601]
If a generating application does not know the time to the second, it may use the value "00" for the seconds (and minutes and hours if necessary).
Authors may use the following recognized link types, listed here with their conventional interpretations. White space characters are not permitted within link types.
These link types are case-insensitive.User agents, search engines, etc. may interpret these link types in a variety of ways.
Alternate: Designates substitute versions for the document in which the link occurs. When used together with the lang attribute, it implies a translated version of the document. When used together with the media attribute, it implies a version designed for a different medium (or media).
Stylesheet: Refers to an external style sheet. See the section on external style sheets for details. This is used together with the link type "Alternate" for user-selectable alternate style sheets.
Start: Refers to the first document in a collection of documents. This link type tells search engines which document is considered by the author to be the starting point of the collection.
Next: Refers to the next document in a linear sequence of documents. User agents may choose to preload the "next" document, to reduce the perceived load time.
Prev:Refers to the previous document in an ordered series of documents. Some user agents also support the synonym "Previous".
Contents:Refers to a document serving as a table of contents. Some user agents also support the synonym ToC (from "Table of Contents").
Index: Refers to a document providing an index for the current document.
Glossary: Refers to a document providing a glossary of terms that pertain to the current document.
Copyright: Refers to a copyright statement for the current document.
Chapter: Refers to a document serving as a chapter in a collection of documents.
Section: Refers to a document serving as a section in a collection of documents.
Subsection: Refers to a document serving as a subsection in a collection of documents.
Appendix: Refers to a document serving as an appendix in a collection of documents.
Help: Refers to a document offering help (more information, links to other sources information, etc.)
Bookmark: Refers to a bookmark. A bookmark is a link to a key entry point within an extended document. The title attribute may be used, for example, to label the bookmark. Note that several bookmarks may be defined in each document.
The following is a list of recognized media descriptors ( %MediaDesc in the DTD).
screen: Intended for non-paged computer screens.
tty: Intended for media using a fixed-pitch character grid, such as teletypes, terminals, or portable devices with limited display capabilities.
tv: Intended for television-type devices (low resolution, color, limited scrollability).
projection: Intended for projectors.
handheld: Intended for handheld devices (small screen, monochrome, bitmapped graphics, limited bandwidth).
print: Intended for paged, opaque material and for documents viewed on screen in print preview mode.
braille: Intended for braille tactile feedback devices.
aural: Intended for speech synthesizers.
all: Suitable for all devices.
conforming user agents must be able to parse the media attribute value as follows:
1. The value is a comma-separated list of entries.
2. Each entry is truncated just before the first character that isn't a US ASCII letter [a-zA-Z] (ISO 10646 hex 41-5a, 61-7a), digit [0-9] (hex 30-39), or hyphen (hex 2d).
3. A case-sensitive match is then made with the set of media types defined above. User agents may ignore entries that don't match.
Script data ( %Script; in the DTD) can be the content of the SCRIPT element and the value of intrinsic event attributes. User agents must not evaluate script data as HTML markup but instead must pass it on as data to a script engine.
Script data that is element content may not contain character references, but script data that is the value of an attribute may contain them.
Style sheet data (%StyleSheet; in the DTD) can be the content of the STYLE element and the value of the style attribute. User agents must not evaluate style data as HTML markup.
Style sheet data that is element content may not contain character references, but style sheet data that is the value of an attribute may contain them.
Except for the reserved names, frame target names (%FrameTarget; in the DTD) must begin with an alphabetic character (a-zA-Z). User agents should ignore all other target names.
The following target names are reserved and have special meanings.
_blank The user agent should load the designated document in a new, unnamed window.
_self The user agent should load the document in the same frame as the element that refers to this target.
_parent The user agent should load the document into the immediate FRAMESET parent of the current frame. This value is equivalent to _self if the current frame has no parent.
_top The user agent should load the document into the full, original window (thus canceling all other frames). This value is equivalent to _self if the current frame has no parent.