HTML 4.01 Test Suite - Assertions

Testable Assertions: Section 9 Text


Valid HTML 4.01!


9 Text - Paragraphs, Lines, and Phrases

Assertion 9.1-1

Reference: Section 9.1
(informative) The document character set includes a wide variety of white space Characters. Many of these are typographic elements used in some applications to produce particular visual spacing effects. In HTML, only the following characters are defined as white space characters: ASCII space ( ) ASCII tab (	) ASCII form feed () Zero-width space (​)
Tests: None

Assertion 9.1-2

Reference: Section 9.1
(informative) Line breaks are also white space characters. Note that although 
 and 
 are defined in [ISO10646] to unambiguously separate lines and paragraphs, respectively, these do not constitute line breaks in HTML, nor does this specification include them in the more general category of white space characters
Tests: None

Assertion 9.1-3

Reference: Section 9.1
(author)(should) Authors should use appropriate elements and styles to achieve visual formatting effects that involve white space, rather than space characters.
Tests: None

Assertion 9.1-4

Reference: Section 9.1
(informative) For all HTML elements except PRE, sequences of white space separate "words" (we use the term "word" here to mean "sequences of non-white space characters").
Tests: None

Assertion 9.1-5

Reference: Section 9.1
(should) When formatting text, user agents should identify the sequences of non-white space characters and lay them out according to the conventions of the particular written language (script) and target medium.
Tests: None

Assertion 9.1-6

Reference: Section 9.1
(may) This layout may involve putting space between words (called inter-word space), but conventions for inter-word space vary from script to script.
Tests: None

Assertion 9.1-7

Reference: Section 9.1
(may) A sequence of white spaces between words in the source document may result in an entirely different rendered inter-word spacing (except in the case of the PRE element).
Tests: None

Assertion 9.1-8

Reference: Section 9.1
(should) User agents should collapse input white space sequences when producing output inter-word space. This can and should be done even in the absence of language information.
Tests: None

Assertion 9.1-9

Reference: Section 9.1
(author)(must) The PRE element is used for preformatted text, where white space is significant.
Tests: 9_1-BF-01.html

Assertion 9.1-10

Reference: Section 9.1
(author)(should) In order to avoid problems with SGML line break rules and inconsistencies among extant implementations, authors should not rely on user agents to render white space immediately after a start tag or immediately before an end tag.
Tests: None

Assertion 9.2.1-1

Reference: Section 9.2.1
(must) EM: Indicates emphasis. Phrase elements add structural information to text fragments. Start tag and end tag are required.
Tests: 9_2_1-BF-01.html

Assertion 9.2.1-2

Reference: Section 9.2.1
(must) STRONG: Indicates stronger emphasis. Phrase elements add structural information to text fragments. Start tag and end tag are required.
Tests: 9_2_1-BF-02.html

Assertion 9.2.1-3

Reference: Section 9.2.1
(must) CITE: Contains a citation or a reference to other sources. Phrase elements add structural information to text fragments. Start tag and end tag are required.
Tests: 9_2_1-BF-03.html

Assertion 9.2.1-4

Reference: Section 9.2.1
(must) DFN: Indicates that this is the defining instance of the enclosed term. Phrase elements add structural information to text fragments. Start tag and end tag are required.
Tests: 9_2_1-BF-04.html

Assertion 9.2.1-5

Reference: Section 9.2.1
(must) CODE: Designates a fragment of computer code. Phrase elements add structural information to text fragments. Start tag and end tag are required.
Tests: 9_2_1-BF-05.html

Assertion 9.2.1-6

Reference: Section 9.2.1
(must) SAMP: Designates sample output from programs, scripts, etc. Phrase elements add structural information to text fragments. Start tag and end tag are required.
Tests: 9_2_1-BF-06.html

Assertion 9.2.1-7

Reference: Section 9.2.1
(must) KBD: Indicates text to be entered by the user. Phrase elements add structural information to text fragments. Start tag and end tag are required.
Tests: 9_2_1-BF-07.html

Assertion 9.2.1-8

Reference: Section 9.2.1
(must) VAR: Indicates an instance of a variable or program argument. Phrase elements add structural information to text fragments. Start tag and end tag are required.
Tests: 9_2_1-BF-08.html

Assertion 9.2.1-9

Reference: Section 9.2.1
(must) ABBR: Indicates an abbreviated form (e.g., WWW, HTTP, URI, Mass., etc.). Phrase elements add structural information to text fragments. Start tag and end tag are required.
Tests: 9_2_1-BF-09.html

Assertion 9.2.1-10

Reference: Section 9.2.1
(must) ACRONYM: Indicates an acronym (e.g., WAC, radar, etc.). Phrase elements add structural information to text fragments. Start tag and end tag are required.
Tests: 9_2_1-BF-10.html

Assertion 9.2.1-11

Reference: Section 9.2.1
(must) EM and STRONG are used to indicate emphasis. The other phrase elements have particular significance in technical documents.
Tests: None

Assertion 9.2.1-12

Reference: Section 9.2.1
(should) The presentation of phrase elements depends on the user agent. Generally, visual user agents present EM text in italics and STRONG text in bold font.
Tests: None

Assertion 9.2.1-13

Reference: Section 9.2.1
(may) Speech synthesizer user agents may change the synthesis parameters, such as volume, pitch and rate accordingly.
Tests: None

Assertion 9.2.1-14

Reference: Section 9.2.1
(informative) The ABBR and ACRONYM elements allow authors to clearly indicate occurrences of abbreviations and acronyms. Western languages make extensive use of acronyms such as "GmbH", "NATO", and "F.B.I.", as well as abbreviations like "M.", "Inc.", "et al.", "etc.". Both Chinese and Japanese use analogous abbreviation mechanisms, wherein a long name is referred to subsequently with a subset of the Han characters from the original occurrence. Marking up these constructs provides useful information to user agents and tools such as spell checkers, speech synthesizers, translation systems and search-engine indexers.
Tests: None

Assertion 9.2.1-15

Reference: Section 9.2.1
(author)(must) The content of the ABBR and ACRONYM elements specifies the abbreviated expression itself, as it would normally appear in running text.
Tests: None

Assertion 9.2.1-16

Reference: Section 9.2.1
(may) The title attribute of these elements may be used to provide the full or expanded form of the expression.
Tests: None

Assertion 9.2.1-17

Reference: Section 9.2.1
(informative) Abbreviations and acronyms often have idiosyncratic pronunciations.
Tests: None

Assertion 9.2.1-18

Reference: Section 9.2.1
(author)(should) When necessary, authors should use style sheets to specify the pronunciation of an abbreviated form.
Tests: None

Assertion 9.2.2-1

Reference: Section 9.2.2
(must) Start tags and end tags for quotations are required.
Tests: None

Assertion 9.2.2-2

Reference: Section 9.2.2
(should) BLOCKQUOTE and Q: cite = uri [CT]. The value of this attribute is a URI that designates a source document or message. This attribute is intended to give information about the source from which the quotation was borrowed.
Tests: None

Assertion 9.2.2-3

Reference: Section 9.2.2
(author)(must) BLOCKQUOTE and Q designate quoted text. BLOCKQUOTE is for long quotations (block-level content) and Q is intended for short quotations (inline content) that don't require paragraph breaks.
Tests: None

Assertion 9.2.2-4

Reference: Section 9.2.2
(should) Visual user agents generally render BLOCKQUOTE as an indented block.
Tests: None

Assertion 9.2.2-5

Reference: Section 9.2.2
(must) Visual user agents must ensure that the content of the Q element is rendered with delimiting quotation marks. Authors should not put quotation marks at the beginning and end of the content of a Q element.
Tests: 9_2_2-BF-01.html

Assertion 9.2.2-6

Reference: Section 9.2.2
(should) User agents should render quotation marks in a language-sensitive manner (see the lang attribute). Many languages adopt different quotation styles for outer and inner (nested) quotations, which should be respected by user-agents.
Tests: None

Assertion 9.2.2-7

Reference: Section 9.2.2
(should) Since the language of both quotations is American English, user agents should render them appropriately, for example with single quote marks around the inner quotation and double quote marks around the outer quotation:
Tests: None

Assertion 9.2.2-8

Reference: Section 9.2.2
(should) It is recommended that style sheet implementations provide a mechanism for inserting quotation marks before and after a quotation delimited by BLOCKQUOTE in a manner appropriate to the current language context and the degree of nesting of quotations.
Tests: None

Assertion 9.2.2-9

Reference: Section 9.2.2
(should) However, as some authors have used BLOCKQUOTE merely as a mechanism to indent text, in order to preserve the intention of the authors, user agents should not insert quotation marks in the default style.
Tests: None

Assertion 9.2.2-10

Reference: Section 9.2.2
(author)(deprecated) The usage of BLOCKQUOTE to indent text is deprecated in favor of style sheets.
Tests: None

Assertion 9.2.3-1

Reference: Section 9.2.3
(must) Start tags and End tags for subscripts and superscripts are required.
Tests: None

Assertion 9.2.3-2

Reference: Section 9.2.3
(should) Many scripts (e.g., French) require superscripts or subscripts for proper rendering. The SUB and SUP elements should be used to markup text in these cases.
Tests: None

Assertion 9.3.1-1

Reference: Section 9.3.1
(must) Start tags for paragraphs are required. End tags for paragraphs are optional.
Tests: None

Assertion 9.3.1-2

Reference: Section 9.3.1
(must) The P element represents a paragraph. It cannot contain block-level elements (including P itself).
Tests: None

Assertion 9.3.1-3

Reference: Section 9.3.1
(should) Authors are discouraged from using empty P elements. User agents should ignore empty P elements.
Tests: None

Assertion 9.3.2-1

Reference: Section 9.3.2
(must) A line break is defined to be a carriage return (
), a line feed (
), or a carriage return/line feed pair. All line breaks constitute white space.
Tests: None

Assertion 9.3.2-2

Reference: Section 9.3.2
(must) Start tags for controlling line breaks are required. End tags for controlling line breaks are forbidden.
Tests: None

Assertion 9.3.2-3

Reference: Section 9.3.2
(must) The BR element forcibly breaks (ends) the current line of text.
Tests: 9_3_2-BF-01.html

Assertion 9.3.2-4

Reference: Section 9.3.2
(must) For visual user agents, the clear attribute can be used to determine whether markup following the BR element flows around images and other objects floated to the left or right margin, or whether it starts after the bottom of such objects.
Tests: None

Assertion 9.3.2-5

Reference: Section 9.3.2
(author)(should) Authors are advised to use style sheets to control text flow around floating images and other objects.
Tests: None

Assertion 9.3.2-6

Reference: Section 9.3.2
(should) With respect to bidirectional formatting, the BR element should behave the same way the [ISO10646] LINE SEPARATOR character behaves in the bidirectional algorithm.
Tests: None

Assertion 9.3.3-1

Reference: Section 9.3.3
(must) Those browsers that interpret soft hyphens must observe the following semantics: If a line is broken at a soft hyphen, a hyphen character must be displayed at the end of the first line. If a line is not broken at a soft hyphen, the user agent must not display a hyphen character.
Tests: None

Assertion 9.3.3-2

Reference: Section 9.3.3
(should) For operations such as searching and sorting, the soft hyphen should always be ignored.
Tests: None

Assertion 9.3.3-3

Reference: Section 9.3.3
(must) The plain hyphen is represented by the "-" character (- or -). The soft hyphen is represented by the character entity reference ­ (­ or ­)
Tests: None

Assertion 9.3.4-1

Reference: Section 9.3.4
(must) Start tags and End tags for preformatted text are required.
Tests: None

Assertion 9.3.4-2

Reference: Section 9.3.4
(must)(deprecated) PRE: width = number [CN] Deprecated. This attribute provides a hint to visual user agents about the desired width of the formatted block. The user agent can use this information to select an appropriate font size or to indent the content appropriately. The desired width is expressed in number of characters. This attribute is not widely supported currently.
Tests: None

Assertion 9.3.4-3

Reference: Section 9.3.4
(must) The PRE element tells visual user agents that the enclosed text is "preformatted".
Tests: None

Assertion 9.3.4-4

Reference: Section 9.3.4
(may) When handling preformatted text, visual user agents: 1. May leave white space intact. 2. May render text with a fixed-pitch font. 3. May disable automatic word wrap. 4. Must not disable bidirectional processing.
Tests: None

Assertion 9.3.4-5

Reference: Section 9.3.4
(may) Non-visual user agents are not required to respect extra white space in the content of a PRE element.
Tests: None

Assertion 9.3.4-6

Reference: Section 9.3.4
(informative) The DTD fragment above indicates which elements may not appear within a PRE declaration. This is the same as in HTML 3.2, and is intended to preserve constant line spacing and column alignment for text rendered in a fixed pitch font.
Tests: None

Assertion 9.3.4-7

Reference: Section 9.3.4
(author)(should) Authors are discouraged from altering this behavior through style sheets.
Tests: None

Assertion 9.3.4-8

Reference: Section 9.3.4
(should) The horizontal tab character (decimal 9 in [ISO10646] and [ISO88591] ) is usually interpreted by visual user agents as the smallest non-zero number of spaces necessary to line characters up along tab stops that are every 8 characters.
Tests: None

Assertion 9.3.4-9

Reference: Section 9.3.4
(should) Using horizontal tabs in preformatted text is strongly discouraged since it is common practice, when editing, to set the tab-spacing to other values, leading to misaligned documents.
Tests: None

Assertion 9.3.5-1

Reference: Section 9.3.5
(should) How paragraphs are rendered visually depends on the user agent. Paragraphs are usually rendered flush left with a ragged right margin. Other defaults are appropriate for right-to-left scripts.
Tests: None

Assertion 9.3.5-2

Reference: Section 9.3.5
(may) HTML user agents have traditionally rendered paragraphs with white space.
Tests: None

Assertion 9.3.5-3

Reference: Section 9.3.5
(should) Following the precedent set by the NCSA Mosaic browser in 1993, user agents generally don't justify both margins, in part because it's hard to do this effectively without sophisticated hyphenation routines. The advent of style sheets, and anti-aliased fonts with subpixel positioning promises to offer richer choices to HTML authors than previously possible.
Tests: None

Assertion 9.3.5-4

Reference: Section 9.3.5
(should) Style sheets provide rich control over the size and style of a font, the margins, space before and after a paragraph, the first line indent, justification and many other details. The user agent's default style sheet renders P elements in a familiar form.
Tests: None

Assertion 9.3.5-5

Reference: Section 9.3.5
(author)(may) One could, in principle, override this to render paragraphs without the breaks that conventionally distinguish successive paragraphs. In general, since this may confuse readers, we discourage this practice.
Tests: None

Assertion 9.3.5-6

Reference: Section 9.3.5
(should) By convention, visual HTML user agents wrap text lines to fit within the available margins. Wrapping algorithms depend on the script being formatted.
Tests: None

Assertion 9.3.5-7

Reference: Section 9.3.5
(should) In Western scripts, for example, text should only be wrapped at white space. Early user agents incorrectly wrapped lines just after the start tag or just before the end tag of an element, which resulted in dangling punctuation.
Tests: None

Assertion 9.4-1

Reference: Section 9.4
(must) Start tags and End tags for INS and DEL elements are required.
Tests: None

Assertion 9.4-2

Reference: Section 9.4
(must)INS and DEL: cite = uri [CT]. The value of this attribute is a URI that designates a source document or message. This attribute is intended to point to information explaining why a document was changed.
Tests: None

Assertion 9.4-3

Reference: Section 9.4
(must)INS and DEL: datetime = datetime [CS]. The value of this attribute specifies the date and time when the change was made.
Tests: None

Assertion 9.4-4

Reference: Section 9.4
(author)(must) INS and DEL are used to markup sections of the document that have been inserted or deleted with respect to a different version of a document
Tests: None

Assertion 9.4-5

Reference: Section 9.4
(may) INS and DEL are unusual for HTML in that they may serve as either block-level or inline elements (but not both). They may contain one or more words within a paragraph or contain one or more block-level elements such as paragraphs, lists and tables.
Tests: None

Assertion 9.4-6

Reference: Section 9.4
(must) The INS and DEL elements must not contain block-level content when these elements behave as inline elements.
Tests: None

Assertion 9.4-7

Reference: Section 9.4
(should) User agents should render inserted and deleted text in ways that make the change obvious.
Tests: None

Assertion 9.4-8

Reference: Section 9.4
(author)(may) Authors may also make comments about inserted or deleted text by means of the title attribute for the INS and DEL elements. User agents may present this information to the user (e.g., as a popup note).
Tests: None