PRE Is Obsolete and Should Be Removed from the XHTML2 Structural Module
The first question which must be posed in a discussion of
PRE, is: what precisely constitues, in the words of the latest editors'
draft of XHTML2,
whitespace in the enclosed text [which] has semantic relevance? If one needs to preserve whitespace, such as in an example of python code, one should either use
blockcode for that purpose, as the definition of the three elements:
PRE makes clear.
If explicitly designated as "code" an assistive technology could respect and report the whitespaces contained in the string of
code, something which is not possible with the
PRE element, unless each and every whitespace to be preserved is explicitly expressed with a character entity value.
This would mitigate the necessity of definitively communicating multi-modally the presence and number of whitespaces in an example through the use of the character entity value of a "non-breaking space" (e.g.
(i'm not sure which is right, the first value is from an online list, the second from XML 1.0, Fourth Edition) -- if the whitespace has "semantic meaning", then the amount of whitespace MUST be explicitly expressed with markup (or the containing element MUST respect and not conflate whitespace in strings marked with the
BLOCKCODE element) in order to make the number of whitespaces perceivable (theoretically definable on a per-element basis through the "
layout" attribute, which will be discussed further, below), as well as endowing a human or assistive technology to process the precise number of whitespaces correctly. therefore, best practice would be to use either
for each whitespace the author desires to be preserved, as variable space delineation is a hallmark of some programming languages, and a user needs the ability to count the whitespaces (or have them correctly rendered when sent to a refreshable braille display or when embossed into tactile braille)
Optimally, one would want to use a character entity code to indicate a TAB or multiple TABs, as some programming languages use TAB delineation, and the purpose of
BLOCKCODE is to ENSURE that any user can copy and paste the sample of
CODE, with whitespaces and explicitly declared TABs preserved.
Rationale for the Obscelecense of
Since the following is explicitly stated in the current editors' draft of XHTML2' Structural module:
Note that while historically one use of the
preelement has been as a container for source code, the new
blockcodeelement more appropriate for that.
and, since the example of the "bad poem" in the definition of
PRE should be controlled either by style sheets, the "
layout" attribute (set to
relevant) or the
xml:space attribute, defined in Section 2.10 ("White Space Handling") in the Fifth Edition of XML 1.0:
A special attribute named
xml:spacemay be attached to an element to signal an intention that in that element, white space should be preserved by applications. In valid documents, this attribute, like any other, must be declared if it is used. When declared, it must be given as an enumerated type whose values are one or both of "
default" and "
preserve". For example:
<!ATTLIST poem xml:space (default|preserve) 'preserve'>
and since the "conformance definition" for whitespace in XHTML2 states:
White space must be handled according to the rules of [XML]. All XHTML 2 elements preserve whitespace.
The user agent must use the definition from CSS for processing white space characters [CSS3-TEXT].
PROPOSED 1: the
PRE element is no longer necessary, and therefore should be removed from the XHTML2 Structural Model.
CAVEAT 1.1: at the VERY least --
PRE should be deprecated into a legacy module, but removing it altogether will eliminate future headaches and break authors of the lazy habit of using
PRE as a lazy-catchall solution; superior mechanisms other than
PRE, such as
BLOCKCODE, have been introduced, and there is widespread support for CSS to control columnization, thereby eliminating another abuse of
PROPOSED 2: That the definition of
BLOCKCODE be amended to indicate that whitespace, line breaks, and other "layout" compenents contained within
BLOCKCODE is intended to be preserved, and that all current references to the
PRE element be removed. An authoring tool could very easily translate individual spaces and other dilimeters (such as TAB) into the UTF-8 codes for a non-breaking whitespace and/or a TAB or TABs. This is also a strategy which a quality authoring tool would, when set to encode in UTF-8, automaically substitute the unicode values, expressed as character entity values, of "special" characters contained within
BLOCKCODE into the actual character-entity for the delimitor, so that it can be faithfully and confidently communicated to the user, and which would facillitate the retention of whitespace and TABs when the string contained within
PROPOSED 3: that the "bad poem" example be changed to reflect Section 2.10 of XML 1.0 Fifth Edition through the use of the
preserve" attribute, as follows:
<!-- begin 'bad poem' example --> <!-- in HEAD --> <!ATTLIST poem xml:space (preserve) #FIXED 'preserve'> <!-- in BODY --> <!-- ... --> <p xml:id="poem" layout="relevant" xml:space="preserve"> If I had any talent I would be a poet </p> <!-- end 'bad poem' example -->
PROPOSED 4: That the "bad poem" example be removed from the draft altogether, an the use of spacing for stylistic effect be covered elsewhere in the document (although this could also be, and perhaps should be, handled by a pass-off to a CSS recommendation )