Accessibility, and Separating Form from Content

Joseph Scheuhammer, Adaptive Technology Resource Centre.
July 6, 2000.

Abstract

XML is an emerging standard. A major reason for its creation was the separation of "data" from "presentation" in order to make documents portable and platform independent. This document discusses the data/presentation dichotomy with an eye towards its relevance to accessibility.

Introduction

The main reason for XML is to separate clearly "data" from "presentation", or, to put it another way, create portable textual data. Portable, structured text was the original intent of HTML: As an instance of SGML, the idea behind HTML was to develop a set of elements that would define the structure of a a web document. But, as HTML matured, the wrong kinds of elements were added whose purpose was entirely format oriented.

This paper refers to the data/presentation distinction in the vernacular "content" vs. "form". In order to get a feel of what the difference is, consider the sentence, "The sun is shining". The content of that sentence says something about a celestial object, and has implications about the weather. But, that content is distinct from how it is presented -- one could easily re-format it in bold face for emphasis (e.g., "The sun is shining"), or perhaps in red (e.g., "The sun is shining"). In all three cases, the content remains the same, but is displayed in different textual formats.

The development of XML began from the premise that form and content should be decoupled, and that XML would be used primarily to encode content. While mention is made in this document of XML, most of the discussion will centre around form and content, and how that relates to accessibility.

From the point of view of accessibility, the main reason to promote the separation of content from form is that it has the potential to ease, and in some cases, solve, problems of accessibility. When content is clearly distinct from its format, the self-same content can be presented in numerous ways. Consider that a standard problem of accessibility is: how does one acquire and present information to persons with disabilities? To take a concrete example, how does one take information presented in a visual medium to a blind person? The answer is implied by the question: it is the medium that is visual, not the information. Or, to put it another way, the information content is one thing, and the means of its presentation, another. Once one has gained unequivocal access to the content, that content can be presented via a variety of media. But, in order to do this, is it requisite that the content clearly be separated from its format.

We are all blind

To paraphrase Professor Gregg Vanderheiden, of the TRACE centre, University of Wisconsin: "There are two kinds of blind people; those who can't see, and those who can". This apparent self contradictory statement was made at a discussion panel at W8, the eighth annual conference of the world wide web. Half of the panel were people interested in, and working towards, making the web accessible. The other half of the panel were people who designed user interfaces for mobile devices. Such devices are limited in their screen size, and are manipulated by pressing buttons on a keypad. It is a challenge, to say the least, to present web content on such a device, and allow meaningful interactions with it. Vanderheiden was making the point that when driving the speed limit down a four lane highway, it is highly undesirable that the driver pay much visual or motor attention to their cell phone. They should be looking at the road, with both hands firmly on the wheel. In such a scenario, the driver is effectively blind.

The real point here is that the problem of accessibility is a universal one. It is not relevant just in the case of people with disabilities. The problem of how to provide access to information, say for a blind person, frequently resembles the more general problem of how to provide access to information in other limiting contexts. The problems are the same.

Here are two ways of describing the benefits of separating form from content, one specifically with regard to accessibility, the other, with respect to a general user interface issue. I leave it to the reader as to whether these are precisely identical.

From an accessibility point of view, there is a need to design hypermedia content in a way that allows someone with a disability (e.g., blind) to interact easily with that media.

From a general interface point view, there is a need to design hypermedia content in a way that allows it to be presented on a variety of different devices, with different capabilities.

Solution: Separate Form from Content

Part of the solution to both of these problems is the clean separation of form from content. Furthermore, if accessibility is simply a special instance of a universal problem, then the separation of form and content should benefit all. This section discusses the ideas of form and content in greater detail, and why separating them is a Good Thing TM.

What is Content?

It was noted earlier that HTML began as a content markup language, whose purpose was to delineate the text of a document according to its logical structure. This is exemplified by the header elements, H1 through H6. As an example, here are two such elements, first as they appear in the markup,

<H1>This is a header level one</H1>
<H2>This is a header level two</H2>

and then as they appear in a web browser:

This is a header level one

This is a header level two

The different heading texts are formatted using different font sizes, weight, centred vs. flush left, and so on. But, if you look at the tag, all it really indicates is that text within is a heading. The different types of header specify levels of importance -- H1 is a major header, H2 is a sub-header, and so on. The Hn tag does not say anything remotely as specific as, "render the text as 18 point bold". That is, the element says nothing about how the text it surrounds is displayed. Instead, header elements declare the headings or major sections of the text, which is a logical kind of markup. These elements do not say anything about how that text appears.

This is good because it forestalls inferences. That is, by being non-committal in terms of how the text will look, the markup states only the category that the text belongs to. This allows one to build devices that are sensitive to such categories even in cases where format is a irrelevant. However, if the markup goes further, and does specify the exact format, then the category information is only implicit, and it becomes more difficult to recover it. It must be inferred from the format, and that inference may not always be sound.

Unfortunately, over the course of its development, people insisted on adding elements to HTML for purely formatting reasons.

What is Format?

Initially, there were a pair of HTML elements whose purpose was to indicate that some portion of the text required emphasis; and this pair of elements allowed for two degrees of emphasis. The elements in question are the EM and STRONG elements. At a later stage, the elements for italic and bold text were added; these are the I and B elements. Here is an example of all of them, again as straight markup,

<EM>This is emphasized text.</EM>
<STRONG>This is strongly emphasized text.</STRONG>
<I>This is italic text.</I>
<B>This is bold text.</B>

and as rendered text:

This is emphasized text.
This is strongly emphasized text.
This is italic text.
This is bold text.

Now, the I element means, "render as italic text"; and the B element, "render as bold". Obviously, these are both formatting commands. However, a glance at the EM and STRONG elements reveals that they are also rendered as italic and bold, respectively. What is the difference?

The difference is that if the markup is EM or STRONG, then the text is declared as requiring emphasis, but only that. It does not go further to declare how the text should be emphasized. On the other hand, while using I or B does emphasize the text, it does so in a purely visual way. "Italic" or "bold" are type setting directives for printed versions of the text. What if the markup were passed to a voice synthesizer to speak the text? How does one speak text that is bold? Using B commits one to a specific way of rendering the text, specifically, a visual one; when, what is really desired is a declaration that this text is somehow different and to allow the expression of that difference to vary depending on the medium in which it is rendered. Indeed, even within a medium, there may be constraints, such as screen real estate, that alter how emphasized text can be drawn. Thus, the use of EM and STRONG elements is superior in the sense that they describe the content without committing it to a specific format.

Style Sheets

Okay, so we have all agreed to use only elements of HTML, and in the future, XML, that declare the structure and content of our documents. Still, we do want to present them somehow, perhaps on a desktop personal computer, or a mobile phone, or both. In either case, we are forced ultimately to render the document somehow, and to format that content in some way. How can this be done? Better still, how can the same markup be rendered in completely different ways?

The short answer is: style sheets. Style sheets are companion documents, that reside in files separate from the content, that describe how that content is to be rendered. There are a number of style sheet solutions in the works; here I will describe (briefly) the notion of cascading style sheets (CSS).

CSS are essentially declarations of the style (font, size, etc.) to associate with different elements of the markup. The style can be "global" in the sense that the style sheet declares how all elements of a certain type are to be rendered. For example, it could declare that all EM elements be rendered in an italic face. Or the style can be minutely "local" in that a specific instance of a element be rendered in a certain way. CSS is very flexible, and, by using them, one can make one's HTML documents exceedingly rich, without sacrificing the specification of those documents' content.

As an example, here is the beginning of a document that uses CSS in the extreme to present a set of paragraphs in different fonts, sizes, and overlapping one another in various ways. The styles were taken from the W3C's "Cascading Style Sheets" web site and the complete document can be found there.


W3C

Cascading Style Sheets

(This page is intended to be viewed with using CSS style sheets, but it need not, of course)

What's new?

Learning CSS

CSS Browsers

Authoring Tools

Specs

CSS1 Test Suite

W3C Core Styles

CSS Validator

And also: SAC, developing CSS3, translations.

Cascading Style Sheets (CSS) is a simple mechanism for adding style (e.g. fonts, colors, spacing) to Web documents. For background information on style sheets, see the Web style sheets resource page. Discussions about CSS are carried out on the www-style@w3.org mailing list and on comp.infosystems.www.authoring.stylesheets.

What's new?

...


By way of comparison, the following block of text shows how that page would be rendered without the style sheet information:


W3C

Cascading Style Sheets

(This page is intended to be viewed with using CSS style sheets, but it need not, of course)

What's new?

Learning CSS

CSS Browsers

Authoring Tools

Specs

CSS1 Test Suite

W3C Core Styles

CSS Validator

And also: SAC, developing CSS3, translations.

Cascading Style Sheets (CSS) is a simple mechanism for adding style (e.g. fonts, colors, spacing) to Web documents. For background information on style sheets, see the Web style sheets resource page. Discussions about CSS are carried out on the www-style@w3.org mailing list and on comp.infosystems.www.authoring.stylesheets.

What's new?

...


As the above examples show, one can design apparently graphic-intense web sites even when restricting oneself to content markup. In other words, no format specific HTML elements are used in the markup of the document since the format information is taken from separate style documents. Furthermore, using style sheets increases the likelihood that the document content will be accessible, since that content is readily available. An audio presentation of the content, for example, can simply ignore the visual style sheets as irrelevant and recite the content appropriately. And, it can do so because the content is now separate from its format.

Still, if it is possible to provide visual style information separate from some content, why not provide the analogous audio styles? In point of fact, an audio CSS specification has been developed. In this case, the style sheets give directives on how to vary the presentation of content when using an audio medium. Here is an example section of an audio style sheet that describes how to present headings and paragraphs:

H1, H2, H3, H4, H5, H6 {
	voice-family: paul;
	stress: 20;
	richness: 90;
	cue-before: url("ping.au")
}
P.heidi { azimuth: center-left }
P.peter { azimuth: right }
P.goat  { volume: x-soft }

According to this fragment of the style sheet, all heading text is to be spoken with the voice "paul". This designates a specific voice on a synthesizer, much like a font family does in the case of the visual formatting of text. The stress level is to be "20", which means that the voice should stress heading text more than compared to other, "plain" text. The richness style indicates how monotone the voice should be; here it is set to be non-monotone. Finally, before speaking a heading, the renderer is directed to play a ping sound effect, providing another cue that the text is a heading.

With respect to the presentation of paragraphs, there are three specific kinds classified as "heidi", "peter", and "goat". These are in addition to standard vanilla paragraphs. Paragraphs of type "heidi" should sound like they are coming from slightly left of centre, whereas "peter" paragraphs should come from the far right. Paragraphs of type "goat" should be spoken extra quietly.

A complete audio style sheet would, of course, specify the audio styles for all the content elements. It is hoped that this fragment will give the reader a sense of how such styles could be employed to express the content of an HTML document using an audio medium. For further details concerning audio styles, the reader is invited to visit the URL for Aural Cascading Style Sheets.

Conclusions

The separation of form from content is beneficial in a general sense in the context of designing interfaces to web content, and, specifically, as a way of making such content accessible. By using structural markup exclusively to delineate sections of one's documents, and avoiding the use of any format markup, there is a greater opportunity to deploy that content in a wider context, be it desktop personal computers, smaller hand held devices, or intermediate sized internet appliances. The use of style sheets permits the presentation of the same content in this variety of contexts, and, simultaneously opens the door to alternative interfaces for those with disabilities.


Copyright © 2000 Adaptive Technology Resource Centre, University of Toronto.

Verbatim copying and distribution of this entire article is permitted in any medium, provided this notice is preserved.

Updated: 2000 Jul 07 JS

Web site maintained by Joseph Scheuhammer