]> Formatting model for SGML/HTML style sheets

Formatting model for SGML/HTML style sheets

The `stream of chunks' model

For the moment, the formatting model is as simple as possible. The formatter accepts a stream of words, spaces, vertical skips, paragraph parameters, and inline objects, collectively known as chunks. The formatter fills lines with these chunks (left to right and right to left, depending on the text), and when the line is full it justifies it and starts the next line.

In addition to the main stream, there are a left and a right track, mainly for floating figures. There is some synchronization between these three tracks:

When the stream of chunks switches from the main track to one at the side, the first chunk in the side track will start at the same y-position as the current line in the main track, or lower.
A paragraph in the main track can be forced to start below the last figure in the left, right, or both tracks.

Two more tracks are for page headers and footers (or for non-scrolling areas, like HTML3's <BANNER>), making a total of five areas. The maximum size of each of them is left to the implementation: a hard-copy program may fail if one of them becomes too large, an interactive program may provide scrollbars in such cases.

For tabular layout, the model is insufficient and a concept of tables with nested frames is needed. Each frame is a simplified version of the global frame, with the same formatting model except that there are no left and right tracks. Each frame exports some information about itself, in particular its current size and its minimum and maximum needed width.

Because table cells will probably also have to be aligned on decimal points or other letters, the frame must also export the horizontal position of the first such letter that it contains.

Alternative models

Other models are possible as well. One candidates is the boxes & glue model of T_EX, which provides just two kinds of boxes, horizontal and vertical, that can be nested or stacked in various ways with stretchable glue between them.

The DSSSL system of providing a set of higher level boxes for specific purposes is also possible. In DSSSL there are boxes, called `flow objects' for list items (containing a label and a text), normal paragraphs, rules, leaders, fractions (with a numerator and denominator), etc. Each flow object has its own set of parameters.

Clearly, flow objects are less flexible than the more primitive boxes of T_EX, but they may be easier to use for a designer. In terms of implementation, they require more code, but make testing for nonsensical lay-outs easier.

Other media

The style sheet language should also be able to express the `lay out' of the document on other media than a computer screen, such as on paper (with pages and page numbers), on a speech generator, or on a braille page.

The `formatting model' for the latter two media is probably much simpler, because there are fewer dimensions or because concepts of `nesting' have no equivalent. The simple `stream of chunks' model may be adequate here, but with a different set of properties to apply to each chunk and to the space between them.

(Back) to style sheet overview

Bert Bos, 30 May 1995