HTML Extensions
To Improve Layout Control for Viewing and Printing Documents

Sylvan Butler and Roberta MacMillan, Hewlett-Packard
Stephen Waters, Microsoft

Last Modified: April 23, 1996


Contents

Introduction

Flow

Vertical Spacing

Adornments

Page Concepts

Conditional


Introduction

Abstract

Looking at the layout effects possible with contemporary desktop publishing applications can be rather overwhelming. While much of what exists is far beyond the current capability of HTML, or the needs of the majority of web publishers, there are some effects which are frequently needed and obviously lacking in the creations of web authors today.

This document is a discussion of features culled from the layouts of many current web and printed pages. It is not meant to be a complete proposal or specification. It is meant to stimulate discussion into areas where improvements can and need to be made.

Organization

The following sections each cover a specific layout need. In general, the first part of each section discusses how a similar effect could be achieved on the web today, and the final part contains the idea of what we would like to see. The features were filtered through the various proposals documented on the W3C pages, however it is entirely possible, even probable, that one or more of these features is already proposed in another document.

Throughout this document, the terms "page" and "region" are used. Region is defined in the section regarding Column and Callout (region) Formatting. Page is typically used to refer to the content that would fit on a piece of paper as printed, or to a single HTML file, but when this document refers to content flowing onto or into a "page", "region" can be used interchangably.

Back to Contents

Column and Callout (region) Formatting

Workaround for today

Rectangular callouts can be simulated by using tables or frames or inline images. Tables and frames are simulations, not true callouts, because the text is not fully flowed around them in one piece, but chopped into pieces by the author and placed in frame or cells surrounding the callout.

The border for table cells can be hidden and the cell padding and cell spacing can be set to 0, allowing for the look of even line spacing across cells. The text for frames can be restricted to avoid unexpected flows due to resizing of windows, but only at the cost of using scroll bars for navigation and the borders of the frame will always visibly separate sections.

Creating the callout as an image and embedding it inline will allow for flow around it, but while the typeface and size of the callout can be fixed, that of the surrounding text cannot.

To give the effect of a non-rectangular shape, one can use special alignment or preformatted text. This allows for the callout shape to be independent of window sizing, however this method doesn't allow for text to flow freely around the callout.

To allow both non-rectangular shapes and controlled flow around the callout, the most popular method available today is to use a non-HTML document with its specific viewer. The only alternative is to present the entire page as a bitmapped image.

What we'd like to see

A capability based on the draft document "Frames using Style Sheets" but with the added capability to specify non-rectangular regions; perhaps using the "area" tag concept from "Client Side Image Maps". The successive area tags define regions which are filled with the specified element(s) in the order given. If an element is too large, or more element(s) exist than will fit into the specified region(s), then the excess will not be displayed. Touching regions are treated as one large region.

To visualize the effect, imagine the defined regions as filled black areas on an image that is otherwise entirely white. The specified content will be flowed into each black area in normal reading order as if dumped sequentially into each. Other elements not specified to be contained by the region are flowed around the region, respecting the specified borders and margins.

It might be nice to use such an image to specify the region, rather than a list of vertices. Using an image would allow the easy specification of arbitrarily shaped (smooth) curves, etc. No matter how the regions are specified, the flow of content into pages and regions needs careful definition.

Back to Contents

Widow and Orphan Control

Workaround for today

One can attempt to manually control widow and orphan lines (lines at the end or beginning of a paragraph which are left by themselves at the top or bottom of a page or region due to dynamically generated breaks) today in HTML with line break tags <BR>, no break tags <NOBR> and word break tags <WBR>, however determining whether the document will print with widow or orphan lines is difficult and only possible using a print preview feature (assuming such exists in the user agent).

What we'd like to see

Layout control which causes the specified element to be formatted so as to prevent widow and orphan lines with the specified number of lines or fewer. The default is 1 line. This formatting is accomplished by forcing the automatically generated page (or region) break to occur earlier or later than normal to keep the lines together. If the next page or region will not hold more than the specified number of lines, the user agent is free to strand lines as it sees fit, so long as the content remains visible to the extent permitted by the total available area in the two regions.

Back to Contents

Vertical Justification

Workaround for today

No dynamic vertical control exists in HTML. Some static vertical control can be attained using line breaks or line-height.

What we'd like to see

Layout control which causes the specified element to be formatted with line spacing (leading) evenly increased so as to fill the vertical area available.

Note, should spacing increase only, not decrease?

Back to Contents

Fixed and Floating Line Spacing

Workaround for today

Only the floating model is currently available. Fixed spacing can be simulated by embedding transparent images of specific heights with each line breaks. This method, however, does not allow the reflowing of text with changes in window size, nor does it address the issue of font size changes. The style sheet proposal does add LINE-HEIGHT which may be sufficient.

What we'd like to see

Layout control which causes the specified element to be formatted with line spacing fixed to the specified height, or (default) free to float to provide the proper leading for the tallest element on any given line.

Back to Contents

Change Bars

Workaround for today

The intent of the change bar is to indicate what portion of the document has been modified without interrupting the flow or changing the formatting. The document could be considered complete with the change bars removed. The difficulty in creating change bars in HTML is that the concept of a "line" is absent. As the window resizes, the line break may shift, and possibly invalidate the change bar. Embedded graphics surrounding changed text can actually change the flow of the document as would any preformatting.

One cumbersome way of assuring that change bars remain with the appropriate text today is to preformat all of the text, or place the text in table cells, allowing the leftmost table cell to contain a change bar. That way the appropriate line can be isolated and marked as changed. Another way of marking changed text is to use color. Unfortunately, this does not transfer well to monochrome printers as well as adding significant overhead to a print job which is not meant for final distribution, rather for more editing. The easiest solution today is to create the page with an application which allows for change bars and require the use of a specific viewer.

What we'd like to see

Layout control which causes the specified element to be formatted with visible "change bars", in whatever manner defined in the style sheet. The default would be to put a vertical bar at the left of the display area on every line which contains any part of the element. Possible options in addition to the typical text decorations, allow specification of a character to substitute for the vertical bar, or an image to use instead.

Back to Contents

Rotated Text

Workaround for today

Rotated text is especially a problem with long, horizontal headings in tables. Headings simply get wrapped to the next line in the cell containing the heading. The only way to do rotated text today is to create the rotated text as a graphic, but then it doesn't match the appearance or behavior of the other text, especially when printed. The easiest solution today is to create the page with an application which allows for text rotation, and then require the use of an add-on viewer.

What we'd like to see

Layout control which causes the specified text element to be formatted with the specified degree of rotation from the normal. If one visualizes text as occurring normally from 9 to 3 on the clock face, a 90° rotation will cause the text to read from 12 to 6, while a 180° rotation will cause the text to appear upside down so as to be read from 3 to 9.

Back to Contents

Non-Tiling Background Images

Workaround for today

Today background images are always tiled and are never printed.

What we'd like to see

Layout control which so that the background image will not be tiled, but will show only once at the specified position. Furthermore, user agents need to print the background images, or at least be configurable to do so.

Back to Contents

Headers and Footers

Workaround for today

Repeated headers/footers cannot be accomplished in HTML documents today. Any simulation by manually repeating these elements would not be reliable due to font and window resizing.

What we'd like to see

An element which is formatted as a header or footer for the specified page(s) as printed or displayed. (How do browsers which don't understand this element display the header/footer, once at the beginning?) Headers and footers do not appear within regions, but only on pages as defined by paper size or by specified page breaks.

For example, on a page containing two columns with a callout in the center, the header/footer would only appear once on the page. Separate headers and footers can be specified for odd or even pages, first page, etc.

A user agent may wish to allow document headers to override the default header for the printout, or add yet another header and/or footer to the paper. The user agent shouldn't ignore the header or footer specified in the HTML document or content may be lost.

See the Page (or region) Break section for a discussion concerning the concept of pages for display vs. print.

Back to Contents

Footnote

Workaround for today

Today one can handle footnotes as endnotes or as links to other portions of the same document or to separate documents. In the first method, the endnotes are displayed and printed as expected. If links are used to simulate footnotes, the displayed document works as intended, but in the printed version, the footnote information is not as easy for the reader to locate, and if they are linked outside the document, they may not be printed at all.

What we'd like to see

Footnotes are already in some "HTML 3.0" proposals as the <FN> element. This should be standardized and additional style specifications are needed to control how footnotes are displayed or printed. Useful options include with a popup, as a group at the end of the "page", as endnotes, with/without a separator bar, inside the text, etc. Using the capability discussed in Print Only or Display Only Elements they could display differently when viewing vs. printing.

Back to Contents

Page (or region) Break

Workaround for today

There is no reliable way to force a page break today.

What we'd like to see

Layout control which causes the following content to start at the beginning of the next page or region. When being viewed outside of a region, it causes the user agent to display a "new page" with new header and footer elements, footnotes, etc. as specified. When being printed outside of a region, the break causes the user agent to place the following content on the next piece of paper. Within regions, it causes the following content to appear at the top of the next region.

With no break specified and viewing the document on screen as opposed to printing, the entire document just fits on one scrollable "page." In this case only the first page header and footer are visible, footnotes styled at the bottom of the page would appear the same as endnotes, the page number is "1" throughout the document, etc. Without this tag, when printing instead of viewing the document, the page size is dependent on the size and orientation of the paper in the printer. Footnotes, headers and footers are printed according to the style associated with each.

Back to Contents

No Page (or region) Break

Workaround for today

There is no way to prevent a page break today.

What we'd like to see

The specified element forced to fit into one page or region. If it cannot fit into the current or next region, the user agent is free to split the content in any manner it sees fit, so long as the entire content which can fit into the available pages or regions is displayed. This is closely related to the flow issues discussed in Column and Callout (region) Formatting.

Back to Contents

Print Only or Display Only Elements

Workaround for today

No distinction made today.

What we'd like to see

A conditional, such that the specified elements take effect during print or during display, respectively. Additional conditionals (such as color/monochrome) are also useful.

Back to Contents

Print Alternate Document

Workaround for today

The user has to manually select the correct document to print.

What we'd like to see

A reference to an entirely different document to be printed when the print command is given to the user agent. For example, the alternate document may be a .DOC file, a PDF file, or even another HTML file. This second file is optimized for the best print layout, as opposed to the best screen layout contained in the file intended for viewing.

Back to Contents