A Modest Proposal to Make Web Printing More Satisfying

A position paper for the W3C Workshop on High Quality Printing from the Web

John C. Thomas, jct@pogo.wv.tek.com

The HyperText mechanism for linking document content is one of the most seductive features of the web. But from the standpoint of printing, it is also problematic. While contextual linking may be well suited for "browsing", it is both confusing and frustrating to the reader who wishes to thoroughly traverse document content.

Since I find it easier to read long textual documents from a printed page than from my workstation screen, I find myself reaching for the "Print" button in my favorite web browser any time a document must be scrolled more than a few times. This becomes inconvenient, however, if the document has many HyperText links, since the document tree must be manually traversed. The "[Next]" link which is beginning to appear on web pages from some of the more professionally administered web sites is only a partial solution. The tool I want is an interactive web crawler which retrieves, indexes and prints a document and any linked documents out to some predefined sphere of context. When I click on the "Print" button, I want to walk up to my printer and pick up printed output which contains:

Since I have defined a web document to include a subset of all of its links, one interesting problem is to define just how large a context to print. I would like to be able to define this context interactively. But I may also like to be able to default to a set of custom preferences. I may want to select a document and all of its direct links. Or perhaps Nth-order indirect links from a particular HTTP server. I may want to restrict the total number of pages printed. Certainly, I want to avoid recursion. I see this dialog built directly into the "Print" feature of the web browser.

On a slightly different subject, HTML documents are typically designed for on-screen viewing. Small GIF images are used in order to shorten the down- load delay and to fit comfortably onto low resolution CRT displays. These small images (and the simple rendering models used for HTML text) reduce user impatience and indeed make web pages browsable.

As formatted for printing by popular browsers, most users complain about unsatisfactory print quality. Although severely layout-challenged, HTML text will be rendered with the full resolution of the printer (usually at least 300x300 dpi). It is the pixel-replicated GIF image which is the main contributor to the perception of poor image quality.

The Adobe PDF format removes both limitations and still maintains reasonable browsability for LAN-based intranet sites. Real-world bandwidth conditions on the internet, however, make Adobe PDF format less desirable because of its generally larger file sizes. I would like to propose two alternatives which should be more browsable than Adobe PDF and produce similar print quality. Both solutions address image format.

The first solution is to use fractal image compression. This technique often produces smaller images than GIF and shorter decompression times than JPEG. But the interesting property of fractal compression is that the stored image may be decompressed to almost arbitrary resolution. Thus, the browser will decompress the image to 75dpi and the printer can decompress the same image to 300dpi. Only one image has been transferred across the internet. No HTTP protocol or HTML extensions are required. And, if the "Print" button metaphor is followed, only the web browser needs to be modified to add this support.

In Navigator 2.0, Netscape added a LOWSRC property to the <IMG> tag. The browser loads this image file before loading the file pointed to by the SRC URL. This is analogous to using an interlaced GIF or a progressive scan JPEG image. So the second alternative to improve the print quality of web pages might be to add a HIGHSRC property which essentially points to a "printable" high resolution image. That image would only be down-loaded only if the user chose to print the web page. Print quality should dramatically improved without adverse impact on browsability.