Robert S. Sutor
Interactive Scientific Publishing
IBM T.J. Watson Research Center
Yorktown Heights, NY 10598 USA
Embedded objects such as ActiveX controls and Netscape Navigator plug-ins today suffer from a number of inconveniences that significantly decrease the overall quality of the rendering of the HTML pages in which they are contained. In this position paper I list several problems that should be solved in any new version of HTML or markup for embedding and rendering XML data in a larger document. These issues are especially relevant now that the Mathematical Markup Language Specification is an official Recomendation of the W3C. Embedded objects need to be treated as first-class citizens by the HTML markup that defines them and by the browsers whose documents contain them.
The ideas in this paper are the result of discussions within the W3C Math Working Group and from my own experience in implementing the IBM techexplorer Hypermedia Browser plug-in. Many of these ideas were expressed by Paul Topping in his paper MathML Requirements for XML in HTML for the 11-12 February 1998 "XML in HTML" Coordination Meeting. Although the examples are from the area of mathematics, many of the problems are common to any embedded object that must be rendered in the middle of surrounding text.
The first problem that anyone encounters when trying to embed mathematics within HTML is that the current markup does not allow precise positioning with respect to the baseline in any easy way. That is, consider
Note how the matrix is adjusted vertically with respect to the surrounding text. The matrix is not simply centered vertically. Indeed, the calculation of the correct vertical shift is one of the main challenges in properly rendering mathematics. Compare the above example with this:
The second example occurs when only the height and width of an embedded object is available and the object is placed flat against the baseline.
While it may be possible to use style sheets and perhaps the SPAN element with TOP and POSITION attributes to adjust embedded objects vertically, this is not practical for documents containing hundreds of math expressions within sentences.
A common trick is to create a GIF file containing the matrix and then center the image vertically. If that doesn't quite look right, extra whitespace is added above or below so that the image floats to the right level. However using GIFs is very much the wrong way of rendering the markup that described the object. The font size used in rendering the GIF may be different from that used in the browser because of user preferences or style sheets. What is more important, however, is that it is very hard if not impossible to recreate the original markup when needed again, say, for computation. Even if you could somehow attach the markup to the GIF, the association between parts of the image and subexpressions in the markup is lost. The point here is that the math object can be rendered just as well as the text, only it is being done by a third-party component employed by the browser. The browser needs help from HTML to allow the embedded object to be positioned correctly.
A formatting box has a depth in addition to its height and width. The depth is equal to the distance the box descends below the surrounding baseline. The total height of the box is equal to the height plus the depth.
Any element that can appear inline with text and accepts a HEIGHT or WIDTH attribute should also process a DEPTH attribute for precise positioning with respect to the baseline. Alternatively, these elements should process the POSITION attribute with the TOP and LEFT attributes.
Now that we've determined that there are really three important measurements for an embedded object rather than just two, we must state that it is silly to always expect the author or the application that created the markup for the embedded object to know ahead of time how big the object will be when rendered. Composition consists of non-trivial computations and is very dependent on the fonts used and their sizes. It is the renderer of the markup and not the producer that should compute the correct height, depth, and width of the object. Imagine how ridiclous HTML markup would be if authors were always required ahead of time to state the required height of each paragraph!
Browsers must be able to query a renderer of an embedded object about the size of the object. Moreover, negotiation should be possible when rendering compromises are possible. Consider the following browser (B) and embedded object renderer (R) conversation:
B: How big do you want to be?
R: Height = 100, Depth = 50, Width = 300.
B: Sorry, can only do Width = 200 maximum. Can you adapt?
R: (Hmmm, I guess I'll have to be tall and skinny.) How about Height = 250, Depth = 75, Width = 175?
Of course, the browser could have indicated its constraints when it started negotiating. Embedded objects can react in various ways when the allotted space is either too big or too small. For example, centering within a large space will often be reasonable, and scrollbars or font size reductions can adjust for being crammed into a small rectangle.
Embedded object elements should have attributes that describe if size negotiation should take place.
We saw above that GIFs are a bad idea for rendering objects that originally had markup associated with them. In particular, the prevailing style information for documents might be relevant for rendering embedded objects. Note how the text font style information in an HTML document might be passed to a math object renderer so that the same style could be used for text within the math:
The embedded object renderer should be able to query the browser about style information. However, the document author should be able to indicate that style information should not be exchanged and the embedded object rendered in isolation.Embedded object elements should have an attribute that determines if style information should be exchanged.