Up: Table of Contents |
REC-MathML-19980407; revised 19990707 |

To be effective, MathML must work well with a wide variety of renderers, processors, translators and editors. This chapter addresses some of the interface issues involved in generating and rendering MathML. Since MathML exists primarily to encode mathematics in Web documents, perhaps the most important interface issues are related to embedding MathML in HTML.

There are three kinds of interface issues that arise in embedding MathML in HTML. First, MathML must be semantically integrated into HTML. For example, there must be a mechanism for browsers to recognize MathML markup as embedded content, and not as an HTML syntax error. More generally, the semantic embedding of MathML in HTML is a special case of embedding XML in HTML, which involves issues such as name space management, and document validation.

Second, MathML rendering must be integrated into browser software. Until MathML is rendered natively by browsers, rendering will typically be done by embedded elements. However, to properly render mathematical notation in context in a Web document, improved coordination between browsers and embedded elements will be necessary. For example, embedded elements will need to be able to detect the ambient rendering environment, such as baseline, font family and color scheme, and respond appropriately to reader input such as font size changes. Support for printing is also essential.

Third, tools for generating MathML must be developed, including editors, translators, and export capabilities in computer algebra systems, and other scientific software.. Since MathML is designed to be powerful and flexible to accommodate a wide range of applications, while at the same time remaining structured and explicit for easy processing, MathML expressions tend to be lengthy, and prone to error when entered by hand. Therefore, special emphasis must be given to insuring that MathML can be easily generated by user-friendly conversion and authoring tools.

The W3C Math working group is committed to working with software vendors to develop a wide range of equation editors and translation tools, and plans to continue to do so in the future. In particular, the working group monitors the public www-math@w3.org mailing list, and will attempt to provide support to software developers with questions about the MathML specification. The working group also intends to try to stimulate the formation of MathML developer and user groups. For current information about MathML tools, applications and user support activities, consult the W3C Math home page.

MathML specifies a single top-level **math** element, which
encapsulates each instance of MathML markup within an HTML page. As
such, the **math** element provides an attachment point for
information which affects a MathML expression as a whole. For
example, the **math** element is the logical place to attach style
sheet or macro information in the future, when these facilities become
available for MathML.

Ideally, the **math** element should also serve as the interface
for embedding MathML in HTML. To function in this capacity, the
**math** element would have to simultaneously signal the semantic
inclusion of MathML (XML) content in HTML, and provide the necessary
machinery for rendering its content in a browser either by invoking an
embedded element, or by specifying parameters for a native renderer in
the browser. Both semantic inclusion and rendering present a number
of issues that extend beyond the boundaries of W3C Math. To a large
extent, the issues which arise for embedding MathML in HTML are the
same as those for the more general problem of embedding XML in HTML.
Resolving these issues will require the efforts of a number of World
Wide Web Consortium Activities, including the HTML, XML, CSS and DOM
activities.

In order to produce a complete and self-contained description of
MathML, this document only specifies the attributes and usage of the
**math** element as a top-level element for MathML, and not as an
interface element. The W3C Math working group intends to
continue working closely with other World Wide Web Consortium
activities to insure that emerging standards for embedding XML in
HTML accommodate seamless integration of MathML in HTML. Section 7.1.2
lists requirements which an interface element for MathML would have to
meet in order to fully integrate MathML into HTML. However, it is
important to note that the MathML specification is independent of the
ultimate embedding mechanism.

As stated above, MathML specifies a single top-level **math**
element. All other MathML content must be contained in a **math**
element; equivalently, every valid, complete MathML expression must be
contained in **<math>** tags. The **math** element must
always be the outermost element in a MathML expression; it is an
error for one **math** element to contain another.

Applications which return subexpressions of other MathML
expressions, for example as the result of a cut-and-paste operation,
should always wrap them in **<math>** tags. The presence of
enclosing **<math>** tags should be a reasonable heuristic
test for MathML content. Similarly, applications which insert MathML
expressions in other MathML expressions must take care to remove the
**<math>** tags from the inner expressions.

The **math** element can contain an arbitrary number of children
schemata. The children schemata render by default as if they were
contained in a **mrow** element.

**class**="*value*"

**style**="*value*"- Provided for future Cascading Style Sheet compatibility.
**macros**="*URL**URL*..."- This attribute provides a way of pointing to external macro
definition files. Macros are not part of the MathML
specification, but a macro mechanism is anticipated as a future
extension to MathML.
**mode**="*display*|*inline*"- The
**mode**attribute specifies whether the enclosed MathML expression should be rendered in a display style or an in-line style. The default is**mode**="inline".

The top-level **math** element described in the preceding
section is concerned with encapsulating MathML content and defining
attributes which affect the entire enclosed expression. It is, in a
sense, "inward looking." However, to render MathML properly in a
browser, and to integrate it properly into an HTML document, an
"outward looking" interface element is also required. This interface
element must be aware of its surrounding environment, and provide a
mechanism for passing information between the browser, and the MathML
renderer.

As noted above, the MathML interface element and the MathML
top-level element should ideally be one and the same. The **math**
element should not only serve to encapsulate MathML content, it should
signal the semantic embedding of MathML content to an HTML processor,
and admit additional attributes for controlling how the MathML
renderer should interact with the browser.

Since a general mechanism for embedding XML in HTML is anticipated
in the near future which may not be compatible with using the
top-level **math** element for the interface element as well, the
remainder of this section describes attributes and functionality that
a MathML interface element should ultimately provide. In the near
term, implementors attempting to provide interim solutions for
rendering MathML in browsers should try to give authors some way of
passing the following interface attributes to the renderer:

**type**="*mime type*"- The type attribute assigns a MIME type to the tag content. This
attribute should ideally be used to invoke an embedded element, such
as a Java applet, plug-in or ActiveX control, to render the tag
content as described in the next section.
**name**="*value*"- Provided for scripting.
**height**=*nn*

**width**=*nn*

**baseline**=*nn*

- Ideally, embedded elements will soon be able to dynamically
negotiate height, width and baseline alignment with browsers.
However, these optional attributes are suggested as an interim
solution for software vendors that want to support MathML, but are
unable to provide dynamic resizing and alignment.
**overflow**="*scroll|elide|truncate|scale*"-
In cases where size negotiation is not possible or fails (for example
in the case of an extremely long equation), this attribute is provided
to suggest an alternative processing method to the renderer.
*scroll*- The window provides a viewport into the larger complete display of the mathematical expression. Horizontal or vertical scrollbars are added to the window as necessary to allow the viewport to be moved to a different position.
*elide*- The display is abbreviated by removing enough of it so that the remainder fits into the window. For example, a large polynomial might have the first and last terms displayed with "+ ... +" between them. Advanced renderers may provide a facility to zoom in on elided areas.
*truncate*- The display is abbreviated by simply truncating it at the right and bottom borders. It is recommended that some indication of truncation is made to the viewer.
*scale*- The fonts used to display the mathematical expression are chosen so that the full expression fits in the window. Note that this only happens if the expression is too large. In the case of a window larger than necessary, the expression is shown at its normal size within the larger window.

**altimg**=*URL*

**alttext**="*value*"- These attributes provide graceful fall-backs for browsers that do
not support embedded elements, or images respectively.

Attributes which apply to the MathML interface element necessarily
take effect when the document is first loaded, and therefore suffer
the limitation that they cannot change in response to reader
interaction. The **height** and **width** attributes are good
examples; if the reader changes the current font size, the height and
width of the embedded math fragments also need to change.
Therefore, in order to properly render MathML, an embedded element
must be able to communicate with the browser, and react to reader
input.

At present, browser support for embedded elements is too limited to provide acceptable rendering for MathML. The W3C Math working group is working closely with the Document Object Model working group in an effort to provide better communication between embedded MathML renderers and browsers. Some of the most needed improvements are:

- Embedded elements must be able to determine the ambient style
parameters, including font characteristics, foreground and background
colors, and link color schemes. Embedded elements must also be able
to align themselves to an arbitrary baseline, rather than the existing
top, middle, bottom alignment options.
- Embedded elements must be able to detect and react to reader
input. In particular, embedded elements must be able to dynamically
resize themselves when the ambient font size changes.
- Embedded elements must be able to print in context, and at high resolution.

Until MathML is natively supported by browsers, we anticipate that MathML rendering will be carried out via embedded objects such as plug-ins, applets, or helper applications. In the near term, the W3C Math working group advocates the use of MIME types to bind embedded MathML to renderers. Mechanisms for assigning MIME types already exist in HTML, and mechanisms for registering and automatically invoking embedded elements such as plug-ins based on MIME type already exist in Web browsers.

The **type** attribute, described in the previous section as a requirement
for the MathML interface element, is intended to associate a MIME type
with its content. The HTML element META is proposed as a means of
specifying document-wide default MIME types for an element.

We propose a simple MIME type naming convention which is flexible enough to accommodate several common situations:

- An author wishing to reach an as wide an audience as possible
might like MathML to be rendered by any available renderer.
- An author targeting a specific audience might like indicate that
a particular MathML be used.
- A reader might wish to specify which of several available renderers
should be used.

We propose that generic MathML be assigned the MIME type
`text/mathml`

, and for browser registry, we suggest the
standard file extension `.mml`

be used. To invoke specific
renderers, we suggest assigning a MIME type of the following format:

```
````text/mathml-renderer`

**Example:**

A user downloads and installs renderer A, and
registers it with the browser for the
`text/mathml`

MIME type to process generic MathML.
However renderer A also accepts TeX as an input syntax, and therefore
during the installation process, it requests to be registered for
`application/x-tex`

as well. Later, the user discovers
renderer B provides additional features, such as
cut and paste capability. Therefore, the user downloads,
installs and registers renderer B for the
`text/mathml-rendererB`

MIME type.

An author then creates a document that contains the the following line in the document header:

```
````<META Content-math-Type="text/mathml">`

Later, the document contains the following expressions:
When our hypothetical reader views this document, renderer A is invoked to process the first expression, while renderer B is invoked for the second. Later, when our hypothetical reader later views a document with MIME type`<math> <msup><mi>x</mi><mn>2</mn></msup> </math> <math type="text/mathml-rendererB"> <mi>α</mi><mo>=</mo><mn>0.4</mn> </math>`

`application/x-tex`

, renderer A is again
invoke, this time in TeX processing mode.
Although rendering MathML expressions typically occurs in place in a Web browser, other MathML processing functions take place more naturally in other applications. Particularly common tasks include opening a MathML expression in an equation editor or computer algebra system.

At present, there is no standard way of specifying that embedded content should be rendered with one application, edited in another, and evaluated by a third. As work progresses on coordination between browsers and embedded elements and the Document Object Model (DOM), providing this kind of functionality should be a priority. Both authors and readers should be able to indicate a preference about what MathML application to use in a given context. For example, one might imagine that some mouse gesture over a MathML expression would cause a browser to present the reader with a pop-up menu, showing the various kinds of MathML processing available on the system, and the MathML processors recommended by the author.

Since MathML will probably be widely generated by authoring tools,
it is particularly important that opening a MathML expression in an
editor should be easy to do and to implement. In many cases, it will
be desirable for an authoring tool to record some information about
its internal state along with a MathML expression, so that an author
can pick up editing where he or she left off.
The MathML specification does not explicitly contain provisions for
recording authoring tool information. In some circumstances, it may
be possible to include authoring tool information
which applies to an entire document as meta
data; interested readers are encouraged to consult the W3C Metadata Activity for
current information about metadata and resource definition. For
encoding authoring tool state information that applies to a particular
MathML instance, readers are referred to the possible use of the **semantics**
element for this purpose.

In order to be fully integrated into HTML, it should be possible not only to embed MathML in HTML, but also to embed HTML in MathML. However, the problem of supporting HTML in MathML presents many difficulties. Moreover, the problems are not specific to MathML; they are problems for XML applications in HTML generally. Therefore, at present, the MathML specification does not permit any HTML elements within a MathML expression, although this may be subject to change in a future revision of MathML, when.mechanisms for embedding XML in HTML have been further developed.

In most cases, HTML elements either do not apply in mathematical contexts (headings, paragraphs, lists, etc), or MathML already provides equivalent or better functionality specifically tailored to mathematical content (tables, style changes, etc). However, there are two notable exceptions.

MathML has no element which corresponds to the HTML
anchor element **a**. In HTML, anchors are used both to make links, and to
provide locations to link to. MathML, as an XML application, defines
links by the use of the **XML-LINK** attribute. However, MathML at
present does not provide a way for other documents to make links into
a MathML expression. One reason for this omission is that linking
into embedded XML content is better addressed as part of a
general mechanism for embedding XML in HTML. Moreover, until browsers
either natively implement MathML rendering, or substantially better
coordination between embedded elements and browsers becomes possible,
there is no reasonable way of implementing links into MathML
expressions.

MathML linking elements are generic XML linking elements as described in the Extensible Markup Language (XML): Part 2. Linking working draft. The reader is cautioned, however, that this working draft is less mature than the XML syntax working draft, and is therefore more subject to future revision. Since the MathML linking mechanism is defined in terms of the XML linking specification, the same proviso holds for it as well.

A MathML element is designated as a link by the presence of the
**XML-LINK** attribute. The possible values for the this attribute
are "simple", "extended", "locator", "group" and "document". Although
all of these values are valid, MathML renderers need only implement
"simple" XML links to be MathML compliant. How links are indicated to
the reader is left to the individual MathML processing application.

Elements which specify the value of the **XML-LINK** attribute
as "simple" must also specify a value for the **HREF** attribute.
These two attributes fully specify a "simple" XML link. Thus, a
typical MathML link might look like:

<mrow XML-LINK="simple" HREF="http://www.w3.org"> ... </mrow>

MathML designates that almost all elements can be used as an XML
linking element. The only elements which cannot serve as linking
elements are those such as the **<sep/>** element which exist
primarily to disambiguate other MathML constructs and in general do
not correspond to any part of a typical visual rendering. The full
list of exceptional elements which cannot be used as linking elements
is given below in table 7.1.5.1.

<prescripts/> | <none/> | <sep/> |

<power/> | <malignmark/> | <maligngroup/> |

Table 7.1.5.1 MathML Elements Which Cannot Be Linking Elements

The IMG element has no MathML equivalent. The decision to omit a general image inclusion mechanism in MathML was based on several factors. First, a simple mechanism for including images in MathML along the lines of the IMG element would not be more closely tied to mathematical content or notation than the HTML IMG element itself. Therefore, such an element would likely be superseded by the IMG element if it becomes possible to mix XML and HTML generally.

Another reason for not providing an image facility is that MathML takes great pains to make the notational structure and mathematical content it encodes easily available to processors while information contained in images is only available to a human reader looking at a visual representation. Thus, for example, in the MathML paradigm, it would be preferable to introduce new glyphs by the creation of special symbol fonts, rather than simply including them as images.

Finally, apart from the introduction of new glyphs, many of the situations where one might be inclined to use an image amount to some sort of labeled diagram. For example, knot diagrams, Venn diagrams, Dynkin diagrams, Feynman diagrams and complicated commutative diagrams all fall into this category. As such, their content would be better encoded via some combination of structured graphics and MathML markup. Because of the generality of the "labeled diagram" construction, the definition of a markup language to encode such constructions extends beyond the scope of the W3C Math activity. However, it may be possible to provide such functionality in a future extension of MathML.

Information is increasingly generated, processed and rendered by software tools. The exponential growth of the Web is fueling the development of advanced systems for automatically searching, categorizing, and interconnecting information. Thus, although MathML can be written by hand and read by humans, the future of MathML is also tied to the ability to process it with software tools.

Many different kinds of MathML editors, translators, processors and renderers will be implemented. In addition to supporting the MathML core language, it is reasonable to assume that some of these renderers will provide additional specialized capabilities. Consequently, it is important to specify what one can and cannot expect from a generic MathML compliant application, and in what ways MathML can be extended, or used to pass additional information directly to specific application that can take advantage of it.

It is important to clearly specify what it means to be a MathML compliant processor. Specifying MathML compliance serves two purposes. First, authors can be assured that their documents will be generally accessible if they refrain from using proprietary extensions. Second, software developers can be assured of the criteria for interoperability.

A well-formed MathML expression is a XML construct determined by the MathML DTD together with the additional requirements given in the specifications of the MathML document.

We define a "MathML processor" to mean any application that can accept, produce, or "roundtrip" a well-formed MathML expression. An example of an application that might round-trip a MathML expression might be an editor that writes a new file even though no modifications are made.

We specify three forms of MathML compliance:

- A MathML-input-compliant processor must accept all
well-formed MathML expressions.
For example a MathML-input-compliant validating parser which implements the MathML specification returns a truth value. A MathML-input-compliant renderer faithfully translates a MathML expression into application-specific form allowing native application operations to be performed.

- A MathML-output-compliant processor must generate
well-formed MathML.
- An embedded MathML-output-compliant processor must return well-formed MathML expressions when queried by the document object model API.
- In the case where cut-and-paste/drag-and-drop operations are implemented, a MathML-output-compliant processor must return well-formed MathML expressions.

- A MathML-roundtrip-compliant processor must preserve MathML equivalence.

Two MathML expressions are "equivalent" if and only if both expressions have the same interpretation (as stated by the MathML DTD and specification) under any circumstances, by any MathML processor. Equivalence on an element-by-element basis is discussed elsewhere in this document.

We note that being roundtrip-compliant may be very difficult for processors that convert MathML input into an internal form that is structurally very different from the XML expression model. The first generation of processors may very well be input-compliant and output-compliant, but not roundtrip-compliant. Nevertheless, we expect roundtrip-compliant processors to be eventually produced with the wide-spread acceptance of MathML.

Beyond the above, the MathML core specification makes no demands of individual processors. However, in order to guide developers, the MathML specification includes advisory material; for example, there are suggested rendering rules included in section 3. The remainder of this section makes additional suggestions about a number of interface issues a MathML processor should address in some fashion.

If a MathML-input-compliant application receives input containing
one or more elements with an illegal number or type of attributes or
children schemata, it should nonetheless attempt to render all the
input in an intelligible way, i.e. to render normally those parts of
the input which were well-formed, and to render error messages
(rendered as if enclosed in an **<merror>** element) in place
of ill-formed expressions.

MathML-output-compliant applications such as editors and
translators may choose to generate **<merror>** expressions
to signal errors in their input. This is usually preferable
to generating well-formed, but possibly erroneous, MathML.

The MathML attributes described in the MathML specification are necessary for display and content markup. Ideally, the MathML attributes should be an open-ended list so that users could add specific attributes for specific renderers. However, this can't be done within the confines of a single XML DTD. Although it can be done using extensions of the standard DTD, some authors will wish to use nonstandard attributes while remaining strictly in compliance with the standard DTD.

To allow this, this specification also allows the attribute
**other**="..." for all elements, for use as a hook to
pass on renderer-specific information. In particular, it can be used
as a hook for passing information to audio renderers, computer algebra
systems, and for pattern matching in any future macro/extension
mechanism. This idea is used in other languages. For example,
Postscript comments are widely used to pass information that is not
part of Postscript.

At the same time, the intent of the **other** attribute is not
to encourage software developers to use this as a loophole for
circumventing the MathML core markup conventions. We trust both
authors and applications will use the **other** attribute
judiciously.

The value of the **other** attribute should be a string containing
an attribute list in valid XML format (i.e., attr1="val1"
attr2="val2"; ..., with appropriate escaping of the
double quotes). Renderers which accept nonstandard attributes
directly should also accept them when they occur within the
string value of the **other** attribute. This is not required for
attributes specifically documented by the MathML standard.

MathML is in its infancy; it is to be expected that MathML will need to be extended and revised in various ways. Some of these extensions can be easily foreseen; as noted repeatedly in this chapter, the mechanisms for fully integrating MathML into HTML are not yet developed, and these mechanisms may have a significant impact on some aspects of MathML

Similarly, there are several kinds of functionality that are fairly obvious candidates for future MathML extensions. These include macros, style sheets, and perhaps a general "labeled diagram" facility. However, there will also no doubt be other desirable extensions to MathML which will only emerge as MathML is widely used. For these extensions, the W3C Math working group relies on the extensible architecture of XML, and the common sense of the larger Web community.

The definition of a style sheet mechanism for XML is part of the ongoing XML activity at the World Wide Web Consortium. Although it is too soon to say what this mechanism will ultimately be like, it is likely that it will accommodate the needs of MathML. It is also possible that such a style sheet mechanism will be sufficiently powerful to provide basic macro capability as well.

Macros, however, play a very important and useful role in encoding mathematical content and meaning. Moreover, it is difficult to devise a coherent, general macro system for MathML, because there are so many distinct applications for MathML macros. Therefore, the W3C Math working group plans to investigate the definition of a macro mechanism specifically tailored to MathML, in addition to participating in general ongoing XML style sheet and macro facility activities.

Some of the possible uses of MathML macros include:

**Abbreviation:**One common use of macros is for abbreviation. Authors needing to repeat some complicated but constant notation can define a macro. This greatly facilitates hand authoring. Macros that allow for substitution of parameters facilitate such usage even further.**Extension of Content Markup:**By defining macros for semantic objects, for example a binomial coefficient, or a Bessel function, one can in effect extend the content markup for MathML. Such a macro could include an explicit semantic binding, or such a binding could be easily added by an external applications. Narrowly defined disciplines should be able to easily introduce standardize content markup by using standard macro packages. For example, the OpenMath project could release macro packages for attaching OpenMath content markup up.**Rendering and Style Control:**Another basic way in which macros are often used is to provide a way of controlling style and rendering behavior by replacing high level macro definitions. This is especially important for controlling the rendering behavior of HTML Math content tags in a context sensitive way. Such a macroing capability is also necessary to provide a way of attaching renderings to user defined XML extensions to the MathML core.**Accessibility:**Reader controlled style sheets are important in providing accessibility to MathML. For example, a reader listening to a voice renderer might by default hear a bit of MathML presentation markup read as "D sub x super 2 of f". Knowing the context to be multivariable calculus, the reader may wish to use a style sheet or macro package which instructs the renderer to render this**<msubsup>**element as "second derivative with respect to x of f".

The set of elements and attributes specified in the MathML specification are necessary for rendering common math expressions. It is recognized that not all mathematical notation is covered by this set of elements, that new notations are continually invented, and that sub-communities within mathematics often have specialized notations; and furthermore that the explicit extension of a standard is a necessarily slow and conservative process; this implies that the MathML standard could never explicitly cover all the presentational forms used by every sub-community of authors and readers of mathematics, much less encode all mathematical content.

In order to facilitate the use of MathML by the widest possible audience, and to enable its smooth evolution to encompass more notational forms and more mathematical content (perhaps eventually covered by explicit extensions to the standard), the set of tags and attributes is open-ended, in the sense described in this section.

MathML is described by an XML-compliant DTD, which necessarily limits the elements and attributes to those which occur in the DTD. Renderers desiring to accept nonstandard elements or attributes, and authors desiring to include these in documents, should accept or produce documents which conform to an appropriately extended XML-compliant DTD which has the standard MathML DTD as a subset.

MathML compliant renderers are allowed, but not required, to
accept nonstandard elements and attributes, and to render them in any
way. If a renderer does not accept some or all nonstandard tags, it
is encouraged to either handle them as errors as described above for
elements with the wrong number of arguments, or to render their
arguments as if they were arguments to an **mrow**, in either case
rendering all standard parts of the input normally.

Up: Table of Contents