2 MathML Fundamentals

Overview: Mathematical Markup Language (MathML) Version 3.0
Previous: 1 Introduction
Next: 3 Presentation Markup

2 MathML Fundamentals
    2.1 MathML Syntax and Grammar
        2.1.1 General Considerations
        2.1.2 MathML and Namespaces
        2.1.3 Children versus Arguments
        2.1.4 MathML and Rendering
        2.1.5 MathML Attribute Values
            2.1.5.1 Syntax notations used in the MathML specification
            2.1.5.2 Length Valued Attributes
            2.1.5.3 Color Valued Attributes
            2.1.5.4 Default values of attributes
        2.1.6 Attributes Shared by all MathML Elements
        2.1.7 Collapsing Whitespace in Input
    2.2 The Top-Level math Element
        2.2.1 Attributes
        2.2.2 Deprecated Attributes
    2.3 Conformance
        2.3.1 MathML Conformance
            2.3.1.1 MathML Test Suite and Validator
            2.3.1.2 Deprecated MathML 1.x and MathML 2.x Features
            2.3.1.3 MathML Extension Mechanisms and Conformance
        2.3.2 Handling of Errors
        2.3.3 Attributes for unspecified data

2.1 MathML Syntax and Grammar

2.1.1 General Considerations

MathML is an application of [XML], Extensible Markup Language, and as such it is governed by the rules of XML syntax. XML syntax is a notation for rooted labeled planar trees. Planarity means that the children of a node may be viewed as given a natural order and MathML depends on this.

The basic `syntax' of MathML is thus defined by XML. Upon this, we layer a `grammar', being the rules for allowed elements, the order in which they can appear, and how they may be contained within each other, as well as additional syntactic rules for the values of attributes. These rules are defined by this specification, and formalized by a RelaxNG schema [RELAX-NG]. The RelaxNG Schema is normative, but a DTD (Document Type Definition) and an XML Schema [XMLSchemas] are provided for continuity (they were normative for MathML2). See Appendix A Parsing MathML.

As an XML vocabulary, MathML's character set must consist of legal characters as specified by Unicode [Unicode]. The use of Unicode characters for mathematics is discussed in Chapter 7 Characters, Entities and Fonts.

The following sections discuss the general aspects of the MathML grammar as well as describe the syntaxes used for attribute values.

2.1.2 MathML and Namespaces

An XML namespace [Namespaces] is a collection of names identified by a URI. The URI for the MathML namespace is:

http://www.w3.org/1998/Math/MathML

To declare a namespace, one uses an xmlns attribute, or an attribute with an xmlns prefix. When the xmlns attribute is used alone, it sets the default namespace for the element on which it appears, and for any child elements. For example:

<math xmlns="http://www.w3.org/1998/Math/MathML">
<mrow>...</mrow>
</math>

When the xmlns attribute is used as a prefix, it declares a prefix which can then be used to explicitly associate other elements and attributes with a particular namespace. When embedding MathML within XHTML, one might use:

<body xmlns:m="http://www.w3.org/1998/Math/MathML">
...
<m:math><m:mrow>...</m:mrow></m:math>
...
</body>

2.1.3 Children versus Arguments

Many MathML elements require a specific number of children or attach a particular meaning to child elements in certain positions. When children of a given MathML element are subject to these conditions, we will often refer to them as arguments instead of merely as children, in order to emphasize this somewhat mathematical relationship. For elements that act as `containers', the arguments correspond directly to children. This is the case for most presentation elements and some content elements such as set. In other cases, such as the content element apply, it is clearer to refer to the second child of the apply as being the `first argument' of the operator; that operator itself being the first child of the apply. Other cases are presentation elements that conceptually accept only a single argument, but for convenience accept any number of children; then we infer an mrow containing those children which acts as the argument to the element in question; See Section 3.1.3.1 Inferred <mrow>s.

In the detailed discussions of element syntax given with each element throughout the MathML specification, the correspondence of children with arguments, the number of arguments required and their order, as well as other constraints on the content are given. This information is also tabulated for the presentation elements Section 3.1.3 Required Arguments.

2.1.4 MathML and Rendering

MathML presentation elements only suggest (i.e., do not require) specific ways of rendering in order to allow for medium-dependent rendering and for individual preferences of style.

Nevertheless, some parts of this specification describe suggested visual rendering rules in some detail; in those descriptions it is often assumed that the model of rendering used supports the concepts of a well-defined 'current rendering environment' which, in particular, specifies a 'current font', a 'current display' (for pixel size) and a 'current baseline'. The 'current font' provides certain metric properties and an encoding of glyphs.

2.1.5 MathML Attribute Values

MathML elements take attributes with values that further specialize the meaning or effect of the element. Attribute names are shown in a monospaced font. The meaning of each attribute and its allowable values are described, throughout this document, along with the specification of each element. The syntax for allowable values use the syntax explained in this section.

When otherwise allowed by the specification for each attribute, MathML attribute values may contain any legal characters specified by the XML recommendation. See Chapter 7 Characters, Entities and Fonts for further clarification.

2.1.5.1 Syntax notations used in the MathML specification

To describe the MathML-specific syntax of permissible attribute values, the following conventions and notations are used for most attributes in the present document.

Notation What it matches
decimal-digit a decimal digit from the range U+0030 to U+0039
hexadecimal-digit a hexadecimal (base 16) digit from the range U+0030 to U+0039 or U+0041 to U+0046
unsigned-integer a string of decimal-digits, representing a non-negative integer.
positive-integer a string of decimal-digits, but not consisting solely of "0"s (U+0030), representing a positive integer.
integer a string of decimal digits, optionally starting with '-' (U+002D)
unsigned-number a decimal integer or rational number (a string of digits, with up to one decimal point represented by U+002E), no sign is allowed.
number a decimal integer or rational number, optionally starting with '-' (U+002D)
character a single non-whitespace character
string an arbitrary character string
length a length, as explained below, Section 2.1.5.2 Length Valued Attributes
color a color, as explained below, Section 2.1.5.3 Color Valued Attributes
id an identifier, unique within the document; must satisfy the NAME syntax of the XML recommendation [XML]
idref an identifier referring to another element within the document; must satisfy the NAME syntax of the XML recommendation [XML]
URI a Uniform Resource Identifier, [RFC3986]
italicized word values as explained in the text for each attribute
literal non-italicized words should appear literally in the attribute value
quoted symbol that same symbol, literally present in the attribute value (e.g. "+" or '+')

The `types' described above, except for string, may be combined into composite patterns using the following operators. They are shown in order of precedence from highest to lowest precedence:

Notation What it matches
( form ) same as form
[ form ] an optional instance of form
form * zero or more instances of form
form + one or more instances of form
f1 f2 ... fn one instance of each form fi, in sequence, perhaps separated by whitespace
f1 | f2 | ... | fn any one of the specified forms fi

When an attribute value is composed of multiple instances of the above types (eg. (length *)), adjacent values must be separated by whitespace (see Section 2.1.7 Collapsing Whitespace in Input); whitespace is typically not allowed within the values of types, (e.g., between a - and number). This separating whitespace is, however, optional in the case of (character *).

Since some applications are inconsistent about normalization of whitespace, for maximum interoperability it is advisable to use only a single whitespace character for separating parts of a value. Moreover, leading and trailing whitespace in attribute values should be avoided.

For most numerical attributes, only those in a subset of the expressible values are sensible; values outside this subset are not errors, unless otherwise specified, but rather are rounded up or down (at the discretion of the renderer) to the closest value within the allowed subset. The set of allowed values may depend on the renderer, and is not specified by MathML.

If a numerical value within an attribute value syntax description is declared to allow a minus sign ('-'), e.g., number or integer, it is not a syntax error when one is provided in cases where a negative value is not sensible. Instead, the value should be handled by the processing application as described in the preceding paragraph. An explicit plus sign ('+') is not allowed as part of a numerical value except when it is specifically listed in the syntax (as a quoted '+' or "+"), and its presence can change the meaning of the attribute value (as documented with each attribute which permits it).

Editorial note: P. Ion  
The presence or not of an explicit + in attribute values is a place we should be in accord with HTML's conventions, in particular HTML5's, if at all possible.

2.1.5.2 Length Valued Attributes

Most presentation elements have attributes that accept values representing lengths to be used for size, spacing or similar properties. The syntax of a length is specified as

Type Syntax
length number | number unit | namedspace

There is no space between the number and unit.

The possible units and namedspaces, along with their interpretations, are shown below. Note that although the units and their meanings are taken from CSS, the syntax of lengths is not identical. A few MathML elements have length attributes that accept additional keywords; these are specified in the description of those specific elements.

When a length is given as a number without a unit it represents a multiple of the default value. Similarly, a trailing "%" represents a percent of the default value. The default value, or how it is obtained, is listed in the table of attributes for each element. (See also Section 2.1.5.4 Default values of attributes)

In some cases, the range of acceptable values for a particular attribute may be restricted; implementations are free to round up or down to the closest allowable value.

The possible units in MathML are:

Unit Description
em an em (font-relative unit traditionally used for horizontal lengths)
ex an ex (font-relative unit traditionally used for vertical lengths)
px pixels, or size of a pixel in the current display
in inches (1 inch = 2.54 centimeters)
cm centimeters
mm millimeters
pt points (1 point = 1/72 inch)
pc picas (1 pica = 12 points)
% percentage of the default value

Some additional aspects of units are discussed further under Additional Notes, below.

The following constants, namedspaces, may also be used where a length is needed; they are typically used for spacing or padding between tokens: "veryverythinmathspace" (1/18em), "verythinmathspace" (2/18em), "thinmathspace" (3/18em), "mediummathspace" (4/18em), "thickmathspace" (5/18em), "verythickmathspace" (6/18em), "veryverythickmathspace" (7/18em), as well as the negatives "negativeveryverythinmathspace", "negativeverythinmathspace", "negativethinmathspace", "negativemediummathspace", "negativethickmathspace", "negativeverythickmathspace" and "negativeveryverythickmathspace". Suggested default values for these constants are shown above in parentheses; the actual spacing used is implementation specific.

2.1.5.2.1 Additional notes about units

Lengths are only used in MathML for presentation, and presentation will ultimately involve rendering in or on some medium. For visual media, the display context is assumed to have certain properties available to the rendering agent. A px corresponds to a pixel on the display, to the extent that that is meaningful. The resolution of the display device will affect the correspondence of pixels to the units in, cm, mm, pt and pc.

Moreover, the display context will also provide a default for the font size; the parameters of this font determine the initial values used to interpret the units em and ex, and thus indirectly the sizes of namedspaces. Since these units track the display context, and in particular, the user's preferences for display, the relative units em and ex are generally to be preferred over absolute units such as px or cm.

Two additional aspects of relative units must be clarified, however. First, some elements such as Section 3.4 Script and Limit Schemata or mfrac, implicitly switch to smaller font sizes for some of their arguments. Similarly, mstyle can be used to explicitly change the current font size. In such cases, the effective values of an em or ex inside those contexts will be different than outside. The second point is that the effective value of an em or ex used for an attribute value can be affected by changes to the current font size. Thus, attributes that affect the current font size, such as mathsize, mathvariant and scriptlevel, must be processed before evaluating other length valued attributes.

If, and how, lengths might affect non-visual media is left up to the implementors.

2.1.5.3 Color Valued Attributes

The color, or background color, of presentation elements may be specified as a color using the following syntax:

Type Syntax
color #RGB | #RRGGBB | html-color-name

A color is specified either by "#" followed by hexadecimal values for the red, green, and blue components, with no intervening whitespace, or by an html-color-name. The color components can be either 1-digit or 2-digit, but must all have the same number of digits; the component ranges from 0 (component not present) to FF (component fully present). Note that #123 corresponds to #102030.

Color values can also be specified as an html-color-name, one of the color-name keywords defined in [HTML4] ("aqua", "black", "blue", "fuchsia", "gray", "green", "lime", "maroon", "navy", "olive", "purple", "red", "silver", "teal", "white", and "yellow"). Note that the color name keywords are not case-sensitive, unlike most keywords in MathML attribute values, for compatibility with CSS and HTML.

When a color is applied to an element, it is the color in which the content of tokens is rendered. Additionally, when inherited from mstyle or from the environment in which the complete MathML expression is embedded, it controls the color of all other drawing due to MathML elements, including the lines or radical signs that can be drawn in rendering mfrac, mtable, or msqrt.

When used to specify a background color, the keyword "transparent" is also allowed. The suggested MathML visual rendering rules do not define the precise extent of the region whose background is affected by using the background attribute on mstyle, except that, when mstyle's content does not have negative dimensions and its drawing region is not overlapped by other drawing due to surrounding negative spacing, this region should lie behind all the drawing done to render the content of the mstyle, but should not lie behind any of the drawing done to render surrounding expressions. The effect of overlap of drawing regions caused by negative spacing on the extent of the region affected by the background attribute is not defined by these rules.

2.1.5.4 Default values of attributes

Default values for MathML attributes are, in general, given along with the detailed descriptions of specific elements in the text. Default values shown in plain text in the tables of attributes for an element are literal, but when italicized are descriptions of how default values can be computed.

Default values described as inherited are taken from the rendering environment, as described in Section 3.3.4 Style Change <mstyle>, or in some cases (which are described individually) taken from the values of other attributes of surrounding elements, or from certain parts of those values. The value used will always be one which could have been specified explicitly, had it been known; it will never depend on the content or attributes of the same element, only on its environment. (What it means when used may, however, depend on those attributes or the content.)

Default values described as automatic should be computed by a MathML renderer in a way which will produce a high-quality rendering; how to do this is not usually specified by the MathML specification. The value computed will always be one which could have been specified explicitly, had it been known, but it will usually depend on the element content and possibly on the context in which the element is rendered.

Other italicized descriptions of default values which appear in the tables of attributes are explained individually for each attribute.

The single or double quotes which are required around attribute values in an XML start tag are not shown in the tables of attribute value syntax for each element, but are shown around example attribute values in the text.

Note that, in general, there is no value which can be given explicitly for a MathML attribute which will simulate the effect of not specifying the attribute at all for attributes which are inherited or automatic. Giving the words "inherited" or "automatic" explicitly will not work, and is not generally allowed. Furthermore, even for presentation attributes for which a specific default value is documented here, the mstyle element (Section 3.3.4 Style Change <mstyle>) can be used to change this for the elements it contains.

Note also that the defaults being discussed describe the behavior of MathML applications when an attribute is not supplied; they do not indicate a value that will be filled in by the XML parser, as is sometimes done by DTD-based specifications.

2.1.6 Attributes Shared by all MathML Elements

In addition to the attributes described specifically for each element, the following attributes are also allowed on all MathML elements.

Name values default
id id none
Establishes an unique identifier associated with the element to support linking, cross-references and parallel markup. See xref and Section 5.4 Parallel Markup.
xref idref none
References another element within the document. See id and Section 5.4 Parallel Markup.
class string none
Associates the element with a set of style classes for use with [XSLT] and [CSS2]. Typically this would be a space separated sequence of words, but this is not specified by MathML. See Section 6.5 Using CSS with MathML for discussion of the interaction of MathML and CSS.
style string none
Associates style information with the element for use with [XSLT] and [CSS2]. This typically would be an inline CSS style, but this is not specified by MathML. See Section 6.5 Using CSS with MathML for discussion of the interaction of MathML and CSS.
href URI none
Can be used to establish the element as a hyperlink to the specfied URI.

Note that MathML 2 had no direct support for linking, and instead followed the W3C Recommendation "XML Linking Language" [XLink] in defining links using the xlink:href attribute. This has changed, and MathML 3 now uses an href attribute. However, particular compound document formats may specify the use of XML Linking with MathML elements, so user agents that support XML Linking should continue to support the use of the xlink:href attribute with MathML 3 as well.

Every MathML element, because of a legacy from MathML 1.0, also accepts the deprecated attribute other (Section 2.3.3 Attributes for unspecified data) which was conceived for passing non-standard attributes without violating the MathML DTD. MathML renderers are only required to process this attribute if they respond to any attributes which are not standard in MathML. However, the use of other is strongly discouraged when there are already alternate ways within MathML of passing specific information.

See also Section 3.2.2 Mathematics style attributes common to token elements for a list of MathML attributes which can be used on most presentation token elements.

2.1.7 Collapsing Whitespace in Input

In MathML, as in XML, "whitespace" means simple spaces, tabs, newlines, or carriage returns, i.e., characters with hexadecimal Unicode codes U+0020, U+0009, U+000A, or U+000D, respectively.

MathML ignores whitespace occurring outside token elements. Non-whitespace characters are not allowed there. Whitespace occurring within the content of token elements is "trimmed" from the ends, i.e., all whitespace at the beginning and end of the content is removed. Whitespace internal to content of MathML elements is "collapsed" canonically, i.e., each sequence of 1 or more whitespace characters is replaced with one space character (U+0020, sometimes called a blank character).

For example, <mo> ( </mo> is equivalent to <mo>(</mo>, and

<mtext>
  Theorem
  1:
</mtext>

is equivalent to <mtext>Theorem 1:</mtext>.

Authors wishing to encode whitespace characters at the start or end of the content of a token, or in sequences other than a single space, without having them ignored, must use &nbsp; or other non-marking characters that are not trimmed. For example, compare

<mtext>
 Theorem
  1:
</mtext>

with

<mtext>
&#xA0;<!--NO-BREAK SPACE-->Theorem &#xA0;<!--NO-BREAK SPACE-->1: 
</mtext> 

When the first example is rendered, there is no whitespace before "Theorem", one space between "Theorem" and "1:", and no whitespace after "1:". In the second example, a single space is rendered before "Theorem", two spaces are rendered before "1:", and there is no whitespace after the "1:".

Note that the xml:space attribute does not apply in this situation since XML processors pass whitespace in tokens to a MathML processor; it is the MathML processing rules which specify that whitespace is trimmed and collapsed.

For whitespace occurring outside the content of the token elements mi, mn, mo, ms, mtext, ci, cn and annotation, an mspace element should be used, as opposed to an mtext element containing only "whitespace" entities.

2.2 The Top-Level math Element

MathML specifies a single top-level or root math element, which encapsulates each instance of MathML markup within a document. All other MathML content must be contained in a math element; equivalently, every valid, complete MathML expression must be contained in <math> tags. The math element must always be the outermost element in a MathML expression; it is an error for one math element to contain another. These considerations also apply when sub-expressions are passed between applications, such as for cut-and-paste operations; See Section 6.3 Transferring MathML

The math element can contain an arbitrary number of children schemata. The children schemata render by default as if they were contained in an mrow element.

2.2.1 Attributes

In addition to the attributes specified in Section 2.1.6 Attributes Shared by all MathML Elements, the math element accepts:

Name values default
display block | inline inline
specifies whether the enclosed MathML expression should be rendered as a separate vertical block (in display style) or inline, aligned with adjacent text. When display="block", displaystyle is initialized to "true", whereas display="block" initializes it to "false"; in both cases scriptlevel is initialized to 0. When this attribute is missing, a rendering agent is free to initialize the state as appropriate to the context. See Section 3.1.6 Displaystyle and Scriptlevel.
dir ltr | rtl ltr
specifies the overall directionality ltr (Left To Right) or rtl (Right To Left) of layout. See Section 3.1.5 Directionality for further discussion.
maxwidth length available width
specifies the maximum width to be used for linebreaking. The default is the maximum width available in the surrounding environment. If that value cannot be determined, the renderer should assume an infinite rendering width.
overflow linebreak | scroll | elide | truncate | scale linebreak
specifies the preferred handing in cases where an expression is too long to fit in the allowed width. See the discussion below.
altimg URI none
provides a URI referring to an image to display as a fall-back for user agents that do not support embedded MathML.
altimg-width length width of altimg
specifies the width to display altimg, scaling the image if necessary; See altimg-height.
altimg-height length height of altimg
specifies the height to display altimg, scaling the image if necessary; if only one of the attributes altimg-width and altimg-height are given, the scaling should preserve the image's aspect ratio; if neither attribute is given, the image should be shown at its natural size.
altimg-valign length 0ex
specifies the vertical alignment of the image. A positive value of valign shifts the bottom of the image below the current baseline, while a negative value raises it above. By default, the bottom of the image aligns to the baseline.
alttext string none
provides a textual alternative as a fall-back for user agents that do not support embedded MathML or images.
cdgroup URI none
The URI specifies a CD group file that acts as a catalogue of CD bases for locating OpenMath content dictionaries of csymbol, annotation, and annotation-xml elements in this math element; see Section 4.2.3 Content Symbols <csymbol>. When no cdgroup attribute is explicitly specified, the document format embedding this math element may provide a method for determing CD bases. Otherwise the system must determine a CD base, in the absense of specific information http://www.openmath.org/cd is assumed as the CD base for all csymbol elements annotation, and annotation-xml. This is the CD base for the collection of standard CDs maintained by the OpenMath Society.

In cases where size negotiation is not possible or fails (for example in the case of an expression that is too long to fit in the allowed width), the overflow attribute is provided to suggest a processing method to the renderer. Allowed values are:

Value Meaning
linebreak The expression will be broken across several lines. See Section 3.1.7 Linebreaking of Expressions for further discussion.
scroll The window provides a viewport into the larger complete display of the mathematical expression. Horizontal or vertical scrollbars are added to the window as necessary to allow the viewport to be moved to a different position.
elide The display is abbreviated by removing enough of it so that the remainder fits into the window. For example, a large polynomial might have the first and last terms displayed with "+ ... +" between them. Advanced renderers may provide a facility to zoom in on elided areas.
truncate The display is abbreviated by simply truncating it at the right and bottom borders. It is recommended that some indication of truncation is made to the viewer.
scale The fonts used to display the mathematical expression are chosen so that the full expression fits in the window. Note that this only happens if the expression is too large. In the case of a window larger than necessary, the expression is shown at its normal size within the larger window.

2.2.2 Deprecated Attributes

The following attributes of math are deprecated

Name values default
macros URI * none
intended to provide a way of pointing to external macro definition files. Macros are not part of the MathML specification, and much of the desired functionality can be accommodated by XSL transformations [XSLT].
mode display | inline inline
specified whether the enclosed MathML expression should be rendered in a display style or an in-line style. This attribute is deprecated in favor of the display attribute.

2.3 Conformance

Information is nowadays commonly generated, processed and rendered by software tools. The exponential growth of the Web is fueling the development of advanced systems for automatically searching, categorizing, and interconnecting information. In addition, there are increasing numbers of Web services, some of which offer technically based materials and activities. Thus, although MathML can be written by hand and read by humans, whether machine-aided or just with much concentration, the future of MathML is largely tied to the ability to process it with software tools.

There are many different kinds of MathML processors: editors for authoring MathML expressions, translators for converting to and from other encodings, validators for checking MathML expressions, computation engines that evaluate, manipulate or compare MathML expressions, and rendering engines that produce visual, aural or tactile representations of mathematical notation. What it means to support MathML varies widely between applications. For example, the issues that arise with a validating parser are very different from those for an equation editor.

In this section, guidelines are given for describing different types of MathML support, and for making clear the extent of MathML support in a given application. Developers, users and reviewers are encouraged to use these guidelines in characterizing products. The intention behind these guidelines is to facilitate reuse by and interoperability of MathML applications by accurately setting out their capabilities in quantifiable terms.

The W3C Math Working Group maintains MathML Compliance Guidelines. Consult this document for future updates on conformance activities and resources.

Editorial note: P. Ion  
The Compliance Guidelines mentioned above is still that for MathML2 and requires updating.

2.3.1 MathML Conformance

A valid MathML expression is an XML construct determined by the MathML Relax_NG Schema together with the additional requirements given in this specification.

We shall use the phrase "a MathML processor" to mean any application that can accept, produce, or "roundtrip" a valid MathML expression. Perhaps the simplest example of an application that might round-trip a MathML expression might be an editor that writes a new file even though no modifications are made.

Three forms of MathML conformance are specified:

  1. A MathML-input-conformant processor must accept all valid MathML expressions, and faithfully translate all MathML expressions into application-specific form allowing native application operations to be performed.

  2. A MathML-output-conformant processor must generate valid MathML, faithfully representing all application-specific data.

  3. A MathML-roundtrip-conformant processor must preserve MathML equivalence. Two MathML expressions are "equivalent" if and only if both expressions have the same interpretation (as stated by the MathML Schema and specification) under any circumstances, by any MathML processor. Equivalence on an element-by-element basis is discussed elsewhere in this document.

Beyond the above definitions, the MathML specification makes no demands of individual processors. In order to guide developers, the MathML specification includes advisory material; for example, there are many suggested rendering rules throughout Chapter 3 Presentation Markup. However, in general, developers are given wide latitude in interpreting what kind of MathML implementation is meaningful for their own particular application.

To clarify the difference between conformance and interpretation of what is meaningful, consider some examples:

  1. In order to be MathML-input-conformant, a validating parser needs only to accept expressions, and return "true" for expressions that are valid MathML. In particular, it need not render or interpret the MathML expressions at all.

  2. A MathML computer-algebra interface based on content markup might choose to ignore all presentation markup. Provided the interface accepts all valid MathML expressions including those containing presentation markup, it would be technically correct to characterize the application as MathML-input-conformant.

  3. An equation editor might have an internal data representation that makes it easy to export some equations as MathML but not others. If the editor exports the simple equations as valid MathML, and merely displays an error message to the effect that conversion failed for the others, it is still technically MathML-output-conformant.

2.3.1.1 MathML Test Suite and Validator

As the previous examples show, to be useful, the concept of MathML conformance frequently involves a judgment about what parts of the language are meaningfully implemented, as opposed to parts that are merely processed in a technically correct way with respect to the definitions of conformance. This requires some mechanism for giving a quantitative statement about which parts of MathML are meaningfully implemented by a given application. To this end, the W3C Math Working Group has provided a test suite.

The test suite consists of a large number of MathML expressions categorized by markup category and dominant MathML element being tested. The existence of this test suite makes it possible, for example, to characterize quantitatively the hypothetical computer algebra interface mentioned above by saying that it is a MathML-input-conformant processor which meaningfully implements MathML content markup, including all of the expressions in the content markup section of the test suite.

Developers who choose not to implement parts of the MathML specification in a meaningful way are encouraged to itemize the parts they leave out by referring to specific categories in the test suite.

For MathML-output-conformant processors, information about currently available tools to validate MathML is maintained at MathML validator. Developers of MathML-output-conformant processors are encouraged to verify their output using this validator.

Customers of MathML applications who wish to verify claims as to which parts of the MathML specification are implemented by an application are encouraged to use the test suites as a part of their decision processes.

2.3.1.2 Deprecated MathML 1.x and MathML 2.x Features

MathML 2.0 contains a number of features of earlier MathML which are now deprecated. The following points define what it means for a feature to be deprecated, and clarify the relation between deprecated features and current MathML conformance.

  1. In order to be MathML-output-conformant, authoring tools may not generate MathML markup containing deprecated features.

  2. In order to be MathML-input-conformant, rendering and reading tools must support deprecated features if they are to be in conformance with MathML 1.x or MathML 2.x. They do not have to support deprecated features to be considered in conformance with MathML 3.0. However, all tools are encouraged to support the old forms as much as possible.

  3. In order to be MathML-roundtrip-conformant, a processor need only preserve MathML equivalence on expressions containing no deprecated features.

2.3.1.3 MathML Extension Mechanisms and Conformance

MathML 2.0 defined three basic extension mechanisms: The mglyph element provides a way of displaying glyphs for non-Unicode characters, and glyph variants for existing Unicode characters; the maction element uses attributes from other namespaces to obtain implementation-specific parameters; and content markup makes use of the definitionURL attribute, as well as Content Dictionaries and the cd attribute, to point to external definitions of mathematical semantics.

These extension mechanisms are important because they provide a way of encoding concepts that are beyond the scope of MathML 3.0 as presently explicitly specified, which allows MathML to be used for exploring new ideas not yet susceptible to standardization. However, as new ideas take hold, they may become part of future standards. For example, an emerging character that must be represented by an mglyph element today may be assigned a Unicode codepoint in the future. At that time, representing the character directly by its Unicode codepoint would be preferable. This transition into Unicode has already taken place for hundreds of characters used for mathematics.

Because the possibility of future obsolescence is inherent in the use of extension mechanisms to facilitate the discussion of new ideas, MathML can reasonably make no conformance requirements concerning the use of extension mechanisms, even when alternative standard markup is available. For example, using an mglyph element to represent an 'x' is permitted. However, authors and implementors are strongly encouraged to use standard markup whenever possible. Similarly, maintainers of documents employing MathML 3.0 extension mechanisms are encouraged to monitor relevant standards activity (e.g., Unicode, OpenMath, etc) and to update documents as more standardized markup becomes available.

2.3.2 Handling of Errors

If a MathML-input-conformant application receives input containing one or more elements with an illegal number or type of attributes or child schemata, it should nonetheless attempt to render all the input in an intelligible way, i.e., to render normally those parts of the input that were valid, and to render error messages (rendered as if enclosed in an merror element) in place of invalid expressions.

MathML-output-conformant applications such as editors and translators may choose to generate merror expressions to signal errors in their input. This is usually preferable to generating valid, but possibly erroneous, MathML.

2.3.3 Attributes for unspecified data

The MathML attributes described in the MathML specification are necessary for presentation and content markup. Ideally, the MathML attributes should be an open-ended list so that users can add specific attributes for specific renderers. However, this cannot be done within the confines of a single XML DTD or in a Schema. Although it can be done using extensions of the standard DTD, say, some authors will wish to use non-standard attributes to take advantage of renderer-specific capabilities while remaining strictly in conformance with the standard DTD.

To allow this, the MathML 1.0 specification [MathML1] allowed the attribute other on all elements, for use as a hook to pass on renderer-specific information. In particular, it was intended as a hook for passing information to audio renderers, computer algebra systems, and for pattern matching in future macro/extension mechanisms. The motivation for this approach to the problem was historical, looking to PostScript, for example, where comments are widely used to pass information that is not part of PostScript.

In the next period of evolution of MathML the development of a general XML namespace mechanism seemed to make the use of the other attribute obsolete. In MathML 2.0, the other attribute is deprecated in favor of the use of namespace prefixes to identify non-MathML attributes. The other attribute remains deprecated in MathML 3.0.

For example, in MathML 1.0, it was recommended that if additional information was used in a renderer-specific implementation for the maction element (Section 3.7.1 Bind Action to Sub-Expression <maction>), that information should be passed in using the other attribute:

<maction actiontype="highlight" other="color='#ff0000'"> expression </maction>

From MathML 2.0 onwards, a color attribute from another namespace would be used:

<body xmlns:my="http://www.example.com/MathML/extensions">
...
<maction actiontype="highlight" my:color="#ff0000"> expression </maction>
...
</body>

Note that the intent of allowing non-standard attributes is not to encourage software developers to use this as a loophole for circumventing the core conventions for MathML markup. Authors and applications should use non-standard attributes judiciously.

Overview: Mathematical Markup Language (MathML) Version 3.0
Previous: 1 Introduction
Next: 3 Presentation Markup