Because MathML is, typically, embedded in a wider context, it is important to describe the conditions that processors should acknowledge in order to recognize XML fragments as MathML. This chapter describes the fundamental mechanisms to recognize and transfer MathML markup fragments within a larger environment such as an XML document or a desktop file-system, it raises the issues of combining external markup within MathML, then indicates how cascading style sheets can be used within MathML.
This chapter applies to both content and presentation MathML and indicates
a particular processing model to the semantics
, annotation
and annotation-xml
elements defined in Section 5.3 Semantic Annotations beyond Alternate Representations.
Within an XML document supporting namespaces (TODO: cite xmlns and xml specs),
the preferred method to recognize
MathML markup is by the identification of the math
element
in the appropriate namespace, i.e. that
of URI http://www.w3.org/1998/Math/MathML
.
This is the recommended method to embed MathML within [XHTML] documents. Some user-agents' setup may require supplementary information to be available, such as the MicroSoft behaviour specification (TODO: quote) used in the MathType browser-extension (TODO:quote).
Markup-language specifications that wish to embed MathML may provide special conditions independent of this recommendation. The conditions should be equivalent and the elements' local-names should remain the same.
Although rendering MathML expressions often occurs in place in a Web browser, other MathML processing functions take place more naturally in other applications. Particularly common tasks include opening a MathML expression in an equation editor or computer algebra system. It is important therefore to specify the encoding-names that MathML fragments should be called with:
MIME types [RFC2045], [RFC2046] offer
a strategy that can be used in current user agents
to invoke a MathML processor. This is primarily useful when
referencing separate files containing MathML markup from an embed
or object
element,
or within a desktop environment. (TODO: check that this still applies)
[RFC3023] assigns MathML the MIME type
application/mathml+xml
which is the official mime-type.
The W3C Math Working Group recommends the standard file extension
.mml
within a registry
associating file formats to file-extension.
In MathML 1.0, text/mathml
was given as the suggested
MIME type. This has been superceded by RFC3023.
In the next section, alternate encoding names are provided for the purposes of
desktop transfers.
Issue specify-encoding-names-in-details | wiki (member only) |
---|---|
Encoding Names jungle | |
Encoding names are specified in the section below, and are described
in the chapter 5 as attribute values of the It might be worth trying to homogenize our list and maybe specify mime-type equivalence for each encoding names. |
|
Resolution | None recorded |
MathML expressions are often exchanged between applications using the familiar copy-and-paste or drag-and-drop paradigms. This section provides recommended ways to process MathML while applying these paradigms.
Applying them will transfer MathML fragments between the contexts of two applications by making them available in several flavors, often called clipboard formats or data flavors. The copy-and-paste paradigm lets application place content in a central clipboard, one data-stream per clipboard format; consuming applications negotiate by choose to read the data of the format they elect. The drag-and-drop pardigm lets application offer content by declaring the available formats and potential recipients accept or reject a drop based on this list; the drop action then lets the receiving application request the delivery of the format in the indicated format. The list of flavors is generally ordered, going from the most wishable to the least wishable flavor.
Current desktop platforms offer both of these transfer paradigms using similar
transfer architectures. In this section we specify what applications should
provide as transfer-flavors, how they should be named, and how they should handle
the special semantics
, annotation
, and annotation-xml
elements.
To summarize the two negotiation mechanisms, we shall, here, be talking of flavors, each having a name (a character string) and a content (a stream of binary data), which are exported.
MathML contains two distinct vocabularies: one for encoding mathematical semantics called Chapter 4 Content Markup and one for encoding visual presentation called Chapter 3 Presentation Markup. Some MathML-aware applications import and export only one of these vocabularies, while other may be capable of producing and consuming both. Consequently, we propose three distinct MathML flavors:
Flavor Name | Description |
---|---|
MathML Content |
Instance contains content MathML markup only |
MathML Presentation |
Instance contains presentation MathML markup only |
MathML |
Any well-formed MathML instance presentation markup, content markup, or a mixture of the two is allowed |
Note that Content MathML
, Presentation MathML
and
MathML
are the exact strings that should be used to describe the
flavors described above.
On operating systems that allow such, applications should register such
names (e.g. Windows' RegisterClipboardFormat).
When transferring MathML, for example when placing it within a clipboard, an application MUST ensure the content is a well-formed XML instance of a MathML schema. Specifically:
The instance MUST begin with a XML processing instruction, e.g. <?xml version="1.0">
The instance MUST contain exactly one root math
element.
Since MathML is frequently embedded within other XML document
types, the instance MUST declare the MathML namespace
on the root math
element. In addition, the instance SHOULD use a
schemaLocation
attribute on the math
element to indicate
the location of MathML schema documents against which the instance is valid.
Note that the presence of the schemaLocation
attribute does not require a
consumer of the MathML instance to obtain or use the cited schema documents.
The instance MUST use numeric character references (e.g. α) rather than character entity names (e.g. α) for greater interoperability.
The character encoding for the instance MUST be either specified in the XML header, UTF-16, or UTF-8. UTF-16-encoded data MUST begin with a byte-order mark (BOM). If no BOM or encoding is given, the character encoding will be assumed to be UTF-8.
Applications that transfer MathML SHOULD adhere to the following conventions:
Applications that have pure presentation markup and/or pure content markup versions of an expression SHOULD offer as many of these two flavors as are available.
When both
presentation and content are exported, recipients should consider it equivalent to a
single MathML instance in which presentation and content are combined at the top
level using MathML's semantics
element (see
Section 5.5.1 Top-level Parallel Markup).
(TODO: issue: in DnD you can't read several, at least in java)
The order between flavors determines
whether presentation wraps content, or vice-versa. Usually, Presentation MathML
should be offered first so that it wraps the Content MathML.
When an application has a mixed presentation and content version
in addition to pure presentation and/or content versions, it should
export the mixed versionafter the pure presentation and/or
content markup versions, and mark it as the generic MathML
flavor.
When an application cannot produce pure presentation and/or
content markup versions, or cannot determine whether MathML data is
pure presentation or content markup (e.g. data being passed through
from a third application,) it should export only one version
marked as the generic MathML
flavor.
An application that only has pure presentation and/or content
markup versions of an expression available SHOULD NOT export a second
copy of the data marked as the generic MathML
flavor.
When an application exports a MathML fragment whose root element is
a semantics
element, it SHOULD offer, after the flavors above,
a flavor for each annotation
or annotation-xml
element:
the flavor should be given by the encoding
attribute value,
and the content should be the child text in UTF-8 (if the annotation
element contains only textual data), a valid XML fragment (if the annotation-xml
element contains children), or the data resulting of requesting the URL
given by the href
attribute.
As a final fallback applications SHOULD export a version of the data in plain-text
flavor (such as CF_UNICODETEXT, UnicodeText, NSStringPboardType, text/plain, ...).
When an application has multiple versions of an expression available, it
may choose the version to export as text at its
discretion. Since some older MathML-aware programs expect MathML
instances transferred as text to begin with a math
element, the text version should generally omit the XML processing
instruction, DOCTYPE declaration and other XML prolog material before
the math
element. Similarly, the BOM should be omitted for
Unicode text encoded as UTF-16. Note, the Unicode text version of the
data should always be the last flavor exported,
following the principle that exported flavors should be ordered
with the most specific flavor first and the least specific flavor
last.
For purposes of determining whether a MathML instance is pure
content markup or pure presentation markup, the math
element
and the semantics
, annotation
and
annotation-xml
elements should be regarded as belonging to
both the presentation and content markup vocabularies. This is
obvious for the root math
element which is required for all
MathML expressions. However, the semantics
element and its
child annotation elements comprise an arbitrary annotation mechanism
within MathML, and are not tied to either presentation or content
markup. Consequently, applications consuming MathML should always
process these four elements even if the application only implements
one of the two vocabularies.
It is worth noting that the above recommendations allow agents producing
MathML to provide binary data for the clipboard, for example as an image
or an application-specific format.
The sole method to do so is to reference the binary data by the href
attribute since XML child-text does not allow arbitrary byte-streams.
While the above recommendations are intended to improve interoperability between MathML-aware applications utilizing the transfer flavors, it should be noted that they do not guarantee interoperablility. For example, references to external resources (e.g. stylesheets, etc.) in MathML data can also cause interoperability problems if the consumer of the data is unable to locate them, just as can happen when cutting and pasting HTML or many other data types. Applications that make use of references to external resources are encouraged to make users aware of potential problems and provide alternate ways for obtaining the referenced resources. In general, consumers of MathML data containing references they cannot resolve or do not understand should ignore them.
An e-Learning application has a database of quiz questions, some of which contain MathML. The MathML comes from multiple sources, and the e-Learning application merely passes the data on for display, but does not have sophisticated MathML analysis capabilities. Consequently, the application is not aware whether a given MathML instance is pure presentation or pure content markup, nor does it know whether the instance is valid with respect to a particular version of the MathML schema. It therefore places the following data formats on the clipboard:
Flavour Name | Flavor Content |
---|---|
MathML |
<?xml version="1.0"?> <math xmlns="http://www.w3.org/1998/Math/MathML">...</math> |
Unicode Text |
<math xmlns="http://www.w3.org/1998/Math/MathML">...</math> |
An equation editor is able to generate pure presentation markup, valid with respect to MathML 2.0, 2nd Edition. Consequently, it exports the following flavors:
Flavour Name | Flavor Content |
---|---|
Presentation MathML |
<?xml version="1.0"?> <math xmlns="http://www.w3.org/1998/Math/MathML">...</math> |
Tiff |
(a rendering sample) |
Unicode Text |
<math xmlns="http://www.w3.org/1998/Math/MathML">...</math> |
A schema-based content management system contains multiple MathML representations of a collection of mathematical expressions, including mixed markup from authors, pure content markup for interfacing to symbolic computation engines, and pure presentation markup for print publication. Due to the system's use of schemas, markup is stored with a namespace prefix. The system therefore can transfer the following data:
Flavour Name | Flavor Content |
---|---|
Presentation MathML |
<?xml version="1.0"?> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/Math/XMLSchema/mathml2/mathml2.xsd"> <mml:mrow> ... <mml:mrow> </mml:math> |
Content MathML |
<?xml version="1.0"?> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/Math/XMLSchema/mathml2/mathml2.xsd"> <mml:apply> ... <mml:apply> </mml:math> |
MathML |
<?xml version="1.0"?> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/Math/XMLSchema/mathml2/mathml2.xsd"> <mml:mrow> <mml:apply> ... content markup within presentation markup ... </mml:apply> ... </mml:mrow> </mml:math> |
TeX |
{x \over x-1} |
Unicode Text |
<mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/Math/XMLSchema/mathml2/mathml2.xsd"> <mml:mrow> ... <mml:mrow> </mml:math> |
A similar content management system is web-based and delivers MathML representations of mathematiacly expressions. The system is able to produce presentation MathML, content MathML, TeX and pictures in PNG format. In web-pages being browsed, it could produce a MathML fragment such as the following:
<mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:semantics> <mml:mrow>...</mml:mrow> <mml:annotation-xml encoding="MathML content">...</mml:annotation-xml> <mml:annotation encoding="TeX">{1 \over x}</mml:annotation> <mml:annotation encoding="image/png" href="formula3848.png"/> </mml:semantics> </mml:math>
A web-browser that receives such a fragment and tries to export it as part of a drag-and-drop action, can offer the following flavors:
Flavour Name | Flavor Content |
---|---|
Presentation MathML |
<?xml version="1.0"?> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/Math/XMLSchema/mathml2/mathml2.xsd"> <mml:mrow> ... <mml:mrow> </mml:math> |
Content MathML |
<?xml version="1.0"?> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/Math/XMLSchema/mathml2/mathml2.xsd"> <mml:apply> ... <mml:apply> </mml:math> |
MathML |
<?xml version="1.0"?> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/Math/XMLSchema/mathml2/mathml2.xsd"> <mml:mrow> <mml:apply> ... content markup within presentation markup ... </mml:apply> ... </mml:mrow> </mml:math> |
TeX |
{x \over x-1} |
image/png |
(the content of the picture file, requested from formula3848.png |
Unicode Text |
<mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/Math/XMLSchema/mathml2/mathml2.xsd"> <mml:mrow> ... <mml:mrow> </mml:math> |
Since MathML is most often generated by authoring tools, it is particularly important that opening a MathML expression in an editor should be easy to do and to implement. In many cases, it will be desirable for an authoring tool to record some information about its internal state along with a MathML expression, so that an author can pick up editing where he or she left off. The following markup is proposed:
For any extra information that is expected to be semantically equivalent
MathML-3 proposes the usage of the
semantics
element presented
in Section 5.3 Semantic Annotations beyond Alternate Representations.
For any extra information that cannot be declared as such, and is, expectedly,
private to the application. MathML-3 suggests to use the maction
,
see Section 3.6.1 Bind Action to Sub-Expression
(maction).
In order to fully integrate MathML into XHTML, it should be possible not only to embed MathML in XHTML, as described in Section 7.1.1 Recognizing MathML in an XML Model, but also to embed XHTML in MathML. However, the problem of supporting XHTML in MathML presents many difficulties. Therefore, at present, the MathML specification does not permit any XHTML elements within a MathML expression, although this may be subject to change in a future revision of MathML.
In most cases, XHTML elements (headings, paragraphs, lists, etc.) either do not apply in mathematical contexts, or MathML already provides equivalent or better functionality specifically tailored to mathematical content (tables, mathematics style changes, etc.). However, there are two notable exceptions, the XHTML anchor and image elements. For this functionality, MathML relies on the general XML linking and graphics mechanisms being developed by other W3C Activities.
Issue Linking-and-marking-ids | wiki (member only) |
---|---|
Linking and Marking IDs | |
We wish to stop using xlink for links since it seems unimplemented and add the necessary attributes at presentation elements. |
|
Resolution | None recorded |
MathML has no element that corresponds to the XHTML anchor element a. In XHTML, anchors are used both to make links, and to provide locations to which a link can be made. MathML, as an XML application, defines links by the use of the mechanism described in the W3C Recommendation "XML Linking Language" [XLink].
A MathML element is designated as a link by the presence of the
attribute xlink:href
. To use the attribute xlink:href
, it is also necessary to declare the
appropriate namespace. Thus, a typical MathML link might look like:
<mrow xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="sample.xml"> ... </mrow>
MathML designates that almost all elements can be used as XML linking elements. The only elements that cannot serve as linking elements are those which exist primarily to disambiguate other MathML constructs and in general do not correspond to any part of a typical visual rendering. The full list of exceptional elements that cannot be used as linking elements is given in the table below.
MathML elements that cannot be linking elements | ||
---|---|---|
mprescripts |
none |
|
malignmark |
maligngroup |
Note that the XML Linking [XLink] and XML Pointer Language [XPointer] specifications also define how to link into a MathML expressions. Be aware, however, that such links may or may not be properly interpreted in current software.
The img
element has no MathML
equivalent. The decision to omit a general mechanism for image
inclusion from MathML was based on several factors. However, the main
reason for not providing an image facility is that MathML takes great
pains to make the notational structure and mathematical content it
encodes easily available to processors, whereas information contained
in images is only available to a human reader looking at a visual
representation. Thus, for example, in the MathML paradigm, it would be
preferable to introduce new glyphs via the mglyph
element which at a minimum identifies them
as glyphs, rather than simply including them as images.
Apart from the introduction of new glyphs, many of the situations where one might be inclined to use an image amount to displaying labeled diagrams. For example, knot diagrams, Venn diagrams, Dynkin diagrams, Feynman diagrams and commutative diagrams all fall into this category. As such, their content would be better encoded via some combination of structured graphics and MathML markup. However, at the time of this writing, it is beyond the scope of the W3C Math Activity to define a markup language to encode such a general concept as "labeled diagrams." (See http://www.w3.org/Math for current W3C activity in mathematics and http://www.w3.org/Graphics for the W3C graphics activity.)
One mechanism for embedding additional graphical content is via the
semantics
element, as in the following example:
<semantics> <apply> <intersect/> <ci>A</ci> <ci>B</ci> </apply> <annotation-xml encoding="SVG1.1"> <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 290 180"> <clipPath id="a"> <circle cy="90" cx="100" r="60"/> </clipPath> <circle fill="#AAAAAA" cy="90" cx="190" r="60" style="clip-path:url(#a)"/> <circle stroke="black" fill="none" cy="90" cx="100" r="60"/> <circle stroke="black" fill="none" cy="90" cx="190" r="60"/> </svg> </annotation-xml> <annotation-xml encoding="application/xhtml+xml"> <img xmlns="http://www.w3.org/1999/xhtml" src="intersect.gif" alt="A intersect B"/> </annotation-xml> </semantics>
Here, the annotation-xml
elements are used to indicate alternative
representations of the Content MathML depiction of the
intersection of two sets.
The first one is in the "Scalable Vector
Graphics" format [SVG1.1]
(see [XHTML-MathML-SVG] for the definition of an XHTML profile integrating MathML and SVG), the second one uses the
XHTML img
element embedded as an XHTML fragment.
In this situation, a MathML processor can use any of these
representations for display, perhaps producing a graphical format
such as the image below.
Note that the semantics representation of this example is given
in the Content MathML markup, as the first child of the
semantics
element. In this regard, it is the
representation most analogous to the alt
attribute of the
img
element in XHTML, and would likely be
the best choice for non-visual rendering.
When MathML is rendered in an environment that supports [CSS21], controlling mathematics style properties with a CSS stylesheet is obviously desirable. MathML 2.0 has significantly redesigned the way presentation element style properties are organized to facilitate better interaction between MathML renderers and CSS style mechanisms. It introduces four new mathematics style attributes with logical values. Roughly speaking, these attributes can be viewed as the proper selectors for CSS rules that affect MathML.
Controlling mathematics styling is not as simple as it might first appear
because mathematics styling and text styling are quite different in
character. In text, meaning is primarily carried by the relative
positioning of characters next to one another to form words. Thus,
although the font used to render text may impart nuances to the
meaning, transforming the typographic properties of the individual
characters leaves the meaning of text basically intact. By contrast,
in mathematical expressions, individual characters in specific
typefaces tend to function as atomic symbols. Thus, in the same
equation, a bold italic 'x' and a normal italic 'x' are almost always
intended to be two distinct symbols that mean different things. In
traditional usage, there are eight basic typographical categories
of symbols. These categories are described by mathematics style
attributes, primarily the mathvariant
attribute.
Text and mathematics layout also obviously differ in that mathematics uses 2-dimensional layout. As a result, many of the style parameters that affect mathematics layout have no textual analogs. Even in cases where there are analogous properties, the sensible values for these properties may not correspond. For example, traditional mathematical typography usually uses italic fonts for single character identifiers, and upright fonts for multicharacter identifier. In text, italicization does not usually depend on the number of letters in a word. Thus although a font-slant property makes sense for both mathematics and text, the natural default values are quite different.
Because of the difference between text and mathematics styling, only the styling aspects that do not affect layout are good candidates for CSS control. MathML 3.0 captures the most important properties with the new mathematics style attributes, and users should try to use them whenever possible over more direct, but less robust, approaches. A sample CSS stylesheet illustrating the use of the mathematical style attributes is available in Appendix C Sample CSS Style Sheet for MathML. Users should not count on MathML implementations to implement any other properties than those in the Font, Colors, and Outlines families of properties described in [CSS2] and implementations should only implement these properties within MathML-elements. Note that these prohibitions do not apply to CSS stylesheets that implement the MathML-CSS profile. (TODO: quote).
TODO: add equivalence statements and conflict resolution and stress that CSS changes should not be considered meaningful.
Generally speaking, the model for CSS interaction with the math style attributes runs as follows. A CSS style sheet might provide a style rule such as:
math *.[mathsize="small"] { font-size: 80% }
This rule sets the CSS font-size properties for all children of the
math
element that have the mathsize
attribute set to small.
A MathML renderer
would then query the style engine for the CSS environment, and use the
values returned as input to its own layout algorithms. MathML does
not specify the mechanism by which style information is inherited from
the environment. However, some suggested rendering rules for the
interaction between properties of the ambient style environment and
MathML-specific rendering rules are discussed in Section 3.2.2 Mathematics style attributes common to token
elements, and more generally throughout Chapter 3 Presentation Markup.
It should be stressed, however, that some caution is required in writing CSS stylesheets for MathML. Because changing typographic properties of mathematics symbols can change the meaning of an equation, stylesheet should be written in a way such that changes to document-wide typographic styles do not affect embedded MathML expressions. By using the MathML 2.0 mathematics style attributes as selectors for CSS rules, this danger is minimized.
Another pitfall to be avoided is using CSS to provide typographic style information necessary to the proper understanding of an expression. Expressions dependent on CSS for meaning will not be portable to non-CSS environments such as computer algebra systems. By using the logical values of the new MathML 3.0 mathematics style attributes as selectors for CSS rules, it can be assured that style information necessary to the sense of an expression is encoded directly in the MathML.
MathML 3.0 does not specify how a user agent should process style information, because there are many non-CSS MathML environments, and because different users agents and renderers have widely varying degrees of access to CSS information. In general, however, developers are urged to provide as much CSS support for MathML as possible.