Mixing Markup Languages

5.1 Semantic Annotations

An important concern of MathML is to represent associations between presentation and content markup forms for an expression, and of associations between MathML markup forms and other representations for an expression. An additional concern is the preservation of semantic attributions that are associated with MathML presentation or content forms. These associations are known collectively as semantic annotations. A semantic annotation decorates a MathML expression with a sequence of pairs made up of a symbol, known as the annotation key, and an associated entity, the annotation value.

5.1.1 Annotation elements

MathML uses the semantics, annotation, and annotation-xml elements to represent semantic annotations. The semantics element provides the container for an annotated element and a sequence of annotations, represented by annotation elements, for character data annotations, and by annotation-xml elements, for XML markup annotations, that represent the annotation key/value pairs.

<semantics>
  <mrow>
    <mrow>
      <mo>sin</mo>
      <mfenced><mi>x</mi></mfenced>
    </mrow>
    <mo>+</mo>
    <mn>5</mn>
  </mrow>
  <annotation cd="TeX" name="plainTeXrep" encoding="TeX">
    \sin x + 5
  </annotation>
  <annotation-xml cd="openmath" name="XMLencoding" encoding="OpenMath">
    <OMA xmlns="http://www.openmath.org/OpenMath">
      <OMS cd="arith1" name="plus"/>
      <OMA><OMS cd="transc1" name="sin"/><OMV name="x"/></OMA>
      <OMI>5</OMI>
    </OMA>
  </annotation-xml>
</semantics>

A semantic annotation may provide an alternate representation for a MathML expression, either as another MathML or XML expression, or as character data represented in some other markup language. An annotation may provide an equivalent representation that captures all of the relevant semantic behavior of the expression, or it may extend the object with additional semantic properties that change the expression in an essential way, or it may simply provide additional rendering or other associations that are incidental to the semantics of the expression.

The relationship between the expression to be annotated and the annotation value is identified by a symbol, known as the annotation key. The annotation key is the primary identifier that an application should use to determine if it understands the associated annotation value. If the annotation key is not specified, it defaults to a distinguished annotation key that specifies that the annotation provides an alternate representation for the annotated expression. In this case, an application should use the value of the encoding attribute to determine if it understands the alternate representation.

Each annotation element provides a reference to its annotation key via the cd and name attributes. Taken together, these attributes identify a named symbol from a specific content dictionary that describes the nature of the annotation. The referencing mechanism is the same as for the csymbol element (see Section 4.2.3 Content Symbols <csymbol>) only that the symbol name of the key symbol is directly given by the name attribute. The definitionURL attribute provides an alternative way to reference the key symbol for an annotation, for better compatibility with MathML 2. If none of these attributes are specified, the annotation key is assumed to be the symbol alternate-representation from the mathmlkeys content dictionary.

The semantics element is considered to be both a presentation element and a content element, and may be used in either context. All MathML processors should process the semantics element, even if they only process one of these two subsets of MathML.

5.1.2 Annotation references

In the usual case, each annotation element includes either character data content (in the case of annotation) or XML markup data (in the case of annotation-xml) that represents the annotation value. There is no restriction on the type of annotation that may appear within a semantics element. For example, an annotation could provide a T_EX encoding, a linear input form for a computer algebra system, a rendered image, or detailed mathematical type information.

In some cases the alternative children of a semantics element are not an essential part of the behavior of the annotated expression, but may be useful to specialized processors. To enable the availability of several annotation formats in a more efficient manner, a semantics element may contain empty annotation and annotation-xml elements that provide encoding and href attributes to specify an external location for the annotation value associated with the annotation. This type of annotation is known as an annotation reference.

<semantics>
  <mfrac><mi>a</mi><mrow><mi>a</mi><mo>+</mo><mi>b</mi></mrow></mfrac>
  <annotation encoding="image/png" href="333/formula56.png"/>
  <annotation encoding="text/maple" href="333/formula56.ms"/>
</semantics>

Processing agents that anticipate that consumers of exported markup may not be able to retrieve the external entity referenced by such annotations should request the content of the external entity at the indicated location and replace the annotation with its expanded form.

An annotation reference follows the same rules as for other annotations to determine the annotation key that specifies the relationship between the annotated object and the annotation value.

5.1.3 Alternate representations

A semantic annotation may provide an alternate representation for a MathML expression. For example, in the MathML representation below, the semantics element binds together various representations of the sum of the sine function applied to a variable x and the number 5.

<semantics>
  <mrow>
    <mrow>
      <mo>sin</mo>
      <mfenced open="(" close=")"><mi>x</mi></mfenced>
    </mrow>
    <mo>+</mo>
    <mn>5</mn>
  </mrow>
  <annotation-xml cd="mathml" name="contentequiv" encoding="MathML&#160;Content">
    <apply>
      <csymbol cd="algebra-logic" name="plus"/>
      <apply><sin/><ci>x</ci></apply>
      <cn>5</cn>
    </apply>
  </annotation-xml>
  <annotation cd="maple" name="nativerep" encoding="text/maple">sin(x) + 5</annotation>
  <annotation cd="mathematica" name="nativerep" encoding="Mathematica">Sin[x] + 5</annotation>
  <annotation cd="TeX" name="plainTeXrep" encoding="TeX"> \sin x + 5</annotation>
  <annotation-xml cd="openmath" name="XMLencoding" encoding="OpenMath">
    <OMA xmlns="http://www.openmath.org/OpenMath">
      <OMA>
        <OMS cd="arith1" name="plus"/>
        <OMA><OMS cd="transc1" name="sin"/><OMV name="x"/></OMA>
        <OMI>5</OMI>
      </OMA>
    </OMA>
  </annotation-xml>
</semantics>

Here, the presentation element in the first child of the semantics element is annotated with various content-oriented representations. Each annotation and annotation-xml element specifies the nature of the annotation by referencing a key symbol in an appropriate content dictionary. For instance, the first annotation-xml element references the key symbol "contentequiv" from the attribution-keys content dictionary that specifies that the content MathML expression it provides is mathematically equivalent to the annotated presentation MathML expression.

5.1.4 Flattening semantic annotations

One consequence of the syntax for semantic annotation is that annotations may be applied to markup elements that are themselves annotations of other elements. In other words, a semantics element may contain another semantics element as its first child element, as in the sketch below:

<semantics>
  <semantics>A A_1 A_k</semantics>
  A_k+1 ... A_n
</semantics>

where the A_i represent annotation or annotation-xml elements. This expression is equivalent to a single semantics element that contains the union of the annotations from the original semantics elements.

<semantics>
  A
  A_1 ... A_n
</semantics>

The operation that produces an expression with a single layer of semantic annotations is called flattening. Multiple annotations with the same key symbol are allowed. While the order of the given attributes does not imply any notion of priority, it potentially could be significant.

5.2 Elements for Semantic Annotations

This section explains the semantic mapping elements semantics, annotation, and annotation-xml. These elements associate alternate representations for a presentation or content expression, or associate semantic or other attributions that may modify the meaning of the annotated expression.

5.2.1 The `semantics` element

The semantics element is the container element that associates annotations with a MathML expression. The semantics element has as its first child the expression to be annotated. Subsequent children provide the annotations.

An annotation whose representation is XML based is enclosed in an annotation-xml element. An annotation whose representation is text is enclosed in an annotation element.

The semantics element takes the definitionURL and encoding attributes, which reference an external source for some or all of the semantic information for the annotated element, as modified by the annotation. The use of these attributes are deprecated in MathML3. The definitionURL attribute should be added to the elements whose meaning is to be clarified.

Attributes of the `semantics` element
Name	values	default
definitionURL	a URI pointing to an equivalent formulation
encoding	the encoding of that equivalent formulation

<semantics>
  <mrow>
    <mrow>
      <mo>sin</mo>
      <mfenced><mi>x</mi></mfenced>
    </mrow>
    <mo>+</mo>
    <mn>5</mn>
  </mrow>
  <annotation-xml cd="mathml" name="contentequiv" encoding="MathML&#160;Content">
    <apply>
      <plus/>
      <apply><sin/><ci>x</ci></apply>
      <cn>5</cn>
    </apply>
  </annotation-xml>
  <annotation cd="maple" name="nativerep" encoding="Maple">
    sin(x) + 5
  </annotation>
  <annotation cd="mathematica" name="nativerep" encoding="Mathematica">
    Sin[x] + 5
  </annotation>
  <annotation cd="TeX" name="plainTeXrep" encoding="TeX">
    \sin x + 5
  </annotation>
  <annotation-xml cd="openmath" name="XMLencoding" encoding="OpenMath">
    <OMA xmlns="http://www.openmath.org/OpenMath">
      <OMS cd="arith1" name="plus"/>
      <OMA><OMS cd="transc1" name="sin"/><OMV name="x"/></OMA>
      <OMI>5</OMI>
    </OMA>
  </annotation-xml>
</semantics>

The default rendering of a semantics element is the default rendering of its first child. A renderer may use the information contained in the annotations to customize its rendering of the annotated element.

5.2.2 The `annotation` element

The annotation element is the container element for a semantic annotation whose representation is parsed character data in a non-XML format. The annotation element should contain the character data for the annotation, and should not contain XML markup elements. If the annotation contains one of the XML reserved characters &, <, >, ', or ", then these characters must be encoded using an XML entity reference or an XML CDATA section.

The annotation element takes the attributes cd, and name. Taken together, these attributes reference the key symbol that identifies the relation between the annotated element and the annotation.

The annotation element takes the definitionURL attribute, which provides an alternative way to reference the key symbol that identifies the relation between the annotated element and the annotation.

If none of these attributes are specified, the key symbol for the annotation is the symbol alternate-representation from the attribution-keys content dictionary.

The annotation element takes the encoding attribute, which describes the content type of the annotation. The value of the encoding attribute may contain a MIME type that identifies the data format for the encoding data. For data formats that do not have an associated MIME type, implementors may choose a For data formats that do not have an associated media type, implementors may choose a self-describing character string to identify their content type.

The annotation element allows the href attribute, which provides a mechanism to attach external entities as annotations on MathML expressions.

<annotation cd="TeX" name="plainTeXrep" encoding="TeX">
  \sin x + 5
</annotation>

<annotation encoding="image/png" href="333/formula56.png"/>

The annotation element is a semantic mapping element that may only be used as a child of the semantics element. While there is no default rendering for the annotation element, a renderer may use the information contained in an annotation to customize its rendering of the annotated element.

Attributes of the `annotation` and `annotation-xml` elements
Name	values	default
definitionURL	a URI pointing to the meaning of the annotation relationship
encoding	an encoding name of the alternate representation contained in the annotation
cd	the content-dictionary name of the symbol denoting the annotation relationship	attribution-keys
name	the name of the equivalent symbol	alternate-representation
href	the (relative) URL to the content of the annotation
clipboardflavor	the (standardized or platform specific) flavor name indicating that this annotation should provide a clipboard flavor, see Section 6.3.2 Recommended Behaviors when Transferring

5.2.3 The `annotation-xml` element

The annotation-xml element is the container element for a semantic annotation whose representation is structured markup in an XML format. The annotation-xml element should contain the markup elements, attributes, and character data for the annotation.

The annotation-xml element takes the attributes cd, and name. Taken together, these attributes reference the key symbol that identifies the relation between the annotated element and the annotation.

The annotation-xml element takes the definitionURL attribute, which provides an alternative way to reference the key symbol that identifies the relation between the annotated element and the element should contain the markup elements, attributes, and character data for the annotation.

The annotation-xml uses the same attributes as the annotation element for identifying the key symbol of the annotation. It uses the encoding attribute, which describes the content type of the annotation. The value of the encoding attribute may contain a media type that identifies the data format for the encoding data. For data formats that do not have an associated media type, implementors may choose a self-describing character string to identify their content type. For example, Section 6.2.3 Names of MathML Encodings identifies the strings MathML, MathML Presentation, and MathML Content as predefined values for the encoding attribute that may be used to identify MathML markup in an annotation-xml element. Finally the annotation-xml element allows the href attribute, which provides a mechanism to attach external XML entities as annotations on MathML expressions.

<annotation-xml cd="mathmlkeys" name="contentequiv" encoding="MathML&#160;Content">
  <apply>
    <plus/>
    <apply><sin/><ci>x</ci></apply>
    <cn>5</cn>
  </apply>
</annotation-xml>

<annotation-xml cd="openmath" name="XMLencoding" encoding="OpenMath">
  <OMA xmlns="http://www.openmath.org/OpenMath">
    <OMS cd="arith1" name="plus"/>
    <OMA><OMS cd="transc1" name="sin"/><OMV name="x"/></OMA>
    <OMI>5</OMI>
  </OMA>
</annotation-xml>

When the annotation value is represented in an XML dialect other than MathML, the namespace for the XML markup for the annotation should be identified by means of namespace attributes and/or namespace prefixes on the annotation value. For instance:

<annotation-xml encoding="application/xhtml+xml">
  <html xmlns="http://www.w3.org/1999/xhtml">
    <head><title>E</title></head>
    <body><p>The base of the natural logarithms, approximately 2.71828.</p></body>
  </html>
</annotation-xml>

The annotation-xml element is a semantic mapping element that may only be used as a child of the semantics element. While there is no default rendering for the annotation-xml element, a renderer may use the information contained in an annotation to customize its rendering of the annotated element.

5.3 Combining Presentation and Content Markup

Presentation markup encodes the notational structure of an expression. Content markup encodes the functional structure of an expression. In certain cases, a particular application of MathML may require a combination of both presentation and content markup. This section describes specific constraints that govern the use of presentation markup within content markup, and vice versa.

5.3.1 Presentation Markup in Content Markup

Presentation markup may be embedded within content markup so long as the resulting expression retains an unambiguous function application structure. Specifically, presentation markup may only appear in content markup in three ways:

within ci and cn token elements
within the csymbol element
within the semantics element

Any other presentation markup occurring within content markup is a MathML error. More detailed discussion of these three cases follows:

Presentation markup within token elements.: The token elements ci and cn are permitted to contain any sequence of MathML characters (defined in Chapter 7 Characters, Entities and Fonts) and/or presentation elements. Contiguous blocks of MathML characters in ci or cn elements are treated as if wrapped in mi or mn elements, as appropriate, and the resulting collection of presentation elements is rendered as if wrapped in an implicit mrow element.
Presentation markup within the csymbol element.: The csymbol element may contain either MathML characters interspersed with presentation markup, or content markup. It is a MathML error for a csymbol element to contain both presentation and content elements. When the csymbol element contains character data and presentation markup, the same rendering rules that apply to the token elements ci and cn should be used.
Presentation markup within the semantics element.: One of the main purposes of the semantics element is to provide a mechanism for incorporating arbitrary MathML expressions into content markup in a semantically meaningful way. In particular, any valid presentation expression can be embedded in a content expression by placing it as the first child of a semantics element. The meaning of this wrapped expression should be indicated by one or more annotation elements also contained in the semantics element.

5.3.2 Content Markup in Presentation Markup

Content markup may be embedded within presentation markup so long as the resulting expression has an unambiguous rendering. That is, it must be possible, in principle, to produce a presentation markup fragment for each content markup fragment that appears in the combined expression. The replacement of each content markup fragment by its corresponding presentation markup should produce a well-formed presentation markup expression. A presentation engine should then be able to process this presentation expression without reference to the content markup bits included in the original expression.

In general, this constraint means that each embedded content expression must be well-formed, as a content expression, and must be able to stand alone outside the context of any containing content markup element. As a result, the following content elements may not appear as an immediate child of a presentation element: annotation, annotation-xml, bvar, condition, degree, logbase, lowlimit, uplimit.

In addition, within presentation markup, content markup may not appear within presentation token elements.

5.4 Parallel Markup

Some applications are able to use both presentation and content information. Parallel markup is a way to combine two or more markup trees for the same mathematical expression. Parallel markup is achieved with the semantics element. Parallel markup for an expression may appear on its own, or as part of a larger content or presentation tree.

5.4.1 Top-level Parallel Markup

In many cases, the goal is to provide presentation markup and content markup for a mathematical expression as a whole. A single semantics element may be used to pair two markup trees, where one child element provides the presentation markup, and the other child element provides the content markup.

The following example encodes the Boolean arithmetic expression (a+b)(c+d) in this way.

<semantics>
  <mrow>
    <mrow><mo>(</mo><mi>a</mi> <mo>+</mo> <mi>b</mi><mo>)</mo></mrow>
    <mo>&#x2062;<!--INVISIBLE TIMES--></mo>
    <mrow><mo>(</mo><mi>c</mi> <mo>+</mo> <mi>d</mi><mo>)</mo></mrow>
  </mrow>
  <annotation-xml encoding="MathML&#160;Content">
    <apply><and/>
      <apply><xor/><ci>a</ci> <ci>b</ci></apply>
      <apply><xor/><ci>c</ci> <ci>d</ci></apply>
    </apply>
  </annotation-xml>
</semantics>

Note that the above markup annotates the presentation markup as the first child element, with the content markup as part of the annotation-xml element. An equivalent form could be given that annotates the content markup as the first child element, with the presentation markup as part of the annotation-xml element.

5.4.2 Parallel Markup via Cross-References

To accommodate applications that must process sub-expressions of large objects, MathML supports cross-references between the branches of a semantics element to identify corresponding sub-structures. These cross-references are established by the use of the id and xref attributes within a semantics element. This application of the id and xref attributes within a semantics element should be viewed as best practice to enable a recipient to select arbitrary sub-expressions in each alternative branch of a semantics element. The id and xref attributes may be placed on MathML elements of any type.

The id and xref attributes are supported by MathML to provide cross-references for those applications that do not otherwise require the use of namespaces or validation. Those applications that support namespaces may use the xml:id attribute in the same manner as is described for the id attribute. Similarly, those applications that support validation may use other attributes declared of type ID and IDREF to establish cross-references between corresponding sub-expressions. Of course, cross-references that use custom attributes in this way rely on prior agreement between the producing and consuming applications to preserve the cross-references.

The following example demonstrates cross-references for the Boolean arithmetic expression (a+b)(c+d).

<semantics>
  <mrow id="E">
    <mrow id="E.1">
      <mo id="E.1.1">(</mo>
      <mi id="E.1.2">a</mi>
      <mo id="E.1.3">+</mo>
      <mi id="E.1.4">b</mi>
      <mo id="E.1.5">)</mo>
    </mrow>
    <mo id="E.2">&#x2062;<!--INVISIBLE TIMES--></mo>
    <mrow id="E.3">
      <mo id="E.3.1">(</mo>
      <mi id="E.3.2">c</mi>
      <mo id="E.3.3">+</mo>
      <mi id="E.3.4">d</mi>
      <mo id="E.3.5">)</mo>
    </mrow>
  </mrow>

  <annotation-xml encoding="MathML&#160;Content">
    <apply xref="E">
      <and xref="E.2"/>
      <apply xref="E.1">
        <xor xref="E.1.3"/><ci xref="E.1.2">a</ci><ci xref="E.1.4">b</ci>
      </apply>
      <apply xref="E.3">
        <xor xref="E.3.3"/><ci xref="E.3.2">c</ci><ci xref="E.3.4">d</ci>
      </apply>
    </apply>
  </annotation-xml>
</semantics>

An id attribute and associated xref attributes that appear within the same semantics element establish the cross-references between corresponding sub-expressions.

All of the id attributes referenced by any xref attribute must be in the same branch of an enclosing semantics element. This constraint guarantees that the cross-references do not create unintentional cycles. This restriction does not exclude the use of id attributes within other branches of the enclosing semantics element. It does, however, exclude references to these other id attributes originating from the same semantics element.

There is no restriction on which branch of the semantics element may contain the destination id attributes. It is up to the application to determine which branch to use.

In general, there will not be a one-to-one correspondence between nodes in parallel branches. For example, a presentation tree may contain elements, such as parentheses, that have no correspondents in the content tree. It is therefore often useful to put the id attributes on the branch with the finest-grained node structure. Then all of the other branches will have xref attributes to some subset of the id attributes.

In absence of other criteria, the first branch of the semantics element is a sensible choice to contain the id attributes. Applications that add or remove annotations will then not have to re-assign these attributes as the annotations change.

In general, the use of id and xref attributes allows a full correspondence between sub-expressions to be given in text that is at most a constant factor larger than the original. The direction of the references should not be taken to imply that sub-expression selection is intended to be permitted only on one child of the semantics element. It is equally feasible to select a subtree in any branch and to recover the corresponding subtrees of the other branches.

Parallel markup with cross-references may be used in any XML-encoded branch of the semantic annotations, as shown by the following example where the Boolean expression of the previous section is annotated with OpenMath markup that includes cross-references:

<semantics>
  <mrow id="E">
    <mrow id="E.1">
      <mo id="E.1.1">(</mo>
      <mi id="E.1.2">a</mi>
      <mo id="E.1.3">+</mo>
      <mi id="E.1.4">b</mi>
      <mo id="E.1.5">)</mo>
    </mrow>
    <mo id="E.2">&#x2062;<!--INVISIBLE TIMES--></mo>
    <mrow id="E.3">
      <mo id="E.3.1">(</mo>
      <mi id="E.3.2">c</mi>
      <mo id="E.3.3">+</mo>
      <mi id="E.3.4">d</mi>
      <mo id="E.3.5">)</mo>
    </mrow>
  </mrow>

  <annotation-xml encoding="MathML&#160;Content">
    <apply xref="E">
      <and xref="E.2"/>
      <apply xref="E.1">
        <xor xref="E.1.3"/><ci xref="E.1.2">a</ci><ci xref="E.1.4">b</ci>
      </apply>
      <apply xref="E.3">
        <xor xref="E.3.3"/><ci xref="E.3.2">c</ci><ci xref="E.3.4">d</ci>
      </apply>
    </apply>
  </annotation-xml>

  <annotation-xml encoding="OpenMath" 
                  xmlns:om="http://www.openmath.org/OpenMath">

    <om:OMA href="E">
      <om:OMS name="and" cd="logic1" href="E.2"/>

      <om:OMA href="E.1">
        <om:OMS name="xor" cd="logic1" href="E.1.3"/>
        <om:OMV name="a" href="E.1.2"/>
        <om:OMV name="b" href="E.1.4"/>
      </om:OMA>

      <om:OMA href="E.3">
        <om:OMS name="xor" cd="logic1" href="E.3.3"/>
        <om:OMV name="c" href="E.3.2"/>
        <om:OMV name="d" href="E.3.4"/>
      </om:OMA>
    </om:OMA>
  </annotation-xml>
</semantics>

Here OMA, OMS and OMV are elements defined in the OpenMath standard for representing application, symbol, and variable, respectively. The references from the OpenMath annotation are given by the href attributes.

5 Mixing Markup Languages