4 Content Markup

Overview: Mathematical Markup Language (MathML) Version 3.0
Previous: 3 Presentation Markup
Next: 5 Combining Presentation and Content Markup

4 Content Markup
    4.1 Introduction
        4.1.1 The Intent of Content Markup
        4.1.2 The Scope of Content Markup
        4.1.3 Basic Concepts of Content Markup
        4.1.4 The structure of MathML3 Content Expressions
        4.1.5 Canonical and Legacy Content MathML
    4.2 Canonical Content Markup
        4.2.1 Numbers
        4.2.2 Identifiers
        4.2.3 Symbols
        4.2.4 The MathML3 Content Dictionaries and Operators
        4.2.5 Function Application
        4.2.6 Bindings and Bound Variables
        4.2.7 Qualifiers
        4.2.8 Structure Sharing
        4.2.9 Semantic Mapping
        4.2.10 In Situ Error Markup
    4.3 Rendering of Content Elements
        4.3.1 General Rules
        4.3.2 Attributes Modifying Content Markup Rendering
    4.4 Legacy Markup in Content MathML
        4.4.1 Numbers with constant type
        4.4.2 Token Elements
        4.4.3 Tokens with Attributes
        4.4.4 Container Markup
        4.4.5 Domain of Application (domainofapplication) in Applications
        4.4.6 Domain of Application (domainofapplication) in Bindings
        4.4.7 Integrals with Calling patterns
        4.4.8 degree
        4.4.9 Upper and Lower Limits
        4.4.10 Lifted Associative Commutative Operators
        4.4.11 Declare (declare)

4.1 Introduction

4.1.1 The Intent of Content Markup

As has been noted in the introductory section of this Recommendation, mathematics can be distinguished by its use of a (relatively) formal language, mathematical notation. However, mathematics and its presentation should not be viewed as one and the same thing. Mathematical sums or products exist and are meaningful to many applications completely without regard to how they are rendered aurally or visually. The intent of the content markup in the Mathematical Markup Language is to provide an explicit encoding of the underlying mathematical structure of an expression, rather than any particular rendering for the expression.

There are many reasons for providing a specific encoding for content. Even a disciplined and systematic use of presentation tags cannot properly capture this semantic information. This is because without additional information it is impossible to decide whether a particular presentation was chosen deliberately to encode the mathematical structure or simply to achieve a particular visual or aural effect. Furthermore, an author using the same encoding to deal with both the presentation and mathematical structure might find a particular presentation encoding unavailable simply because convention had reserved it for a different semantic meaning.

The difficulties stem from the fact that there are many to one mappings from presentation to semantics and vice versa. For example the mathematical construct " H multiplied by e" is often encoded using an explicit operator as in H ×  e. In different presentational contexts, the multiplication operator might be invisible "H  e", or rendered as the spoken word "times". Generally, many different presentations are possible depending on the context and style preferences of the author or reader. Thus, given "H  e" out of context it may be impossible to decide if this is the name of a chemical or a mathematical product of two variables H and e.

Mathematical presentation also changes with culture and time: some expressions in combinatorial mathematics today have one meaning to a Russian mathematician, and quite another to a French mathematician; see Section 5.4.1 Notational Style Sheets for an example. Notations may lose currency, for example the use of musical sharp and flat symbols to denote maxima and minima [Chaundy1954]. A notation in use in 1644 for the multiplication mentioned above was \blacksquare H e [Cajori1928].

When we encode the underlying mathematical structure explicitly, without regard to how it is presented aurally or visually, we are able to interchange information more precisely with those systems that are able to manipulate the mathematics. In the trivial example above, such a system could substitute values for the variables H and e and evaluate the result. Further interesting application areas include interactive textbooks and other teaching aids.

4.1.2 The Scope of Content Markup

The semantics of general mathematical notation is not a matter of consensus. It would be an enormous job to systematically codify most of mathematics – a task that can never be complete. Instead, MathML makes explicit a relatively small number of commonplace mathematical constructs, chosen carefully to be sufficient in a large number of applications. In addition, it provides a mechanism for associating semantics with new notational constructs. In this way, mathematical concepts that are not in the base collection of elements can still be encoded.

The base set of content elements is chosen to be adequate for simple coding of most of the formulas used from kindergarten to the end of high school in the United States, and probably beyond through the first two years of college, that is up to A-Level or Baccalaureate level in Europe. Subject areas covered to some extent in MathML are:

  • arithmetic, algebra, logic and relations

  • calculus and vector calculus

  • set theory

  • sequences and series

  • elementary classical functions

  • statistics

  • linear algebra

It is not claimed, or even suggested, that the proposed set of elements is complete for these areas, but the provision for author extensibility greatly alleviates any problem omissions from this finite list might cause.

4.1.3 Basic Concepts of Content Markup

The design of the MathML content elements are driven by the following principles:

  • The logical/functional tree structure of a mathematical expression should be directly encoded by the MathML content elements.

  • The encoding of an expression tree should be explicit, finite, and not dependent on the special parsing of PCDATA or on additional processing such as operator precedence parsing.

  • The basic set of mathematical content constructs that are provided should have default mathematical semantics.

  • There should be a mechanism for associating specific mathematical semantics with the constructs.

The primary goal of the content encoding is to establish explicit connections between mathematical structures and their mathematical meanings. The content elements correspond directly to parts of the underlying mathematical expression tree. Each structure has an associated default semantics and there is a mechanism for associating new mathematical definitions with new constructs.

Significant advantages to the introduction of content-specific tags include:

  • Usage of presentation elements is less constrained. When mathematical semantics are inferred from presentation markup, processing agents must either be quite sophisticated, or they run the risk of inferring incomplete or incorrect semantics when irregular constructions are used to achieve a particular aural or visual effect.

  • It is immediately clear which kind of information is being encoded simply by the kind of elements that are used.

  • Combinations of semantic and presentation elements can be used to convey both the appearance and its mathematical meaning much more effectively than simply trying to infer one from the other.

Expressions described in terms of content elements must still be rendered. For common expressions, default visual presentations are usually clear. "Take care of the sense and the sounds will take care of themselves" wrote Lewis Carroll [Carroll1871]. Default presentations are included in the detailed description of each element occurring in Chapter 4 Content Markup.

To accomplish these goals, the MathML content encoding is based on the concept of an expression tree. A content expression tree is constructed from a collection of more primitive objects, referred to herein as containers and operators. MathML possesses a rich set of predefined container and operator objects, as well as constructs for combining containers and operators in mathematically meaningful ways. The syntax and usage of these content elements and constructions is described in the next section.

4.1.4 The structure of MathML3 Content Expressions

Since the intent of MathML content markup is to encode mathematical expressions in such a way that the mathematical structure of the expression is clear, the syntax and usage of content markup must be consistent enough to facilitate automated semantic interpretation. There must be no doubt when, for example, an actual sum, product or function application is intended and if specific numbers are present, there must be enough information present to reconstruct the correct number for purposes of computation. Of course, it is still up to a MathML processor to decide what is to be done with such a content-based expression, and computation is only one of many options. A renderer or a structured editor might simply use the data and its own built-in knowledge of mathematical structure to render the object. Alternatively, it might manipulate the object to build a new mathematical object. A more computationally oriented system might attempt to carry out the indicated operation or function evaluation.

MathML content encoding is based on the concept of an expression tree built up from

As a general rule, the terminal nodes in the tree represent basic mathematical objects, such as numbers, variables, arithmetic operations and so on. The internal nodes in the tree generally represent some kind of function application or other mathematical construction that builds up a compound object. Function application provides the most important example; an internal node might represent the application of a function to several arguments, which are themselves represented by the terminal nodes underneath the internal node.

4.1.5 Canonical and Legacy Content MathML

MathML3 has simplified and regularized the structure of content MathML expressions, and has based the meaning of symbols on the concept of content dictionaries. In the long run, this will lead to simpler implementations, but in the short run, this creates problems with legacy representations. Therefore MathML3 does not forbid MathML2 representations, but only deprecates (i.e. discourages their use) them.

Concretely, we will distinguish canonical MathML3 (see Section 4.2 Canonical Content Markup), i.e. expression trees that adhere to the MathML3 representational model, from legacy MathML3 (see Section 4.4 Legacy Markup in Content MathML), expression trees that conform to the MathML2 representational model but are not canonical. Legacy MathML3 expressions are still valid MathML3 expressions, but their semantics is specified by re-interpreting them as equivalent canonical MathML3 expressions. MathML processors may deal with them, by supporting them natively, or by translating them to canonical MathML3 during input, e.g. by the XSLT style sheet supplied with the MathML distribution.

Issue legacy2canonical.xsl wiki (member only)
Legacy to Canonical Transformation

Do we want to supply one, how do we distribute it? Will that be an appendix?

Resolution None recorded

MathML3 conformant processors should not generate legacy MathML3, they can be upgraded to conform by piping output through the same style sheet.

4.2 Canonical Content Markup

We introduce the infrastructure of the XML encoding of the Content MathML expression trees in the next sections. In addition to the usage information contained in this section, Appendix C MathML3 Content Dictionaries gives a complete listing of the Content MathML symbols, providing reference information about their attributes, syntax, examples and suggested default semantics and renderings. The rules for using presentation markup within content markup are explained in Section 5.2.3 Presentation Markup Contained in Content Markup. An informal EBNF grammar describing the syntax for the content markup is given in Appendix B Content Markup Validation Grammar.

4.2.1 Numbers

Editorial note: MiKo  
This section will be reworked for OpenMath compabitility

The containers such as <cn>12345</cn> represent mathematical numbers. For example, the number 12345 is encoded as

<cn>12345</cn> 

. The attributes and PCDATA content together provide the data necessary for an application to parse the number. For example, a default base of 10 is assumed, but to communicate that the underlying data was actually written in base 8, simply set the base attribute to 8 as in

 <cn base="8">12345</cn> 

while the complex number 3 + 4i can be encoded as

 <cn type="complex-cartesian">3<sep/>4</cn> 

Such information makes it possible for another application to easily parse this into the correct number.

The cn element is the MathML token element used to represent numbers. The supported types of numbers include: "real", "integer", "rational", "complex-cartesian", and "complex-polar", with "real" being the default type. An attribute base is used to help specify how the content is to be parsed. Its value (any numeric string) indicates numerical base of the number.The default value is "10"

The content itself is essentially PCDATA, separated by <sep/> when two parts are needed in order to fully describe a number. For example, the real number 3 is constructed by <cn type="real">3</cn>, while the rational number 3/4 is constructed as <cn type="rational"> 3<sep/>4 </cn>. The detailed structure and specifications are provided in Section 4.2.1 Numbers.

The type attribute indicates type of the number. Predefined values: "e-notation", "integer", "rational", "real", "complex-polar", "complex-cartesian", "constant".

The default value is "real".

Note: Each data type implies that the data adheres to certain formatting conventions, detailed below. If the data fails to conform to the expected format, an error is generated. Details of the individual formats are:

real

A real number is presented in decimal notation. Decimal notation consists of an optional sign ("+" or "-") followed by a string of digits possibly separated into an integer and a fractional part by a "decimal point". Some examples are 0.3, 1, and -31.56. If a different base is specified, then the digits are interpreted as being digits computed to that base.

e-notation

A real number may also be presented in scientific notation. Such numbers have two parts (a mantissa and an exponent) separated by sep. The first part is a real number, while the second part is an integer exponent indicating a power of the base. For example, 12.3<sep/>5 represents 12.3 times 105. The default presentation of this example is 12.3e5.

integer

An integer is represented by an optional sign followed by a string of 1 or more "digits". What a "digit" is depends on the base attribute. If base is present, it specifies the base for the digit encoding, and it specifies it base 10. Thus base='16' specifies a hex encoding. When base > 10, letters are added in alphabetical order as digits. The legitimate values for base are therefore between 2 and 36.

rational

A rational number is two integers separated by <sep/>. If base is present, it specifies the base used for the digit encoding of both integers.

complex-cartesian

A complex number is of the form two real point numbers separated by <sep/>.

complex-polar

A complex number is specified in the form of a magnitude and an angle (in radians). The raw data is in the form of two real numbers separated by <sep/>.

MathML also allowed type "constant" with the Unicode symbols for certain numeric constants. This only allowed in MathML3 as part of the legacy markup.

4.2.2 Identifiers

MathML3 uses the ci element (for "content identifier") to construct a variable, or an identifier that is not a symbol. The content is either PCDATA or a general presentation construct (see Section 3.1.6 Summary of Presentation Elements). For example,

<ci><msub><mi>c</mi><mn>1</mn></msub></ci>

encodes an atomic symbol that displays visually as c1 which, for purposes of content, is treated as a single symbol representing a real number. The definitionURL attribute can be used to identify special properties or to refer to a defining instance, of (for example) a bound variable.

A ci element is rendered as if if were actually the presentation element mi (see Section 3.2.3 Identifier (mi)). The actual rendering of a mathematical symbol can be made as elaborate as necessary simply by using the more elaborate presentational constructs (as described in Chapter 3 Presentation Markup) in the body of the ci or csymbol element.

All attributes of the ci element are CDATA.

type

A type attribute indicates the type of object the identifier represents. Typically, ci represents a real scalar, but no default is specified. Predefined values: "integer", "rational", "real", "complex", "complex-polar", "complex-cartesian", "constant", "function" or the name of any content element. The meanings of the attribute values shared with cn are the same as those listed for the cn element. The attribute value "complex" is intended for use when an identifier represents a complex number but the particular representation (such as polar or cartesian) is either not known or is irrelevant.

nargs

The nargs indicates number of arguments for function declarations. Pre-defined values: "nary", or any numeric string. The default value is "1".

occurrence

The occurrence indicates occurrence for operator declarations. Pre-defined values: "prefix", "infix", "function-model". The default value is "function-model".

definitionURL

URI pointing to detailed semantics of the function.

encoding

syntax of the detailed semantics of the function.

The declaration the type and nargs attributes on the ci element in

<ci type="function"
  nargs="2">f</ci>

declares f to be a two-variable function.

4.2.3 Symbols

The notion of constructing a general expression tree is essentially that of applying an operator to sub-objects. For example, the sum a + b can be thought of as an application of the addition operator to two arguments a and b. In MathML, elements are used for operators for much the same reason that elements are used to contain objects. They are recognized at the level of XML parsing, and their attributes can be used to record or modify the intended semantics.

There is also another reason for using elements to denote operators. There is a crucial semantic distinction between the function itself and the expression resulting from applying that function to zero or more arguments which must be captured. This is addressed by making the functions self-contained objects with their own properties and providing an explicit apply construct corresponding to function application. We will consider the apply construct in the next section.

MathML contains many pre-defined operator elements, covering a range of mathematical subjects (see section Section 4.2.4 The MathML3 Content Dictionaries and Operators below). However, an important class of expressions involve unknown or user-defined functions and symbols. For these situations, MathML provides a general csymbol element, which is discussed below.

Due to the nature of mathematics the notation must be extensible. The key to extensibility is the ability of the user to define new functions and other symbols to expand the terrain of mathematical discourse.

It is always possible to create arbitrary expressions, and then to use them as symbols in the language. Their properties can then be inferred directly from that usage as was done in the previous section. However, such an approach would preclude being able to encode the fact that the construct was a known symbol, or to record its mathematical properties except by actually using it. The csymbol element is used as a container to construct a new symbol in much the same way that ci is used to construct an identifier. (Note that "symbol" is used here in the abstract sense and has no connection with any presentation of the construct on screen or paper).

The difference in usage is that csymbol should refer to some mathematically defined concept with an external definition referenced via the csymbol attributes, whereas ci is used for identifiers that are essentially "local" to the MathML expression.

In MathML3, external definitions are grouped in Content Dictionaries (structured documents for the definition of mathematical concepts; see [OpenMath2004] and Appendix C MathML3 Content Dictionaries).

We need three bits of information to fully identify a symbol: a symbol name, a Content Dictionary name, and (optionally) a Content Dictionary base URI, which we encode in three attributes of the csymbol element: name, cd, and cdbase. The Content Dictionary is the location of the definition of the symbol, consisting of a name and, optionally, a unique prefix called a cdbase which is used to disambiguate multiple Content Dictionaries of the same name. As there are multiple encodings for content dictionaries, we use the encoding attribute to specify which one to expect. The value of this attribute is the mime-type of the encoding. If a symbol does not have an explicit cdbase attribute, then it inherits its cdbase from the first ancestor in the XML tree with one, should such an element exist. In this document we have tended to omit the cdbase for clarity.

There are other properties of the symbol that are not explicit in these fields but whose values may be obtained by inspecting the Content Dictionary specified. These include the symbol definition, formal properties and examples and, optionally, a Role which is a restriction on where the symbol may appear in a MathML expression tree. The possible roles are described in Section C.2.3 Symbol Roles.

<csymbol cdbase="http://www.example.com" encoding="application/x-MathML-CD"
	         cd="VectorCalculus" name="Christioffel">Christoffel</csymbol>
Issue encoding_value wiki (member only)
encoding value

What should be the value of the encoding attribute. I propose the MIME type. What is the mine-type for MathML content dictionaries?

For the moment I will use application/x-MathML-CD should we register one when we register application/xml+mathml?

Resolution None recorded

For backwards compatibility with MathML2 and to facilitate the use of MathML within a URI-based framework (such as RDF [rdf] or OWL [owl]), the content of the name, cd, cdbase, and encoding can be combined in the definitionURL attribute: we provide the following scheme for constructing a canonical URI for an MathML Symbol, which can be given in the definitionURL attribute.

URI = cdbase-value + '/' + cd-value + '/' + encoding-ext + '#' + name-value

where encoding-ext is the canonical extension for the encoding specified in the encoding attribute. So for example the URI for the symbol above would be

<csymbol definitionURL="http://www.example.com/VectorCalculus.mcd#Christioffel">Christoffel</csymbol>
Issue CD_encoding_table wiki (member only)
CD encoding table?

do we want to keep a table of MIME types (for the encodings) and and the default extensions to make the mapping work? Is this something the OpenMath Society should do?

Resolution None recorded
Editorial note  
integrate the following leftover text

The csymbol element, or "content symbol" is used to construct a symbol whose semantics are not part of the core content elements provided by MathML, but defined outside of the MathML specification. csymbol does not make any attempt to describe how to map the arguments occurring in any application of the function into a new MathML expression. Instead, it depends on its definitionURL attribute to point to a particular meaning, and the encoding attribute to give the syntax of this definition. The content of a csymbol is either PCDATA or a general presentation construct (see Section 3.1.6 Summary of Presentation Elements). For example,

<csymbol definitionURL="http://www.example.com/ContDiffFuncs.htm" encoding="text">
  <msup><mi>C</mi><mn>2</mn></msup>
</csymbol>

encodes an atomic symbol that displays visually as C2 and that, for purposes of content, is treated as a single symbol representing the space of twice-differentiable continuous functions. The detailed structure and specifications are provided in Section 4.2.3 Symbols.

4.2.4 The MathML3 Content Dictionaries and Operators

The most common operations and functions such as plus and sin have been predefined explicitly as empty elements. The general rule is that for any symbol defined in the MathML3 content dictionaries (see Appendix C MathML3 Content Dictionaries), there is an empty content element with the same name. For instance, the empty MathML element

<plus/>

is equivalent to the element

<csymbol cdbase="http://w3.org/Math/CD" encoding="application/x-MathML-CD"
	         cd="algebra-logic" name="plus"><mo>+</mo></csymbol>

both can be used interchangeably. (see Section 4.4.2 Token Elements for details)

We will now give an overview over the MathML3 content elments, they are grouped into content dictionaries that broadly reflect the area of mathematics from which they come.

4.2.5 Function Application

The most fundamental way of building a compound object in mathematics is by applying a function or an operator to some arguments. MathML supplies an infrastructure to represent this in expression trees, which we will present in this section.

An apply element is used to build an expression tree that represents the result of applying a function or operator to its arguments. The tree corresponds to a complete mathematical expression. Roughly speaking, this means a piece of mathematics that could be surrounded by parentheses or "logical brackets" without changing its meaning.

For example, (x + y) might be encoded as

<apply><plus/><ci>x</ci><ci>y</ci></apply>

The opening and closing tags of apply specify exactly the scope of any operator or function. The most typical way of using apply is simple and recursive. Symbolically, the content model can be described as:

<apply> op a b </apply>

where the operands a and b are MathML expression trees themselves, and op is a MathML expression tree that represents an operator or function. Note that apply constructs can be nested to arbitrary depth.

An apply may in principle have any number of operands:

<apply> op a b [c...] </apply>

For example, (x + y + z) can be encoded as

<apply><plus/><ci>x</ci><ci>y</ci><ci>z</ci></apply>

Mathematical expressions involving a mixture of operations result in nested occurrences of apply. For example, a x + b would be encoded as

<apply><plus/><apply><times/><ci>a</ci><ci>x</ci></apply><ci>b</ci></apply>

There is no need to introduce parentheses or to resort to operator precedence in order to parse the expression correctly. The apply tags provide the proper grouping for the re-use of the expressions within other constructs. Any expression enclosed by an apply element is viewed as a single coherent object.

An expression such as (F+G)(x) might be a product, as in

<apply><times/><apply><plus/><ci>F</ci><ci>G</ci></apply><ci>x</ci></apply>

or it might indicate the application of the function F + G to the argument x. This is indicated by constructing the sum

<apply><plus/><ci>F</ci><ci>G</ci></apply>

and applying it to the argument x as in

<apply><apply><plus/><ci>F</ci><ci>G</ci></apply><ci>x</ci></apply>

Both the function and the arguments may be simple identifiers or more complicated expressions.

The apply element is conceptually necessary in order to distinguish between a function or operator, and an instance of its use. The expression constructed by applying a function to 0 or more arguments is always an element from the codomain of the function. Proper usage depends on the operator that is being applied. For example, the plus operator may have zero or more arguments, while the minus operator requires one or two arguments to be properly formed.

If the object being applied as a function is not already one of the elements known to be a function (such as sin or plus) then it is treated as if it were a function.

4.2.6 Bindings and Bound Variables

Some complex mathematical objects are constructed by the use of bound variables. For instance the integration variables in an integral expression is one. Such expressions are represented as MathML expression trees using the bind and bvar elements, possibly augmented by the qualifier element condition (see .

The bvar element is a special qualifier element that is used to denote the bound variable of a binding expression. The bvar element is also used for the bound variable in sums, products, and quantifiers and may be used with user defined functions.

<bind>
  <forall/>
  <bvar><ci>x</ci></bvar>
  <apply><eq/><apply><minus/><ci>x</ci><ci>x</ci></apply><cn>0</cn></apply>
</bind>

Instances of the bound variables are normally recognized by comparing the XML information sets of the relevant ci elements after first carrying out XML space normalization. Such identification can be made explicit by placing an id on the ci element in the bvar element and referring to it using the definitionURL attribute on all other instances. An example of this approach is This id based approach is especially helpful when constructions involving bound variables are nested.

It can be necessary to associate additional information with a bound variable one or more instances of it. The information might be something like a detailed mathematical type, an alternative presentation or encoding or a domain of application. Such associations are accomplished in the standard way by replacing a ci element (even inside the bvar element) by a semantics element containing both it and the additional information. Recognition of and instance of the bound variable is still based on the actual ci elements and not the semantics elements or anything else they may contain. The id based approach outlined above may still be used.

<bind>
  <intfun/>
  <bvar><ci id="var-x">x</ci></bvar>
    <apply><power/><ci definitionURL="#var-x">x</ci><cn>7</cn></apply>
</bind>
Issue integrals_om_mathml wiki (member only)   ISSUE-9 (member only)
Sort out integrals between OpenMath and MathML

Integrals are used differently in OpenMath and MathML. In OpenMath, we have two symbols int@calculus1 for the indefinite integral and defint@calculus1 for the definite integral. Both are operators applied to functions (to be constructed by lambda). In MathML we have the int element for definite and indefinite integrals, and it can both be used both as a binder and as an applied operator.

Both usage patterns are sensible, but we must (the CDs mandate it) distinguish between binder- and applied symbols. The question now is how to best deal with legacy representations of integrals, there are lots of them out there.

Resolution None recorded

4.2.7 Qualifiers

The integrals we have seen so far have all been indefinite, i.e. the range of the bound variables range is unspecified. In many situations, we also want to specify range of bound variables, e.g. in definitive integrals. MathML3 provides the optional condition element as a general restriction mechanism for binding expressions.

A condition element contains a single child that represents a truth condition. Compound conditions are indicated by applying operators such as and in the condition. Consider for instance the following representation of a definite integral.

<bind>
  <int/>
  <bvar><ci>x</ci></bvar>
  <condition>
    <apply><in/><apply><interval><cn>0</cn><infty/></apply></apply>
  </condition>
  <apply><sin/><ci>x</ci></apply>
</bind>

Here the condition element restricts the bound variables to range over the non-negative integers. A number of common mathematical constructions involve such restrictions, either implicit in conventional notation, such as a bound variable, or thought of as part of the operator rather than an argument, as is the case with the limits of a definite integral.

A typical use of the condition qualifier is to define sets by rule, rather than enumeration. The following markup, for instance, encodes the set {x | x < 1}:

<bind><set/>
  <bvar><ci>x</ci></bvar>
  <condition><apply><lt/><ci>x</ci><cn>1</cn></apply></condition>
  <ci>x</ci>
</bind>

In the context of quantifier operators, this corresponds to the "such that" construct used in mathematical expressions. The next example encodes "for all x in N there exist prime numbers p, q such that p+q = 2x".

<bind><forall/>
  <bvar><ci>x</ci></bvar>
  <condition><apply><in/><ci>x</ci><naturalnumbers/></apply></condition>
  <bind><exists/>
     <bvar><ci>p</ci></bvar>
     <bvar><ci>q</ci></bvar>
     <condition>
       <apply><and/>
         <apply><in/><ci>p</ci><primes/></apply>
         <apply><in/><ci>q</ci></primes/></apply>
       </apply>
     </condition>
     <apply><eq/>
        <apply><plus/><ci>p</ci><ci>q</ci></apply>
        <apply><times/><cn>2</cn><ci>x</ci></apply>
     </apply>
   </bind>
</bind>

This use extends to multivariate domains by using extra bound variables and a domain corresponding to a cartesian product as in

<bind><intexp/>
  <bvar><ci>x</ci></bvar>
  <bvar><ci>y</ci></bvar>
  <condition>
    <apply>
      <and/>
      <apply><leq/><cn>0</cn><ci>x</ci></apply>
      <apply><leq/><ci>x</ci><cn>1</cn></apply>
      <apply><leq/><cn>0</cn><ci>y</ci></apply>
      <apply><leq/><ci>y</ci><cn>1</cn></apply>
    </apply>
  </condition>
  <apply>
    <times/>
    <apply><power/><ci>x</ci><cn>2</cn></apply>
    <apply><power/><ci>y</ci><cn>3</cn></apply>
  </apply>
</bind>
Issue bvar_children wiki (member only)
How many bound variables per bvar element?

the OMBVAR allows multiple bound variables, for the MathML, I am not so sure, what do we want?

Resolution None recorded

4.2.8 Structure Sharing

To conserve space, MathML3 expression trees can make use of structure sharing via the share element. This element has an href attribute whose value is the value of a URI referencing an id attribute of a MathML expression tree. When building the MathML expression tree, the share element is replaced by a copy of the MathML expression tree referenced by the href attribute. Note that this copy is structurally equal, but not identical to the element referenced. The values of the share will often be relative URI references, in which case they are resolved using the base URI of the document containing the share element.

Issue share_presentation wiki (member only)
share in Presentation MathML as well?

In order to get parallel markup working, we might want to introduce a sharing element for presentation MathML as well. That would also potentially give us size benefits.

Resolution None recorded

For instance, the mathematical object f(f(f(a,a),f(a,a)),f(a,a),f(a,a)) can be encoded as either one of the following representations (and some intermediate versions as well).

<math>         <math>
  <apply>                         <apply>
    <ci>f</ci>                      <ci>f</ci> 
    <apply>                         <apply id="t1">
      <ci>f</ci>                      <ci>f</ci>
      <apply>                         <apply id="t11">
        <ci>f</ci>                      <ci>f</ci>
        <ci>a</ci>                      <ci>a</ci>
        <ci>a</ci>                      <ci>a</ci>
      </apply>                        </apply>
      <apply>                         <share href="#t11"/>
        <ci>f</ci>
        <ci>a</ci> 
        <ci>a</ci>
      </apply>                                
    </apply>                      </apply>
    <apply>                       <share href="#t1"/>
      <ci>f</ci>
      <apply>
        <ci>f</ci>
        <ci>a</ci>
        <ci>a</ci>
      </apply>
      <apply>
        <ci>f</ci>
        <ci>a</ci>
        <ci>a</ci>
      </apply>
    </apply>
  </apply>
</math>                     </math>

We say that an element dominates all its children and all elements they dominate. An share element dominates its target, i.e. the element that carries the id attribute pointed to by the xref attribute. For instance in the representation above the apply element with id="t1" and also the second share dominate the apply element with id="t11".

The occurrences of the share element must obey the following global acyclicity constraint: An element may not dominate itself. For instance the following representation violates this constraint:

<apply id="foo">
    <plus/>
    <cn>1</cn>
    <apply>
       <plus/>
       <cn>1</cn>
       <share xref="foo"/>
    </apply> 
  </apply>

Here, the apply element with id="foo" dominates its third child, which dominates the share element, which dominates its target: the element with id="foo". So by transitivity, this element dominates itself, and by the acyclicity constraint, it is not an MathML expression tree. Even though it could be given the interpretation of the continued fraction   \frac{1}{1 + \frac{1}{1 + \frac{1}{1 + \ldots}}} this would correspond to an infinite tree of applications, which is not admitted by Content MathML

Note that the acyclicity constraints is not restricted to such simple cases, as the following example shows:

<apply id="bar">                  <apply id="baz">
    <plus/>                         <plus/>
    <cn>1</cn>                      <cn>1</cn>
    <share xref="baz"/>             <share xref="bar"/>
  </apply>                        </apply>

Here, the apply with id="bar" dominates its third child, the share with xref="baz", which dominates its target apply with id="baz", which in turn dominates its third child, the share with xref="bar", this finally dominates its target, the original apply element with id="bar". So this pair of representations violates the acyclicity constraint.

Note that the share element is a syntactic referencing mechanism: an share element stands for the exact element it points to. In particular, referencing does not interact with binding in a semantically intuitive way, since it allows for variable capture. Consider for instance

<bind id="outer">
  <lambda/>
  <bvar><ci>x</ci></bvar>
  <apply>
    <ci>f</ci>
    <bind id="inner">
      <lambda/>
      <bvar><ci>x</ci></bvar>
      <share id="copy" xref="#orig"/>
    </bind>
    <apply id="orig"><ci>g</ci><ci>X</ci></apply>
  </apply>
</bind>

it represents the term \lambda{x}.f(\lambda{x}.g(x),g(x)) which has two sub-terms of the form g(x), one with id="orig" (the one explicitly represented) and one with id="copy", represented by the share element. In the original, the variable x is bound by the outer bind element, and in the copy, the variable x is bound by the inner bind element. We say that the inner bind has captured the variable X.

It is well-known that variable capture does not conserve semantics. For instance, we could use α-conversion to rename the inner occurrence of x into, say, y arriving at the (same) object \lambda{x}.f(\lambda{y}.g(y),g(x)) Using references that capture variables in this way can easily lead to representation errors, and is not recommended.

4.2.9 Semantic Mapping

This section explains the use of the semantic mapping elements semantics, annotation and annotation-xml.

The use of content markup rather than presentation markup for mathematics is sometimes referred to as semantic tagging [Buswell1996]. The parse-tree of a valid element structure using MathML content elements corresponds directly to the expression tree of the underlying mathematical expression. We therefore regard the content tagging itself as encoding the syntax of the mathematical expression. This is, in general, sufficient to obtain some rendering and even some symbolic manipulation (e.g. polynomial factorization).

However, even in such apparently simple expressions as X + Y, some additional information may be required for applications such as computer algebra. Are X and Y integers, or functions, etc.? "Plus" represents addition over which field? This additional information is referred to as semantic mapping. In MathML, this mapping is provided by the semantics, annotation and annotation-xml elements.

For example in the MathML representation

<semantics>
  <mrow>
    <mrow>
      <mo>sin</mo>
      <mfenced open="(" close=")"><mi>x</mi></mfenced>
    </mrow>
    <mo>+</mo>
    <mn>5</mn>
  </mrow>
  <annotation encoding="Maple">sin(x) + 5</annotation>
  <annotation-xml encoding="MathML-Content">
    <apply>
      <plus/>
      <apply><sin/><ci>x</ci></apply>
      <cn>5</cn>
    </apply>
  </annotation-xml>
  <annotation encoding="Mathematica">Sin[x] + 5</annotation>
  <annotation encoding="TeX"> \sin x + 5</annotation>
  <annotation-xml encoding="OpenMath">
    <OMA xmlns="http://www.openmath.org/OpenMath">
      <OMA>
        <OMS cd="arith1" name="plus"/>
        <OMA><OMS cd="transc1" name="sin"/><OMV name="x"/></OMA>
      <OMI>5</OMI>
    </OMA>
  </annotation-xml>
</semantics>

binds together various representations of the sum of the sinus function applied to a variable x and the number 5. In the sense of a semantic mapping discussed above, we annotate the presentation element in the first child of the semantics element with a content MathML expression tree that clarifies the meaning of the parts involved. See Chapter 5 Combining Presentation and Content Markup for extensions of this idea.

Of course, providing an explicit semantic mapping at all is optional, and in general would only be provided where there is some requirement to process or manipulate the underlying mathematics.

The semantics element is the container element that associates additional representations with a given MathML construct. The semantics element has as its first child the expression being annotated, and the subsequent children are the annotations. There is no restriction on the kind of annotation that can be attached using the semantics element. For example, one might give a TEX encoding, or computer algebra input, or even detailed mathematical type information in an annotation. A definitionURL attribute is used on the annotation to indicate when the semantics of an annotation differs significantly from that of the original expression.

The representations that are XML based are enclosed in an annotation-xml element while those representations that are to be parsed as PCDATA are enclosed in an annotation element.

The semantics element takes the definitionURL and encoding attributes, which can be used to reference an external source for some or all of the semantic information.

An important purpose of the semantics construct is to associate specific semantics with a particular presentation, or additional presentation information with a content construct. The default rendering of a semantics element is the default rendering of its first child. When a MathML-presentation annotation is provided, a MathML renderer may optionally use this information to render the MathML construct. This would typically be the case when the first child is a MathML content construct and the annotation is provided to give a preferred rendering differing from the default for the content elements.

Use of semantics to attach additional information in-line to a MathML construct can be contrasted with use of the csymbol for referencing external semantics. See Section 4.2.3 Symbols

The semantics element is a semantic mapping element.

The annotation-xml container element is used to contain representations that are XML based. It is always used together with the semantics element.

The annotation-xml element takes the attributes definitionURL and encoding that can be used to override the default semantics. Only the encoding attribute is required whenever the semantics remains unchanged.

The annotation element is the container element for a semantic annotation in a non-XML format.

The annotation element takes the attributes definitionURL and encoding that can be used to override the default semantics. Only the encoding attribute is required whenever the semantics remains unchanged.

The annotation element is a semantic mapping element. It is always used with semantics.

4.2.10 In Situ Error Markup

Error is made up of a symbol and a sequence of zero or more MathML expression trees. This object has no direct mathematical meaning. Errors occur as the result of some treatment on an expression tree and are thus of real interest only when some sort of communication is taking place. Errors may occur inside other objects and also inside other errors. Error objects might consist only of a symbol as in the object:

To encode an error caused by a division by zero, we would employ a aritherror Content Dictionary with a DivisionByZero symbol with role error we would use the following expression tree:

<cerror>
  <csymbol cd="aritherror" name="DivisionByZero"/>  
  <apply><divide/><ci>x</ci><cn>0</cn></apply>
</cerror>

Note that the error should cover the smallest erroneous subexpression so cerror can be a subexpression of a bigger one, e.g.

<apply><eq/>
  <cerror>
    <csymbol cd="aritherror" name="DivisionByZero"/>  
    <apply><divide/><ci>x</ci><cn>0</cn></apply>
  </cerror>
  <cn>0</cn>
</apply>

If an application wishes to signal that the MathML it has received is invalid or is not well-formed then the offending data must be encoded as a string. For example:

<cerror> 
  <csymbol cd="parser" name="invalid_XML"/>
  <mtext> &lt;apply&gt;&lt;cos&gt; &lt;ci&gt;v&lt;/ci&gt; &lt;/apply&gt; </mtext>
</cerror>

Note that the < and > characters have been escaped as is usual in an XML document.

4.3 Rendering of Content Elements

While the primary role of the MathML content element set is to directly encode the mathematical structure of expressions independent of the notation used to present the objects, rendering issues cannot be ignored. Each content element has a default rendering, and several mechanisms (including Section 4.3.2 Attributes Modifying Content Markup Rendering) are provided for associating a particular rendering with an object.

4.3.1 General Rules

The default rendering of a simple cn-tagged object is the same as for the presentation element mn with some provision for overriding the presentation of the PCDATA by providing explicit mn tags. This is described in detail in Section 4.2.1 Numbers.

Generally, each mathematical object has global properties that impact everything from the interpretation of operations that are applied to it to how to render the symbols representing it. These mathematical properties are captured by setting attribute values (see Section 4.3.2 Attributes Modifying Content Markup Rendering or by associating the properties with the object through the use of the semantics element.

A mathematical system that has been passed an apply element is free to do with it whatever it normally does with such mathematical data. It may be that no rendering is involved (e.g. a syntax validator), or that the "function application" is evaluated and that only the result is rendered (e.g. sin(0) \rightarrow 0).

When an unevaluated "function application" is rendered there are a wide variety of appropriate renderings. The choice often depends on the function or operator being applied. Applications of basic operations such as plus are generally presented using an infix notation while applications of sin would use a more traditional functional notation such as sin(x). Consult the default rendering for the operator being applied in its content dictionary (see Section C.2.4 Default Rendering Specifications for details). The same holds for use-defined functions (see csymbol) that are not evaluated by the receiving or rendering application unless an alternative presentation is specified using the semantics tag.

The default rendering of a semantics element is the default rendering of its first child: the annotation and annotation-xml are not rendered.

4.3.2 Attributes Modifying Content Markup Rendering

The type attribute, in addition to conveying semantic information, can be interpreted to provide rendering information. For example in

<ci type="vector">V</ci>

a renderer might display a bold V for the vector.

All content elements support the general attributes class style, id, and otherthat can be used to modify the rendering of the markup. the first three are intended for compatibility with Cascading Style Sheets (CSS), as described in Section 2.4.5 Attributes Shared by all MathML Elements.

Issue other_nowadays wiki (member only)   ISSUE-3 (member only)
other is deprecated, delete the following

in particular, how would we do this nowadays?

Resolution None recorded

MathML elements accept an attribute other (see Section 7.2.3 Attributes for unspecified data), which can be used to specify things not specifically documented in MathML. On content tags, this attribute can be used by an author to express a preference between equivalent forms for a particular content element construct, where the selection of the presentation has nothing to do with the semantics. Examples might be

  • inline or displayed equations

  • script-style fractions

  • use of x with a dot for a derivative over dx/dt

Thus, if a particular renderer recognized a display attribute to select between script-style and display-style fractions, an author might write

<apply other='display="scriptstyle"'>
  <divide/>
  <cn>1</cn>
  <ci>x</ci>
</apply>

to indicate that the rendering 1/x is preferred.

The information provided in the other attribute is intended for use by specific renderers or processors, and therefore, the permitted values are determined by the renderer being used. It is legal for a renderer to ignore this information. This might be intentional, as in the case of a publisher imposing a house style, or simply because the renderer does not understand them, or is unable to carry them out.

4.4 Legacy Markup in Content MathML

MathML3 content markup differs from earlier versions of MathML in that it has been regularized and based on the content dictionary model introduced by OpenMath [OpenMath2004]. While this is the preferred representation, MathML3 also supports MathML2 markup as a legacy representation. We will discuss this representation in the following and indicate the equivalent canonical representations, which are preferred in MathML3

4.4.1 Numbers with "constant" type

The cn element can be used with the value "constant" for the type attribute and the Unicode symbols for the content. This use of the cn is deprecated in favor of the number constants exponentiale, imaginaryi, true, false, notanumber, pi, eulergamma, and infinity in the content dictionary constnants CD, or the use of csymbol with an appropriate value for the definitionURL. For example, instead of using the pi element, an instance of <cn type="constant">&pi;</cn> could be used.

4.4.2 Token Elements

The most common operations and functions such as plus and sin have been predefined explicitly as empty elements. The general rule is that for any symbol defined in the MathML3 content dictionaries (see Appendix C MathML3 Content Dictionaries), there is an empty content element with the same name. For instance, the empty MathML element

<plus/>

is equivalent to the element

<csymbol cdbase="http://w3.org/Math/CD" encoding="application/x-MathML-CD"
        cd="algebra-logic" name="plus"><mo>+</mo></csymbol>

both can be used interchangeably.

Issue new_tokens_vs_csymbol wiki (member only)
tokens for the new MathML3 symbols?

do we introduce new empty elements for the new symbols for which we introduce definitions in the CDs?

Resolution None recorded

In MathML2, the definitionURL attribute could be used to modify the meaning of an element to allow essentially the same notation to be re-used for a discussion taking place in a different mathematic domain. This use of the attribute is deprecated in MathML3, in favor of using a csymbol with different referencing attributes.

4.4.3 Tokens with Attributes

Issue token_attribs wiki (member only)   ISSUE-11 (member only)
Tokens with Attributes

In MathML2, the meaning of various token elements could be specialized via various attributes, usually the type attribute. Canonical Content MathML does not have this possibility, therefore I propose to either pass these attributes to as extra arguments in the apply or bind elements, or to add new symbols for the non-default case to the respective content dictionaries.

Resolution None recorded

In MathML2, the meaning of various token elements could be specialized via various attributes, usually the type attribute. Canonical Content MathML does not have this possibility, therefore these attributes are either passed to the symbols as extra arguments in the apply or bind elements, or MathML3 adds new symbols for the non-default case to the respective content dictionaries.

We will summarize the cases in the following table:

legacy Content MathML canonical Content MathML
<diff type="function"/> <csymbol name="diff" cd="calculus_veccalc"/>
<diff type="algebraic"/> <csymbol name="aDiff" cd="calculus_veccalc"/>
Editorial note: MiKo  
systematically consider all the cases here

4.4.4 Container Markup

To retain compatibility with MathML2, MathML3 provides an alternative representation for applications of constructor elements. For instance for the set element, the following two representations are considered equivalent

<set><ci>a</ci><ci>b</ci><ci>c</ci></set>
<apply><set/><ci>a</ci><ci>b</ci><ci>c</ci></apply>

and following the discussion in section Section 4.2.3 Symbols they are equivalent to

<apply><csymbol name="set" cd="sets"/><ci>a</ci><ci>b</ci><ci>c</ci></apply>

Other constructors are interval, list, matrix, matrixrow, vector, apply, lambda, piecewise, piece, otherwise

Issue dom_for_containers wiki (member only)
MathML DOM for Container Elements

Do we want to prescribe one of the representations for the DOM? That would make the processing much simpler.

Resolution

We have decided to keep the MathML DOM directly in equivalent to the XML DOM of this, then this becomes a non-issue

4.4.5 Domain of Application (domainofapplication) in Applications

The domainofapplication element was used in MathML2 an apply element which denotes the domain over which a given function is being applied. In contrast to its use as a qualifier in the bind element, the usage in the apply element only marks the argument position for the range argument of the definite integral.

MathML3 supports this representation as a legacy form. For instance, the integral of a function f over an arbitrary domain C can be represented as

<apply><int/>
  <domainofapplication><ci>C</ci></domainofapplication>
  <ci>f</ci>
</apply>

in the legacy representation, it is considered equivalent to

<apply><intfun/><ci>C</ci><ci>f</ci></apply>
Editorial note: MiKo  
be careful with Int and int here

4.4.6 Domain of Application (domainofapplication) in Bindings

The domainofapplication was intended to be an alternative to specification of range of bound variables for condition. Generally, a domain of application D can be specified by a condition element requesting that the bound variable is a member of D. For instance, we consider the legacy representation

<apply><int/>
  <bvar><ci>x</ci></bvar>
  <domainofapplication><ci type="set">D</ci></domainofapplication>
  <apply><ci type="function">f</ci><ci>x</ci></apply>
</apply>

as equivalent to the canonical representation

<bind><intexp/>
  <bvar><ci>x</ci></bvar>
  <condition><apply/><in/><ci>x</ci><ci type="set">D</ci></condition>
  <apply><ci type="function">f</ci><ci>x</ci></apply>
</apply>

4.4.7 Integrals with Calling patterns

MathML2 used the int element for the definite or indefinite integral of a function or algebraic expression on some sort of domain of application. There are several forms of calling sequences depending on the nature of the arguments, and whether or not it is a definite integral. Those forms using interval, condition, lowlimit, or uplimit, provide convenient shorthand notations for an appropriate domainofapplication.

MathML separates the functionality of the int element into four different symbols: intfun, defintfun, and intexp. The first two are integral operators that can be applied to functions and the latter is binding operators for integrating an algebraic expression with respect to a bound variable.

The following two indefinite function integrals are equivalent.

<![CDATA[<apply><int/><sin/></apply>
<![CDATA[<apply><intfun/><sin/></apply>

The following two definite function integrals are equivalent (see also Section 4.4.6 Domain of Application (domainofapplication) in Bindings).

<![CDATA[<apply><int/>
 <domainofapplication><ci type="set">D</ci></domainofapplication>
 <sin/>
</apply>
<![CDATA[<apply><defintfun/><ci type="set">D</ci><sin/></apply>

The following two indefinite integrals over algebraic expressions are equivalent.

<![CDATA[<apply><bvar><ci>x</ci></bvar><int/><apply><sin/><ci>x</ci></apply></apply>
<![CDATA[<bind><bvar><ci>x</ci></bvar><intexp/><apply><sin/><ci>x</ci></apply></bind>

The following two definite function integrals are equivalent.

<![CDATA[<apply><int/>
 <bvar><ci>x</ci></bvar>
 <domainofapplication><ci type="set">D</ci></domainofapplication>
 <apply><sin/><ci>x</ci></apply>
</apply>
<![CDATA[<bind><intexp/>
 <bvar><ci>x</ci></bvar>
 <domainofapplication><ci type="set">D</ci></domainofapplication>
 <apply><sin/><ci>x</ci></apply>
</bind>

4.4.8 degree

The degree element is a qualifier used by some MathML containers to specify that, for example, a bound variable is repeated several times.

Editorial note: MiKo  
specify a complete list of containers that allow degree elements, so far I see diff, partialdiff, root

The degree element is the container element for the "degree" or "order" of an operation. There are a number of basic mathematical constructs that come in families, such as derivatives and moments. Rather than introduce special elements for each of these families, MathML uses a single general construct, the degree element for this concept of "order".

<bind><diff/>
  <bvar><ci>x</ci><degree><cn>2</cn></degree></bvar>
  <apply><power/><ci>x</ci><cn>5</cn></apply>
</bind>
<bind>
  <partialdiff/>
  <bvar>
    <ci>x</ci>
    <degree><ci> n </ci></degree>
  </bvar>
  <bvar>
    <ci>y</ci>
    <degree><ci>m</ci></degree>
  </bvar>
  <apply><sin/>
    <apply><times/><ci>x</ci><ci>y</ci></apply>
  </apply>
</bind>

A variable that is to be bound is placed in this container. In a derivative, it indicates which variable with respect to which a function is being differentiated. When the bvar element is used to qualify a derivative, the bvar element may contain a child degree element that specifies the order of the derivative with respect to that variable.

<apply>
  <diff/>
  <bvar>
    <ci>x</ci>
    <degree><cn>2</cn></degree>
  </bvar>
  <apply><power/><ci>x</ci><cn>4</cn></apply>
</apply>

it is equivalent to

<bind>
  <apply><diff/><cn>2</cn></apply>
  <bvar><ci>x</ci></bvar>
  <apply><power/><ci>x</ci><cn>4</cn></apply>
</bind>
Editorial note: MiKo  
what do we want to use for degree?

Note that the degree element is only allowed in the container representation. The canonical representation takes the degree as a regular argument as the second child of the apply or bind element.

Editorial note: MiKo  
Make sure that all MMLdefinitions of degree-carrying symbols get a paragraph like the one for root.

The default rendering of the degree element and its contents depends on the context. In the example above, the degree elements would be rendered as the exponents in the differentiation symbols:

\frac{\partial^{n+m}}{\partial x^n \partial y^m}     \sin(xy)

4.4.9 Upper and Lower Limits

The uplimit and lowlimit elements are legacy qualifiers that can be used to restrict the range of a bound variable to an interval, e.g. in some integrals and sums. uplimit/lowlimit pairs can be expressed via the interval element from the CD Basic Content Elements. For instance, we consider the legacy representation

<apply><int/>
  <bvar><ci> x </ci></bvar>
  <lowlimit><ci>a</ci></lowlimit>
  <uplimit><ci>b</ci></uplimit>
  <apply><ci type="function">f</ci><ci>x</ci></apply>
</apply>

as equivalent to the following canonical representation

<bind><int/>
  <bvar><ci>x</ci></bvar>
  <condition>
    <apply><in/><ci>x</ci><apply><interval/><ci>a</ci><ci>b</ci></apply></apply>
  </condition>
  <lowlimit><ci>a</ci></lowlimit>
  <uplimit><ci>b</ci></uplimit>
  <apply><ci type="function">f</ci><ci>x</ci></apply>
</bind>

If the lowlimit qualifier is missing, it is interpreted as negative infinity, similarly, if uplimit is then it is interpreted as positive infinity.

4.4.10 Lifted Associative Commutative Operators

Issue lifted_operators wiki (member only)   ISSUE-8 (member only)
New Symbols for Lifted Operators

MathML2 allowed the use of n-ary operators as binding operators with bound variables induced by them. For instance union could be used as the equivalent for the TeX \cup as well as \bigcup. While the relation between the nary and the set-based operators is deterministic, i.e. the induced big operators are fully determined by them, the concepts are quite different in nature (different notational conventions, different types, different occurrence schemata. I therefore propose to extend the MathML K-14 CDs with symbols big operators, much like we already have sum as the big operator for for the n-ary plus symbol, and prod for times. For the new symbols, I propose the naming convention of capitalizing the big operators (as an alternative, we could follow TeX and pre-pend a bib). For example we could have Union as a big operator for union

Resolution None recorded

MathML2 allowed to use a associative operators to be "lifted" to "big operators", for instance the n-ary union operator to the union operator over sets, as the union of the U-complements over a family F of sets in this construction

<apply>
  <union/>
  <bvar><ci>S</ci></bvar>
  <condition>
    <apply><in/><ci>S</ci><ci>F</ci></apply>
  </condition>
  <apply><setdiff/><ci>U</ci><ci>S</ci></apply>
</apply>

While the relation between the nary and the set-based operators is deterministic, i.e. the induced big operators are fully determined by them, the concepts are quite different in nature (different notational conventions, different types, different occurrence schemata). Therefore the MathML3 content dictionaries provides explicit symbols for the "big operators", much like MathML2 did with sum as the big operator for for the n-ary plus symbol, and prod for times. Concretely, these are Union, Intersect, Max, Min, Gcd, Lcm, Or, And, and Xor. With these, we can express all legacy expressions. For instance, the union above can be represented canonically as

<bind><Union/>
  <bvar><ci>S</ci></bvar>
  <condition>
    <apply><in/><ci>S</ci><ci>F</ci></apply>
  </condition>
  <apply><setdiff/><ci>U</ci><ci>S</ci></apply>
</bind>

For the exact meaning of the new symbols, consult the content dictionaries.

4.4.11 Declare (declare)

The declare element is a legacy construct with two primary roles. The first is to change or set the default attribute values for a mathematical identifier. The second is introduce a new identifier "name" for an object. Once a declaration is in effect, the

<ci>name</ci>

acquires the new attribute settings, and (if the second object is present) stands for the object. The actual instances of a declared ci element are normally recognized by comparing their content with that of the declared element. Equality of two elements is determined by comparing the XML information set of the two expressions after XML space normalization (see [XPath]).

All declare elements must occur at the beginning of a math element. The scope of a declaration is "local" to the surrounding math element. The scope attribute can only be assigned to "local". It was intended to support future extensions, but MathML3 contains no provision for making document-wide declarations, so the scope remains fixed to local

Occurrences of declare with only one argument can be eliminated by adding the respective attributes to all other occurrences of the same identifier in the respective math element. E.g.

<math>
<declare type="function" nargs="nary"><ci>F</ci></declare>
  <apply><eq/>
    <apply><ci>F</ci><ci>X</ci><ci>Y</ci></apply>
    <apply><ci>F</ci><ci>Y</ci><ci>X</ci></apply>
  </apply>
</math>

is equivalent to the representation

<math>
  <apply><eq/>
    <apply><ci type="function" nargs="nary">F</ci><ci>X</ci><ci>Y</ci></apply>
    <apply><ci type="function" nargs="nary">F</ci><ci>Y</ci><ci>X</ci></apply>
  </apply>
</math>

Occurrences of the declare element with a second argument can be eliminated with the help of the MathML share element. If the declared identifier (the first child of the declare is not used in the expression, the declare element can be dropped. If it is used once, it can simply be replaced with the second declare child. If it is used two or more times, we replace one of its occurrences with the second declare child, add a new id attribute, and replace all other occurrences by share elements that point to this. For instance

<math>
  <declare>
    <ci>fivefac</ci>
    <apply><times/><cn>1</cn><cn>2</cn><cn>3</cn><cn>4</cn><cn>5</cn></apply>
  </declare>
  <apply><times/>
     <ci>fivefac</ci>
     <ci>fivefac</ci>
     <ci>fivefac</ci>
  </apply>
</math>

is equivaelnt to

<math>
  <apply><times/>
    <apply id="newfoo"><times/><cn>1</cn><cn>2</cn><cn>3</cn><cn>4</cn><cn>5</cn></apply>
    <share xref="#newfoo"/>
    <share xref="#newfoo"/>
  </apply>
</math>
Overview: Mathematical Markup Language (MathML) Version 3.0
Previous: 3 Presentation Markup
Next: 5 Combining Presentation and Content Markup