Parsing MathML

A.1 Use of MathML as Well-Formed XML

DTD and W3C XML Schema
Issue update_schema	`wiki (member only)`
DTD and W3C XML Schema need updating to MathML3
Resolution	None recorded

A MathML document must be a well-formed XML document using elements in the MathML namespace as defined by this specification, however it is not required that the document refer to any specific Document Type Definition (DTD) or schema that specifies MathML. It is sometimes advantagous not to specify such a language definition as these files are large, often much larger than the MathML expression and unless they have been previously cached by the MathML application, the time taken to fetch the DTD or schema may have an appreciable effect on the processing of the MathML document.

Note also that if no DTD is specified with a DOCTYPE declaration, that entity references (for example to refer to MathML characters by name) may not be used. The document should be encoded in an encoding (for example UTF-8) in which all needed characters may be encoded as character data, or characters may be referenced using numeric character references, for example ∫ rather than ∫

If a MathML fragment is parsed without a DTD, in other words as a well-formed XML fragment, it is the responsibility of the processing application to treat the white space characters occurring outside of token elements as not significant.

However, in many circumstances, especially while producing or editing MathML, it is useful to use a language definition, to constrain the editing process or to check the correctness of generated files. The following section, Section A.2 Using the RelaxNG Schema for MathML3, discusses the RelaxNG Schema for MathML3 [RelaxNG], which forms a normative part of the specification. Following that, Section A.4 Using the MathML XML Schema, and Section A.3 Using the MathML DTD discuss alternative languages definition using the document type definitions (DTD) and the W3C XML schema language, [XMLSchemas], both of which are derived from the normative RelaxNG schema automatically. One should note that the schema definitions of the language is currently stricter than the DTD version. That is, a schema validating processor will declare invalid documents that are declared valid by a (DTD) validating XML parser. This is partly due to the fact that the XML schema language may express additional constraints not expressable in the DTD, and partly due to the fact that for reasons of compatibility with earlier releases, the DTD is intentionally forgiving in some places and does not enforce constraints that are specified in the text of this specification.

A.2 Using the RelaxNG Schema for MathML3

MathML documents should be validated using the RelaxNG Schema for MathML, either in the XML encoding (http://www.w3.org/Math/RelaxNG/mathml3/mathml3.rng) or in compact notation (http://www.w3.org/Math/RelaxNG/mathml3/mathml3.rnc) which is also shown below.

In contrast to DTDs there is no in-document method to associate a RelaxNG schema with a document.

Editorial note: MiKo
I have included the schema verbatim for reference, a better version should be generated somehow

We provide five RelaxNG schemata for sub-languages of MathML3:

The grammar for Presentation MathML without content elements mixed in
The grammar for strict Content MathML3
The grammar for pragmatic Content MathML3 without presentation MathML in token elements
The grammar for full MathML without deprecated parts
The grammar for full MathML with deprecated parts

we will present them in detail in the next sections below. As the compact notation for RelaxNG grammars is more readable, we will use this format here.

Note that the RelaxNG grammars here are considerably more strict than the MathML2 DTDs (even in strict mode).

A.2.1 The Grammar for Presentation MathML

#     This is the Mathematical Markup Language (MathML) 3.0, an XML
#     application for describing mathematical notation and capturing
#     both its structure and content.
#
#     Copyright 1998-2007 World Wide Web Consortium
#        (Massachusetts Institute of Technology, Institut National de
#         Recherche en Informatique et en Automatique, Keio University).
#         All Rights Reserved.
#
#     Permission to use, copy, modify and distribute the RelaxNG schema for MathML3
#     and its accompanying documentation for any purpose and without fee is
#     hereby granted in perpetuity, provided that the above copyright notice
#     and this paragraph appear in all copies.  The copyright holders make
#     no representation about the suitability of the Schema for any purpose.
#
#     This file contains the grammar rules for pure presentation MathML3, i.e. without
#     content MathML mixed in. 
#     It is provided "as is" without expressed or implied warranty.
#
#     Revision:   $Id: appendixa.html,v 1.1 2008/04/08 17:00:08 jules Exp $
#     Author: Michael Kohlhase http://kwarc.info/kohlhase

default namespace m = "http://www.w3.org/1998/Math/MathML"

include "mathml3-common.rnc"

math.content |= ContInPres

MathML.Common.attrib |= attribute class {xsd:NMTOKENS}?,attribute style {xsd:string}?


Browser-interface.attrib = attribute baseline {xsd:string}?,
                           attribute overflow {"scroll" | "elide" | "truncate" | "scale"}?,
                           attribute altimg {xsd:anyURI}?,
                           attribute alttext {xsd:string}?,
       			   attribute type {xsd:string}?,
			   attribute name {xsd:string}?,	    
			   attribute height {xsd:string}?,
			   attribute width {xsd:string}?

math.attlist |= Browser-interface.attrib,attribute display {"block" | "inline"}?

simple-size = "small" | "normal" | "big"

centering.values = "left" | "center" | "right"

named-space = "veryverythinmathspace" | "verythinmathspace" | "thinmathspace" | 
              "mediummathspace" | 
              "thickmathspace" | "verythickmathspace" | "veryverythickmathspace"
thickness = "thin" | "medium" | "thick"

# number with units used to specified lengths 

length-with-unit = 
    xsd:string #{pattern="(-?([0-9]+|[0-9]*\.[0-9]+)(em|ex|px|in|cm|mm|pt|pc|%))|0"}
length-with-optional-unit = 
   xsd:string #{pattern="-?([0-9]+|[0-9]*\.[0-9]+)(em|ex|px|in|cm|mm|pt|pc|%)?"}

# This is just "infinity" that can be used as a length 
infinity = "infinity"

# colors defined as RGB 
RGB-color = xsd:string {pattern="#(([0-9]|[a-f]){3}|([0-9]|[a-f]){6})"}

# The mathematics style attributes. These attributes are valid on all
#     presentation token elements except "mspace" and "mglyph", and on no
#     other elements except "mstyle". 

Token-style.attrib = attribute mathvariant
		       {"normal" | "bold" | "italic" | "bold-italic" | "double-struck" | 
                        "bold-fraktur" | "script" | "bold-script" | "fraktur" | 
 			"sans-serif" | "bold-sans-serif" | "sans-serif-italic" | 
			"sans-serif-bold-italic" | "monospace"}?,
                     attribute mathsize {simple-size | length-with-unit}?,

                     attribute mathcolor {xsd:string}?,
   		     attribute mathbackground {xsd:string}?

truefalse = "true" | "false"

Operator.attrib = 
# this attribute value is normally inferred from the position of
# the operator in its "<mrow"> 
   attribute form {"prefix" | "infix" | "postfix"}?,
   # set by dictionary, else it is "thickmathspace" 
   attribute lspace {length-with-unit | named-space}?,
   # set by dictionary, else it is "thickmathspace" 
   attribute rspace {length-with-unit | named-space}?,
   # set by dictionnary, else it is "false" 
   attribute fence {truefalse}?,
   # set by dictionnary, else it is "false" 
   attribute separator {truefalse}?,
   # set by dictionnary, else it is "false" 
   attribute stretchy {truefalse}?,
   # set by dictionnary, else it is "true" 
   attribute symmetric {truefalse}?,
   # set by dictionnary, else it is "false" 
   attribute movablelimits {truefalse}?,
   # set by dictionnary, else it is "false" 
   attribute accent {truefalse}?,
   # set by dictionnary, else it is "false" 
   attribute largeop {truefalse}?,
   attribute minsize {length-with-unit | named-space}?,
   attribute maxsize {length-with-unit | named-space | infinity | xsd:float}?


mglyph = element mglyph {MathML.Common.attrib,
                     attribute alt {xsd:string}?,
                     (attribute src {xsd:anyURI}| attribute fontfamily {xsd:string}),
		     attribute width {xsd:string}?,
		     attribute height {xsd:string}?,
		     attribute baseline {xsd:string}?,
		     attribute index {xsd:positiveInteger}?}


linethickness.attrib = attribute linethickness {length-with-optional-unit|thickness}
mline = element mline {MathML.Common.attrib,
      linethickness.attrib?,
      attribute spacing {xsd:string}?,
      attribute length {length-with-unit | named-space}?}

Glyph-alignmark = malignmark|mglyph

mi = element mi {MathML.Common.attrib,Token-style.attrib,(Glyph-alignmark|text)*}

mo = element mo {MathML.Common.attrib,Operator.attrib,Token-style.attrib,
                 (text|Glyph-alignmark)*}

mn = element mn {MathML.Common.attrib,Token-style.attrib,(text|Glyph-alignmark)*}

mtext = element mtext {MathML.Common.attrib,Token-style.attrib,(text|Glyph-alignmark)*}

ms = element ms {MathML.Common.attrib,Token-style.attrib,
                 attribute lquote {xsd:string}?,
		 attribute rquote {xsd:string}?,
		 (text|Glyph-alignmark)*}

# And the group of any token 
Pres-token = mi | mo | mn | mtext | ms

msub = element msub {MathML.Common.attrib,
                  attribute subscriptshift {length-with-unit}?,
                  ContInPres,ContInPres}

msup = element msup {MathML.Common.attrib,
                  attribute supscriptshift {length-with-unit}?, 
                  ContInPres,ContInPres}

msubsup = element msubsup {MathML.Common.attrib,
                     attribute subscriptshift {length-with-unit}?, 
                     attribute supscriptshift {length-with-unit}?, 
                     ContInPres,ContInPres,ContInPres}

munder = element munder {MathML.Common.attrib,
                         attribute accentunder {truefalse}?, 
                         ContInPres,ContInPres}

mover = element mover {MathML.Common.attrib,
                       attribute accent {truefalse}?, 
                       ContInPres,ContInPres}

munderover = element munderover {MathML.Common.attrib,
                                 attribute accentunder {truefalse}?, 
                                 attribute accent {truefalse}?, 
                                 ContInPres,ContInPres,ContInPres}

PresExp-or-none = ContInPres | none
mmultiscripts = element mmultiscripts{MathML.Common.attrib,
	                              ContInPres, 
				      (PresExp-or-none,PresExp-or-none)*,
				      (mprescripts,(PresExp-or-none,PresExp-or-none)*)?}
none = element none {empty}
mprescripts = element mprescripts {empty}

Pres-script = msub|msup|msubsup|munder|mover|munderover|mmultiscripts
linebreak-values = "auto" | "newline" | "indentingnewline" | "nobreak" | "goodbreak" | "badbreak"
mspace = element mspace {MathML.Common.attrib,
                         attribute width {length-with-unit | named-space}?,
	           	 attribute height {length-with-unit}?,
	           	 attribute depth {length-with-unit}?,
                   	 attribute linebreak {linebreak-values}?}

mrow = element mrow {MathML.Common.attrib,ContInPres*}

mfrac = element mfrac {MathML.Common.attrib,
                       attribute bevelled {truefalse}?,
                       attribute denomalign {centering.values}?,
		       attribute numalign {centering.values}?,
		       linethickness.attrib?,
		       ContInPres,ContInPres}
msqrt = element msqrt {MathML.Common.attrib,ContInPres*}

mroot = element mroot {MathML.Common.attrib,ContInPres,ContInPres}

mpadded-space = xsd:string {pattern="(\+|-)?([0-9]+|[0-9]*\.[0-9]+)(((%?)*(width|lspace|height|depth))|(em|ex|px|in|cm|mm|pt|pc))"}



mpadded-width-space = xsd:string {pattern="((\+|-)?([0-9]+|[0-9]*\.[0-9]+)(((%?) *(width|lspace|height|depth)?)|(width|lspace|height|depth)|(em|ex|px|in|cm|mm|pt|pc)))|((veryverythin|verythin|thin|medium|thick|verythick|veryverythick)mathspace)|0"}

mpadded = element mpadded {MathML.Common.attrib,
	                   attribute width {mpadded-width-space}?,
  			   attribute lspace {mpadded-space}?,
  			   attribute height {mpadded-space}?,
  			   attribute depth {mpadded-space}?,
  			   ContInPres*}

mphantom = element mphantom.attlist {MathML.Common.attrib,ContInPres*}

mfenced = element mfenced {MathML.Common.attrib,
                           attribute open {xsd:string}?,
  	                   attribute close {xsd:string}?,
  			   attribute separators {xsd:string}?,
			   ContInPres*}

notation-values = "actuarial"|"longdiv"|"radical"| 
                              "box"|"roundedbox"|"circle"| 
                              "left"|"right"|"top"|"bottom"|
                              "updiagonalstrike"|"downdiagonalstrike"| 
                              "verticalstrike"|"horizontalstrike"
menclose = element menclose {MathML.Common.attrib,
                          attribute notation {notation-values}?,
			  ContInPres*}

# And the group of everything 
Pres-layout = mrow|mfrac|msqrt|mroot|mpadded|mphantom|mfenced|menclose

Table-alignment.attrib = attribute rowalign 
 	     {xsd:string {pattern="(top|bottom|center|baseline|axis)(top|bottom|center|baseline|axis)*"}}?,
        attribute columnalign {xsd:string {pattern="(left|center|right)( (left|center|right))*"}}?,
        attribute groupalign {xsd:string}?

mtr.content = mtd
mtr = element mtr {Table-alignment.attrib, MathML.Common.attrib,(mtr.content)+}

mlabeledtr = element mlabeledtr {Table-alignment.attrib,MathML.Common.attrib,(mtr.content)*}

mtd = element mtd {MathML.Common.attrib,
                   Table-alignment.attrib,
                   attribute columnspan {xsd:positiveInteger}?,
  		   attribute rowspan {xsd:positiveInteger}?,
		   ContInPres*}

mtable.content = mtr|mlabeledtr
mtable = element mtable {Table-alignment.attrib,
                         attribute align {xsd:string}?,
			 attribute alignmentscope {xsd:string {pattern="(true|false)( true| false)*"}}?,
			 attribute columnwidth {xsd:string}?,
  			 attribute width {xsd:string}?,
  			 attribute rowspacing {xsd:string}?,
  			 attribute columnspacing {xsd:string}?,
  			 attribute rowlines {xsd:string}?,
  			 attribute columnlines {xsd:string}?,
  			 attribute frame {"none" | "solid" | "dashed"}?,
  			 attribute framespacing {xsd:string}?,
  			 attribute equalrows {truefalse}?,
  			 attribute equalcolumns {truefalse}?,
  			 attribute displaystyle {truefalse}?,
			 attribute side {"left"|"right"|"leftoverlap"|"rightoverlap"}?,
  			 attribute minlabelspacing {length-with-unit}?,
  			 MathML.Common.attrib,
			 (mtable.content)*}

maligngroup = element maligngroup {MathML.Common.attrib,
     attribute groupalign {"left" | "center" | "right" | "decimalpoint"}?}

malignmark = element malignmark {MathML.Common.attrib,attribute edge {"left" | "right"}?}

Pres-table = mtable|maligngroup|malignmark

mcolumn = element mcolumn {MathML.Common.attrib,
     attribute align {"left" | "right"}}

mstyle = element mstyle {MathML.Common.attrib,
                         attribute scriptlevel {xsd:integer}?,
                         attribute displaystyle {truefalse}?,
			 attribute scriptsizemultiplier {xsd:decimal}?,
  			 attribute scriptminsize {length-with-unit}?,
  			 attribute color {xsd:string}?,
  			 attribute background {xsd:string}?,
  			 attribute veryverythinmathspace {length-with-unit}?,
  			 attribute verythinmathspace {length-with-unit}?,
			 attribute thinmathspace {length-with-unit}?,
                         attribute mediummathspace {length-with-unit}?,
                         attribute thickmathspace {length-with-unit}?,
                         attribute verythickmathspace {length-with-unit}?,
                         attribute veryverythickmathspace {length-with-unit}?,
                         linethickness.attrib?,
  			 Operator.attrib,Token-style.attrib,
			 ContInPres*}

merror = element merror {MathML.Common.attrib,ContInPres*}

maction = element maction {MathML.Common.attrib,
			   attribute actiontype {xsd:string}?,
  	                   attribute selection {xsd:positiveInteger}?,
  			   ContInPres*}

semantics-pmml = element semantics {semantics.attribs,PresExp, semantics-annotation*}

PresExp = Pres-token | Pres-layout | Pres-script | Pres-table 
	      |  mspace | mline | mcolumn |  maction | merror | mstyle
	      | semantics-pmml

ContInPres = PresExp


Issue rnc_browserinterface	`wiki (member only)`
this should probably only go into mathml3-presentation.rnc
Resolution	None recorded


Issue rnc_units-patterns	`wiki (member only)`
need final decision on the patterns here and refactor to horizontal and vertical ones
Resolution	None recorded


Issue rnc_mathvariant	`wiki (member only)`
For both of the following attributes the types should be more restricted
Resolution	None recorded


Issue mglyph_alt	`wiki (member only)`
perhaps make alt required 9but breaks stuff, or just make it required if there is a src attribute
Resolution	None recorded


Issue rnc_leftover-max	`wiki (member only)`
MaxF: definition from spec seems wrong, fixing to ([+\|-] unsigned-number (%[pseudo-unit]\|pseudo-unit\|h-unit)) \| namedspace \| 0
Resolution	None recorded

Issue permissive_units wiki (member only)

more permissive lengths/widths

Issue permissive_units	`wiki (member only)`
more permissive lengths/widths
David wrote in an e-mail: `length-with-unit` doesn't allow white space (anywhere) which (if any) of the following do we want to allow " 2em ", "2 em", "- 2 em". Also it insists on starting with a digit or -, but do we want to allow ".5em" "-.5em" However we do claim css compatibility here which may suggest some answers to the above `http://www.w3.org/TR/CSS21/syndata.html#length-units`. css allows an optional leading `+` as well `+2em` css requires number to "immediately" follow any sign and the unit to "immediately" follow the number, which I think means no intervening white space. css <number> are allowed to start with a `.` so `.5em` is allowed. css insists on a digit following a `.` so `5.em` is not allowed. Once we have firm answers to the above it should be easy to drop the regexp back in, and make the text match. I think we should not allow white space except at beginning and end but allow a leading `+` (a change from mathml2) and allow no digits before the `.`, but insist on digits after a `.` which would be `[\-\+]?([0-9]+(\.[0-9]+)?\|\.[0-9]+)(em\|ex\|px\|in\|cm\|mm\|pt\|pc\|%))\|0` as written this doesn't allow " `2em` " but I think we can set white space trim properties to apply before the regex is checked (I'll check)
Resolution	None recorded

David wrote in an e-mail: length-with-unit doesn't allow white space (anywhere) which (if any) of the following do we want to allow " 2em ", "2 em", "- 2 em". Also it insists on starting with a digit or -, but do we want to allow ".5em" "-.5em"

However we do claim css compatibility here which may suggest some answers to the above http://www.w3.org/TR/CSS21/syndata.html#length-units.

css allows an optional leading + as well +2em css requires number to "immediately" follow any sign and the unit to "immediately" follow the number, which I think means no intervening white space. css <number> are allowed to start with a . so .5em is allowed. css insists on a digit following a . so 5.em is not allowed.

Once we have firm answers to the above it should be easy to drop the regexp back in, and make the text match.

I think we should not allow white space except at beginning and end but allow a leading + (a change from mathml2) and allow no digits before the ., but insist on digits after a . which would be [\-\+]?([0-9]+(\.[0-9]+)?|\.[0-9]+)(em|ex|px|in|cm|mm|pt|pc|%))|0 as written this doesn't allow " 2em " but I think we can set white space trim properties to apply before the regex is checked (I'll check)

Resolution None recorded

A.2.2 The Grammar for Strict Content MathML3

The grammar for Strict Content MathML3 can be found at http://www.w3.org/Math/RelaxNG/mathml3/mathml3-strict.rnc.

#     This is the Mathematical Markup Language (MathML) 3.0, an XML
#     application for describing mathematical notation and capturing
#     both its structure and content.
#
#     Copyright 1998-2007 World Wide Web Consortium
#        (Massachusetts Institute of Technology, Institut National de
#         Recherche en Informatique et en Automatique, Keio University).
#         All Rights Reserved.
#
#     Permission to use, copy, modify and distribute the RelaxNG schema for MathML3
#     and its accompanying documentation for any purpose and without fee is
#     hereby granted in perpetuity, provided that the above copyright notice
#     and this paragraph appear in all copies.  The copyright holders make
#     no representation about the suitability of the Schema for any purpose.
#
#     This file contains the grammar rules for strict content MathML3
#     It is provided "as is" without expressed or implied warranty.
#
#     Revision:   $Id: appendixa.html,v 1.1 2008/04/08 17:00:08 jules Exp $
#     Author: Michael Kohlhase http://kwarc.info/kohlhase

#  This is the RelaxNG schema module for the strict content part of MathML.

default namespace m = "http://www.w3.org/1998/Math/MathML"

include "mathml3-common.rnc"

math.content |= ContExp


opel.content = text

# we want to extend this in pragmatic CMathML, so we introduce abbrevs here.

cn.content = text
cn.type.vals  = "e-notation"|"integer"|"rational"|"real" |
                         "complex-cartesian"|"complex-polar"


cn = element cn {#attribute base {xsd:positiveInteger [1,...,36]},
                 attribute type {cn.type.vals}?,
  		 Definition.attrib,
  		 MathML.Common.attrib,	
		 (cn.content)*}

ci = element ci {attribute type {xsd:string}?,
                 attribute nargs {xsd:string}?,
		 attribute occurrence {xsd:string}?,		
                 Definition.attrib,	
  		 MathML.Common.attrib,
		 opel.content,
		 name.attrib?}

cdname.attrib = attribute cd {xsd:NCName}

csymbol       = element csymbol {MathML.Common.attrib,
	                         Definition.attrib,cdname.attrib?,cdbase.attrib?, 
				 opel.content}

# the content of the apply element, leave it empty and extend it later
apply = element apply {MathML.Common.attrib,cdbase.attrib?,apply.content}
apply-head = apply|bind|ci|csymbol|semantics-apply
apply.content = apply-head,ContExp*
semantics-apply = element semantics {semantics.attribs,apply-head, semantics-annotation*}

qualifier = condition

# the content of the bind element, leave it empty and extend it later
bind = element bind {MathML.Common.attrib,cdbase.attrib?,bind.content}
bind-head = apply|csymbol|semantics-bind
bind.content = bind-head,bvar*,qualifier?,ContExp
semantics-bind   = element semantics {semantics.attribs,bind-head, semantics-annotation*}

bvar = element bvar {MathML.Common.attrib,cdbase.attrib?,bvar-head}
bvar-head = ci|semantics-bvar
semantics-bvar   = element semantics {semantics.attribs,bvar-head, semantics-annotation*}

condition = element condition {Definition.attrib,cdbase.attrib?,ContExp}

share = element share {MathML.Common.attrib,attribute href {xsd:anyURI}}

# the content of the cerror element, leave it empty and extend it later
cerror = element cerror {MathML.Common.attrib,cdbase.attrib?,cerror.content}
cerror-head = csymbol|apply|semantics-cerror
cerror.content = cerror-head,ContExp*
semantics-cerror = element semantics {semantics.attribs,cerror-head, semantics-annotation*}

semantics-cmml = element semantics {semantics.attribs,ContExp, semantics-annotation*}

ContExp = cn| ci | csymbol | apply | bind | share | cerror | semantics-cmml


Issue rnc_opel-content	`wiki (member only)`
What is the content of a operator element, currently all text?
Resolution	None recorded


Issue rnc_cn-content	`wiki (member only)`
What is the content of a cn?
Resolution	None recorded


Issue rnc_cn	`wiki (member only)`
cn needs to be totally reworked once the spec is fixed
Resolution	None recorded

A.2.3 The Grammar for Pragmatic MathML

The grammar for pragmatic MathML3 can be found at http://www.w3.org/Math/RelaxNG/mathml3/mathml3-pragmatic.rnc.

#     This is the Mathematical Markup Language (MathML) 3.0, an XML
#     application for describing mathematical notation and capturing
#     both its structure and content.
#
#     Copyright 1998-2007 World Wide Web Consortium
#        (Massachusetts Institute of Technology, Institut National de
#         Recherche en Informatique et en Automatique, Keio University).
#         All Rights Reserved.
#
#     Permission to use, copy, modify and distribute the RelaxNG schema for MathML3
#     and its accompanying documentation for any purpose and without fee is
#     hereby granted in perpetuity, provided that the above copyright notice
#     and this paragraph appear in all copies.  The copyright holders make
#     no representation about the suitability of the Schema for any purpose.
#
#     This file contains the grammar rules for pragmatic content MathML3
#     It is provided "as is" without expressed or implied warranty.
#
#     Revision:   $Id: appendixa.html,v 1.1 2008/04/08 17:00:08 jules Exp $
#     Author: Michael Kohlhase http://kwarc.info/kohlhase
#
#     This is the RelaxNG schema module for the pragmatic content part of 
#     MathML (but without the presentation in token elements).

default namespace m = "http://www.w3.org/1998/Math/MathML"

include "mathml3-strict.rnc"

## the content of "cn" may have <sep> elements in it
sep = element sep {empty}
cn.content |= sep
cn.type.vals |= "constant" 

# allow degree in bvar
degree = element degree {MathML.Common.attrib,ContExp+}
bvar-head |= (degree?,ci)|(ci,degree?)
# allow degree to modify <root/>
apply.content |= root_arith1_elt,degree,ContExp*


domainofapplication = element domainofapplication {Definition.attrib,MathML.Common.attrib,cdbase.attrib?,ContExp}

lowlimit = element lowlimit {Definition.attrib,MathML.Common.attrib,cdbase.attrib?,ContExp+}
uplimit = element uplimit {Definition.attrib,MathML.Common.attrib,cdbase.attrib?,ContExp+}

## allow the non-strict qualifiers
qualifier |= domainofapplication|(uplimit,lowlimit?)|(lowlimit,uplimit?)|degree

## we collect the operator elements by role
opel.constant = notAllowed
opel.binder = notAllowed
opel.application = notAllowed
opel.semantic-attribution = notAllowed
opel.attribution = notAllowed
opel.error = notAllowed

opels = opel.constant | opel.binder | opel.application | 
        opel.semantic-attribution | opel.attribution |
	opel.error
container = notAllowed

## the values of the MathML type attributes;  
MathMLType |= "real" | "complex" | "function" | "algebraic" | "integer"

## include the relevant content dictionaries
include "mathml3-cds-pragmatic.rnc"

## we instantiate the strict content model by structure checking
apply-binder-head = semantics-apply-binder|opel.binder
apply.content |= apply-binder-head,bvar+,qualifier?,ContExp
semantics-apply-binder = element semantics {semantics.attribs,apply-binder-head, semantics-annotation*}

apply-head |= opel.application
bind-head |= opel.binder
cerror-head |= opel.error

## allow all functions, constants, and containers to be content expressions on their own
ContExp |= opel.constant|opel.application|container 

#     This is the Mathematical Markup Language (MathML) 3.0, an XML
#     application for describing mathematical notation and capturing
#     both its structure and content.
#
#     Copyright 1998-2007 World Wide Web Consortium
#        (Massachusetts Institute of Technology, Institut National de
#         Recherche en Informatique et en Automatique, Keio University).
#         All Rights Reserved.
#
#     Permission to use, copy, modify and distribute the RelaxNG schema for MathML3
#     and its accompanying documentation for any purpose and without fee is
#     hereby granted in perpetuity, provided that the above copyright notice
#     and this paragraph appear in all copies.  The copyright holders make
#     no representation about the suitability of the Schema for any purpose.
#
#     This file contains the grammar rules for pragmatic content MathML3
#     It is provided "as is" without expressed or implied warranty.
#
#     Revision:   $Id: appendixa.html,v 1.1 2008/04/08 17:00:08 jules Exp $
#     Author: Michael Kohlhase http://kwarc.info/kohlhase
#
#     This is the RelaxNG schema module for the pragmatic content part of 
#     MathML (but without the presentation in token elements).

default namespace m = "http://www.w3.org/1998/Math/MathML"

include "mathml3-strict.rnc"

## the content of "cn" may have <sep> elements in it
sep = element sep {empty}
cn.content |= sep
cn.type.vals |= "constant" 

# allow degree in bvar
degree = element degree {MathML.Common.attrib,ContExp+}
bvar-head |= (degree?,ci)|(ci,degree?)
# allow degree to modify <root/>
apply.content |= root_arith1_elt,degree,ContExp*


domainofapplication = element domainofapplication {Definition.attrib,MathML.Common.attrib,cdbase.attrib?,ContExp}

lowlimit = element lowlimit {Definition.attrib,MathML.Common.attrib,cdbase.attrib?,ContExp+}
uplimit = element uplimit {Definition.attrib,MathML.Common.attrib,cdbase.attrib?,ContExp+}

## allow the non-strict qualifiers
qualifier |= domainofapplication|(uplimit,lowlimit?)|(lowlimit,uplimit?)|degree

## we collect the operator elements by role
opel.constant = notAllowed
opel.binder = notAllowed
opel.application = notAllowed
opel.semantic-attribution = notAllowed
opel.attribution = notAllowed
opel.error = notAllowed

opels = opel.constant | opel.binder | opel.application | 
        opel.semantic-attribution | opel.attribution |
	opel.error
container = notAllowed

## the values of the MathML type attributes;  
MathMLType |= "real" | "complex" | "function" | "algebraic" | "integer"

## include the relevant content dictionaries
include "mathml3-cds-pragmatic.rnc"

## we instantiate the strict content model by structure checking
apply-binder-head = semantics-apply-binder|opel.binder
apply.content |= apply-binder-head,bvar+,qualifier?,ContExp
semantics-apply-binder = element semantics {semantics.attribs,apply-binder-head, semantics-annotation*}

apply-head |= opel.application
bind-head |= opel.binder
cerror-head |= opel.error

## allow all functions, constants, and containers to be content expressions on their own
ContExp |= opel.constant|opel.application|container

This grammar focuses on the pragmatic extensions in , , , , and .

Editorial note: MiKo
check this again

The pragmatic extensions in , , , , , rely on information that is specified in the MathML content dictionaries. This is handled in the schema http://www.w3.org/Math/RelaxNG/mathml3/mathml3-cds-pragmatic.rnc.

Finally, the pragmatic extensions given in are not covered in this schema, but will be left for full MathML in the next section.

A.2.4 Full MathML

The RelaxNG schema for full MathML without deprecated functionality builds on the schemata for presentation MathML in and pragmatic Content MathML in , mixing the content models as described in . It can be found at http://www.w3.org/Math/RelaxNG/mathml3/mathml3.rnc.

#     This is the Mathematical Markup Language (MathML) 3.0, an XML
#     application for describing mathematical notation and capturing
#     both its structure and content.
#
#     Copyright 1998-2007 World Wide Web Consortium
#        (Massachusetts Institute of Technology, Institut National de
#         Recherche en Informatique et en Automatique, Keio University).
#         All Rights Reserved.
#
#     Permission to use, copy, modify and distribute the RelaxNG schema for MathML3
#     and its accompanying documentation for any purpose and without fee is
#     hereby granted in perpetuity, provided that the above copyright notice
#     and this paragraph appear in all copies.  The copyright holders make
#     no representation about the suitability of the Schema for any purpose.
#
#     This file contains the grammar driver for MathML3
#     It is provided "as is" without expressed or implied warranty.
#
#     Revision:   $Id: appendixa.html,v 1.1 2008/04/08 17:00:08 jules Exp $
#     Author: Michael Kohlhase http://kwarc.info/kohlhase

default namespace m = "http://www.w3.org/1998/Math/MathML"

include "mathml3-common.rnc"

## Content Expressions now allow pMathML in ci and csymbol
ContExp = grammar {include "mathml3-pragmatic.rnc" {start=ContExp opel.content = text|parent PresExp}}

## Presentation Expressions allow Content Expressions mixed in everywhere
PresExp = grammar {include "mathml3-presentation.rnc" {start=PresExp ContInPres=PresExp|parent ContExp}}

## the math element can contain one content element or several presentation elements
math.content|=ContExp|PresExp+

A.2.5 Full MathML with Deprecated Elements

The grammar for the elements deprecated in MathML3 can be found at http://www.w3.org/Math/RelaxNG/mathml3/mathml3-deprecated.rnc.

#     This is the Mathematical Markup Language (MathML) 3.0, an XML
#     application for describing mathematical notation and capturing
#     both its structure and content.
#
#     Copyright 1998-2007 World Wide Web Consortium
#        (Massachusetts Institute of Technology, Institut National de
#         Recherche en Informatique et en Automatique, Keio University).
#         All Rights Reserved.
#
#     Permission to use, copy, modify and distribute the RelaxNG schema for MathML3
#     and its accompanying documentation for any purpose and without fee is
#     hereby granted in perpetuity, provided that the above copyright notice
#     and this paragraph appear in all copies.  The copyright holders make
#     no representation about the suitability of the Schema for any purpose.
#
#     This file contains the grammar driver for MathML3
#     It is provided "as is" without expressed or implied warranty.
#
#     Revision:   $Id: appendixa.html,v 1.1 2008/04/08 17:00:08 jules Exp $
#     Author: Michael Kohlhase http://kwarc.info/kohlhase

default namespace m = "http://www.w3.org/1998/Math/MathML"

include "mathml3.rnc"

Token-style.attrib |=
  attribute fontsize {xsd:string}? |
  attribute fontstyle {xsd:string}? |
  attribute color {xsd:string}? |
  attribute fontfamily {xsd:string}? |
  attribute fontweight {xsd:string}? 

#Deprecated Content Elements
dep-content = 
  element reln {ContExp*}|
  element fn {ContExp}

ContExp |= dep-content

apply-head |= dep-content

declare = element declare {attribute type {xsd:string}?,
                           attribute scope {xsd:string}?,
                           attribute nargs {xsd:nonNegativeInteger}?,
                           attribute occurrence {"prefix"|"infix"|"function-model"}?,
                           Definition.attrib,cdbase.attrib?, 
                           ContExp+}
ContExp |= declare

A.2.6 Generated Grammar for Arity & Type Checking

In Section A.2.3 The Grammar for Pragmatic MathML we have seen an example of a grammar that is generated from information present in the MathML content dictionaries. If we make use of the type information that comes with the CDs.

Editorial note: MiKo
maybe we should have a note about generating Grammars from the CDs. I will have to generate arity and type checking files to mix in and import them here. Maybe this should not be treated in a normative appendix?

A.2.7 MathML as a module in a RelaxNG Schema

Normally, a MathML expression does not constitute an entire XML document. MathML is designed to be used as the mathematics fragment of larger markup languages. In particular it is designed to be used as a module in documents marked up with the XHTML family of markup languages. As RelaxNG directly supports modular development, this is usually very easy: an XHTML+MathML schema can be specified as simple as

# A RelaxNG Schema for  XHTML+MathML
include "xhtml.rnc"
math = external "mathml3.rnc"
Inline.class |= math
Block.class |= math

assuming that we have access to a modular RelaxNG schema for xhtml that uses Inline.class and Block.class to collect the the content models for inline and block-level elements.

Editorial note: Miko
check this and reference an external schema

Specilizing the MathML3 schema so that we can check the content of annotation-xml elements is similarly simple:

# A RelaxNG Schema for MathML with OpenMath3 annotations
omobj = external "openmath3.rnc" 
include "mathml3.rnc" {anotation-xml.model = omobj}

For details about RelaxNG grammars and modularization see [RelaxNG] or [RelaxNGBook].

Editorial note: Miko
check this and reference an external schema; I think we can even tie the OpenMath model to the value `OpenMath` in the `encoding` attribute.

A.3 Using the MathML DTD

Editorial note: David
DTD to be generated from Relax NG

A.4 Using the MathML XML Schema

Editorial note: David
XSD schema to be generated from Relax NG

A Parsing MathML