W3C Math Working Group XML Query Language Requirements

Draft for Working Group Review November 18, 1998

Editors:
Stephen Hunt [Lead], Wolfram Research
Roger Hunter, MacKichan Software, Inc.

Abstract

This document details the W3C Math Working Group's requirements for the XML Query Language effort. Querying MathML requires a pattern language for both element structure and content to return a subtree of the MathML expression tree. The pattern language could be derived from XSL patterns by allowing multiple branches and positions to be specified relative to any node and regular expression matching on content.

Summary of Requirements

  1. Pattern language - to identify both element subtree structure and content.

Pattern language

To illustrate the MathML requirement, draft XSL patterns are extended in a manner consistent with the XQL proposal. The extension is used merely to provide an example of the MathML requirement and is not an explicit recommendation in itself.

Draft XSL patterns use a directory metaphor to identify a pattern relative to some node in the expression tree: parent/child, ancestor//descendent. Extending this specification to include multiple branches and positions at arbitrary depth relative to any given node would allow the structure of a MathML subexpression to be identified.

Extension 1: Multiple branches

The qualifier notation can be used to constrain a node to contain branch patterns. Ordered branch sequences can be delimited by a semicolon and unordered branch patterns enforced by the 'and' operator.

A MathML pattern is a hierachical structure. By recursively descending the tree the subtrees are tested for the pattern. The return value is the subtree.  The pattern may contain an arbitrary number of levels. When you get a match you can stop and return the match or continue and return the first n matches or all matches from all levels.

Example: b^2

Presentation markup:

        <msup>
            <mi>b</mi>
            <mn>2</mn>
        </msup>
Pattern: 
            sup[mi;mn] - msup element with children mi followed by mn
            sup[mi and mn] - msup element with children mi and mn
Example: b^(2a)
            <msup>
                <mi>b</mi>
                <mrow>
                    <mn>2</mn>
                    <mo>&InvisibleTimes;</mo>
                    <mi>a</a>
                </mrow>             </msup>
Pattern: 
            sup[mi;mrow/mi] - mrow/mi matches the second branch above
            sup[mi and mrow/mi]


Extension2: Position

The ; sequence operator implicitly identifies position.  It can be extended to express sequences of elements/content at a given level.

p;q p immediately preceeds q
p;;q p preceeds q
p;;;q p preceeds q, either p or q may be empty
Example: 4 a c

Presentation markup:

        <mrow>
            <mn>4</mn>
            <mo>&InvisibleTimes;</mo>
            <mi>a</mi>
            <mo>&InvisibleTimes;</mo>
            <mi>c</mi>
        </mrow>
Pattern:
            mrow[*;;mi;;*] - only matches first of the two mi
            mrow[*;;mi;;;*] - matches both first and second mi

Extension 3: Content

Content can be queried by regarding it as data with no element head.
[content] identifies content of an element
[regular expression] identifies content matching regular expression pattern
Example:
         <mn>4</mn>
Pattern:
            mn[[4]]

Putting it all together: A compound example.

Example: b^2 - 4 a c
            <mrow>
            <msup>
                <mi>b</mi>
                <mn>2</mn>
            </msup>
            <mo>-</mo>
            <mrow>
                <mn>4</mn>
                <mo>&InvisibleTimes;</mo>
                <mi>a</mi>
                <mo>&InvisibleTimes;</mo>
                <mi>c</mi>
            </mrow>
            </mrow>
Query: Search for superscript and a descendant product.

Pattern:

            *[msup;*;mrow[*;;mo[[&InvisibleTimes;]];;*]]
or
            *[msup;*;mrow[*;;mo[[*Times*]];;*]]

References

Mathematical Markup Language (MathML) 1.0 Specification
http://www.w3.org/TR/REC-MathML
The W3C Query Languages Workshop Call for Participation
http://www.w3.org/TandS/QL/QL98/cfp
XML Query Language: A Proposal
http://www.w3.org/Style/XSL/Group/1998/09/XQL-proposal.html

Maintained by:

   Stephen Hunt(Math Working Group XML-QL Requirements Lead).
   Angel Diaz(co-chair for the Math working group).
   Patrick Ion(co-chair for the Math working group).
W3C contact for math: Dave Raggett.
Last revised: 1998/11/17 by aldiaz

Copyright  ©  1998 W3C (MIT, INRIA, Keio ), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply. Your interactions with this site are in accordance with our public and Member privacy statements.