UNDOctober 07, 2008 XHTML - myths and reality
Channel: Technorati Search for: mathml
The Planet MathML aggregates posts from various blogs that concern MathML. Although it is hosted by W3C, the content of the individual entries represent only the opinion of their respective authors and does not reflect the position of W3C.
If you own a blog with a focus on MathML, and want to be added or removed from this aggregator, please get in touch with Bert Bos at bert@w3.org.
All times are UTC.
Atom feed
Powered by: ![]()
Channel: Technorati Search for: mathml
Channel: Technorati Search for: mathml
Channel: Technorati Search for: mathml
Channel: Technorati Search for: mathml
Channel: Technorati Search for: mathml
Channel: Technorati Search for: mathml
Author: | Channel: Murray Sargent: Math in Office
One subject that seems to come up every other month or so is how RichEdit tables work. So I might as well post the answer. Hopefully RichEdit tables will eventually be described in the Windows SDK. They are not directly related to Math in Office, but I had mathematical expressions in mind when designing RichEdit’s table facility. Both mathematics and tables are recursive. For example you can have a fraction in the numerator of another fraction, and you can have a table in the cell of another table. So implementing tables seemed like a useful project that might also reveal how to implement a WYSIWYG implementation of mathematics. In fact, MathML <mtable>’s have a lot in common with general tables.
Most people at the time (1999) were recommending that a table cell should be represented by a whole RichEdit instance, which would give great generality. But I wanted a model that was much smaller, faster and worked with the built-in Find/Replace functionality and the RTF file converters. To this end, we needed a model, like Word’s, that was part of a single document instance, and could be overlaid on the existing paragraph structure. Accordingly RichEdit's table implementation is very efficient and fast, in fact, much faster than Word’s (although less general). Improvements have been made over the years, but the discussion that follows applies to RichEdit 4.0, which shipped with Office 2002, and RichEdit 4.1, which ships with Windows XP and Vista to this day. It also applies to later versions that ship with Office 2003 & 2007, which have additional features..
Specifically a cell containing a single line of text is represented only by that text, not by some larger structure. An empty cell consists of the single character, the cell mark U+0007. A cell containing multiple lines of text is expressed in terms of a structure that is substantially smaller than a complete edit instance, followed by the CELL mark. Tables can be nested up to 15 levels deep; higher nestings are represented by tab-delimited text. Cells can contain multiple paragraphs of any kind, e.g., bidirectional text, arbitrary tabs and alignments.
The Spring of 1999 was shortly after the Unicode Technical Committee added the U+FFF9..U+FFFB delimiter characters for describing ruby text in Japanese. These characters were available for more general use and seemed ideal for RichEdit’s internal table structure. This choice preceded the addition of the internal-use-only U+FDDO..U+FDEF characters that we use for mathematical structure characters, among other things.
In the (in-memory) backing store, a table row has the form
{CR...}CR
where { stands for the Unicode STARTGROUP character U+FFF9, and CR is the ASCII Carriage Return character U+000D. The delimiter } stands for the Unicode ENDGROUP character U+FFFB and ... stands for a sequence of cells, each consisting of cell text terminated by the CELL mark U+0007. For example, a row with three empty cells has the plain text understructure U+FFF9 U+000D U+0007 U+0007 U+0007 U+FFFB U+000D. The start and end group character pairs are assigned identical PARAFORMAT2 information that describe the row and cell parameters. If rows with different parameters are needed, they may follow one another with appropriate PARAFORMAT2 parameters. A horizontally or vertically merged cell has two characters: NOTACHAR (0xFFFF) followed by CELL (0x7). Any text that appears in a merged cell is stored in the first cell of the set of merged cells.
One way to insert tables is to copy/paste tables from Word. RichEdit reads and writes table RTF. For more programmatic purposes, RichEdit 4.0 introduced the message EM_INSERTTABLEROW, which acts similarly to EM_REPLACESEL but inserts one or more table rows with empty cells instead of plain text. Specifically it deletes the text (if any) currently selected by the selection and then inserts empty table row(s) with the row and cell parameters given by wparam and lparam, respectively, as defined below. It leaves the selection pointing to the start of the first cell in the first row. The client can then populate the table cells by pointing the selection at the various cell end marks and inserting and formatting the desired text. Such text can include nested table rows, etc. Since wparam and lparam point at row and cell parameter structures, this API isn't compatible with Visual Basic and can't be easily added to RichEdit’s object model TOM, although TOM2 does have a general set of table interfaces.
The TABLEROWPARMS and TABLECELLPARMS structures are defined as
typedef struct _tableRowParms
{ // EM_INSERTTABLE wparam is a (TABLEROWPARMS *)
BYTE cbRow; // Count of bytes in this structure
BYTE cbCell; // Count of bytes in TABLECELLPARMS
BYTE cCell; // Count of cells
BYTE cRow; // Count of rows
LONG dxCellMargin; // Cell left/right margin (\trgaph)
LONG dxIndent; // Row left (right if fRTL indent (similar to \trleft)
LONG dyHeight; // Row height (\trrh)
DWORD nAlignment:3; // Row alignment (like PARAFORMAT::bAlignment,
// \trql, trqr, \trqc)
DWORD fRTL:1; // Display cells in RTL order (\rtlrow)
DWORD fKeep:1; // Keep row together (\trkeep}
DWORD fKeepFollow:1; // Keep row on same page as following row (\trkeepfollow)
DWORD fWrap:1; // Wrap text to right/left (depending on bAlignment)
// (see \tdfrmtxtLeftN, \tdfrmtxtRightN)
DWORD fIdentCells:1; // lparam points at single struct valid for all cells
} TABLEROWPARMS;
typedef struct _tableCellParms
{ // EM_INSERTTABLE lparam is a (TABLECELLPARMS *)
LONG dxWidth; // Cell width (\cellx)
WORD nVertAlign:2; // Vertical alignment (0/1/2 = top/center/bottom
// \clvertalt (def), \clvertalc, \clvertalb)
WORD fMergeTop:1; // Top cell for vertical merge (\clvmgf)
WORD fMergePrev:1; // Merge with cell above (\clvmrg)
WORD fVertical:1; // Display text top to bottom, right to left (\cltxtbrlv)
WORD wShading; // Shading in .01% (\clshdng) e.g., 10000 flips fore/back
SHORT dxBrdrLeft; // Left border width (\clbrdrl\brdrwN) (in twips)
SHORT dyBrdrTop; // Top border width (\clbrdrt\brdrwN)
SHORT dxBrdrRight; // Right border width (\clbrdrr\brdrwN)
SHORT dyBrdrBottom; // Bottom border width (\clbrdrb\brdrwN)
COLORREF crBrdrLeft; // Left border color (\clbrdrl\brdrcf)
COLORREF crBrdrTop; // Top border color (\clbrdrt\brdrcf)
COLORREF crBrdrRight; // Right border color (\clbrdrr\brdrcf)
COLORREF crBrdrBottom; // Bottom border color (\clbrdrb\brdrcf)
COLORREF crBackPat; // Background color (\clcbpat)
COLORREF crForePat; // Foreground color (\clcfpat)
} TABLECELLPARMS;
Note that paragraph-format information containing the TABLEROWPARMS and TABLECELLPARMS information is attached to the table-row delimiters as set up by the EM_ INSERTTABLEROW message, so merely duplicating the plain-text table structure in the backing store isn't enough to insert a working table. In fact, methods like ITextRange::SetText() convert the special delimiters U+FFF9.U+FFFB to spaces (U+0020). Note also that this table structure is nestable.
The definition of EM_INSERTTABLEROW is extensible, since in the future we'll probably have to support more parameters for table rows and cells. The API also inserts a consistent table row all at once, so that no illegal table parts are present on return. Hence if the document is saved after such an insertion, valid Word-compatible RTF will be written. lparam points at the TABLECELLPARMS structure for the first cell in an array of TABLECELLPARMS structures. It's important that cbCell = sizeof(TABLECELLPARMS). That way RichEdit knows how much cell information the client is specifying. In particular, in the future if more cell parameters are defined, older clients can get away with specifying less and the new RichEdit can assign default values for the new parameters. Similarly cbRow says how many bytes are defined by the client for TABLEROWPARMS, in case RichEdit is revised to support more row parameters that the client doesn't know about.
To make simple tables easier to define, if fIdenticalCells = 1, lparam points at a single TABLECELLPARMS structure that is valid for all cells in the row. Note that a nonzero cell border width is guaranteed to give at least a one-pixel border.
The colors are limited to the standard 16 colors defined by
RGB( 0, 0, 0), // \red0\green0\blue0
RGB( 0, 0, 255), // \red0\green0\blue255
RGB( 0, 255, 255), // \red0\green255\blue255
RGB( 0, 255, 0), // \red0\green255\blue0
RGB(255, 0, 255), // \red255\green0\blue255
RGB(255, 0, 0), // \red255\green0\blue0
RGB(255, 255, 0), // \red255\green255\blue0
RGB(255, 255, 255), // \red255\green255\blue255
RGB( 0, 0, 128), // \red0\green0\blue128
RGB( 0, 128, 128), // \red0\green128\blue128
RGB( 0, 128, 0), // \red0\green128\blue0
RGB(128, 0, 128), // \red128\green0\blue128
RGB(128, 0, 0), // \red128\green0\blue0
RGB(128, 128, 0), // \red128\green128\blue0
RGB(128, 128, 128), // \red128\green128\blue128
RGB(192, 192, 192), // \red192\green192\blue192
plus two custom colors. The border widths are limited to the range 0 to 255 twips.
If the color index is not in the range 1..18, then autocolor is used, which usually ends up being the system Text or Background colors.
Channel: www-math@w3.org Mail Archives
Channel: www-math@w3.org Mail Archives
Author: | Channel: Murray Sargent: Math in Office
Two very interesting developments are happening that will improve Word 2007’s MathML support. The first is key for helping in getting Word 2007 math text into the scientific and technical publisher workflows and the second may help in this regard too. Specifically new transforms are now available in beta versions enabling Word to read and write MathML. These XSLT files are responsible for converting between Word’s native math format OMML and MathML 2.0. If you’d like to try out the new files (omml2mml.xsl and mml2omml.xsl), you can download them from the Microsoft Connect site using the invitation code: 0707-84P4-DPWT. Once you’ve downloaded the files, copy them to C:\Program Files\Microsoft Office\Office12 subdirectory, or wherever winword.exe is. Before doing so, you might want to change the current omml2mml.xsl and mml2omml.xsl files to omml2mml.xsl.bak and mml2omml.xsl.bak, respectively, in case you want to back out the update at a later date. But I doubt you will. The new ones are significantly better.
The second development is that Word 2007 will have a service pack release that enables it to read and write the ISO standard odf files as well as the native ISO standard OOXML files. In the odf standard, math zones are represented by MathML 2.0. So when Word converts to and from odf, it will use MathML 2.0 for all math zones. And it will use the files above to do the translations.
Channel: W3C Math Home
Channel: www-math@w3.org Mail Archives
Channel: www-math@w3.org Mail Archives
Channel: www-math@w3.org Mail Archives
Author: | Channel: eds.activemath.org - MathML
I recently had a very simple request… soooo simple: our user just wishes to copy the formula from Mathematica (which can copy it in MathML) and paste it on something that does web.
I just went around and tried… SeaMonkey should support that in editor and reader: copy a piece of HTML with MathML and paste it, didn’t even work… my 1/x became a place full of nbsps in three lines!
Author: | Channel: Making Math Accessible
I'll be participating in a panel session at the American Council of the Blind Annual Convention, July 5-12 at the Galt House Hotel & Suites in Louisville, Kentucky. The American Council of the Blind is one of the nation's leading membership organizations of blind and visually impaired people. The panel session, Books Unbound: How Technology is Writing a New Chapter on Accessible Textbooks, will also include presenters from the National Instructional Materials Accessibility Center, Recording for the Blind and Dyslexic, Bookshare.org and the National Braille Press. During my segment, I'll be giving folks an overview of the latest news on math support in assistive technology products and how the provision of MathML within electronic textbook files is going to make fully accessible math textbooks a reality for people with visual impairments, with support for both math-to-speech and math-to-braille access. If you're going to be in Louisville for ACB, be sure to come to this session and say hello. For details on the time and location of this session, as well as information on other conferences where we will be speaking or exhibiting, see our Events Schedule.
Author: | Channel: Murray Sargent: Math in Office
Subscript and Superscript
Bases ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
For proper math typography, it’s important to know the base of a subscript or superscript expression. For example, in Einstein’s equation E = mc2, the superscript expression c2 appears and c is the base, not mc. Knowing what the base is allows proper kerning of the base relative to the script (superscript or subscript) as well as providing more accurate semantics in interoperating with mathematical calculation engines.
This post describes the subscript/superscript base rules used by Word 2007 and RichEdit 6 in building up math text from the linear format. The rules are good, but not infallible, and users can overrule them either directly in the linear format or after they are built up into the Professional format.
Unicode math alphabetics: Ordinarily when a user types an ASCII letter or a Greek lower case letter α..ω (along with some variants), the letter is automatically converted to the corresponding Unicode math italic letter. These special mathematical letters, along with the basic set of Latin letters in Fraktur, script, and open-face math styles, are reserved for mathematical variables . Accordingly if a subscript or superscript follows such a letter, that letter is considered to be the base. In linear format if you type E=mc^2<space>, you get E = mc2, where the letters are given by math italic characters (not used here in this blog post). In particular, c would be given by the math italic c, U+1D450, rather than by the ASCII c, U+0063. This single math italic c is the base of the superscript expression c2. For more information on the math alphabetics, please see Section 2.1 of the Unicode Technical Report #25.
Numbers: A consecutive string of ASCII digits is treated as a base. So in the expression 1002, the 100 is the base of the superscript expression and has the mathematical meaning of “one hundred squared”. This quantity is typed in as 100^2.
ASCII letter strings: Since mathematical variables are almost always represented by math alphabetics, a consecutive string of ASCII letters is treated as a base. So in the superscript expression sin-1, the base is “sin”. Actually this case is usually handled by the function name mechanism described next. You can enter an ASCII letter string by turning off the italic button before you type or by selecting the corresponding math italic letters and then turning off the italic button. Be sure to turn the italic button back on if you want to enter math italic variables.
Function names: when a consecutive string of English alphabetics is typed followed by a space or bracket of some kind, the resulting math italic string is “folded” down to the corresponding ASCII letter string and compared to entries in a mathematical function dictionary. If found, the folded version of the string is used followed by the function-apply operator U+2061. The dictionary includes trigonometric functions like sin, cos, tan, etc., along with many other famous math function names. Users can modify this dictionary. If the function-apply operator is then followed by a subscript or superscript, that script is transferred to the function name, and the function name becomes the base of the script expression. This is handy for typing in expressions like sin-1x.
Embellished operators: If an operator character precedes a subscript or superscript, the operator is the base. For example, in the expression +2, the + is the base.
Built-up math objects: If a built-up math object such as a stacked fraction precedes a subscript or superscript, that object is the base.
Superscript a subscript object: Exceptions to the rule above occur for superscripting a subscript object and subscripting a superscript object. In both of these cases, the combination is turned into a subsup object, which has special typography, typically placing the superscript over the subscript.
Opaque strings: Opaque strings are whatever is inside a \begin \end expression. Such strings are bases if followed by a subscript or superscript. This is the catch-all method of letting most any mathematical text be a subscript/superscript base. The user is cautioned to use reasonable choices so that the result is understandable to readers.
Complex script characters: In Indic scripts like Devanagari, a number of Unicode characters may be combined to form a character “cluster”. If such a cluster is followed by a subscript or superscript, the cluster becomes the base. However, this doesn’t occur for Arabic ligatures, for which only the last character is treated as the base. One can force the whole ligature to be the base by putting it inside a \begin \end expression, i.e., by making it an opaque string.
Ordinary text: Expressions resulting from the linear format “rate” are called ordinary text and are useful as variables when you want to spell out the variables’ names. Such ordinary text strings are treated as bases.
Author: | Channel: eds.activemath.org - MathML
There’s a wind for more content construction in the ActiveMath group, with at least two projects at the University focussed on creating content (and a adapt platform and…). And MathML starts to play an important role there.
Author: | Channel: SF.net - DocBook to LaTeX Publishing
Channel: W3C Math Home