This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 9859 - add document-conformance constraints for documents that contain SVG or MathML content that in turn contains HTML content
Summary: add document-conformance constraints for documents that contain SVG or MathML...
Status: RESOLVED WONTFIX
Alias: None
Product: HTML WG
Classification: Unclassified
Component: pre-LC1 HTML5 spec (editor: Ian Hickson) (show other bugs)
Version: unspecified
Hardware: PC All
: P2 normal
Target Milestone: ---
Assignee: Ian 'Hixie' Hickson
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-06-05 05:39 UTC by Michael[tm] Smith
Modified: 2010-10-04 14:48 UTC (History)
5 users (show)

See Also:


Attachments

Description Michael[tm] Smith 2010-06-05 05:39:51 UTC
I note that the 'The "in foreign content" insertion mode' subsection of the "Tree construction" section defines as a parse error any instance of 'A start tag whose tag name is one of: "b", "big", "blockquote", "body", "br", "center",', etc., when the parser is in the "in foreign content" insertion mode.

http://dev.w3.org/html5/spec/tokenization.html#parsing-main-inforeign

That's all great, but the author view of the spec doesn't provide any corresponding requirements on documents/authors when they are creating documents that contain SVG and MathML content.

So I think it would be helpful if the spec could state something like, 'Documents [meaning serialized documents, not parsed DOM representations] must not have SVG or MathML content that contains any of the following elements: "b", "big", "blockquote", "body", "br", "center",' etc.

I realize that a parsed document in memory is not going to contain such instances. I mean for this suggestion to apply to guidance for authors on how to produce conformant documents.
Comment 1 Michael[tm] Smith 2010-06-05 05:50:23 UTC
 I'm looking at this test case: http://validator.nu/?doc=https%3A%2F%2Feyeasme.com%2FJoe%2FMathML%2FHTML5%2Fextras.html&showsource=yes#l230c87

...which is originally from http://www.mozilla.org/projects/mathml/demo/extras.xhtml

validation on it fails -- actually, proper error-free parsing of it fails -- because it has an HTML img element within a MathML annotation-xml element, as far as I can tell

but HTML img within MathML annotation-xml seems to be valid as far as MathML is concerned, and as far as validator.nu is concerned, if it's served with an XML MIME type -

http://validator.nu/?doc=https%3A%2F%2Feyeasme.com%2FJoe%2FMathML%2FHTML5%2Fextras.xhtml&showsource=yes

It seems the the only case where it's not conformant is when it's served as text/html
Comment 2 Simon Pieters 2010-06-07 03:40:26 UTC
"A start tag whose tag name is "svg", if the current node is an annotation-xml element in the MathML namespace."

The HTML5 parser only allows <svg> or MathML in <annotation-xml>. Not HTML. <span> will be in MathML namespace.

We could change the parser to allow HTML there (like it allows HTML in MathML <mi>), but then it would break MathML in <annotation-xml>. The MathML spec has an example with MathML in <annotation-xml>.

Maybe we need a special tag that enables HTML in <annotation-xml>? <div>? It means MathML can't have a future element called <div>.

"A start tag whose tag name is "svg" or "div" if the current node is an annotation-xml element in the MathML namespace."

However... The XHTML version doesn't use <annotation-xml> at all. Why does the HTML version? Shouldn't it use <mn> or <mtext> instead?
Comment 3 Michael[tm] Smith 2010-06-07 05:12:00 UTC
(In reply to comment #2)
> "A start tag whose tag name is "svg", if the current node is an annotation-xml
> element in the MathML namespace."
> 
> The HTML5 parser only allows <svg> or MathML in <annotation-xml>. Not HTML.
> <span> will be in MathML namespace.

I see now... I had not noticed that previously.
> However... The XHTML version doesn't use <annotation-xml> at all. Why does the
> HTML version? Shouldn't it use <mn> or <mtext> instead?

I also hadn't noticed that the original source did not use <annotation-xml>.. I'll ask the person who originally reported if why he made that change
Comment 4 Henri Sivonen 2010-06-07 06:55:06 UTC
Could be my fault for making an incorrect statement about annotation-xml on the Mozilla Hacks blog.
Comment 5 Michael[tm] Smith 2010-06-09 02:29:24 UTC
FYI, I opened a related bug, bug 9887, "parsing algorithm should allow HTML content in MathML <annotation-xml>"
Comment 6 Michael[tm] Smith 2010-06-14 01:34:47 UTC
I had some offlist discussion about where HTML elements should be allowed in MathML content, and that leads me to suggest that it should be restricted to <mtext> elements. I realize that the parsing algorithm also processes HTML content correctly if it's in <mi>, <mn>, <mo>, and <ms> elements, but I think this is a case where it makes sense for the document-conformance requirements to be stricter than what the parsing algorithm allows
Comment 7 Simon Pieters 2010-06-14 06:21:47 UTC
> I think
> this is a case where it makes sense for the document-conformance requirements
> to be stricter than what the parsing algorithm allows

Why? Is this something that will change over time (i.e. do you think it will make sense to allow HTML in <mn> in the future)?
Comment 8 Michael[tm] Smith 2010-06-14 06:37:21 UTC
(In reply to comment #7)
> > I think
> > this is a case where it makes sense for the document-conformance requirements
> > to be stricter than what the parsing algorithm allows
> 
> Why? Is this something that will change over time (i.e. do you think it will
> make sense to allow HTML in <mn> in the future)?

I don't think it will make sense to allow HTML in <mn> in the future, no. But from what I've gleaned both about MathML semantics and that MathML tools situation, putting HTML into <mn> (or any of the "token elements" except for <mtext>) is both not a good match for MathML semantics and also a potential problem for other non-browser MathML processing tools. I think the case is that the existing tools don't necessarily expect to do any processing at all of <mtext> contents -- and so any non-MathML-namespace markup they find in <mtext> is not going to cause them any processing failures -- whereas some tools do not expect such markup in <mn>, etc., so it may cause them some processing failures.

Anyway, I think what we'll eventually need is for someone knowledgeable about MathML semantic differences and the MathML tool situation -- preferably somebody from the Math WG -- to post comment here with some more details.
Comment 9 Ian 'Hixie' Hickson 2010-08-27 20:46:35 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Rejected
Change Description: no spec change
Rationale: It's already the case that you can't legally put those elements the places where they cause problems, by virtue of them not being valid in SVG and MathML.