<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>25104</bug_id>
          
          <creation_ts>2014-03-20 10:56:56 +0000</creation_ts>
          <short_desc>The RelaxNG schema should recognize more encoding values for the &lt;annotation-xml&gt; element</short_desc>
          <delta_ts>2015-08-23 06:58:45 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>HTML Checker</product>
          <component>General</component>
          <version>unspecified</version>
          <rep_platform>PC</rep_platform>
          <op_sys>Linux</op_sys>
          <bug_status>NEW</bug_status>
          <resolution></resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter>fred.wang</reporter>
          <assigned_to name="Michael[tm] Smith">mike+validator</assigned_to>
          <cc>davidc</cc>
    
    <cc>mike</cc>
          
          <qa_contact name="qa-dev tracking">www-validator-cvs</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>102701</commentid>
    <comment_count>0</comment_count>
    <who name="">fred.wang</who>
    <bug_when>2014-03-20 10:56:56 +0000</bug_when>
    <thetext>The HTML5 RelaxNG schema currently allows the following values for the encoding attribute of the &lt;semantics&gt; element:

HTML =&gt; string &quot;application/xhtml+xml&quot; | string &quot;text/html&quot;
SVG =&gt; string &quot;SVG1.1&quot;
MathML =&gt; string &quot;MathML&quot; | string &quot;MathML-Content&quot; | string &quot;MathML-Presentation&quot;

MathML3 suggests to use the MathML/SVG MIME types as encoding values and to keep the old values for backwards compatibility:

http://www.w3.org/TR/MathML3/chapter6.html
http://www.w3.org/TR/MathML3/chapter6.html#encoding-names
http://www.w3.org/TR/MathML3/chapter6.html#interf.graphics

Gecko &amp; WebKit recognizes these MathML/SVG MIME types and I think they should be allowed by the RelaxNG schema too.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>102720</commentid>
    <comment_count>1</comment_count>
    <who name="Michael[tm] Smith">mike</who>
    <bug_when>2014-03-21 02:56:33 +0000</bug_when>
    <thetext>I&apos;m happy to change this as long as it actually conforms to the MathML spec. I&apos;ll read up from the links you provided.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>102721</commentid>
    <comment_count>2</comment_count>
    <who name="Michael[tm] Smith">mike</who>
    <bug_when>2014-03-21 03:01:56 +0000</bug_when>
    <thetext>Cc&apos;ing David Carlisle.

David, as far as I know we&apos;re using this part of upstream MathML schema as-is–without changes–so if what Fred says is correct, this seems like a change that you should also make to the upstream schema.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>102722</commentid>
    <comment_count>3</comment_count>
    <who name="">fred.wang</who>
    <bug_when>2014-03-21 05:33:37 +0000</bug_when>
    <thetext>(In reply to Michael[tm] Smith from comment #2)
&gt; Cc&apos;ing David Carlisle.
&gt; 
&gt; David, as far as I know we&apos;re using this part of upstream MathML schema
&gt; as-is–without changes–so if what Fred says is correct, this seems like a
&gt; change that you should also make to the upstream schema.

Sorry, I forgot to mention the RelaxNG links. As I understand, the upstream schema accepts arbitrary content for semantics:

http://www.w3.org/Math/RelaxNG/mathml3/mathml3-common.rnc

while the HTML5 one has been restricted to accepts only SVG/MathML/HTML:

https://bitbucket.org/validator/validator/src/5ee4172d2929787d5b78519c2035e62b503eee6c/schema/mml3/mathml3-common.rnc?at=default#cl-64
https://bitbucket.org/validator/validator/src/5ee4172d2929787d5b78519c2035e62b503eee6c/schema/xhtml5-svg-mathml.rnc?at=default#cl-31</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>102723</commentid>
    <comment_count>4</comment_count>
    <who name="Michael[tm] Smith">mike</who>
    <bug_when>2014-03-21 05:48:11 +0000</bug_when>
    <thetext>(In reply to fred.wang from comment #3)
&gt; Sorry, I forgot to mention the RelaxNG links. As I understand, the upstream
&gt; schema accepts arbitrary content for semantics:
&gt; 
&gt; http://www.w3.org/Math/RelaxNG/mathml3/mathml3-common.rnc

The schema we use for the validator allows exactly the same content and attributes for the &lt;semantics&gt; element which the above file from the upstream schema does. I made no change at all to &lt;semantics&gt; as far as I can tell.

&gt; while the HTML5 one has been restricted to accepts only SVG/MathML/HTML:
&gt; 
&gt; https://bitbucket.org/validator/validator/src/
&gt; 5ee4172d2929787d5b78519c2035e62b503eee6c/schema/mml3/mathml3-common.
&gt; rnc?at=default#cl-64
&gt; https://bitbucket.org/validator/validator/src/
&gt; 5ee4172d2929787d5b78519c2035e62b503eee6c/schema/xhtml5-svg-mathml.
&gt; rnc?at=default#cl-31

Those links are for changes not to &quot;the encoding attribute of the &lt;semantics&gt; element&quot; (what you mention in the Description for this issue) but instead for 
changes to the &lt;annotation-xml&gt; element.

Is this bug about &lt;annotation-xml&gt; or about &lt;semantics&gt;?

At this point it&apos;s unclear to me what change you&apos;re asking for here.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>102727</commentid>
    <comment_count>5</comment_count>
    <who name="">fred.wang</who>
    <bug_when>2014-03-21 06:10:21 +0000</bug_when>
    <thetext>&gt; Is this bug about &lt;annotation-xml&gt; or about &lt;semantics&gt;?
&gt; At this point it&apos;s unclear to me what change you&apos;re asking for here.

It&apos;s about &lt;annotation-xml&gt;, sorry (&lt;semantics&gt; does not have encoding attribute AFAIK). Of course &lt;annotation-xml&gt;&apos;s are used ad children of &lt;semantics&gt;, thus the confusion.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>102728</commentid>
    <comment_count>6</comment_count>
    <who name="Michael[tm] Smith">mike</who>
    <bug_when>2014-03-21 06:17:47 +0000</bug_when>
    <thetext>(In reply to fred.wang from comment #5)
&gt; &gt; Is this bug about &lt;annotation-xml&gt; or about &lt;semantics&gt;?
&gt; &gt; At this point it&apos;s unclear to me what change you&apos;re asking for here.
&gt; 
&gt; It&apos;s about &lt;annotation-xml&gt;, sorry (&lt;semantics&gt; does not have encoding
&gt; attribute AFAIK). Of course &lt;annotation-xml&gt;&apos;s are used ad children of
&gt; &lt;semantics&gt;, thus the confusion.

OK, got it. I&apos;ll wait for David to weigh in, because I really still don&apos;t understand the use cases for &lt;annotation-xml&gt; enough to know myself and would like to get his take on what he thinks is the best thing to do here.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>102736</commentid>
    <comment_count>7</comment_count>
    <who name="David Carlisle">davidc</who>
    <bug_when>2014-03-21 09:36:33 +0000</bug_when>
    <thetext>If I&apos;d ruled the world things would have been different:-)

The situation is that in the core MathML RelaxNG schema annotation-xml allows arbitrary content and the encoding attribute takes arbitrary values.

attribute encoding {xsd:string}?

In application/xhtml+xml parsing there is an argument that says that this should be as above, however I think there is also the argument that the HTML+MathML+SVG schema should try to steer people towards HTML/XHTML compatibility by default.

In text/html parsing things are as usual rather more murky.

annotation-xml doesn&apos;t really take arbitrary content it&apos;s either parsed as html (encoding=text/html or application/xhtml+xml) or as MathML (any other encoding).
Namespaces in the content are as usual mangled/ignored.


This means for example that if you want to put SVG in the annotation-xml you need to use encoding application/xhtml+xml because then the html parser sees &lt;svg&gt; and automatically puts things back in foreign content in svg namespace and things work. If you put any SVG-related value for the encoding the elements will be parsed as unknown elements in the MathML namespace. 

So... In an ideal world I&apos;d make the HTML parsing more sensible, but until or unless there is a proposal to make things work in HTML parsing I wouldn&apos;t say that the validator is wrong to warn about any encoding other than the HTML or MathML ones. That doesn&apos;t mean that the browsers shouldn&apos;t accept whatever makes sense of course.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>102739</commentid>
    <comment_count>8</comment_count>
    <who name="David Carlisle">davidc</who>
    <bug_when>2014-03-21 09:51:04 +0000</bug_when>
    <thetext>(In reply to David Carlisle from comment #7)

oops.
&gt; 
&gt; In text/html parsing things are as usual rather more murky.

I got that part right:-)

&gt; 
&gt; This means for example that if you want to put SVG in the annotation-xml you
&gt; need to use encoding application/xhtml+xml because then the html parser sees
&gt; &lt;svg&gt; and automatically puts things back in foreign content in svg namespace
&gt; and things work. If you put any SVG-related value for the encoding the
&gt; elements will be parsed as unknown elements in the MathML namespace. 


I got that part wrong: &lt;svg&gt; is also recognised as svg in MathML (foreign content parsing)

However it is still basically true that in text/html there are only really two useful values for encoding, text/html and application/xhtml+xml. Any other value has the same effect as not having the encoding attribute at all.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>102847</commentid>
    <comment_count>9</comment_count>
    <who name="Michael[tm] Smith">mike</who>
    <bug_when>2014-03-24 17:56:08 +0000</bug_when>
    <thetext>(In reply to fred.wang from comment #0)
&gt; The HTML5 RelaxNG schema currently allows the following values for the
&gt; encoding attribute of the &lt;semantics&gt; element:
&gt; 
&gt; HTML =&gt; string &quot;application/xhtml+xml&quot; | string &quot;text/html&quot;

(In reply to David Carlisle from comment #8)
&gt; However it is still basically true that in text/html there are only really
&gt; two useful values for encoding, text/html and application/xhtml+xml. Any
&gt; other value has the same effect as not having the encoding attribute at all.

So yeah the HTML parser depends on looking for those specific values for the &quot;encoding&quot; attribute.

http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html#html-integration-point

That part of the HTML spec says:

— A node is an HTML integration point if it is one of the following elements:

— An annotation-xml element in the MathML namespace whose start tag token had an attribute with the name &quot;encoding&quot; whose value was an ASCII case-insensitive match for the string &quot;text/html&quot;
— An annotation-xml element in the MathML namespace whose start tag token had an attribute with the name &quot;encoding&quot; whose value was an ASCII case-insensitive match for the string &quot;application/xhtml+xml&quot;

Then there are other parts of the spec that say what to do when you an &quot;HTML integration point&quot; is encountered; e.g.:

  http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html#parsing-main-inforeign

—Pop an element from the stack of open elements, and then keep popping more elements from the stack of open elements until the current node is a MathML text integration point, an HTML integration point, or an element in the HTML namespace.

So as far as I can see, the validator is aligned with the HTML spec here. If you want something other than encoding=application/xhtml+xml or encoding=text/html to be supported then you need to file a bug against the HTML spec.</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>