This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 20707 - Please add a Scope section per the qualification of the TAG's support for REC track publication
Summary: Please add a Scope section per the qualification of the TAG's support for REC...
Status: RESOLVED FIXED
Alias: None
Product: HTML WG
Classification: Unclassified
Component: CR HTML/XHTML Compatibility Authoring Guide (ed: Eliot Graff) (show other bugs)
Version: unspecified
Hardware: All All
: P2 normal
Target Milestone: ---
Assignee: Leif Halvard Silli
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 12725
  Show dependency treegraph
 
Reported: 2013-01-18 12:13 UTC by Henri Sivonen
Modified: 2013-09-01 20:57 UTC (History)
4 users (show)

See Also:


Attachments

Description Henri Sivonen 2013-01-18 12:13:35 UTC
See http://lists.w3.org/Archives/Public/public-html/2012Dec/0082.html from the TAG.

The email says, in part:
“
We understand that the HTML WG is currently debating whether to publish the Polyglot draft as a Recommendation or a Note. We support the publication of the Polyglot draft as a Recommendation, with the addition of a Scope section that makes the intended uses of polyglot clear. The scope should indicate that

 * the use of polyglot is suitable as an
   option for tool chains that operate in
   controlled environments and for authoring
   tools

 * XML-based HTML tools or systems intended
   for the most general contexts of use cannot
   depend on polyglot input: for maximum flexibility,
   such tools should use the technique of using an
   HTML parser that produces an XML-compatible DOM or
   event stream

Making these points would make it clearer what polyglot is being recommended for, and what it is not being recommended for.
”

Please add a Scope section that communicates the points outlined in the TAG email.
Comment 1 Leif Halvard Silli 2013-01-19 11:37:15 UTC
(In reply to comment #0)

>  * XML-based HTML tools or systems intended
>    for the most general contexts of use cannot
>    depend on polyglot input: for maximum flexibility,
>    such tools should use the technique of using an
>    HTML parser that produces an XML-compatible DOM or
>    event stream

In the context of Polyglot Markup, then the above doesn’t adequately express what "maximum flexibility" implies. As a specification of polyglot markup - which *itself* is a format for "maximum flexibility" (read: maximum robustness), "maximum flexibility" means that the XML-based HTML tool *must* output polyglot markup.

The following reformulation expresses what I mean - hopefully Henri is happy with it too:

   * XML-based HTML tools or systems intended
     for the most general contexts of use cannot
     depend on polyglot input: for maximum flexibility
<ins>when creating polyglot markup with</ins> such tools,
<del>such</del><ins>the</ins> tools should use the technique of using an
     HTML parser that produces an XML-compatible DOM or
     event stream
Comment 2 Jeni Tennison 2013-01-19 21:31:00 UTC
(In reply to comment #1)
>    * XML-based HTML tools or systems intended
>      for the most general contexts of use cannot
>      depend on polyglot input: for maximum flexibility
> <ins>when creating polyglot markup with</ins> such tools,
> <del>such</del><ins>the</ins> tools should use the technique of using an
>      HTML parser that produces an XML-compatible DOM or
>      event stream

The first part of the sentence is talking about tools that are *consuming* markup. I don't think that the TAG intended for these exact words to be used within the Scope section, but if it helps to clarify the intent, it would be better to say:

    ... for maximum flexibility, XML-based tools that consume
    HTML should use the technique of using an HTML parser that
    produces a DOM or event stream that can be consumed as XML.

Also, note the reference to 'authoring tools' in the first bullet point could encompass any HTML-generating tool, though it's not as strong as you'd like (suggesting providing polyglot output as an option rather than saying it *must* be produced).
Comment 3 Leif Halvard Silli 2013-01-19 22:10:26 UTC
I want the spec to say that *if* one produces XHTML syntax for the consumption as HTML, then it is RECOMMENDED for EVERYONE - and not just for people with special tools and special processes - to serve polyglot markup.

Does that request fit into this bug? Or should file another one? (Both too few and to many bugs are bad …)
Comment 4 Leif Halvard Silli 2013-01-20 00:56:39 UTC
(In reply to comment #2)
> (In reply to comment #1)
> >    * XML-based HTML tools or systems intended
> >      for the most general contexts of use cannot
> >      depend on polyglot input: for maximum flexibility
> > <ins>when creating polyglot markup with</ins> such tools,
> > <del>such</del><ins>the</ins> tools should use the technique of using an
> >      HTML parser that produces an XML-compatible DOM or
> >      event stream
> 
> The first part of the sentence is talking about tools that are *consuming*
> markup.

The problem is that, taken together, then *both* bullet points talk about *consuming* markup. Because, when first bullet point talks about "controlled environments", then it has in mind environements where both consumption and output is XML. The second bullet point talks about "uncontrolled" environments, in which it of course is the parsing that becomes the problem.

The problem with the focus on the parsing is that it builds on an idea from the XML-HTML Task Force which assumes that the point with polyglot markup is to create documents that can serve as food for an XML tool chain.

However, that is not the purpose. Being ready to be parsed by such tools is of course one benefit of polyglot markup. But, actually, polyglot markup is more a "HTML-parsing safe" format - a robust format which is guaranteed to survive a XML toolchain with at more or less dubious HTML preparser

The other, important purpose of polygot markup is simply to survive Web browsers’s parsing - the very basic rule that the DOCTYPE is required etc, ensures that there no quirks-mode is triggered, and thus ensures that elements are treated the same way by both XHTML and HTML parsers.

> I don't think that the TAG intended for these exact words to be used
> within the Scope section, but if it helps to clarify the intent, it would be
> better to say:
> 
>     ... for maximum flexibility, XML-based tools that consume
>     HTML should use the technique of using an HTML parser that
>     produces a DOM or event stream that can be consumed as XML.
> 
> Also, note the reference to 'authoring tools' in the first bullet point
> could encompass any HTML-generating tool, though it's not as strong as you'd
> like (suggesting providing polyglot output as an option rather than saying
> it *must* be produced).

1) Regarding my "*must* output polyglot markup", please replace it with "outputs polylgot markup". The point I tried to make was that this spec should not become a document that teaches good ways to produce HTML - in general. And in particular, it is not this spec's task to tell authors how they can produce something *other* than polyglot markup - such a thign only produces FUD (fear, uncertainty and doubt).

2) This is not really meant as an argumetn against the TAG text, but in my view, the spec is *already* a bit too concerned with explaining that "polyglot is not for all". Because, yes, it is for all. It is for all that wants to produce polyglot markup and have the benfits of that.

3) Regarding "authoring tools": Thanks for pointing that out. When I now reread the first bullet point, I understand that "and for authoring tools" in fact reflects what I myself has claimed earlier on in the debate. But the way I see it, then e.g. a WYSIWYG authoring tool is more comparable to a "XML tool chain with HTML parser" than it is comparable to a "controlled environment". (Well, it depends on how the tool handles pre-existing markup, of course …)

4) Regarding your rephrasing of "for maximum flexibility" then it flows better than the original text.
Comment 5 Henri Sivonen 2013-01-21 08:17:18 UTC
First, what Jeni said.

Second:

(In reply to comment #3)
> I want the spec to say that *if* one produces XHTML syntax for the
> consumption as HTML, then it is RECOMMENDED for EVERYONE - and not just for
> people with special tools and special processes - to serve polyglot markup.

I think that sort of thing would decrease rather than increase consensus. (I object to that formulation.)

“Produces XHTML syntax” is ambiguous. XHTMLness depends on Content-Type—not syntax. As for syntax, HTML5 deliberately allows XHTMLisms in text/html to ease migration to valid HTML5 from Appendix C-influenced markup. It would be entirely inappropriate to say that if you have some XHTMLisms in text/html, you have to go all the way to polyglot. That would make migration harder—not easier.
Comment 6 Leif Halvard Silli 2013-01-22 15:06:18 UTC
(In reply to comment #5)
>First, what Jeni said.

Since you did not comment on my proposed ammendment to the text, I am hereby submitting an updated version:

* Polyglot Markup can be produced by any XHTML or HTML tool
  that adheres to its requirements. As such it is available
  to anyone striving for the robustness of this format.

* Polyglot markup might be simplest to produce in controlled
  environment tool chains and authoring tools. For XML-based
  HTML tools or systems intended for the most general contexts
  contexts, before deciding about output markup, they should,
  for maximum flexibility use the technique an HTML parser
  that produces a DOM or event stream that can be consumed as
  XML.

* Polyglot Markup is particulary suitable if the author wants
  to limit their output to fewer, safer options. Polyglot 
  Markup does not aim to be the sole option, but it does aim
  to be the safest and the most robust.

* In addition, as a subset of XML, Polyglot Markup represents
  a target format for XHTML production tools that are sought
  updated for HTML5-conformance through adjustments of their
  XHTML output. At the time of writing, it was the sole such
  XHTML5-subset that had been specified.

> “Produces XHTML syntax” is ambiguous.

If it isn't well-formed, then it isn't XML. I meant that statement *only* about well-formed XHTML documents.


> XHTMLness depends on Content-Type—not syntax.

Ditto for HTMLness.


> As for syntax, HTML5 deliberately allows XHTMLisms in text/html to
> ease migration to valid HTML5 from Appendix C-influenced markup.

Indeed.


> It would be
> entirely inappropriate to say that if you have some XHTMLisms in text/html,
> you have to go all the way to polyglot.

Indeed. I did not intend to say something like that. Over all, I tried to avoid what what I perceive that you do not avoid, namely a message on the pattern that "if you have such and such starting point, then you must convert it to this or that flavor of HTML5".  All I tried to say that, regardless of backgorund, then polyglot markup is definitely the most sensible variant of XHTML5 to produce, if first you have decided to produce XHTML5. (If fact, for now, it is the only description of such a XHTML5 variant.)


> That would make migration harder—not easier.

Any langauge that hints that polyglot is only suitable for authors that have such and such starting point, decreases agreement, in my view.
Comment 7 Larry Masinter 2013-02-09 01:04:14 UTC
I offer another perspective on use cases for Polyglot which might fit into the "Scope" section:


http://lists.w3.org/Archives/Public/www-tag/2013Feb/0018.html

casts Polyglot as a transition technology which can be Recommended as part of how to transition from XHTML 1.0 (the previous W3C Recommendation for hypertext markup) to HTML5.
Comment 8 Leif Halvard Silli 2013-09-01 20:57:50 UTC
Accepted. I added more or less literally from Henri’s two points in  comment 0.

There is no a scope section within the intro section.

http://dev.w3.org/html5/html-xhtml-author-guide/html-xhtml-authoring-guide.html#introduction