Re: ISSUE-4 - versioning/DOCTYPEs

Boris Zbarsky, Thu, 13 May 2010 21:41:18 -0400:
> On 5/13/10 4:28 PM, Larry Masinter wrote:
>> Boris, please review again the definition for "polyglot" documents:
>> those that can be processed equally as XHTML and HTML, served
>> equally well as text/html and application/xhtml+xml.
> 
> If a document indeed satisfies such a constraint, then editing it as 
> XHTML should work, no?  Leif's complaint was that editing such 
> documents in particular editors doesn't work, right?
> 
>> You may not have a personal interest in serving the community
>> that wants to use such documents, e.g., to be able to interchange
>> between XHTML and text/html but why are you insisting on preventing
>> those who want such a choice from having it?
> 
> Where am I doing that?
> 
>> It's glib to say that "they're just broken", but by what
>> measure are they "broken", exactly? Not meeting your personal
>> requirements?
> 
> I said that an editor that makes a XHTML document that it's editing 
> into non-well-formed XML is "broken".  Do you have a different 
> adjective to describe such an editor?

It is true that KompoZer has a bug in xhtml+xml mode (it basically 
saves xml:lang="*" as lang="*"). This is a bug that KompoZer should fix 
- of course. But note, that apart from that specific bug, then KompoZer 
*does* produce polyglot/Appendix C XHTML1 syntax regardless of whether 
it edits in xhtml+xml mode or text/html mode. Thus KompoZer takes its 
information about what syntax to follow not from the MIME type but from 
the DOCTYPE (it even creates HTML4 syntax in XHTML mode, provided you 
use a HTML4 doctype).

And you are also right, that if KompoZer *does* fix the xml:lang 
related bug, then it *could* start to require that XHTML5 documents 
must be edited in xhtml+xml mode, whereas HTML5 documents must be 
edited in text/html mode. Put simply: Require that HTML5 files must 
have the .html suffix and XHTML5 files must have the .xhtml suffix.

But then, what about XHTML1 files? Should it also stop creating and 
respecting XHTML1 files? Most XHTML1 files are saved with the .html 
suffix. And so, if the .html suffix automatically causes KompoZer to 
turn the file into a HTML5 file, then I am not so sure that authors 
will be happy. Or should it single out the XHTML1 doctype and treat the 
file differently from how it otherwise treats files with the .html 
suffix? (That is how it does it now, you could say.)

There is of course also a third option, and that is that KompoZer 
always creates polyglot syntax, regardless of what MIME type it uses. 
If I were the KompoZer developers, then this is perhaps what I would 
have found the simplest solution - it is, in fact, the most backward 
compatible solution, if we consider how KompoZer handles XHTML1 
documents. This solution could also work for most text and wysiwyg 
editors, I think: They could all agree/be asked to go for the polyglot 
syntax.

But as long as this is not the case, that some insists on using HTML4 
alike syntax in HTML5 files, and non-polyglot syntax in XHTML5 files, 
then I think we need a third parameter - an optional DOCTYPE - that can 
inform authors and tools about what syntax flavor a file is supposed to 
use.

Though, an alternative to a "real" doctype, is to make the rule, for 
*tools*, that 
	<!DOCTYPE HTML> and 
	<!doctype html> et cetera
mean that the file SHOULD/MUST be generated with HTML4 alike syntax, 
while 
	<!DOCTYPE html> 
means that file SHOULD/MUST be generated with XHTML5 polyglot syntax. 
(After all, pure XHTML5 does in principle not need a doctype, so the 
very presence of a DOCTYPE there could be used as  such a trigger ...) 
This would not be backward compatible with current editions of 
KompoZer, but could work in the future. There are also many HTML5 pages 
out there that uses <!DOCTYPE html> without using polyglot syntax. But 
still, I think this could work, if we wanted. We then have the choice, 
also, about requiring <!DOCTYPE html> to have polyglot syntax effect 
*only* in xhtml+xml mode, or also make it have the same effect in 
text/html mode - I think the latter is the best thing. Does this sound 
more plausible to you?

(PS: I talk about authoring guidelines/rules - not about what is 
*valid*.)
-- 
leif halvard silli

Received on Friday, 14 May 2010 11:53:15 UTC