W3C

Results of Questionnaire ISSUE-4 & ISSUE-84: HTML5 versioning and doctypes - Straw Poll for Objections

The results of this questionnaire are available to anybody.

This questionnaire was open from 2010-07-22 to 2010-07-30.

12 answers have been received.

Jump to results for question:

  1. Objections to the Change Proposal to Allow much more general DOCTYPE syntax to be conforming, possibly including optional version information.
  2. Objections to the Change Proposal to Not Put a Version Indicator in the DOCTYPE.

1. Objections to the Change Proposal to Allow much more general DOCTYPE syntax to be conforming, possibly including optional version information.

We have a Change Proposal to allow much more general DOCTYPE syntax to be conforming, possibly including optional version information. This Change Proposal would also affect the set of legacy doctypes that are considered conforming. If you have strong objections to adopting this Change Proposal, please state your objections below.

Keep in mind, you must actually state an objection, not merely cite someone else. If you feel that your objection has already been adequately addressed by someone else, then it is not necessary to repeat it.

Details

Responder Objections to the Change Proposal to Allow much more general DOCTYPE syntax to be conforming, possibly including optional version information.
Grant Simpson
David Baron Making varying DOCTYPE syntax conforming will make it more likely that browsers to use variations in DOCTYPE decarations to trigger new mode changes, even if the specification declares this to be nonconforming. I believe authors would be more willing to use apparently-non-proprietary mode changing syntax than clearly proprietary markup or clearly-proprietary comment syntax. Having more modes makes it harder for new entrants in the browser market, as I described in http://dbaron.org/log/2007-03#e20070325a , http://lists.w3.org/Archives/Public/public-html/2007Apr/0279.html , and http://lists.w3.org/Archives/Public/www-tag/2009Aug/0054.html .

Allowing variations in syntax that are conforming, but do nothing, seems likely to encourage authors to believe that those variations will do something. (Witness, for example, the large numbers of authors who change the "EN" within the HTML 4.0 FPI to their own language code.)

Allowing DOCTYPE declarations that can be connected to DTD-based validators encourages people to use inferior DTD-based validation tools rather than a conformant HTML5 conformance checker.
Boris Zbarsky Fundamentally, if UAs treat all HTML versions as the same there is no need for versioning information.

So a proposal to create such information presupposes that UAs will NOT treat versions the same. We have this to a small extent now in terms of quirks mode, and even that is a significant maintenance burden. So in effect, a proposal that we introduce versioning is a proposal that favors big players over small ones in the HTML space: the next obvious course of action is to make sure several incompatible versions of HTML are created and that all need to be supported. This raises barriers to entry into the field, and while it may be good for certain W3C member companies I believe it's very bad for HTML and HTML consumers as a whole.

I can see various scenarios where a producer might want to keep internal versioning information, but I believe this can be put in a comment as well as in a doctype, and in any case ideally should be removed before publishing the content (so that conformance issues don't arise even is such tools want to put the information in the doctype).

While I agree that it's possible that in the future a situation will arise where HTML documents from that point on need to be differentiated from all preceding ones, such a situation has not arisen in practice in the nearly 20 years of HTML's existence. In the unlikely event that it does arise, that would be the right time to consider adding a version indicator. Clearly this would not be done lightly at that point, but neither should changes to HTML that are backwards-incompatible to the extent envisioned in this scenario be undertaken lightly.
Henri Sivonen I strongly object to this change proposal.

A WBS bug prevented me from posting my objections here, so I posted them to http://lists.w3.org/Archives/Public/www-archive/2010Jul/0124.html as advised by the Team Contact. Please consider that message as if the objections in it were stated here.
Toby Inkster
Anne van Kesteren I have the same objections to this proposal as I have had to the ISSUE-88 change proposal to allow something without any observable effect. This will only lead to confusion for authors.

http://www.w3.org/2002/09/wbs/40318/issue-88-objection-poll/results

In addition, this would give the opportunity for Web browsers to fragment HTML which would be a very bad thing for the Web. If we ever need versioning for HTML (and we have not since HTML was invented) we can add it at that point. Designing for a theoretical future is not at all necessary here and only complicates matters.
James Graham I strongly object to the proposal to add versioning information
in HTML. In general terms, versioning on the UA side is bad for
the web, significantly increasing the burden of complexity in new
implementations, and therefore reducing competition in the
browser space.

Versioning is also bad for authors because it means they have to learn
magic incantations to opt into the behaviour they expect, leading
to confusion when things go wrong due to picking the wrong
version. These problems can be seen today with quirks vs limited
quirks vs standards modes.

Given that switching behaviour based on versions is bad for both
end users and authors, we should be designing the language to
make it difficult for UAs to version switch, even if it is in their
interests to do so, because it is not in the interests of the web as a
whole.

Although the change proposal makes some attempt to limit the scope of
when versioning information is permitted and how UAs may react to it,
it seems foolish to rely on both authors and vendors following such a
complex set of rules when the downsides of breaking them is so
great. In particular if UA vendors break the rules. costs are mostly
in the form of externalities incurred by other vendors, authors and
ultimately end users. We should be working to avoid such things, not
encouraging them by making them easy.

The suggested rules continually refer to "controlled
environments". This implies that the change proposal is mainly
concerned with non-web use cases. I do not believe we should give
priority to non-web use cases particularly when they directly conflict
with the requirements for the web.

I also note that were these rules to be followed, and versioning only
used in non-web applications, several of the supposed benefits of
versioning could not actually occur in the manner described, unless we
believe the argument that an unused syntactic construct is materially
different from one that does not exist. However it is not clear why
this is so; for example one suggested benefit of versioning is
continuity of conformance for existing documents in the face of
changes to the conformance requirements of the language. It is hard to
see how this is realised in the case where there is a versioning
mechanism that a document is not permitted to use, but not in the case
where there is no versioning mechanism. In this sense I believe that
the proposal is internally inconsistent.

This inconsistency would be more troubling if the purported benefits
of versioning were real. However I believe they are not, and agree
with the criticisms that Henri made in his response.

It has been suggested that non-browser UAs, in particular editors (or
"editing workflows"), may benefit from versioning information. However
it is not clear why this would be uniquely important, or even
necessary, metadata for an editor. Such a UA may also want to store
information such as the current cursor position, or details of the
allowed features according to a specific set of browsers that the
author wished to target. I do not think such UAs benefit substantially
from a version indicator and certainly I see no need to raise it to a
privileged position above all other possible metadata.
Jeremy Orlow Backwards compatibility is central to HTML. If your HTML needs to target specific versions of the spec, then we've failed. Adding the notion of versions to the DOCTYPE will simply confuse authors and provides no real benefit as far as I can tell.

If we do for some reason, in the future, do need to break backwards compatibility, we should then consider adding a mechanism for citing versions. Until then, the version number will just be noise/overhead/confusing.
Aryeh Gregor First, point #11 of the Rationale suggests that if an incompatible change to HTML is ever needed, adding support for version indicators now will help to avert the cost of switching to the new and incompatible version. But according to the proposal, browsers will treat a versioned doctype the same as an unversioned one, so why not just use the latter exclusively? If versioning ever needs to be added in the future, add it then. Pages with plain <!doctype html> can then be treated as "old" pages, while ones with the new versioning syntax (whatever it is) will be "new" ones.

Indeed, this is exactly what HTTP did. HTTP 0.9 <http://www.w3.org/Protocols/HTTP/AsImplemented.html> had no explicit versioning, and when HTTP 1.0 <http://tools.ietf.org/html/rfc1945#page-12> added versioning, it provided that "If the protocol version is not specified, the recipient must assume that the message is in the simple HTTP/0.9 format." This scheme proved successful in practice, obviously.

It is not explained why permitting versioning now makes later incompatible changes easier. As far as I can tell, this eliminates the rationale for the entire proposal, because it leaves no reason to actually permit explicit versioning now.


Second, the Change Proposal (also in point #11 of the Rationale) states:

"""
In the history of computer languages, there are no languages that have not evolved, been extended, or otherwise "versioned" as long as the language has been in use. This applies to network protocols, character encoding standards, programming languages, and certainly to every known technology found on the web. There are no known cases where a language hasn't gone through some at least minor incompatible change. The standards process is established as a way of evolving specifications and implementations in a way to reduce the likelihood of complete failure to interoperate, but certainly not to guarantee that no incompatible changes will be needed in the future.
"""

I believe this is materially very wrong. HTML, CSS, and JavaScript are all languages that have maintained an extremely high degree of backward compatibility, and in particular have not ever used explicit versioning to date (that is, none that authors or UAs respected in practice, formal talismans notwithstanding).

There are many other languages and interfaces that try to maintain this level of compatibility as well. The Linux system call API, the X Window System, and the Windows system call interface have all kept changes down to a level where they do not require programs or operating systems to advertise particular versions. I cannot think of a single programming language that requires programs to explicitly state what version of the language they're written in. (Perhaps this just shows my limited knowledge of programming languages, but then the original Change Proposal should have given supporting evidence for the claim in the first place.)

Indeed, it's difficult to think of successful technologies that really rely on explicit versioning in practice. HTTP and XML have explicit versions, but in practice they're stuck at 1.1 and 1.0 respectively, and any new features that are adopted in practice are added backwards-compatibly, without incrementing the version number. So although they technically have version numbers, they're not good support for this Change Proposal.

It's true that all languages sometimes need to break backward compatibility a bit, but version indicators are rarely necessary or helpful in doing this. In most cases, such breakage is kept as minimal as possible, so incrementing a version number would be excessive. In some cases, such as Python, there are large-scale backward compatibility breaks, but version indicators are still not used -- because they impose a permanent cost on the language to moderately ease a temporary transition period.

On top of that, even if a few other languages do use explicit version indicators in the real world, they are not in the same position as web technologies. HTML is used by billions of people every day, and there are trillions of web pages in existence, under the control of countless different authors. Browser implementers have consistently emphasized that in order for people to use their browsers, they *cannot* break significant numbers of web pages. It is not a question of whether we want to change HTML incompatibly; it is impossible whether we want to or not, because market forces prohibit it. No reason is given for why this situation would ever change, as long as HTML remains widely deployed.

Thus adding explicit version indicators is not only useless in the event of an incompatible change, as I explain in my first point. It also exemplifies one of the cardinal problems with pursuing long-term theoretical purity over short- to mid-term pragmatism: it burdens the language with extra syntax to avoid a problem that will very likely never exist, so that the cure is worse than the disease.

In summary, the proposal seeks to address a nonexistent problem, and fails to do even that.


On a final and more minor note, I also specifically object to the suggestion that the wording "mostly useless" be changed. It's good to use clear and forceful wording to help get the message out to authors that doctypes have no real purpose on the web, since they don't control browser processing (except by avoiding quirks mode). Weaker wording like "of limited utility" would reduce the effectiveness of this message. We do not need to write the spec in an overly formal fashion if that reduces the effectiveness of its phrasing.
Tab Atkins Jr. Robin Berjon addressed the issue of a version indicator eloquently in his series on "XML Bad Practices", based on his experiences developing the SVG specification: http://berjon.com/blog/2009/12/xmlbp-naive-versioning.html.

If content is expected to be rejected across version boundaries, then "versions" are actually different languages entirely. Pretending that they're just different versions helps no one, and so a versioning indicator is not needed. Luckily, HTML has never and likely will never fall into this category - the next category is the relevant one here.

If content is expected to be accepted across version boundaries, then good "versioning" behavior must be relatively complex and subtle to correctly handle unknown (read: newly introduced) content. The versioning indicator itself doesn't help in any way with this - what happens if the author says the content is version 2, but uses a construct from version 1 that's obsolete in version 2, and a construct from version 3 that hadn't been thought of at the time version 2 was created? Proper error-handling (for the version1 construct) and forward-compatible design (for the version3 construct) needs to be designed into the language in the first place, and an additional version indicator provides precisely zero help here.

Thus, a versioning indicator is never needed. As this appears to be the primary justification for this change proposal, then, I strongly object to changing the language in this way.
Jace Voracek It is a key necessity for the World Wide Web to operate on a basis which does not require version-specific syntax with DOCTYPE for HTML in order to avoid confusion among platforms. Allowing such syntax may break the universal standards of HTML if many different DOCTYPE documents are produced.
David Singer
- HTML and other core specifications of the Web platform are processed in an unversioned way in practice. Having an in-band version indicator is misleading to authors, who expect it to have a material effect, and creates a great deal of misunderstanding.
- It may be tempting for some implementors to use a standard version indicator as an implementation version indicator. For reasons well-explained by Adam Barth, David Baron and others, this would have an anti-competitive effect on the browser market; content would end up locked in to specific bug profiles.
- The proposal not only allows versioning in the doctype but also allows creation of ad-hoc custom doctypes. Profiling the language this way is likely to lead to interoperability problems.
- If versioning is to be supported, versioning the doctype is inferior to versioning via an attribute; other specs use an attribute, XML documents generally would not use a doctype, and the doctype is not viable for compound documents such as Atom that embeds XHTML.
- The change proposal argues that versioning in the doctype may be useful for controlled environments, but on the other hand argues that versioning must be ignored by implementations. If it has an effect, it creates an interoperability problem; if it has no effect, then it is unclear how it helps controlled environments.

2. Objections to the Change Proposal to Not Put a Version Indicator in the DOCTYPE.

We have a Change Proposal to not put a version indicator in the DOCTYPE. This Change Proposal would also leave the set of legacy doctypes that are considered conforming unchanged from the current draft. If you have strong objections to adopting this Change Proposal , please state your objections below.

Keep in mind, you must actually state an objection, not merely cite someone else. If you feel that your objection has already been adequately addressed by someone else, then it is not necessary to repeat it.

Details

Responder Objections to the Change Proposal to Not Put a Version Indicator in the DOCTYPE.
Grant Simpson In my academic capacity, I often deal with historical electronic documents. While the DOCTYPE stated in the document is not an ironclad indication of which DOCTYPE the author meant to use (or even that the author knew what a DOCTYPE is), it limits the likely age of the document to at least as old as the DOCTYPE. The existing HTML DOCTYPES are versioned, which makes this task much easier than if we adopt a non-versioned DOCTYPE. Essentially, if we go with simply <!DOCTYPE html> and continue to do the same for iterations of HTML after 5 (as is implied by the lack of a version number), we greatly increase the complexity of attempting to date HTML documents according to internal criteria. We would merely be saying that this document is HTML 5+, period. Similarly, if we allow HTML 5 documents to use legacy DOCTYPES, we exacerbate the problem to a higher degree. I realize, first of all, that my concerns are not of primary concern to matters of conformance checking and document parsing and, second of all, that due to the fact that people regularly select the wrong DOCTYPE -- or it is selected for them by WYSIWYG editors -- and thus one would have to look to more internal criteria to determine age (and, more helpfully, to external criteria). However, I do think this is an important concern for people working with historical HTML documents. Heretofore, DOCTYPE has been a serviceable first stab at determining date.
David Baron
Boris Zbarsky
Henri Sivonen
Toby Inkster The proposal confuses version indicators with rendering pragmas.

Allowing authors to include a DOCTYPE of their choice would not require people working on HTML rendering engines and other HTML processors to pay any special attention to them.
Anne van Kesteren
James Graham
Jeremy Orlow
Aryeh Gregor
Tab Atkins Jr.
Jace Voracek
David Singer

More details on responses

  • Grant Simpson: last responded on 22, July 2010 at 18:31 (UTC)
  • David Baron: last responded on 22, July 2010 at 20:16 (UTC)
  • Boris Zbarsky: last responded on 23, July 2010 at 05:07 (UTC)
  • Henri Sivonen: last responded on 23, July 2010 at 12:19 (UTC)
  • Toby Inkster: last responded on 26, July 2010 at 21:14 (UTC)
  • Anne van Kesteren: last responded on 27, July 2010 at 08:17 (UTC)
  • James Graham: last responded on 27, July 2010 at 11:43 (UTC)
  • Jeremy Orlow: last responded on 27, July 2010 at 14:00 (UTC)
  • Aryeh Gregor: last responded on 27, July 2010 at 18:20 (UTC)
  • Tab Atkins Jr.: last responded on 27, July 2010 at 19:41 (UTC)
  • Jace Voracek: last responded on 29, July 2010 at 18:11 (UTC)
  • David Singer: last responded on 30, July 2010 at 10:11 (UTC)

Everybody has responded to this questionnaire.


Compact view of the results / list of email addresses of the responders

WBS home / Questionnaires / WG questionnaires / Answer this questionnaire