7546 – "HTML 5" Editor's draft misnamed and suboptimal for HTML content authors unless refactored into HTML (main) and DOM API (appendix).

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 7546 - "HTML 5" Editor's draft misnamed and suboptimal for HTML content authors unless refactored into HTML (main) and DOM API (appendix).

Summary: "HTML 5" Editor's draft misnamed and suboptimal for HTML content authors unle...

Status:	VERIFIED WONTFIX

Alias:	None

Product:	HTML WG
Classification:	Unclassified
Component:	pre-LC1 HTML5 spec (editor: Ian Hickson) (show other bugs)
Version:	unspecified
Hardware:	All All

Importance:	P2 major
Target Milestone:	---
Assignee:	Ian 'Hixie' Hickson
QA Contact:	HTML WG Bugzilla archive list

URL:
Whiteboard:
Keywords:	NE, NoReply

Depends on:
Blocks:

Reported:	2009-09-08 19:38 UTC by Steven Rowat
Modified:	2010-10-04 13:58 UTC (History)
CC List:	7 users (show)

See Also:

Attachments

Description Steven Rowat 2009-09-08 19:38:32 UTC

Abstract:

It's been clearly stated that HTML 5 is being re-specified in terms of the DOM (as opposed to HTML 4). This is a top-level change, and I believe that unless it is managed carefully it will cause serious problems for the entire web, for the following two related reasons: 1. "HTML 5" specified in terms of the DOM becomes in effect a Javascript Implementation Specification, not an HTML specification: calling the current document "HTML 5" is then misleading. 2. This so-called "HTML 5" specification becomes unreadable by average HTML 4 authors unless they can master the DOM, shifting control of the web towards Javascript specialists (who are usually corporate) and away from those who use plain HTML (and are usually individual authors). Solution:
Refactor the "HTML 5" specification into a self-consistent plain-language HTML section and a DOM (Javascript) appendix.

Details:

the simplest definition of the DOM I can find is this from Wikipedia:

"...the Document Object Model is the way JavaScript sees its containing HTML page and browser state."

http://en.wikipedia.org/wiki/Document_Object_Model

The editor of HTML 5, Ian Hickson, said in an interview:

"The main advantage of defining the HTML DOM APIs and the HTML elements in the same specification is that we dont let stuff fall through the cracks."

http://www.webstandards.org/2009/05/13/interview-with-ian-hickson-editor-of-the-html-5-specification/

However, to me it seems that the current interweaving of DOM and HTML in the HTML 5 specification (Editor's, August 25, 2009) appears hideously complex to those who do not use the DOM/Javascript (like myself), and thus becomes unreadable by average HTML 4 authors, shifting control of the web towards Javascript specialists.

Based on the above interview, it's probably accurate to say that the reason "HTML 5" has been produced is in order to specify Javascript use by user-agents ('middlemen'), not HTML document production by document authors. It seems worthwhile noting that all the co-chairs and editors of "HTML 5" are employees of the largest software corporations: Microsoft, Google, IBM, and Apple.

Thus it's natural that it has evolved to be readable primarily by those conversant in browser scripting; but I call attention to the fact that this means there will be a danger that Javascript specialists (corporate), rather than HTML authors (individuals), will now obtain control of the latest developments in web usage.

I do not see this as a good thing. I believe it is a major downgrading of the inclusiveness of the Web.

SUGGESTED SOLUTION:

Refactor the existing "HTML 5" document as follows:
1. Remove almost all references to the DOM from the specification proper, and place them in an appendix, or in a second fully separate section, possibly published separately.
2. Make the leading section, containing new HTML 5 elements and changes from HTML 4, a fully self-consistent document that can be readable and understood by anyone conversant in HTML 4 (or plain language, at best; like the HTML 4 specification). At no place should understanding even of the existence of the DOM, much less its attributes, be presumed.

I will re-quote Mr. Hickson's goal for the HTML DOM:

"The main advantage of defining the HTML DOM APIs and the HTML elements in the same specification is that we dont let stuff fall through the cracks."

It seems feasible to me to obtain this goal by separately specifying everything currently in the HTML 5 DOM, while at the same time not forcing those who don't use it (HTML authors) to be wading through it when they don't want or need to.

Otherwise, what falls through the cracks might be the authoring of HTML pages itself. And then, what use would there be for a DOM to manipulate what isn't there?

Steven Rowat

Comment 1 Michael[tm] Smith 2009-09-09 02:26:46 UTC

I changed "disastrous" to "suboptimal" because including words like "disastrous" in bug-report descriptions unnecessarily distracts from the substance of the bug report.

Comment 2 Steven Rowat 2009-09-09 03:28:24 UTC

(In reply to comment #1)
> I changed "disastrous" to "suboptimal"

Yes that seems like a good change, thank you.

Comment 3 Geoffrey Sneddon 2009-09-11 18:57:12 UTC

Do we not inevitably need some sort of infoset to define elements/attributes for both HTML and XHTML, as well as to define the tree the parser creates? I can see no way to define parsing without having some sort of tree model to parse to, and creating an infoset that is not DOM will likely alienate more readers, as many are familiar with DOM through its presence in scripting.

Comment 4 Steven Rowat 2009-09-13 03:58:44 UTC

(In reply to comment #3)
> Do we not inevitably need some sort of infoset to define elements/attributes
> for both HTML and XHTML, as well as to define the tree the parser creates? I
> can see no way to define parsing without having some sort of tree model to
> parse to, and creating an infoset that is not DOM will likely alienate more
> readers, as many are familiar with DOM through its presence in scripting.
> 

You may be correct in all these statements; but, as far as I understand them, they don't seem to relate directly to my suggestion. And I can't help noting that it's somewhat ironic that you are using language (infoset, tree model) that I'm only marginally familiar with to explain why HTML authors will be presented with HTML5 that is largely incomprehensible to them unless they learn new programming languages.

Putting this in my own terms, which have allowed me to write HTML/CSS for many years without the DOM: doesn't a browser encounter code on the page, and then make a model internally and execution decisions based on what it finds in that code? And isn't that code either pure markup, or a mix of markup and scripting (javascript)? 
  But in either case, the model the browser makes is just a single model -- "infoset?" --. I have no argument with that. And I don't think it relates to the issue raised in this bug that mixing an expanded DOM terminology throughout the specification, in among the markup terminology, is like handing someone a Fortran manual and saying "Oh, by the way, we explained a lot of the issues using Python examples, and mixed them in all the way through. You don't mind, do you?"

And also, after considering it further (before encountering your comment), I realized that this bug is only a symptom of a much larger problem.

So I ask that you indulge me in this slightly awkward situation: I have now spread this issue into another list: it seemed like the TAG list was the correct place to have a discussion at the theoretical level about society's relationship to the new direction taken in HTML 5, and I have posted a longer essay amplifying and extending the ideas from this bug, particularly around monetization of the web. It can be found here:

http://lists.w3.org/Archives/Public/www-tag/2009Sep/0028.html

I suggest that it might help to read that post before further attempts to discuss the issue raised in this bug.

Steven Rowat

Comment 5 Ian 'Hixie' Hickson 2009-09-18 21:48:52 UTC

The "DOM" as used for the definition of HTML5 is primarily an abstract model and is necessary for defining the behaviour of HTML in the face of invalid markup. There's not really much we can do about that.

It doesn't mean you have to use JavaScript. It's no different from the Infoset model used by most XML specs (explicitly or implicitly).

I agree that if you don't understand basic technologies like the DOM, that HTML5 appears complex  but that's not because the spec talks about things in terms of the DOM, it's because HTML is complex. We wouldn't make it any simpler by using some other model like the Infoset or SAX.

Doing what HTML4 did isn't an option either. HTML4 got around this problem by being so vague that it failed to define most of the behaviour of HTML.

Comment 6 Steven Rowat 2009-09-20 01:12:27 UTC

(In reply to comment #5)

> I agree that if you don't understand basic technologies like the DOM, that
> HTML5 appears complex  but that's not because the spec talks about things in
> terms of the DOM, it's because HTML is complex. We wouldn't make it any simpler
> by using some other model like the Infoset or SAX.

We agree it's complex. You imply though that it's fixed; there's no choice. But there's always a choice (which is part of the curse of being human). 

And I think this choice actually depends on "complex for who" --which is the point of this bug and which I do not feel you have addressed adequately enough to close the bug. I just posted a bit more about this in the "Complexity" thread here:
http://lists.w3.org/Archives/Public/public-html/2009Sep/0813.html

I'll repeat the relevant section

"To understand what are acceptable amounts and types of complexity, 
from my perspective it seems useful to see what happens if we first 
split "regular web authors" into:
     a) people who produce their own content that they wish to have 
distributed (or sold) via web pages, versus
     b) web-page coders who code to support other people's content 
(usually as a profession).

After doing this, it appears to me that the complexity of HTML5, 
relative to HTML4, is a significant new burden on the "a" group, with 
the result that HTML5 could cause a large shift in HTML authoring 
towards the professional coders."

Why this is a problem I think is best demonstrated with use-cases of individual content-authors who might be affected; I have been working on ten such, with their rights/commerce preferences (ie metadata needs), and will post them in the near future.

Comment 7 Ian 'Hixie' Hickson 2009-09-22 10:41:45 UTC

I disagree with the premise of this bug report. The spec isn't complex intentionally, it's complex because the platform it describes, which by and large predates the spec, is complex. There's nothing that can be done about that without sacrificing fundamental goals like comprehensive precision in the description of the platform to get full interoperability without reverse engineering.

If you disagree, please escalate this to the chairs.

Comment 8 Steven Rowat 2009-09-22 18:35:30 UTC

(In reply to comment #7)
> I disagree with the premise of this bug report. The spec isn't complex
> intentionally.....
> 
> If you disagree, please escalate this to the chairs.

I do not believe the spec is complex intentionally; that is not a premise of the bug report. Different people have different goals and skills; I'm pointing out that the skills and goals of those described in the 10 use cases, which I've just posted to the TAG [1], are different from those being served by HTML5. I do not believe this comes about by nefarious intention; rather by different belief systems, knowledge, and experiences in society. What may seem like an advance to one set of people in HTML5 will effectively block another set of people. I'm attempt to give them (and myself, who has been one of them at times) a voice in the decision.   

In the 10 use cases discussion, which I researched after originally posting this bug, I've come to the point of view that almost nothing can be done in the current web architecture to fix this problem as I see it. In Van Jacobson's terminology, the point-to-point architecture currently used is "disenfranchising creators" [1,2].

So I agree now that your new WONTFIX is probably the best solution for this bug at the present time. 

>If you disagree, please escalate this to the chairs.

And therefore escalation to the chairs would be inappropriate. I believe escalation to the TAG is appropriate; hence essay [1].

Finally: in that essay I propose that W3C begin studying and thereafter possibly actively moving to the CCN model [1]; and part of that proposal is that appropriate hooks or enablers be placed into HTML5 so the back-compatibility from CNN can be as smooth as possible. If the move to CNN comes to pass, or seems increasingly likely to via a policy decision of the TAG, then it would be appropriate for this bug to be reopened and then fixed by such co-ordination.

References:

[1] "Ten Use-Cases of Individual Content Authors Requiring Rights/Commerce Metadata: success in HTML4; HTML5; and CCN"
http://lists.w3.org/Archives/Public/www-tag/2009Sep/0055.html

[2] Interview with Van Jacobson, 2009: content-centric networking
http://mags.acm.org/queue/200901/

S.R.

Comment 9 Maciej Stachowiak 2010-03-14 14:50:51 UTC

This bug predates the HTML Working Group Decision Policy.

If you are satisfied with the resolution of this bug, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
  http://dev.w3.org/html5/decision-policy/decision-policy.html

This bug is now being moved to VERIFIED. Please respond within two weeks. If this bug is not closed, reopened or escalated within two weeks, it may be marked as NoReply and will no longer be considered a pending comment.

Comment 10 Steven Rowat 2010-03-14 22:13:48 UTC

(In reply to comment #9)
 
> This bug is now being moved to VERIFIED....[snip]...it may be
> marked as NoReply .

I stand by the conclusion I reached in Comment #8 above.

As such, and in accordance with your new Working Group Decision Policy,

http://dev.w3.org/html5/decision-policy/decision-policy.html

it seems best to choose the NoRepy option. I believe HTML 5 is barking up the wrong tree, but based on my experience so far, I don't think escalating these ideas within the HTML 5 Group will bring any changes; these will have to come at the TAG level or by the response of the Internet itself during attempted implementation, or via HTML 5 being superseded by CCN or a similar structural change.


But: congratulations on the writing of the 'HTML Working Group Decision Policy', which is clear, concise, and understandable by non-experts. Would that the HTML 5 spec could achieve this! 


(Final aside:
I would like to point out the two typos in the final paragraph of Comment #8, where I incorrectly wrote 'CNN' instead of 'CCN'.)

Comment 11 Maciej Stachowiak 2010-03-15 06:52:51 UTC

(In reply to comment #10)
> (In reply to comment #9)
> 
> > This bug is now being moved to VERIFIED....[snip]...it may be
> > marked as NoReply .
> 
> I stand by the conclusion I reached in Comment #8 above.
> 
> As such, and in accordance with your new Working Group Decision Policy,
> 
> http://dev.w3.org/html5/decision-policy/decision-policy.html
> 
> it seems best to choose the NoRepy option. I believe HTML 5 is barking up the
> wrong tree, but based on my experience so far, I don't think escalating these
> ideas within the HTML 5 Group will bring any changes; these will have to come
> at the TAG level or by the response of the Internet itself during attempted
> implementation, or via HTML 5 being superseded by CCN or a similar structural
> change.
> 

Since you're here and available to reply: If your position is that you do not wish to pursue the issue further, but wish to register your disagreement for the record, then the best thing to do would be to put the bug in CLOSED and add the Disagree keyword. NoReply is intended for cases where the originator of a comment is not available to respond in a timely manner. We'll use it if we have to, but a proper reply is preferred, even if it is one of disagreement.