TAG telcon -- 01 Oct 2009

Admin

NM: Next call 2009-10-08, scribe-designate is DC
... We are still waiting on DC for f2f minute cleanup for 23 and JAR for 25

HTML 5 review

<noah> http://www.w3.org/2001/tag/2009/09/HTMLIssuesRevised.pdf

LM: Let's look for things we can conclude

HST: I explored some background wrt this topic

<masinter> it's ok, i hope we eventually get to all

HT: I checked whether it's intended to be the case that there is expected to be a relationship between document conformance, and processor recovery. The answer is no.
... In the case of a binary attribute, there is no error condition noticed or recovered in the case of an unexpected value. Allows for extensibility.

HT: So, not a high priority

<DanC> (huh? for binary, the spec determines a "true" or "false" value for every case, no?)

<ht> Yes, DC, but there is no error recovery required

<ht> No error is even detected

<jar> because there are no errors in that case?

<DanC> well, returning "true" or "false" is error recovery, no?

<ht> What I mean is there is nothing in the spec of the form "recover by ...." or any indication wrt the processor that anything unusual has happened at all

<Zakim> masinter, you wanted to say the purpose of a mime type is to tell the receiver what the sender intended when the sender sent the message with the mime type label

Relationship between the DOM and its HTML serializations

TBL: This sounds like the fact that the parsing of XML and HTML are different
... even if you parse the same document in the two different ways you may get different DOMs
... e.g. the insertion (or not) of a tbody

<noah> Anyone else remember the nuances of what was intended by this one.

<noah> If we can't agreement on what this issue was, and why it's high priority, I'm going to pick another.

<DanC> (I wonder if the <tbody> thing is in the appendix C guidelines http://www.w3.org/TR/xhtml1/#guidelines ...)

<DanC> (yes... it is, briefly... "e.g. the tbody element within table")

HT: I think this has to do with "what is the spec about"? Is it the semantics of the DOM? The serialization. At the moment it's about both.

JAR: And about the APIs?

<ht> scribenick: ht

LM: There were once two documents, HTML and the DOM, but the interactions were sufficiently complex that they've now been merged
... but the result has sort of lost the definition of the HTM Language
... which leads us back to the desire to separate out the language definition

LM: I have some sympathy wrt this because browsers are in fact based on the DOM API

<timbl> Browsers -- code in general which treats HTML or XML should be written in terms of the DOM, of course, not the serialization.

HT: Yes. The argument that document.write makes it difficult to separate behavioral and static aspects of the spec is a sound one. Until we find "allies" and find partial consensus to find version that does not support document.write, at least in the context of live scripting (at parse time)....
... Script outside the head.

<Zakim> masinter, you wanted to say 'not possible' is wrong; 'difficult' perhaps and to point out also about security contexts which don't allow the things that also give difficulty

LM: I think it is possible to define a language and DOM w/o document.write
... and then add document.write as a feature
... There are also security contexts in which document.write is not allowed
... They would benefit from a clean defn of the language

NM: When?

<DanC> (html mail is an example)

<timbl> Suppose we recognize that document.write() is a dirty level-breaker and those who enter abandon hope of simplicity or wide interoperability. And we promote the alternatives -- often document.write is used when a cleaner alrernative exists, or should be designed.

<noah> Thanks dan, that's what I expected to hear.

LM: Mashups

TBL: document.write is a classic piece of dangerous engineering -- distrust self-modifying code -- we could divide cases into reasonable and a safer cleaner alternative provided, and bad cases
... innerHTML=... [??? a better alternative ???]

<DanC> (raman pointed me to a script loading technique that didn't use document.write(); I think I lost track of it before I managed to study it closely.)

TBL: In contrast squeezing a new script elt via document.write in order to get around the lack of dynamic script loading

<DanC> tim said .becomes() is better than .innerHTML =

TBL: We could promote the alternatives

<Zakim> DanC, you wanted to note (a) a streaming subset of HTML that hsivonen found/implemented that's pretty interesting and (b) remaining open issues in specifying document.write() and

DC: Separating out document.write has been asserted to be hard, but Henri Siv. allowed document.write in a one-pass HTML parser -- that could/should be promoted
... People talk as if the HTML 5 parsing algorithm is done, but it isn't -- when Henri implemented the published algorithm in the Mozilla code carcass, it doesn't go back and reparse as often as the browsers actually do, and this causes interop bugs

TBL: Example?

DC: Heuristics for distinguishing actual comments from JS inside comments
... THere is a list of 3 problems from Henri [ref?]
... This is classified as 'needs research' in Hixie's queue

DC: Discussion to date by WHATWG has not reached a good conclusion

<DanC> re fixes re reparsing: "Wow. Please can we stick to just the current magic escapes and not add even more magic?" -- Ian Hickson http://lists.w3.org/Archives/Public/public-html/2009Aug/0617.html

<DanC> (re streaming subset... looks like hsivonen doesn't support it as a feature, but he sketched it in http://lists.w3.org/Archives/Public/public-html/2009May/0582.html )

NM: I think it's true in principle that documenting a language w/o document.write would be a good thing
... but in practice it is widely used, and insofar as HTML 5 is about what browsers actually do
... we have to document the true complexity of the current situation
... So either someone has to demonstrate that layering will achieve this
... because otherwise we will just have to write two documents

NM: And I think the first is unlikely to succeed
... because of asynchrony, for instance

<jar> maybe some macrology... generate multiple docs from single source

NM: So, without a sketch of what a declarative story would look like, I don't think we can take this forward

<noah> Specifically, I think the challenge is to find someone who can do a more declarative exposition of all the logic they now have, including document.write

<noah> Note to Henry: gee, the DOM/document spec split reminds a lot of the difficulties in writing a convenient specification for XSD at both the serialization and component level

<Zakim> masinter, you wanted to ask to separate editorial vs. normative requirements issues

LM: The definition of interop used in the HTML 5 discussion is pretty different from what we're used to -- its "works the same" rather than "as specified"
... For example, different chrome and different behaviour of the back button

[scribe is lost, invites LM to write this down]

<noah> [fwiw, I'm finding it hard to follow Larry too]

LM: Roy F.' s argument that many HTML applications don't have a DOM, and there's no place to put the conformance requirements on those applications

LM: so all you have is a definition of conformant documents, and a definition of conformant DOM-based consumers, but no definition of conformant processors which are not DOM-based

TB<

xxx

<jar> The conformance section of HTML5 explicitly acknolwedges non-browser (and I think non-DOM) consumers

<Zakim> timbl, you wanted to say one could rough out a very conservative version of the languge, for example which just does't have scripts or doesn't have document.write(). and to say one

TBL: What about "if in doubt, throw it out" -- that is,
... not looking for biggest language you can define

TBL: I don't use script (or tbody) or document.write, so life is simpler for me
... And we could document how simple it gets
... Having specs for docs and for behaviour isn't crazy, as long as you can show they match up and give a coherent protocol

TBL: It would be nice to show that relationship holds mathematically

<noah> I think Tim missed my point, which is that these folks will want to document how browsers work. I never questioned that smaller languages could be documented. I questioned whether anyone would invest in maintaining that spec in duplicate with the full browser spec.

NM: There seems to be a concensus in the community that a documentation of current browser behaviour is needed

<DanC> masinter, re proxies etc., they can treat "DOM" as "abstract syntax" or "data model" or "infoset", as sivonen argued in http://lists.w3.org/Archives/Public/public-html/2009Aug/1322.html

NM: If that's going to be done, and done well, that leaves the simpler cleaner subsets

TBL: What about a profile?

NM: It's hard to profile the non-declarative form of the current spec.

TBL: But if you just say "look at that spec., don't use document.write, use tbody, then things are much better"

NM: Maybe, but that wouldn't be the same as a much simpler spec. written explicitly to define the simpler language
... which defines the two simple trees (doc and DOM) and a clean non-procedural mapping between them

<DanC> (yes, it does seem to me that the goal of HTML 5, to specify every case of how scripts work, involves solving the halting problem)

<Zakim> noah, you wanted to ask, with chair hat on, is this discussion on a path to useful input to the HTML 5 group, or at least articulating our concerns more clearly and accurately?

NM: How is this feeding into helping the HTML 5 WG?

<Zakim> timbl2, you wanted to say to Larry (a) it is in general correct and normal to spec the semantics of a language in terms of the abstract model not the syntax, for XML and RDF for example

<Zakim> masinter, you wanted to point out that the goal has been to accomplish something that no standard in the history of standards, much less in the history of computer software

DC: I think if you look at the DOM as abstract syntax, similar to XPath data model or infoset, then the "agents that are not DOM-based lose" argument collapses

LM: I'm not sure the consensus is ...
... The group is chartered to produce an incremental update of the HTML language and the DOM, but not necessarily a browser implementation spec.

<noah> Is Larry's characterization of the HTML WG charter correct? I took it as implicit, and perhaps explicit, that when WhatWG teamed up with W3C, that a user agent spec was very much a goal. Am I wrong?

LM: The intent to write a standard which matches exactly and precisely to what is implemented has no precedent
... Maybe a C language spec.

<noah> Class loaders in Java?

<DanC> a user agent spec isn't explicit in the charter. (http://www.w3.org/2007/03/HTML-WG-charter.html )

<masinter> http://krijnhoetmer.nl/irc-logs/whatwg/20090927#l-459

LM: The desire to specify exactly what browsers do moves you from language description to algorithms to describe behaviour
... This seems different from any spec. I'm familiar with

HT: Interestingly, the XML grammar simultaneously defines the XML language both before and after entity replacement. Trying to do something like that for HTML, in order to make the treatment of load-time scripting and document.write at least partly declarative, is an interesting challenge. Then there is Noah/Raman's question: even if we could, would it have impact on the working group?

HT: We would have to convince them that this would lead to a better way of achieving their existing goals. Seems unlikely.

<noah> I agree with Dan; that bit of the XML spec is tricky, not a model for future work.

<timbl> DanC, in what way .. what params or whether a charcstream or byte stream etc ?

<DanC> char stream, or dictionary of entity_name->char_stream, etc.

<DanC> (noah and timbl are responding to an unminuted comment that I made: I was never happy with that aspect of the XML spec... it's not at all clear what the input to an XML parser is )

NM: http://www.w3.org/2007/03/HTML-WG-charter.html

"A language evolved from HTML4 for describing the semantics of documents and applications on the World Wide Web. This will be a complete specification, not a delta specification."

NM: The history of all this is unusual
... The WHATWG perceived a gap in what the W3C was doing
... I've always understood that their goal was to document what was required to build the next Mozilla or Opera or Chrome
... not at the superficial level
... but in terms of how the language and the DOM and CSS combine to do the right thing

NM: That project attracted support, and we can't ignore that
... saying the world doesn't or shouldn't care about that project is pointless -- that train has left the station
... Our only possible way forward is to show that an alternative, layered, approach actually achieves their goals, and has other benefits

LM: I disagree that the train has left the station -- even if this project goes ahead unchanged, it's still appropriate for the TAG to describe what should have happened

<noah> I think this spec. will have more direct impact than any other spec the W3C produces in this period of several years. I want to put a top priority on helping it, or making it better. Just saying why it's wrong or could have done better, even if true, strikes me as a much lower priority for now.

LM: Doing more than just hand-waving is important, to delve into our concerns
... DC may think that converting DOM-construction specifications into more declarative ones isn't difficult, but it seems to make backing off of some aspects much much harder

<DanC> ("difficult" I don't disagree with. The claim [I heard] was "does not apply to non-DOM-based agents")

<noah> I think the future will be reached by evolving from whateve winds up in this spec. Maybe or maybe not getting someone to write a clean spec in parallel will have good influence as well, but the XHTML experience makes me pessimistic.

NM: So, suppose we agreed with all you said, what should we do?

<Zakim> timbl, you wanted to point out the ffuture is longer than the past

LM: Historically outputs for us include Notes, Findings and WebArch. . .
... advice to members, or to the Team
... we could encourage the production of another spec. with better properties

TBL: In one sense it's left, but it has a long way to go in the future
... Pointing out that a simpler language has good properties might stimulate interest
... A book might get written without our help

<masinter> I think the concern is around allowing use of the same spec with different normative requirements for different applications, for example, those that don't have DOMs

TBL: There will be vast numbers of books about HTML 5 -- the spec. is not the last thing that will happen
... and more specs will come

<noah> FWIW, while I appeared to disagree with Tim earlier, I agree completely with what he's saying here. Pointing out the advantages of cleaner and/or smaller formulations is a good thing to do.

TBL: I would like to find things to point to which say that BNF is better than C code

<noah> I don't think it fulfills our responsibility to help the larger spec be more successful too.

TBL: Prioritize specific feedback requesting changes to the document
... For example, the Data section should be removed

<masinter> looking for "From Roy Fielding Wed 8/26/2009 10:23 AM"

LM: Maybe we should start a document: Summary of TAG issues/discussions wrt the HTML 5 spec. . .
... Questions/issues/places where we're uneasy/would like further work
... Could even become a companion piece

<masinter> that's what you were asking for, Noah, is "what are we going to do next"?

NM: Not ready for that kind of structuring question, still aiming for specific feedback to the HTML WG on specific issues

LM: Are we procedurally blocked, or do we disagree about the technical issue?
... Is there something which in principle we all agree would be a good thing?

TBL: Yes, removing the Data section

<DanC> and what position are you asking whether I agree to?

<masinter> DanC: on Issue 11 about the spec being based on the DOM

NM: Back to issue 7

<noah> Still on issue 7 {-)

<masinter> sorry 7, not 11

<noah> Still on issue 7 :-)

TBL: Proposal: we should ask that the spec. specifies a set of constraints under which one can publish XML to be read as HTML [without loss]

<masinter> I'm concerned about normative requirements for non-DOM-based HTML consumers

<noah> Larry, we'll get to non-Dom issue in a minute. I heard you say that, and several others disagree.

TBL: Anyone disagree?

DC: I do

JAR: I would like to know how that would be different from XHTML5

<DanC> TimBL asked whether he thought this would be a useful comment. I think (a) the spec already allows it and (b) the resulting discussion would be anything but useful.

TBL: If you mean the XML serialization of HTML5, the problem is that goes through a different path

<jar> must be quarantined under a different media type

NM: So, one of your constraints would rule out document.write ?

TBL: Typically, yes

<DanC> the technical substance of TimBL's request was achieved when Sam Ruby convinced Ian Hickson to allow <br />

TBL: One possibility would be don't use scripts
... or, scripts OK, but document.write out
... [another one, scribe missed]

<masinter> I think Roy has the most cogent argument about this, suggest reading the discussion

<noah> Link?

<masinter> http://lists.w3.org/Archives/Public/public-html/2009Aug/1298.html

HST: So I heard LM to say that the output of our work need not be entirely, or even mostly, feedback to the HTML WG -- even if we don't feed back to them on certain issues, recording our discussions and conclusions would be valuable in any case

TBL: Feedback to the WG is the most important thing

LM: Understood -- just that we may not feedback on everything we've talked about

NM: For the future, and for other specs, setting out what we've learned is valuable even if it doesn't affect the HTML 5 spec.
... but I reiterate that the HTML 5 spec. is going to have a huge impact, and we need to focus on helping to make it better
... It will be the basis on which the future is built

<masinter> i don't understand why you think I disagree with you

NM: There are times when writing the clean spec. and getting away from the old cruft is good
... but we can't shirk the responsibility to help make this spec that they are writing better
... We've asked them to specify what they call an Authoring spec.
... TBL has suggested another request, that they specify another thing, a clean subset

<masinter> noah: who are you arguing with?

TBL: Could we get the TAG to resolve that there should be a document subset of HTML that can be serialized as XML?

<masinter> 1+, although I'd like something stronger

Noah: Pleas type in IRC whether you agree with the following proposal: that there should be a documented subset of HTML that can be serialized as XML

<DanC> -1; it's not clear how this is asking for something that's not already in the spec

<jar> jar defers to ht. can't fathom the implications

<masinter> although the discussion at F2F on alternate serializations of XML infoset are interesting

<masinter> and the MS proposal on namespaces in HTML is interesting

<masinter> everyone see that?

TBL: +1

<DanC> (ht is no longer on this call, jar)

<noah> +1 Yes in principle; still on the fence as to whether asking the HTML WG to do this now is on balance the right way and the right time

<jar> oh foo.

<jar> +0 I'd like to understand better. I'm being dense

<timbl> Jar, there is a statememt like "If you use doctype HTML and you use tbody and you don't use document.write(), the you can serialize as XML and label as HTML"

DC: I think this may already be there, in two ways 1) xml serialization is defined and 2) lots of XML docs match the HTML grammar

<DanC> (yes, that's what I said, though there's not actually a grammar)

NM: Fair enough... the implied grammar :-)

DC: yes it does

TBL: Really?

<masinter> http://lists.w3.org/Archives/Public/public-html/2009Aug/1184.html

<timbl> So which XML doc s don't match the HTML grammar, dan?

<DanC> the technical substance of TimBL's request was achieved when Sam Ruby convinced Ian Hickson to allow <br />

DC: Yes

<masinter> "removal of the permission for sending syntactic profiles of XML as text/html. "

<img />?

<timbl> masinter, is that an issue ?

Time check: 5 mins to go.

<DanC> (well, Ian Hickson's goal.)

masinter: There is an issue in HTML WG to update the registration of text/html, to remove the permission to serve XML syntax as text/html

TBL: Open, closed or rejected issue?

LM: Closed.

TBL: Dropped?

DC: You sure it's closed?
... Issue 53 is pending review.

<DanC> http://www.w3.org/html/wg/tracker/issues/53 ISSUE-53

<DanC> mediatypereg Need to update media type registrations

<DanC> State: PENDING REVIEW

LM: We can have influence by offering opinions on issues not yet closed in HTML WG

1 minute

<DanC> (entertaining specific proposals does seem to be productive)

TAG telcon

01 Oct 2009

Attendees

Contents

Admin

HTML 5 review

Relationship between the DOM and its HTML serializations