TAG Sep 2008

23 Sep 2008



Stuart Williams
DanC, jar


<fixup> Date: 23 Sep 2008

<fixup> Scribe: DanC

Convene, review agenda

SKW: agenda changes?

AM: how about metadata?

SKW: under URNsAndRegistries-50?

JAR (and AM): probably won't fit there.

SKW: ok... will try to fit it somewhere

AM: or maybe discuss it over lunch.

FTF meeting schedule

"proposal for a TAG F2F meeting Weds-Fri 10-12 December 2008 in Cambridge Mass, USA."

-- http://www.w3.org/2002/09/wbs/34270/200812-F2FDecision/results

SKW: there was a question as to why end of week [Wed-Fri as opposed to Tue-Thu]
... I think it was due to an XML conference

HT: that XML conference has no technical track; I don't plan to go

TBL: how about 9-11 Dec?

several: that would be better for me

PROPOSED: To hold a TAG meeting Tue-Thu 9-11 Dec 2008 in Cambridge, MA, USA, TBL/W3C to host.
... 3 full days

TBL: or noon to noon?

PROPOSED: to hold a TAG meeting Tue-Thu 9-11 Dec 2008 in Cambridge, MA, USA, TBL/W3C to host, ending at 3:30p Thu

RESOLUTION: to hold a TAG meeting Tue-Thu 9-11 Dec 2008 in Cambridge, MA, USA, TBL/W3C to host, ending at 3:30p Thu

TVR abstaining

<DaveO> probable regrets for in person attendance at dec f2f

TAG at TPAC 2008

SKW: I asked chairs whether there was interest to meet with TAG members during the TPAC...
... I won't be there.
... I've seen 2 responses:
... WebApps WG on the topic of URI Schemes for Widgets
... WAI-PF invitation to observe

NM: I'm likely to be availble to meet with WebApps
... there will be a room for a TAG meeting?

SKW: yes, there's meeting space reserved for TAG
... ok... so it looks like there's interest from TAG members to meet with WebApps re URI schemes


SKW: The OASIS XRI TC is asking whether their present direction is something the TAG thinks is promising.

discussion of communications between TAG chair and XRI TC chairs...

SKW: examples [of base URIs]: http://xri.net/ or http://boeing.com.xri.net or http://xri/ or http://boeing.com.xri/

NM: an important point is: these =xyz things... they aren't expected to be recognized out of context, but treated as a relative URI, and the XRI-related policy stuff would be communicated in the usual top-down ways

TBL: do they still expect to use this [?] in OpenID?

HT: I expect so, as that's a major use case.
... yes, this looks promising, but the devil will be in the details...

<noah> Right. What I said was: I would find it unacceptable if anyone proposed that the =xxx syntax was supposed to be recognized as an XRI in an arbitrary URI [context]; what I think is being proposed, and what's OK with me, is that URI http://example.com/=XRISyntaxHere is recognized as an XRI if and only if example.com says that they have used their URI space in this way.

HT: the XML base rules are specified in some detail by specs from the XML Core WG, and unless this proposal plays by those rules exactly, we'll have to discuss it in detail

SKW: any discussion against? no? ok... I'll take the ball...

. ACTION: Stuart encourage discussion of proposal around http://xri.net/ or http://boeing.com.xri.net or http://xri/ or http://boeing.com.xri/ in www-tag

<scribe> ACTION: Stuart encourage discussion of proposal around http://xri.net/ or http://boeing.com.xri.net or http://xri/ or http://boeing.com.xri/ in www-tag

<trackbot> Created ACTION-174 - Encourage discussion of proposal around http://xri.net/ or http://boeing.com.xri.net or http://xri/ or http://boeing.com.xri/ in www-tag [on Stuart Williams - due 2008-09-30].

SKW: the XRI TC is concerned that tracking this under URNsAndRegistries-50 is misleading
... and suggests some tweaks to our issue tracking...

HT: it's administratively inconvenient to change the issue name, since our discussion is indexed under UrnsAndRegistries

TBL: how about...
... his issue covers a) URIs for namespace names b) URNs and other proposed schemes for location
... independent names c) XML registries, and perhaps centralized vs.
... decentralized vocabulary tracking.

NM: does that capture the indirection enough? there's something to the "registries" bit

SKW: sounds like there's willingess to revise the issue description

<scribe> ACTION: Stuart to collect input from TimBL and others and revise issue description

<trackbot> Created ACTION-175 - Collect input from TimBL and others and revise issue description [on Stuart Williams - due 2008-09-30].

SKW: and we have another bit of writing... "Dirk and Nadia design a naming scheme" draft 16 Sep http://www.w3.org/2001/tag/doc/justSayHTTP

HT: after 2 years of [writer's block], I think I'm on to something...
... earlier drafts would neither (a) convert a skeptic, nor (b) serve as an introduction to a someone new to the issue. It only spoke to those who already agreed.
... this approach is perhaps not effective enough to convert those who take an extreme view, but I think it's not as off-putting as earlier drafts
... is this a promising direction?

DC: I had a "I don't see point X covered yet" feeling, though I can't recall what the X was

AM: [oops; missed the question]

HT: I'm particularly happy with the list under "So, here are the requirements in detail"
... it doesn't yet incorporate a uri-in-uri requirement... I haven't gotten my head around that

TBL: perhaps in a break I can help with that

HT: there's a bit of "apples and oranges" between 'delegation' and the sorts of stability...

HT goes to the whiteboard...

headings: Owner / Resource / Representation

under each, 0 = centralized, 1 = decentralized

HT: all 8 possibilities...
... the 0 / 0 / 0 extreme... e.g. W3C... naming and resources are centralized.

TBL: er... ok... I'll stipulate for now...

HT: the 1 / 1 / 1 extreme is, e.g. "the web". ownership is distributed, etc.
... the hypothesis is "we can do all 8 rows with http". That's what I'm working through.

TBL: a weakness in this table is that it assumes ownership, but naming schemes like using sha-1 don't really have ownership

DC: yes, I think we need an appendix to say "we're not treating the things that don't involve ownership"
... I've seen "the TAG thinks all new URI schemes are harmful" but I want to be clear that we're not going that far... only "no new schemes when there's an administrative hierarchy in place"
... counterexamples include freenode/p2p, sha-1, etc.

NM: I'm not sure the W3C example is a good one for the 0/0/0 case
... the W3C has delegation internally

poll shows support for this direction

[I presume there's a drafting action on HT that continues]


<ht> http://www.ltg.ed.ac.uk/~ht/exi.html

(hmm... a copy of that should go to www-archive or something.)


<trackbot> ACTION-93 -- Henry S. Thompson to review EXI WDs since 20 Dec -- due 2008-09-23 -- OPEN

<trackbot> http://www.w3.org/2001/tag/group/track/actions/93

HT: the EXI WG has published 5 drafts: * Best Practices

* Primer

* Evaluation

* WD

* Impacts

HT: I think "Best Practices" is most relevant for us... it's 10 now months old
... "Primer" seems to be orthogonal to / irrelevant for TAG considerations

<timbl> I completely agree with your comment on section 4.2

HT: several years ago, the approach for exi was as a character encoding...
... then more recently, a new story emerged... "an EXI document is not a well-formed XML document..."


HT: 2 questions:
... 1. EXI documents should not be well-formed XML
... 2. 2 bits (leading 1 0) is enough to distinguish EXI from XML
... i.e. does the TAG agree with 1 and 2?

NM: I gather this is a new content encoding... [?]

<Zakim> noah, you wanted to ask whether Content Type:application/___+xml; Content-Encoding: x-exi is the right way to go

SKW: how about the Evaluation document?

HT: I haven't worked on that lately

SKW: we asked them to give a more succinct argument in the evaluation document

TBL: a concern with the content-encoding approach it suggests exi applies to any byte sequence [?]
... [and a similar concern about the charset approach; scribe couldn't distinguish]

HT: yes, all approaches have their down sides... [missed the gist of it]

NM: one requirement is to round-trip XML documents, preserving single vs double quotes. This doesn't seem to meet that.

HT: right.

NM: ah. so it's acknowledged that this requirement isn't met.

HT: another design choice is to duplicate the XML mime hierarchy; i.e. application/exi to to with application/xml, application/svg+exi to go with applicaiton/svg+xml and so on

[incoherent chorus of down-sides to that approach]

TBL draws a table: + / - for charset, content-encoding

a + for the charset approach is that it applies to all XML files

a - for charset is "not connegable". [DanC isn't sure about that. There's Accept-Charset, no?]

a + for content-encoding work for all content types [disputed by Noah]

a - for content-encoding is: software architecture on client needs a kludge

HT: a downside of the charset is the 32 bytes <?xml version="1.0" encoding="exi"?>

[incoherent debate about whether that really requires 32 bytes]

SKW: we're requested to participate in their last call review 19 Sep to 7 Nov

<noah> Evaluation document: http://www.w3.org/TR/2008/WD-exi-evaluation-20080728/

DanC: yes; we should double-check that their new drafts address the comments they said they'd address
... in reply to http://lists.w3.org/Archives/Public/www-tag/2007Nov/0065

<timbl> Do they have machine-readable data behind those graphs?

NM: re efficiency, the gist of our comments, I see [something like] "this will be done in CR"

DO: as I recall, we also asked for some less exotic use cases... I don't see that.

<timbl> It would be nice if there were a web form based thing to test your own data, so that David could for example try ot with acouple of bits of XML from his daily life

NM: I see convincing results re compactness but not yet wrt efficiency; so now I'd like to know if that's good enough for an interesting set of usecases

<scribe> ACTION: Noah work with Dave to draft comments on exi w.r.t. evaluation and efficiency

<trackbot> Created ACTION-176 - Work with Dave to draft comments on exi w.r.t. evaluation and efficiency [on Noah Mendelsohn - due 2008-09-30].

<scribe> ACTION: Stuart to schedule more discussion of exi architecture charset/encoding etc.

<trackbot> Created ACTION-177 - Schedule more discussion of exi architecture charset/encoding etc. [on Stuart Williams - due 2008-09-30].

HTML and Web: the Big Picture

SKW: I'm bi-modal about this topic. I understand that it is a high priority for the TAG and why. However, there are now communities with such established momentum in [many] directions that I wonder what impact the TAG could expect to have. In addition, my day-job focus is not on hypertext which means that finding time to commit to doing a proper review of a 450+ page spec. is a real challenge - it's not my style to skim read. In 3-5 years time, what advice does the W3C expect to be giving hypertext content creators - what recommendation will it be making? I think that the W3C presenting multiple options does not help content creators and amounts to "...let the market figure it out...".

DO: the way the HTML 5 spec is being produced... I have concerns about the process... but process issues are awkward for the TAG to address...
... meanwhile, it interferes with technical input, e.g. w.r.t. distributed extensibility

NM: perhaps nothing novel to add, but...
... we should give what technical input we think will help, and if our input isn't taken up, such is life
... if HTML and XHTML co-exist, the TAG should help W3C tell people when to use one and when to use the other and such
... that advice might include recommending practical approaches despite conflict with architectural principles
... one view is "there's clean XHTML and messy tag soup". Then I've ready comments on the HTML 5 spec about the way it mixes a spec for browsers with a language spec...
... perhaps you could write a clean language spec for HTML 5 and separate that from the browser-patch-up stuff. I don't hear much about that approach

HT: again, perhaps not novel... I'd like to help get the good stuff in HTML 5 more visible
... perhaps using traditional formal techniques and separate it from error recovery and such
... I see some support for this approach
... To the extent the TAG can do that, I'm interest.
... meanwhile, as a researcher, I'm interested in being more formal about the error recovery stuff.

DC: well, I was hugely frustrated with making no progress toward my goals for HTML 5 in the first year of the HTML WG, but a few months later, perhaps one year is not that much in the HTML timeframe. So I'm open to looking at lots of approaches

AM: I'm new to this area... the WHATWG is new on my radar. Looking at the landscape, I wonder if the chances of having impact are so low that... well... should we use our time for other things?

JAR: it looks somewhat daunting to have impact; most of my work is disjoint with this technical area

TVR: there's a potential crisis for the W3C if we say "HTML 5 is so difficult to deal with that we'll ignore it" then much of the work in W3C becomes irrelevant to deployment on the Web

<jar> scribe: jar

Around the table regarding html5

<timbl> Concern over - Losses of engineering quality, and architectural principles, which have serious consequences; - Changes of philosophy about improving the web as opposed to letting it fester while describing it; - Socially, lack of review by other groups who can't read the huge spec; - Socially and engineering .. making willful departure from other specs without negotiation with the other community.

timbl: people have accused of partisanship

stuart: Putting our analysis on record is a good idea.
... Let's document TAG opinion even if it risks having no effect.

noah: Important for TAG to approach sympathetically needs of the different communities. Core intuition - we need to document what the browsers do - is beneficial
... But to mix this with a language spec is not a service to the community
... Better if the document could be relayered. Separate permissive behavior from 'clean' behavior
... having a clean spec is good for content creators

ht: It would be good for goals of html5wg AND for w3c if a traditional language spec were separated out from monolithic
... and if it were described formally (with a grammar)
... (2) two bits of bridge-building to pursue: look at media type namespace defaulting proposal; and
... we need to find mechanisms, perhaps the w3c validator, perhaps via changes to specs, that help to people see that there are incremental improvements possible in the quality of their html documents

<noah> Actually, where I'm scribed as saying "separate permissive behavior from clean behavior" isn't quite the nuance I had in mind. I think a language specification indicates which documents are legal, and what they mean. That's one spec. I think HTML 5 as drafted also includes a specification for pieces of code we might call browsers, which by the way attempt to provide useful output for content that would not be "legal" in the language spec, e.g. improperly nested elements. I think having both specifications is very important, but I would prefer that the browser specification, including fixup of bad content, was separate from the specification of the clean language and its correct interpretation. The former spec. would be for authors and for those who might in future be able to deploy less permissive UAs; the latter would be to achieve interoperability among browsers as we know them.

raman: From TAG perspective, we must ask: How does the rest of W3C's work fit in with HTML5?

<Zakim> raman, you wanted to add treat html5+js as the assembly language of the Web, compile better languages to the assembly language for delivery. That is what everyone is doing right

noah: Consider possibility of encouraging people who can contribute to this discussion to run for an elected TAG spot?

HTML5 should be modularized?

noah: Different axes along which to modularize

ashok: There's a fairly large section on microformats - that should go in an appendix

danc: Does parsing html5 require parsing numbers?
... So what about document.write ?

ht: XML spec defines (BNF) which simultaneously defines language including entity refs AND language after entity ref expansion
... so to define two grammars is not completely unprecedented

noah: would be nice to define a correspondence between nicely formed input and the DOM

<DanC> (I think the spec for how markup relates to the DOM is currently specified as serialization rather than parsing in the current HTML5 draft)

<DanC> (http://www.w3.org/html/wg/html5/#serializing 8.4 Serializing HTML fragments )

noah: defining the DOM should be straightforward... then the script runs. Suppose new text is not nicely formed. Maybe the specification for the clean language can just not specify what happens in the intermediate states while new text is being written, until the document character string is once again properly formed. ?
... try to make it as declarative as possible

raman: Let's define clean version separately from conversion from not-clean to clean.
... Effect of executing document.write: apply same rules that would apply to get you from not-clean to clean
... html 5 spec is not clear on this. complicated state table
... browsers can load script synchronously or asynchronously (example of two .js files executing asynchronously)
... this twist is an accident. people noticed that the first script blocks the second script, workaround is document.createElement of a script tag

stuart: Remember Douglas Crockford article against document.write ?

ht: If you want a walled community, here's what the walls look like

raman: google ajax has no document.write. no namespace pollution
... you do your library as a javascript library. publish a 2nd file that's a load hook. defines one new name. there is no document.write or createElement. read the ajax documentation

timbl: About document.write: the createElement alternative is a pain ...

<bubbles> Dan, see http://www.google.com/uds/samples/places.html and http://code.google.com/apis/ajaxsearch/samples.html

<bubbles> show source on the first one, and note the call to google.load()

timbl: but there's an ecmascript extension that makes it more concise

ht: it makes xml a first-class syntactic object in javascript
... The languages that permit xml elements embedded, have never taken off.

raman: Once you say declarative, all the imperative people come out and say you can't do everything declaratively
... but in Lisp, no one ever said "declarative". Instead they just made special forms that abstracted out the imperative part

timbl: Is there a clean, concise alternative to document.write?

<DanC> Fixing HTML Douglas Crockford 2007-11-28

raman: If you have eval, you don't need document.write

timbl: Are there languages where SQL is integrated well into another language? [looking for precedent]

noah: (about xaml)

raman: document.write is a challenge to modularization. can we do: [core], error recovery, document.write ?
... document.write is what connects all the other clumps together

noah: Language spec says what tags, attributes means. A spec for a processor talks about what can be done incrementally, what is accepted. Error recovery is about not-legal input, here's how to process it.
... Language to DOM could be specified declaratively. It's not a processing problem
... Wants one module that says: This is the clean spec, here's what you should author to
... then error recovery would go in another module

danc: HTML 5 is not coordinating with other w3c activities. How should this be addressed. [looking at agenda - css, other web languages, ...]


URI Parsing in HTML5


danc: Why should I be worried about this?

raman: HTML5 spec has no business talking about how URLs are parsed
... should not be talking about how these particular strings should be interpreted. Rationale for doing so was error recovery...
... but this doesn't warrant defining the parsing rules for URLs.

Danc: Doesn't see where HTML5 respecifies anything
... Is quite surgical.

noah: It says it's not using "URI" as defined in rfc3986. Maybe a better way to write it would be to give a new name for the syntactic change/extension to URI?

raman: Goes back to what group owns which specs

ht: XML Core has names for things like this. System Identifier is a nonescaped URI, vintage 1998. What you put into an XML doc that gets turned into a URI by a specified process
... 5 specs have this same problem - what to call things that get turned into URIs. They all cut & paste text that was written for Xlink 1.0
... When IRI came along, this became untenable. Current revision of IRI spec will include a section on 'legacy extended IRIs'
... Not clear when new RFC is coming out. hostage to IDN
... A part of HTML 5's problem has been solved by LEIRIs
... The problem is error recovery - what to do when the string isn't a URI
... First bullet (2.5.2 of [what document??]) is addressed by the WG note [on LEIRIs]

<ht> http://tools.ietf.org/html/draft-duerst-iri-bis-03#page-33

noah: First issue is stripping leading and trailing space ... no one is saying the spaces are part of the URI

ht: Principle of least surprise says we'd be better if across the board users had a single expectation on how to write URIs in the documents
... Ergo, (1) we need the RFC specifying URIs; and (2) we need to say how to write them in documents

stuart: Jena handles IRIs - has 6 modes of operation

ht: XML Core is trying to reduce this to 1

raman: URL situation is an example of the social problem. I don't expect to solve this, or if we do, for our solution to make any difference
... Rather, how do we bring about solution to social problem, of who owns what specs?

Danc: Maybe take URI related parts of HTML 5 and send to IETF for review
... ?

noah: Space trimming is a funny use of "error recovery". I think "error recovery" has to do with processing...

danc: agreed.

jar: Is this an example of clean/not-clean distinction?

noah: No, it's a detail

timbl: The problem is that the HTML 5 spec doesn't distinguish clean from not-clean
... [thinks that space stripping is not-clean but recoverable]

<DanC> for reference, http://www.w3.org/html/wg/tracker/issues/56 Assess whether "URLs" section/definition conflicts with Web architecture

<timbl> I think it is important that the cases which don't meet the IRI spec are referred to as errors, even if the errors are ignored in HTML5 browser handling

jar: compare to C programming language 1970's vs. 1980's

raman: No problem with having the browser accept everything
... Problem is that the spec is writing in: Go ahead and write these things, it's OK

jar: Balance of power is different. In C the language had no power relative to the "browsers" (CPUs), so new CPUs could dictate language changes

ht: Is the third bullet (about %s) telling us that % should, or shouldn't be %-encoded?

still on 2.5.2 of HTML 5 draft

danc: issue of string + document encoding pairing ...

<DanC> uri encoding test cases http://hixie.ch/tests/adhoc/uri/encoding/

<DanC> SKW: "It is possible for xml:base attributes to be present even in HTML fragments, as such attributes can be added dynamically using script. (Such scripts would not be conforming, however, as xml:base attributes are not allowed in HTML documents.)"

<DanC> see also The xml:base attribute (XML only) http://www.w3.org/html/wg/html5/#the-xmlbase

danc: (unknown elements go in the dom. unknown attributes don't)

noah: rationale?

danc: unknown
... XQuery has test cases for handling of URIs in attribute values, right?

ht: There are XML test cases for 'system identifiers'

danc: IETF is concerned about scope of HTML 5 including a protocol (web sockets)

raman: What about interaction between metadata (e.g. xml:lang=) in document vs. in HTTP headers?
... Does xml:lang or lang override HTTP headers? This should technically be decided by HTTP WG

stuart: 'authoritative metadata' tag finding
... Is html LANG well specified anywhere?

danc: see HTML 4

<DanC> long thread around lang vs xml:lang vs http content-language http://lists.w3.org/Archives/Public/public-html/2008Aug/thread.html#msg856

<Stuart> http://www.w3.org/International/questions/qa-http-and-lang

danc: (Wondering about how to liaise regarding URI spec(s))

<DanC> for reference: URIs in HTML5 and issues arising Ian Hickson (Sunday, 29 June) http://lists.w3.org/Archives/Public/uri/2008Jun/0088.html

stuart: The original idea of http-equiv was that the server would pull it out and supply as http header?

danc: Yes, since some people had no [other] way to influence the server

stuart: and then the clients started interpreting http-equiv as well

<ht> http://www.xkcd.com/477/


Summary of Action Items

[NEW] ACTION: Noah work with Dave to draft comments on exi w.r.t. evaluation and efficiency
[NEW] ACTION: Stuart encourage discussion of proposal around http://xri.net/ or http://boeing.com.xri.net or http://xri/ or http://boeing.com.xri/ in www-tag
[NEW] ACTION: Stuart to collect input from TimBL and others and revise issue description
[NEW] ACTION: Stuart to schedule more discussion of exi architecture charset/encoding etc.
[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.133 (CVS log)
$Date: 2008/10/08 14:53:27 $