Technical Architecture Group Teleconference -- 01 Oct 2013

<dka> trackbot, this will be tag

<trackbot> Sorry, dka, I don't understand 'trackbot, this will be tag'. Please refer to <http://www.w3.org/2005/06/tracker/irc> for help.

<dka> ;alkyd;alksd

<dka> trackbot, start meeting

<trackbot> Date: 01 October 2013

<dka> darobin can you join us this morning?

<darobin> ohai

<darobin> dka: I can join in ~10min, for about 30-45min

<darobin> after that I'm off to tell the lovely people of Lyon how much they'd enjoy making standards

<wycats_> I'm on my way, fyi

<wycats_> Is Alex already there?

<Yves> not yet

<Yves> nor anne

<wycats_> OK I am at his hotel waiting for him

<dka> Ok - we really need to start shortly due to Philippe's schedule.

<wycats_> Anne's here

<dka> Suggest you come over - maybe you can ring up to Alex's room?

<wycats_> I texted him

<dka> ok we have Philippe until 10:30 so we can wait a few minutes

<dka> Robin we are on 0824

<scribe> scribenick: noah

<darobin> [Polyglot has an update about to be published: http://dev.w3.org/html5/html-xhtml-author-guide/]

HTML5

We have Philippe le Hegaret visiting in person and Robin Berjon on the phone

HT: Proposed topics RDFa, polyglot, authoring spec, mediat type

<darobin> [RDFa isn't us, but Microdata is more or less in a bad state it would seem]

TBL: Is it appropriate to talk about proposed solution to the Web IDL problem

AVK: DRM?

HT: EME = Encrypted Media Extensions are actually going into the browsers

<darobin> [the TAG brainstormed about DRM last time (or the time before?), did anything come of that?]

PLH: EME is in HTML WG pushed by Web & TV interest group. 1.5 years work in HTML wg subgroup.
... Microsoft, Netflix and Google are active and contributed editors to the specification
... Use case is ability to play back protected premium content

s/XXX/Google/

PLH: Want you to be able to interact with platform's Content Distribution Management system (CDM), which provides whatever actual DRM, possibly HW or SW
... You can also use the API for less constrainted systems, but in practice browser vendors are implementing to use underlying platform
... Mozilla is said to be working on this but their detailed plans aren't clear to me. Henri Sivonen is involved from Mozilla

HT: Google means Webkit?

Several: No, blink

PLH: (scribed missed some details about MPEG Dash?)

HT: Is the CDM target exclusively Windows?

PLH: No, Apple as well

HT: Linux?

PLH: Well, if you want some content on Linux, there are providers who require DRM.

YK: Jeff's opinion last time, with which I disagree, is that it could run on Linux should the community wish to do it. I (Yehuda) think most agree that the necessary support won't be on Linux in practice.

DA: Why can't others write it for Linux

YK: You won't get certs

TBL: Too many assumptions here

YK: OK, I retract, but I'm betting Netflix won't publish to Linux without DRM

TBL: Why?
... There are lots of possible architectures. The problem with building it this way is that it depends on support from machine manufacturer. You could imagine, for example, built into the LCD that does the decryption. You could image a band publishing music, some free some priced.
... You could imagine a deployed infrastructure for deploying the keys. For the musician, likely preferable to having control of keys so centralized. Moving to a more open market is desirable.

YK: I'm not saying we will necessarily not have DRM on Linux, I'm skeptical that companies with content like Netflix will publish to it.

TBL: False assumption that there is only one CPU. Phones have many e.g. I have root access on my Mac and can install open source on it. It's not easy for me to copy something downloaded using iTunes. I know/suspect the machine also has subsystems to which I don't have access.

<Zakim> darobin, you wanted to talk about potential reform action

PLH: Today, to watch Hulu or Netflix you need a Flash plugin. If EME works right, then no need for plugin. Now there's a burden on browsers to work with the platform, and this is especially problematic for open source systems.

<timbl> logger, pointer

<timbl> logger, pointer?

RB: Media is just the tip of the iceberg. We need to anticipate requests for protection of Apps, Books, media. Wondering whether the TAG could help push on question of Web-compatible copyrights.

DA: Bookmark that thought

PLH: So, no plugin, but browser has no control over underlying DRM systems, e.g. issues like accessibility.

HT: The well argued >technical< objection is that it is a step toward taking control of the platform away from the owner. That's the key problem. We are making a link in that chain easier, but the deep evil (if you believe it's evil) is further down the chain

PLH: It's a small step

<Zakim> wycats_, you wanted to ask about the distinction between built-in EME and built-in Flash

YK: I see benefits of standard thing with standard interface. I wouldn't like it but can see benefit. I'm not convinced EME is better than plugin.

PLH: You would not have to ship flash, reduces browser footprint

YK: Do you think that will happen

<darobin> [with EME you push the plugin further down the stack and reduce its surface compared to Flash]

PLH: Yes, streaming etc would need to be addressed

<slightlyoff> AR: necessary but not sufficient

PLH: I'd guess EME will show on mobile platforms over time

AVK: We never standardized the plugin API, and we know plugins have problems. Now we're opening an extra hole.

YK: At least with Flash you could implement it yourself in principle. With EME it's not clear that you can.

AVK: Standardizing this seems counter to the W3C's mission

<wycats_> to be clear, I didn't mean that you could implement the DRM part -- what I meant was that you could implement FLASH yourself

TBL: That's a complex discussion

<wycats_> so the undocumented EME API is worse than the undocumented Flash API as it relates to the web content that each of them is enabling

AVK: I am worried about the Web depending on proprietary technologies not easily ported to other platforms.

Robin Berjon thanks the group but has to leave.

<Zakim> noah, you wanted to say even small steps are symbolically significant

<wycats_> For me, the nut of the issue is that new browsers will not be able to run "web content" without a non-trivial amount of non-technical work

<annevk> +1 to wycats_

<slightlyoff> thanks darobin !

<slightlyoff> wycats_: that seems like a no-op vs the current state, no?

PLH: Regarding Web IDL: it's a metasyntax for writing specifications. Within a few years we'll use that either Web IDL or JS IDL.

<wycats_> slightlyoff: the current state is not in a standard -- so the current content is not "web content" and we're honest about it

<wycats_> the claimed benefit of the new state is that it can be described as "web content"

<wycats_> except it's not actually

<slightlyoff> I accept the semantic distinction.

PLH: Some groups are using the full semantics of WEB IDL but older stuff doesn't support that.

<wycats_> this conversation is nuanced... we should have it in person

PLH: Both Geolocation and Touch Events claim to be using ECMAscript bindings, but when it gets to things like prototypes it doesn't work.

AVK: Browser bugs?

<wycats_> I totally disagree that these are just "browser bugs"

YK: Yes, but it's not that simple, even browser vendors have trouble knowing what's right. Ordinary users have no chance.

<Yves> type conversion is also overlook

PLH: Leads to a situation where specifications lose credibility because don't match reality. Conformance tests fail.
... The incentive to fix isn't strong enough.

YK: I agree that in principle this stuff tends to be edge cases, but I have spent say 100 hours of my career burning time figuring out such mismatches between spec and reality. Getting the specs to where they are an accurate guide to practical reality is important (scribe has paraphrased)

AVK: Don't follow. You want specs to say where implementations will go, not where they are.

TBL: Anne, delighted to hear that!

YK: Not making strong statement on that distinction

AVK: Mozilla uses Web IDL for geolocation

AR: Chrome doesn't always follow the IDL
... Is the issue that prototypes aren't be linearized?

YK: Part of the problem

AR: We in chrome never have performance regression, and have spent years figuring out how to linearize prototypes. So far, haven't found a way that performs.

TBL: Eventually?

AR: Eventually we should be able to do it except maybe for some real edge cases

YK: I'm on chrome going to monkey patch an event target, and on another platforms may not work

AVK: We've had IDL based stuff since forever

AR: Yeah, it's a pain to do it by hand

AVK: Then we moved to Web IDL and some did it piecemeal (Chrome?). We are trying for the whole thing, but across browser versions it's a huge job. Real interop will take you a long time.

YK: So, your view is WEB IDL documents ideal reality, and it's just taking a very long time for real reality to catch up?

AVK: Yes

YK: I have more optimism that it's a few years, but not decade(s)
... Some things like that some browsers require capture flag and some don't really does bother developers. We play lip service to interop

AVK: Pressure for new features

YK: Maybe, the HTML5 parser was welcomed for improving interop

AR: WEB IDL is high fidelity but to the wrong ideal

YK: But PLH was asking is there enough info there, if so, then browser vendors please prioritize

TBL: So, we have specs that are not really WEB IDL compatible but still use WEB IDL to list parameters

YK: Does prose tell you

TBL: I'm saying it's being used just for the parameter list, with no implication that there are prototypes, setters, getters etc. They are usefully using as a descriptive.

Several: Whoa, there are specifications saying that things like getters and setters >are< implied

TBL: So, let's say the don't have getters and setters, should they not use WEB IDL and instead use something else, or should they say they're using WEB IDL in limited way.

AR: First, do you think the API without getters/setters is good

TBL: Right now, they are trying to document a spec that for better or worse does not intend to have getters and setters

PLH: So, today the protoypes are being followed by all execpt Webkit and Blink. Mozilla and Microsot are doing what they reasonably can.
... We need to make sure specs describe deployed reality even if later we hope to improve implementations

TBL: Is it pragmatic for them to use the WebIDL syntax?

AR: Two issues. A) Carries implications that to which they are not signing up, so misleading B) It's going to mislead implementors

TBL: The implementations were done "without thinking" in the sense of without attending to Web IDL

AR: If you are attempting to describe a Javasript interface that doesn't match WEB IDL, use Javascript to describe it
... Insofar as Javascript is a poor description language, that's on TC 39 to fix. It is however to write out an interface description, e.g. with a dummy implementation, that can meet the need

PLH: There are a few like that, maybe Web storage

TBL: Or you could clone WEB IDL and rip out pieces

YK: People have enough problem reading/learning WebIDL. Subset version adds confusion.

Someone: so, the browsers are on the road to fixing the APIs

PLH: But doing things like adding getters is very low priority

YK: I can live with that. Web IDL is the desired reality, fixing bugs is low priority

TBL: Not OK. When bugs are likely to be open for ten years, then that's not OK because specs aren't affecting reality

<annevk> plh: http://mxr.mozilla.org/mozilla-central/source/dom/webidl/Geolocation.webidl

YK: Really going to take ten years for geolocation

AVK: No

YK: So?

TBL: I think I'm hearing browser vendors don't want to fix some of these bugs

YK: I'm more optimistic that implementations are converging on the spec faster than that

DA: Does this fit into bucket of "TAG provides guidance to working groups"? If so, what guidance.

TBL: I think the deep dive here is valuable.
... Popping back up might lose that value

YK: I think we can get behind saying "standards should drive interop"

DA: But TAG can get involved at detail level as we did with Web audio. Anything like that here?

AVK: The geolocation group is "run" by two people who work on Blink, so they are likely to give the corresponding answer

AR: If it's only geolocation, then we can fix this

AVK: I assume they felt that fixing this wasn't consistent with getting to REC in a timely way
... We >want< the prototype and we want it interoperable

YK: Would be useful to know why blink isn't doing it. Was there a deep reason making it really hard, or just didn't push hard on it?

AVK: What's a long time?

YK: Having a spec not interoperate for multiple years

AVK: We have several

YK: We should consider avoiding publication of specs likely not be interoperable for extended periods

<slightlyoff> plh: do you have a pointer to the discussions about this?

<slightlyoff> hey JeniT!

NM: If you're going to write specs that cover ideal behavior and interim, can be useful to give them formal distinction as differently named conformance levels.

AR: Confused. Of what substantive issue is geolocation an example.

<plh> http://dev.w3.org/geo/api/idl-test-suite/idlharness.html

PLH: Well touch events is more clearcut, but considering geolation, the specification says the XXX object must be supported on the navigator interface, but it is not.

AVK: Makes sense, because navigator was complex and took a long time to convert.

<annevk> To be clear, Firefox Nightly passes that test

AR: When geolocation was published, WEB IDL was only a working draft. What is the core issue, that Blink isn't linearizing prototypes, or is there an issue unique to geolocation?

PLH: I now see that something told to the director was unduly pessimistic. We thought we heard there were not two implementations because IE passed but Firefox failed IDL compliance. We now hear that was a short term concern, bug fixed, and so we do have two implementations.
... The touch events one was also presented to (Tim). Looked at running on both Firefox and Chrome mobile. There were pieces of a test failing (scribe isn't sure he got this right)

TBL: Can you do touch evens with trackpad on desktop

PLH: In principle, but I can't check that myself

AVK: If someone had checked the Firefox bug database they would have seen the fix was on the way.

PLH: We somewhat trust the WGs

AVK: Not clear this WG has deep insight into implementations other than Blink

PL: Did I hear someone concerned about need to express things beyond what WEB IDL can do?

YK: Well, I heard Tim speculate that if Web IDL semantics not right for all, the syntax might still be used.

<dka> I will note: we have talked about 3 APIs this morning - Geolocation enables access to an underlying capability of the device which may be implemented using a patent-encumbered technology (GPS); Touch events enables access to an underlying capability of the device which may be implemented using a patent-encumbered technology (multi-touch); EME enables access to an underlying capability of the device which may be implemented using a patent-encumbered technology (some

<dka> underlying DRM).

TBL: They came to me and asked about syntactic conformance (just matches Web IDL grammar). I said "no", we need syntax and semantics here

PLH: Touch events is a bit tricky. IE doesn't implement.

YK: What's the status of the patents

PLH: There is a PAG.

YK: Aren't we moving to pointer events?

PLH: Yes, but there is deployments of Touch Events so they want a 1.0 rec

YK: But not likely to get much attention in the future? Then I don't care about Web IDL so much.

AVK: But I care, because we do it with Web IDL

AR: What's the issue?

<plh> http://w3c-test.org/webevents/tests/touch-events-v1/submissions/Nokia/idlharness.html

YK: Should touch events become a REC if not describable by suitable IDL
... Given that Web IDL is part of the spec, we don't have interoperable implementations of touch events

DA: Is this a TAG issue? Unconvinced.

YK: Broader issue is whether WEB IDL semantics are important when considering interoperable implementations? I think we agree: yes.

PLH: Anything else in remaining 4 minutes.

HT: I want to know where we are on Polyglot

PLH: HTML WG is publishing it.

HT: Henri had an objection

PLH: I think it's moving forward
... As long as someone is willing to do the work it will move forward.
... On other things: RDFa went to rec awhile ago; on microdata it's going to note
... We're doing the same with Authoring spec

NM: We had a clear, negotiated agreement with HTML WG chairs that authoring spec goes to REC. please check the record from a couple of years ago.

DA: I agree with Noah

PLH: I will check

<plh> http://www.w3.org/2013/09/html-charter.html

<ht> http://www.w3.org/blog/news/archives/3253

PLH: The HTML WG got a new charter yesterday with two new things 1) new dual license which can be used for extension specifications
... 2) The DOM4 specification was moved into HTML WG also announced yesterday and can get cc: by license

AVK: I am concerned that the cc: by license is non-GPL compatible.

AR: I'm unhappy disucssing document licenses for software

<plh> Plh: an example is: Is Anne willing to edit the DOM specification in the HTML WG under the new dual license?

<plh> Anne: no, because it's incompatible with GPL

TBL: There are some people in this discussion who have taken the trouble to be quite pedantic about it. The notion that GPL would be incompatible with cc: by is a bit pedantic.

AR: You can't drill on this without asking who would sue whom and why? Who owns the rights? If these are rights granted to the Free Software Foundation, they may be litigious. I think it's less likely that W3C or Mozilla will. There are multiple potential ways to solve this. One might be a covenant not to sue.

PLH: We published a FAQ. Our legal people believe that in this particular case your use of cc: by is compatiable with GPL.

YK: How do you link a spec.

AVK: Given that FSF considers it incompatible...

<ht> http://www.w3.org/Consortium/Legal/IPR-FAQ-20000620

<plh> http://www.w3.org/2013/09/html-faq.html

<plh> http://www.w3.org/2013/09/html-faq.html#gplcompatibility

AR: Henri Sivonen makes an argument, of which I'm not convinced, that if you are going to automatically generate help or error text that extracts from the specification and winds up in code, the code is then potentially not GPL clean.
... I believe that a covenent should resolve that concern.
... I do understand that this doesn't in the general case resolve the question of whether cc: by leaves you GPL clean

TBL: Definitely not just help text. It's also generating parsers from BNF, etc.

NM: I agree with Tim

DA: The question is, FSF and Creative Commons say it's incompatible and W3C says compatible, are we getting them together.

PLH: Harder than that.

AR: FSF has their own interpretation that they can enforce with respect to the products they control

DA: Thank you Philippe

Philippe leaves

TBL: This about people doing things on principle. Principle is important, and important things follow from principles. Viral licenses are complicated to design and are tweaked occasionally. The fact that FSF has come to this conclusion is a failure of the GPL license. Compatibility with cc: By should be straightforward and should be a requirement for GPL.

AVK: Yes, some of it is principle. I would like my work to be in the public domain, much as Tim got the ideas and software around the Web to be given away.

AR: CC0 is NOT public domain. Having a license and public domain is different thing. In US law, governments can put things in public domain but individuals can't.

SK: But laws typically say that the owner has all rights to do anything.

DA: Which jurisdiction?

SK: I guess I'm speaking of Russia, but that's based closely on Berne convention and should apply at least to most of Europe.

AR: ...and US
... What can the TAG do? Ask US libary of congress to let W3C put into public domain.

TBL: But our members don't want to put our work into the public domain.
... I did not get the Web into public domain. I got CERN to agree not to charge license. The work I did here is MIT license and copyright MIT, similar to BSD and cc: by.

AR: So key was?

TBL: As I recall, not to charge royalties.

<ht> Yehuda -- yes, this is the "Authoring Spec": http://www.w3.org/TR/html5-author/

<ht> "HTML5: Edition for Web Authors"

<slightlyoff> big deprecated warning at the top?

<slightlyoff> "This document has been discontinued and is only made available for historical purposes. The HTML specification includes a style switcher that will hide implementer-oriented content."

<ht> It's a Note, whereas Polyglot (http://www.w3.org/TR/html-polyglot/) is still on the REC track

<ht> But the link to the Authoring spec. from the charter says "The HTML Working Group will complete work on the following existing deliverables of the group:"

<ht> ???

<annevk> timbl: http://www.w3.org/Policy.html

<annevk> timbl: "The definition of protocols such as HTTP and data formats such as HTML are in the public domain and may be freely used by anyone. -- Tim BL"

<ht> Anne, yes, but as AR and SK said, _saying_ that is an indication of author's intent, but it doesn't, legally, make it so IIUC: only the US gov't can 'make' something public domain wrt US law, I believe.

<annevk> ht: this was back in Switzerland I suspect

<annevk> ht: either way, CC0 covers the US gov case

<trackbot> Meeting: Technical Architecture Group Teleconference

<trackbot> Date: 01 October 2013

<ht> Here is the (relatively) famous analysis of DRM effect on Vista and chip design: http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.html

<slightlyoff> Scribe: slightlyoff

API design

DKA: sergey, you volunteered the topic, perhaps you can talk a little bit about it?

<twirl> https://github.com/w3ctag/api-design-guide/blob/master/API%20Design%20Guide.md

SK: in my opinion, the design guide should serve 2 main goals
... 1.) to be a guide for folks who design APIs
... 2.) to build a guide for API reviews
... the second is guide for the TAG

YK: should non-platform builders trying to design things use the guide? what about ember?

SK: what I've written is very general
... it should primarily be focused on the web platform

YK: if it can't be useful for library developers, it'll probably fail the idiomaticness test

DKA: my sense is that we should be focusing primarily on the work of spec developers

YK: I'm saying this is an acceptance criteria

AVK: "idiomatic" changes over time
... what AWB does is different in style from what query does

SK: there are some bits that are specific to our field, but hopefully it will be relatively general

<twirl> http://www.w3.org/TR/api-design/#basics

SK: it's not a guide, per sae, but it includes good IDL design notes

<wycats_> AR: Proscriptive text should be more fluid

<wycats_> AR: I would like there to be a more general design guidelines

<wycats_> ... my sense of Robin's guidelines is that they are good tactical advice but not a broad sense of the landscape

(thanks wycats_)

<wycats_> ... we need something that helps people navigate the landscape

<scribe> Scribe: slightlyoff

SK: I think it should have 2 parts
... something to help folks designing interfaces and more general guidance. It could be 2 documents.

YK: some of this stuff is going to need to change, e.g. the advice about globals will change in the light of ES6 modules

DKA: don't we ant to build looking forward to that future?

YK: if we do that, there's a group in TC39 that's refactoring the exiting platform in terms of modules and classes and they'll need to be roped into this effort to make it forward looking

<wycats_> AR: I feel like that is an attractive thing to want to do, but it's maybe too soon

<wycats_> ... since we don't have a lot of design expertise with modules

<wycats_> ... and no consumers among working groups yet

<wycats_> ... I would like to focus on the design challenges that we've observed

YK: new specs will want to use modules in 6 months

AVK: I'm not sure

YK: that's my sense of the timeline

DKA: my main concern for this work is getting started on what we can get consensus on
... I'd like to have something meaty by the end of the year

YK: if we can get it done by the end of the year, the current outline seems good
... it sounds like there's a lack of consensus

DKA: until there's something concrete, perhaps we should be talking about what we should say in this document

YK: I think there are some areas where we should be explaining to people how to use modules to help get them over the hurdle of using them

AVK: there's a problem with shipping

YK: slightlyoff is right…there's a transition problem
... we need to both provide the back off strategy -- someplace to do things with modules in the interim

AVK: once things ship in 2 browsers, I think it'll be "game on"

SK: the platform is evolving permanently. Every 6 months there will be some changes that will cause spec authors to need to revise their work
... modules are not blocking this
... the TAG has a role: questions about wether or not to use modules _will_ be directed to the tag, so we'll be on the hook for providing advice; both current state and how to think bout the future
... I don't know if we should write down the current state in this guide, and how to cope, and how to think about designing in the future. Perhaps we should have 2 parts? something general that doesn't change frequently and a part that's more tactical: how to use modules, promises, etc.

<wycats_> Just to be clear, I am not saying that everyone will be using modules in 6 months, but that new specs will want to use them in 6 months

SK: are we agreed about the main goals of the guide?

[agreement]

SK: I have a plan for the relatively general part
... I'd like to present it and collect your feedback, and if we agree, I'll start working on the general part of the guide
... API developers should be explaining what tasks the API should solve
... how are those tasks solved in other plaforms/APIs (prior art)
... and there should be some rationale and exposition about what to borrow and why to leave behind
... the other question is: what are the use cases?
... i've reviewed 3 APis before this meeting: web audio, web animations, and push notifications
... all presented me the same problem
... I don't know what the common use-cases are and the spec doesn't provide direction about where to find them
... I think that's a major problem with these specs
... I have to find out how they're used and why the current solutions aren't used

AR: it might be good for spec authors to have that sort of text for their own sake, but it doesn't seem like it'll be a shortcut for reviewers

SK: I agree, I do still need to review and check the background
... when you're reviewing the spec, you need to come to an independent view
... so it's good to have the list to see if the spec authors were hitting the right issues
... for example web audio, it's intended in a browser, but it has no mechanisms for working with audio more than a minute in lenght
... it's important to have the use cases both for the folks doing review and for the folks doing the specs
... the designer should be reviewing the spec at checkpoints to make sure they align with the major design guidance
... else things may drift in scope
... I'll write this down with extended examples
... both good and bad

AR: how do we imagine this being used primarily? In coordination with us? independently?

DKA: my though was that folks who are building specs would use this as an input

AR: so it needs to stand alone and be self-supporting

DKA: agree

SK: we don't want to be going to each spec designer and having to explain the guide

AR: I think that raises the risk for advice that has a sell-by date

YK: yeah, promises, streams, globals

<wycats_> AR: I think the right way to do this is to extract advice out of our interactions with working groups

<wycats_> AR: otherwise we'll find ourselves with a pile of good but eventually contradictory statements that will not hang together

(thanks wycats_ )

DKA: is this a traditional TAG finding?
... or is this different kind of document?
... TAG findings have publication dates and errata
... or should this be a living doc that only lives on github, etc.

NM: it could even go to REC

DKA: but if the ground is shifting quickly....

YL: we need a clear way of updating it and marking some advice obselete

DKA: agree

SK: I think this document will be tested when we continue to do reviews
... and we can come up with determination about how it should live as a result

DKA: what can we pull out of our work with WebAudio? WebRTC?

wycats_: there's a chicken-and-egg problem with Alex's approach: it won't help us spread new technology that should be broadly used

AR: agree

SK: next chapter

AR: what do we want to do about it?

YK: I think we're going to have to take a bit of a leading role

SK: the second level that should be explained in a spec is defining the levels of abstraction
... what are the data structures?
... how abstract is it?
... what UI does the spec provide?
... how doe the spec objects interact with each other?
... so WA becomes clear: the data structures are very low level and it doesn't have UI

AR: how does this help us or a spec author get clarity on wether or not these are the right choices to make?

SK: it's pretty basic that the levels of abstraction should be outlined. If the abstraction levels are defined, things will compose well

AR: but does this provide practical advice? How does noting the level of abstraction turn into guidance?

SK: I've thought a lot about this….

YK: do we think this sort of document will help folks in the wrong headspace?

SK: I do think it'll help. Won't be a magic bullet,but will help clarify

PL: it'd be nice if we had a clear description of where the levels of abstraction really ARE in the platform. Seems like something the TAG should provide.

YK: agree. That seems like the first thing we should do here

PL: I think a lot of us have pictures of it in our heads, but we haven't annunciated it

YK: if you're crossing layers, at a minimum there should be an API

PL: I think we should define this model

DKA: that seems like a separate part of this document that needs to be written

YK: yeah, this was the thing that Alex and I had in mind and we should talk about it more
... [ need offline discussion ]

DKA: if you don't have the bandwidth, perhaps you can review?

YK: yeah, need to find time

DKA: you agree with the basic formulation of the plan? draft and let sergey edit?

[ agreement ]

SK: defining the abstraction levels is the hard part

YK: I think there's a growing consensus
... I'm optimistic that we're more on the same page than not

SK: are there written results?

[ no ]

YK: I think the extensible web manifesto is one work product. Trying to explain the platform in terms of those broad ideas is something we need to do

[ breaking for lunch ]

DKA: after lunch, we have wendy seltzer joining us to discuss privacy and security
... after that we can loop back to this

lunch

(back at 1pm EST)

<dka> trackbot, start meeting

<trackbot> Meeting: Technical Architecture Group Teleconference

<trackbot> Date: 01 October 2013

Security & Privacy

<dka> Scribe: wycats_

<wycats> Scribe: wycats

Wendy (WS): Security and Privacy expert

WS: Here to talk about where society is going with regard to security and privacy

DKA: We had some ideas about what we can talk about

1) What could we (TAG) be doing with regard to the government snooping situation?

DKA: It's not our role to wade into the politics
... but the question of securing the web is something that is clearly in the realm of the TAG and Web Architecture
... so could we be giving more guidance about the use or non-use of "security" technologies

2) Publishing and Linking

scribe: is there any action that W3C is planning to take with amicus briefs or anything that we can provide background

<slightlyoff> thanks wycats

3) Some top-level thoughts on "let's make a deal" application security model

DKA: 4) any input into the API design guide around privacy and security
... 5) What should we be thinking about

WS: Let's sketch out what T&S is doing right now
... and we'll see if there are architectural questions that TAG can help with
... to influence a "secure and trustworthy web"

DKA: Let's start with (1) "government snooping"
... AR: Can you outline your security proposals

AR: I have the benefit of leaning on much smarter people
... specifically Adam Langley, who works on SSL in Chrome
... I would trust AGL with my private data
... I asked him: (1) should the TAG weigh in
... and he said yes
... (2) what specific things can we do around SSL?
... (a) Perfect Forward Secrecy
... (b) strong keys
... (c) cert pinning and OCSP

(http://en.wikipedia.org/wiki/Online_Certificate_Status_Protocol)

scribe: (d) removing weak TLS versions
... (e) Strict Transport Security headers

(http://en.wikipedia.org/wiki/HTTP_Strict_Transport_Security)

scribe: he doesn't believe that pinning is something most sites can do
... cert pinning is really hard
... OCSP pinning is a performance optimization
... OCSP is a response to the problem of revoking certificates
... browsers can look up in a global list for revoke certs
... high lag / low compliance

YK: issues with captive portals
... (The previous discussion was about CRLs)
... OCSP is an improvement on CRLs

(http://en.wikipedia.org/wiki/Revocation_list)

scribe: OCSP is blocking
... therefore sucks
... OCSP pinning is a way to provide a response to the OCSP question as part of the handshake

YK: Is there a solution for OCSP and Captive Portals?

AR: We need to never trust captive portals on the browser side
... AGL thinks both kinds of pinning are too hard for most publishers
... because OCSP pinning requires relatively frequently updating your certs and maintaining strict security around the key material

YK: Google does pinnin

AR: Yep

HT: I assumed there was an informal consensus that SSL itself isn't fit for purpose

AR: Let's try to be more specific
... the issue with SSL is trusting the root signer
... you have to trust the root signer for a long time and you have to update them sometime
... there are economic forces that drive prices down
... it's legitimate to have a varying view of their (roots of trusts) compliance efforts

HT: I hit cert errors from sites I have reason to trust so I blow them off
... and I think it's a universal experience
... which means certs aren't useful

AR: (f) Crypto all the things

DKA: Website authors may tell people to skip cert errors

HT: UofE does this

YK: This is really an issue of self-signed certs

AR: The advice is "spend a couple hundred bucks"

TBL: MIT says to install the MIT root cert

<Yves> install... but what to do when it's revoked?

AR: PFS can be implemented without high costs
... we need to get more browsers to implement it
... it would be great if they did
... IE doesn't support PFS
... it's an option in the SSL handshake

(scribe suggests reading the wikipedia page for more information -- cannot scribe all the technical details)

<dka> http://en.wikipedia.org/wiki/Perfect_forward_secrecy

AR: Proposal: We should advocate that major publishers should provide it and all browsers should implement it

TBL: How do you tell someone to do it

http://stackoverflow.com/questions/17308690/how-do-i-enable-perfect-forward-secrecy-by-default-on-apache

DKA: This really all gets to Wendy's discussion about UI

WS: The user research that I'm aware of is that users are terrible at any question about security

YK: What about the "zomg don't enter this site" malicious modal dialog

AR: I'm looking into it

SK: I think it's over 90% don't follow it

AR: We've made it more and more onerous over time

HT: There is at least one page that I don't know how to get past

AVK: There are issues with captive portals

PL: Is there a standard for captive portals

YK: There is a proposed status code

TBL: Can we (TAG) kill captive portals

YK/AVK: No

AR: We can maybe limit them to reserved IPs

PL: We need a solution before HTTP status codes

TBL: Which adds problems for cert errors

<wseltzer> [the captive portal problem... how many AP devices provide the bulk of these?]

should we be change the OS UI when you're behind a captive portal?

YK: explains the generate_204 solution

(http://www.chromium.org/chromium-os/chromiumos-design-docs/network-portal-detection)

AVK: There are also issues with timeouts

DKA: Captive portals make people oblivious to security errors

TBL: Captive portals violate web principles
... it makes HTTP meaningless

<slightlyoff> AGL advises that the data here is pretty fresh: http://www.cs.berkeley.edu/~devdatta/papers/alice-in-warningland.pdf

(http://tools.ietf.org/html/rfc6585)

WS: There isn't "one owner" of this problem
... there are many interactions
... you can't solve it with one solution

<slightlyoff> Click through rates for badware/malware is low, click-through rates SSL warnings is sadly very high = (

WS: but maybe an architectural solution can work across the board to solve it cooperatively

slightlyoff: how do users perceive the difference

<slightlyoff> IIRC, different colors, but I'd have to go dig up the HTML

DKA: Work item Proposal: Recommendations for Captive Portal Owners

YK: You might be able to design a good captive portal

PL: Yes, you can for example block all ports by 80 and 443

TBL: They can look at the user agent

AR: Key strength is in flux
... we can count on governments and non-commercial actors being able to do 2^80 computation
... being able to break 1024 bit keys now or at least soon
... we need to take 1024 keys off the table
... browsers will warn
... (for 1024 keys)

AVK: What's moving faster - ability to generate keys or ability to crack them

Yves: Weak ciphers can also render the keys useless

HT: If you're paranoid you may believe that government agencies may have cracked 1024-bits better than brute force

<slightlyoff> was looking for this earlier....: https://www.ssllabs.com/downloads/SSL_TLS_Deployment_Best_Practices_1.3.pdf

YK: It may still be worth protecting ourselves from others than the NSA even if the NSA cracked RSA

TBL: The TAG could just say "don't use 1024 bit keys"

AR: And we can say "don't use the null cipher"

TBL: We can ask the validator suite to add validation for sane SSL practices

AR: And a name-and-shame list

<Yves> https://www.ssllabs.com/ssltest/

AR: Strong Versions
... 1.1 and 1.2 are the state of play, people should not be using something else
... don't use TLS <= 1.0 or SSL <= 3.0
... Strict Transport Security
... http://en.wikipedia.org/wiki/HTTP_Strict_Transport_Security
... Crypto All the Things
... Public services should all be HTTP

Yves: What about caching

HT: HTTP thinks that the web only works because of caching

YK: Public caches are such bad actors that you may wish to use SSL just to opt out of them

HT: The W3C servers were being brought to their knees by poorly written proxies that were requesting namespaces
... IBM said bandwidth was cheaper than writing a cache

<slightlyoff> HSTS background: http://www.chromium.org/sts

YK: The mixed content warning has helped with getting SSL support in CDNs

AR: TL;DR We need more crypto
... the expectation that the web's traffic is mostly unencrypted is "an invitation to be embarrassed later"

DKA: What other unintended consequences are there

(scattered discussion)

<slightlyoff> heh

YL: People tend to use SSL and/or port 443 for new services to avoid proxies messing around

<Yves> especially interception proxies

WS: UI issues
... in some ways we have steered away from recommending UI because browsers see it as a competitive advantage
... but maybe we really do need to make some recommendations now
... so users have a consistent mental model of what good behavior is vs. bad behavior with regard to warnings
... and have a hope of making better decisions about security
... rather than throwing up our hands and letting users choose browsers that make decisions for them

<wseltzer> "warning fatigue"

YK: Why are self-signed certs still emitting warnings
... Why not "no padlock, no warning"?

AR: You could imagine this
... the only real value is that it makes the work of hackers more difficult
... it also puts encrypted traffic in a different bucket

(if we think this warning is useful, why not yell at the user when they use HTTP)

(surely a self-signed cert is no worse than unencrypted HTTP?)

<wseltzer> (how do you help the user differentiate between seeing a self-signed cert for well-known-site and one for his own/friend's site?

HT: I have my owned self-signed cert -- what's the additional vulnerability if I don't install the self-signed certs

YK: Strict Transport Security may help

AR: It gives you temporal security

TBL/AR: Man in the middle is still easy

("easy")

TBL: I like being able to opt into trusting a specific self-signed cert

YK: I think you are in the top 0.00001% of the mental model of the situation

TBL: Teenagers using facebook understand friends of friends
... the only major difference is UI

<wseltzer> "we have bad security heuristics"

WS: This is a very hard set of problems
... what can we do to chip away at it?
... how can we think about the different elements of the threat model?
... for example the pervasive passive adversary vs. the adversary targeting you vs. the adversary targeting a type of communication
... for example crypto all the things helps with passive collections
... and a warning for self-signed-certs interferes with "Crypto All the Things" which helps with passive adversaries
... we now know that there is much more of the passive adversary threat
... so maybe Crypto All the Things is a high priority
... as IETF liaison I'm thinking more about what orgs have what responsibilities

<slightlyoff> http://infrequently.org/13/crypto-all-the-things.jpg

AR: What is IETF thinking

WS: Crypto All the Things

AR: Good Crypto All the Things

WS: Also PFS to avoid passive collection and future cracking

YL: What about DNSSEC?

<annevk> Crypto All the Things Strongly? -> CATS!

<Yves> DANE

<Yves> withint IETF

<slightlyoff> https://datatracker.ietf.org/wg/dane/charter/

YK: I want to remove the padlock and the warning for self-signed certs

AVK: File a bug

YK: Sounds good
... I don't think enlisting users to yell at webmasters is a generally effective strategy

TBL: Maybe the browser should send requests to a website so it shows up in their error logs

DKA: Should we have a work item?

AR: Let's get SSL experts as collaborators

DKA: we have some good starting points there

<dka> q

YK: Let's make a starting document that isn't controversial

Sergey: what about expired certs?

AR: No. We need to warn people
... the idea is to create a class of certs that aren't meant to imply protection from MitM
... what about expired self-signed certs

YK: People aren't willing to say that HTTP traffic is specially warned, so anything better than HTTP shouldn't warn unless the server expresses intent to have a padlock show up

DKA: Let's rope some people in
... like your friend

ACTION Alex to start writing some text for security recommendations with AGL's help

<trackbot> Created ACTION-831 - Start writing some text for security recommendations with agl's help [on Alex Russell - due 2013-10-08].

(scattered discussion about spy vs. spy icon)

DKA: What should we be thinking

WS: We're focusing on (a) DNT Header (b) Privacy Interest Group
... we don't have formal security reviews

http://www.w3.org/Privacy/

http://en.wikipedia.org/wiki/Do_Not_Track

YK: Is there interest in doing security reviews?

WS: The IETF does it

YK: Why doesn't the W3C do it?

DKA: unknown
... TBL?

TBL: The TAG hasn't recommended it
... at first, it seemed like lip service
... but there have been a few times where I felt it was useful
... so I would be happy to do it

YK: I'm suggesting that there is a formal security review of specs

(as opposed to the lip service "security considerations" section)

TBL: We do this already for accessibility and internationalization
... so maybe we should do it for security

YK: I didn't enjoy doing this for JSON API

<wseltzer> bureaucratic hassle--

http://www.iana.org/assignments/media-types/application/vnd.api+json

TBL: Would have been nice if they had to write security considerations when they wrote SMTP

WS: Please submit any documents that you want wider reviews on to the IG or the Workshop
... there isn't yet any specific plans for the Workshop

<wseltzer> http://www.strews.eu/ (EU Project)

API Guide

SK: Part III: Defining Object Responsibilities

<dka> trackbot, start meeting

<trackbot> Meeting: Technical Architecture Group Teleconference

<trackbot> Date: 01 October 2013

SK: This stage can help find design problems
... IV: Object Interface
... this stage helps ensure consistency with the names used in the details of the APIs
... this feeds into the Platform-Specific guide

DKA: Maybe we can discuss this in terms of an API like Web Audio

SK: I had a push API review that we could use for this

AR: I looked at it
... it seems like a good way to analyze OO APIs
... what about layering?
... how do these things relate to markup?

YK: We asked for example how Web Audio related to <audio>

AR: We should keep these things front and center
... because it can be easily lost when you're focused on a particular layer

YK: You could imagine an HTML form of the push API

SK: Push API doesn't have any elements
... it may be problematic to add elements
... there are lots of questions about how all this should work
... I cannot make assumptions about this should look

YK: Not every tag has a visual representation... it's just our declarative tool of choice

AR: What else?

YK: What, if any, new capabilities does this proposal introduce? If none, what capabilities is it described in terms of?
... If there isn't a direct JavaScript API for something for performance, what is the rationale?

DKA: What platform invariants exist? Does this proposal violate any?

YK: We know of raciness, but also things like not leaking same origin
... And doing things like Fetch and Service Workers may make it easier to avoid breaking Same Origin because there's a JS model for what's happening

DKA: What other WGs should we be reaching out to?

YK: I think we should review the Web Components family of specs

ACTION Alex to invite Dmitry to present about Web Components

<trackbot> Created ACTION-832 - Invite dmitry to present about web components [on Alex Russell - due 2013-10-08].

ACTION Yehuda to write some text about capability layering

<trackbot> Created ACTION-833 - Write some text about capability layering [on Yehuda Katz - due 2013-10-08].

ACTION Sergey to start fleshing out an API review document

<trackbot> Error finding 'Sergey'. You can review and register nicknames at <http://www.w3.org/2001/tag/group/track/users>.

ACTION Сергей to start fleshing out an API review document

<trackbot> Error finding 'Сергей'. You can review and register nicknames at <http://www.w3.org/2001/tag/group/track/users>.

ACTION twirl to start fleshing out an API review document

<trackbot> Created ACTION-834 - Start fleshing out an api review document [on Сергей / Sergey Константинов / Konstantinov - due 2013-10-08].

<slightlyoff> Scribe: slightlyoff

<annevk> https://etherpad.mozilla.org/tagzipurls

<annevk> for notes

YK: consensus is that we need bundling for ES6 modules
... AVK came up with the idea that if you want to represent the file inside a zip, use the fragment identifier

Zip URLs

YK: ...but then AVK noted that there's an issue with it and we need to work it out.

<annevk> plinss: can you project http://wiki.whatwg.org/wiki/Zip#URLs maybe

YK: the platform uses fragments to navigate *inside* resources ( generally), but this now changes it to move fragments into part of the resource being identified

TBL: you shouldn't feel constrained to what HTML fragments do

YK: everyone's happy with fragment semantics
... when you're writing HTML that's RELATIVE to another file (say from a filesystem), it's easy:
... you do <a href="../thinger.html">
... now imagine the cluster of files lives inside a zip file
... it doesn't appear that ".." can be relative to something inside a fragment.
... the inclination was "surely this happened in XML...maybe that was solved"

HT: the media type defines the semantics of fragment identifiers

[ inaudible ]

AVK: if I get an HTML file with it's own media type, and now the question is "what's the URL of the document"?
... is there an "outer URL/inner URL" boundary?

<dka> Noting we are also working on a capability URLs doc: https://github.com/w3ctag/capability-urls - and Yehuda and I have talked over lunch about a possibility of a "URL best practices for web applications" doc… Now we're discussing Zip URLs…

HT: this occurs in multipart-mime as well

TBL: now imagine an address space where you have fragments for things that are named by bits of a fragment

HT: there's a base URI for the virtual installed location for the whole package

<dka> Also, see http://www.w3.org/2001/tag/doc/IdentifyingApplicationState for what the TAG has said previously on hash URIs in web app context.

HT: and relative URLs are relative to the package

YL: what about a Service Worker in the zip file that can do the resolution?

TBL: my suggestion was to use "/"

[ whiteboard example of <a href="../foo.html"> ] in a zipe with #foo.html and #bar.html

YK: so is the idea that there's a virtual URL?

AVK: yes, there's an internal addressing scheme
... zip://..../bar.html

TBL: alternative design with different problems
... http://..../foo.zip/somehting/a.html

AVK: how does this work?

TBL: 3xx redirect from the unresolveable URL to foo.zip

[ we need something that doesn't require server support ]

TBL: this has nice properties (natural URLs, etc.)
... requires both server and client support

AVK: the problem with this is that we've learned that deploying anything at the HTTP layer is REALLY hard. Even CORS headers is a tall ask.
... this is negotiation style...it's a lot harder

YK: it's trivial to do in rails
... you need to get it in apache
... in general we reject solutions that require .htaccess files

AVK: if we disregard fragment, and we disregard the redirect solution, the other ways to do it...
... you can have nested URLs: outer URL with new scheme

<wycats> I am unwilling to disregard the fragment solution

AVK: there are some nasty problems; URLs are now a stack, zip urls require parsing changing, etc.

TBL: historically that sort of thing has gone down rather badly
... you might have had "zip:" and then a URL encoded URL

AVK: [ notes that FF has "jar:" ]
... the 4th solution is to have some sort of sub-path
... some new sort of seaparator
... I proposed "$sub=" which would be very unique
... everything after "$sub=" is for local processing
... it works for polyfilling too

YK: I think you should be able to use whatever rules you'd sub in for stuff after "$sub=" you can use in the hash soltuion

AVK: nooo....

TBL: can I put slashes in the post "$sub=" area?

[ yes ]

HT: what about a generated URI for the contents?

YK: can you say that the fragment is part of the base URI for that scheme?

AVK: what scheme? there's no scheme for the fragment

[ discussion about the relativeness of the mimetypes and the address-bar duality ]

AVK: the "$sub=" only affects the URL parser
... what the IETF specifies isn't what we have
... "$" is illegal...it's reserved

YK: not use "$"?

AVK: people use illegal charachters = (

[ backing up ]

PL: what's the basic usecase?
... is it something that's always going to do processing on the client? or are we hoping for smart servers that can collaborate and send the right things?

TBL: when people put things on the web as zips, now we've closed them off

[ TBL articulates an example where the server might only want to send individual files, not the whole zip ]

[ discussion about server assistance and what gets sent to the server ]

PL: the reason I asked is that I can see utility for both versions
... we might want to do a verssion that supports both
... with combinations of smart/dumb clients/servers

YK: for a dumb server, you need to send the extra info in a separate header

PL: I'm suggesting HTTP extnsions

YK: you still need a new separator

PL: if that's our goal, it rules out fragments and new schemes

<wycats> I am warming to the % solution

HT: can I present a different solution?

<wycats> as a new URL separator that means "don't send this to the server, but it's part of the base URI" (and maybe the rest is sent in a header)

HT: suppose we start at the following point: somebody requests the package, and it's got the relevenat content-type.
... the client is going to get a directory listing and what you'll get in the URL bar is something with zip://example.com/package.zip/

(and you get a list of URLs in the bundle)

HT: zip://example.org/package.zip/foo.html

TBL: doesnt have enough information! No delimiter

HT: we said we want this to work: http://e...o/package.zip#foo.html

TBL: you don't know where to break it up

[ YK and AVK discuss the public/private URL merits ]

YK: I'm looking at the % solution: send only the base to the server

AVK: % isn't popular because you can't polyfill it. 400 on apache and issues on IE and IIS
... naked "%" is interesting as it's an extension to URL parsing
... nobody can be using it

YK: polyfilling is about what old *browsers*, not old servers do
... the fragment has the same problem

AVK: yes

[ discussion about how making it work in old clients ]

TBL: there's always been the problem that when you introduce a new mimetype, there's download vs. understand

YK: in the ideal world, the old browser would send the request and the server would have a chance to send the right file
... you could unpack the file on disk next to the real zip

AVK: we need a delimiter, and we need something so unique that it won't conflict in data URLs

(etc.)

YK: is that important?

AVK: useful for feature testing

YK: make a blob!
... we need to find a char that's supported by old browser

<Yves> See semicolon https://tools.ietf.org/html/rfc3986#section-3.3

<Yves> [[

<Yves> For

<Yves> example, the semicolon (";") and equals ("=") reserved characters are

<Yves> often used to delimit parameters and parameter values applicable to

<Yves> that segment.

<Yves> ]]

TBL: old browser, new server...

YK: we could say the new semantics are "%", but that "$sub=" is equivalent

HT: what about "|"

[ can't use ';' ]

YK: need something illegal that no browser sends or doesn't occur in the corpus of the web

<annevk> Yves: ; and = work fine in URLs today

<annevk> Yves: people use = in the query string for sure

[ discussion about unpacking for compat ]

AVK: we need to require a mime type
... otherwise sites that upload zip files become vulnerable to attacks

<Yves> something like /foo.zip;sub=root/blah.html would be usable, even illegal characters might be on the web today

AVK: if they host zip files which have HTML in it, no they are subject to XSS via the HTML inside the zips

PL: does zip content need to be a different origin?

YK: can't be

TBL: do you wnat to say within a site that some bits are different origins?

YK: polyfilling is an interesting constraint that can help us tie-break...so what's in tie?
... TBL hates "%"

AVK: "%!" would work, but you can't just have "%"

YK: so semantics are, [base]%![location in zip]
... the base is sent to the server
... and you send a new header with [location in zip]

PL: an old client w/ an old server, would pass the whole thing to the server and 404
... old client / new server might do the right thing

YK: the reason I want a different separator is that righ tnow we have 2 semantics : thing to be sent and thing to be reserved
... but it turns out you really want both

PL: what I like is that you send the second half in an additional header

TBL: why is "?...." weird?

AVK: it's the query string....

TBL: right, servers lop it off and send the zip file

[ discussion of servers throwing away stuff after "?" for static assets ]

YK: sort of works and is semantically reasonable....emperical question
... IRL, it already has these semantics
... how does ".." resolve with "?"?

<plinss> ‽

[ it jumps behind the "?" ]

<annevk> plinss: if only URLs were not bytes

TBL: my requirements are that I should be able to link within, without, and out with an abs URL

<dka> ¿

YK: "%!" is appealing

AR: what's the case aginst "%!"?

AVK: doesn't work in IE...IE doesn't parse the URL

YL: changing URL parsing on server is unpossible

<dka> ¶

<dka> ∑

AVK: most of the solutions require that sort of change

<annevk> bytes people, URLs are bytes

<plinss> ?!

<annevk> :-)

<dka> £

<annevk> http://example.org/zip^_^image.jpg

TBL: if I wasn't so enamoured with clean specs, what I'd do is send things, look for 404s, and then look for the zip on disk

AVK: "%!" fails the OldIE test

YK: what about "$!"?

[ discussion of ".." vs. "?" ]

Yves: +1

AVK: so this can't happen after the query string?

[ yes ]

AVK: thre are downlod services that treat the full URL as the addres

[ relative URLs vs query strings ]

[ what about "$!" ? ]

<timbl> [ discussion of ".." vs. "?" ∀ values of ..]

AVK: haven't looked at that

YK: has to not fail on old IE

AVK: and has to work with data URLs?

YK: seems like a good tiebreaker?

AVK: at what level in the URL parser does this enter?

YK: URL parsers need to see it as part of the delimiter

[ data: questions ]

AVK: data: urls do have query strings
... we could do that, but htne you don't get the zip out of it

AR: love that you're gonna be base64 encoding the zip ;-)

YK: turns out that the limits stick us with a new delimiter that works in old browsers...preferably not alphaneumeric

TBL: the architecture of URLs is that there are these chunks...
... things that are being sent to various parties but not others...lots of stuff is built on this...and worried that this is new

YK: doesn't break infrastructure because new browsers won't be sending broken stuff into the wild and old browsers might, but that's the price for supporting them
... thought there was a binary choice between sending and not sending to the server, but it turns out there's a union option
... proposal is a new URL delimiter
... [before]$![after]
... [before] is sent to server
... [after] is sent in a new header

[ what about query strings? does it break 'em? ]

PL: should be part of the query string if it's going to break them

[ discussion about ".." relative to "?" ]

[ polyfill case ]

YK: it really is a new semantic, so you actually need new syntax

HT: you're not gonna get it until HTTP 2

YK: why not? this is a client behavior

[ seems old browsers send "$" ]

(encoded or not?)

AVK: not encoded in gecko

YK: wanting somthing that's a delimiter and doesn't send stuff to the server, it means we need to change URLs

AR: agree

TBL: what about "/$/" ?
... it'll 404 on an old server

YK: if it's not used on the web, seems fine
... do we agree on the shape of the solution?

HT: would like to see something that thinks about what it means to add contains to the ontology of client/server interaction

[ that's what this is ]

TBL: one of the things about having a "/" is that...if you have lots of icons, you could just 303 to the zipfile anyway

HT: I do want the nesting..and we're not going to get that = \
... I want a way to say I want you to start the whole resolution over again...mounting a zipfile as a filesystem analogy

YK: I thinks that's what this delimiter propsal does

TBL: you want the base URL to be somewhere else?

HT: yeah

YL: new clients will have separate logic for redirects inside zips...need to identify the main content of the file

(HALP, scribe error)

TBL: the whole tag is labouring under the burden of getting a change to apache...trying to get mimetypes added...etc.
... apache doesn't change until IETF stability...then a long lag

<Yves> server http://www.example.com/foo.zip/bar.html return a CL ttp://www.example.com/foo.zip#bar.html, new clients will recognize that the "container" is a zip and fetch it for further resolution

TBL: it'd be nice to have a zip-aware apache module...

YK: I think that'll happen, but we should also support low-fideltiy modes

TBL: you could unwrap..

(inaudible)

HT: agree that this covers all teh viable proposals...the idea that you need a new delimiter
... a mechanism whereby a new syntactic signal strips a part of a URI and puts it in a header...that seems like the only thing that talks about nesting containers

DKA: what's the proposed TAG output?

[ none ]

DKA: are we sure?
... a summary? a note?

<timbl> If you use a /$/ then 1) new clients can just fetch the zip & DTRT. 2) old clients will work if EITHER the server has been hacked to be smart and 303 to the zip file OR someone has unzipped the thing into a directory called $

YK: a note seems good

timbl, agree

[ agreement about a note ]

ACTION wycats draft a note regarding constraints and decisions regarding zip URLs

<trackbot> Created ACTION-835 - Draft a note regarding constraints and decisions regarding zip urls [on Yehuda Katz - due 2013-10-08].

<timbl> If you use a / then 1) new clients cannot just fetch the zip so they may have to use an algorithm on old servers, nut on a new server they get a 303 2) old clients will work if EITHER the server has been hacked to be smart and 303 to the zip file OR someone has unzipped the thing into a directory called $

AR: do these compose? is parsing specified as left-to-right?

YK: yes

PL: yes, it should just work

AVK: you need to repeat the delimiter

[ yes, of course ]

PL: one of the things I was considering...smart client and server let you compose arbitrary URLs with these deimilters and the server might be able to help you zip things up with arbitrary paths
... these delimiters should be equivalent to a slash for caching, etc.

TBL: that's why I like "/"

PL: right, but I need a way to signal that this one is special

TBL: analogus: instead of having a domain name in the wrong order, thought through inverting domain names... /org/example/...
... would have been extra work, but nice for lots of reasons
... would like this to be clean in that way
... an appealing property

YK: the syntax isn't expressive enough to help smart server do the Right Thing (™

PL: the protocol about what sort of response I get from what sort of server needs to be specified

AVK: for now we don't need to do the server part

YK: we need to say that there's a header

HT: no 303 needed...it's a mediatype solution

YL: you need a container type

PL: if the server only sends one resource, it needs to send the right mimetype

[ discussion about navigating around inside a zip ]

HT: does apache give you header-based hooks?

PL: rewrite rule

HT: so a vanilla apache will work...don't need to rev the server to get this to work...a sysadmin can do this

TBL: right. There are 2 levels. The .htaccess level...not obvious how easy that is to make it work
... can't look at the file tree since there isn't one

HT: uncovered a difference in media type expectations: does this work with any zipfile?

TBL: I'm hearing a requirement that zip files be usable without hassle

AVK: that was the plan but it's not clear we can do that security-wise
... new problems...cache-manifest issue

AR: can we solve this with cors?

[ no, same-origin all the time ]

YK: need a new opt-in

SK: do we care about encrypted zip files?

[ maybe? ]

YK: what about the username/passwd URL features?

AVK: no, but we need to talk about the format...what do we support
... and there is some ork on the fetch side about how long things persist, are cached, etc.
... [inaudible] pinning []

AR: ? ?

AVK: you need to keep the zip alive for some amount of time
... don't want to throw it away quickly

HT: what we're looking for is a "super slash"
... what we want it to mean is "this is something that redirects...to understand this you ahve to unpack to proceed"

TBL: I prefer a clean hierarchy
... "//" was a mistake
... it shoudl have been /com/microsoft/.....
... there's level-breaking here...things that aren't really hierarcies

HT: taht's what they're proposing...assimilating it to the notion of slash is reasonable...we can't use "/" as such....

[ missed it...sorry ]

HT: TBL is saying "/" will work

YK: you can't disambiguate

HT: how does the client know?

TBL: it gets a redirection

HT: example.org/foo.zip/index.html

[ discussion of 303 solution ]

TBL: pushing back...why not the server solution?

YK: in my dayjob, I do write servers, and while I'd love it for servers to be the answer, but it's not tennable
... e.g., github pages

TBL: a sad case

YK: millions of things on it!

AVK: shouldn't have taken mimetypes out of the body = (

[ argument about archeology ]

(whimsical debate about polyglot zip/html)

YK: what about the mimetype? sniffing?

AVK: sniffing doesn't support all file types

<annevk> http://fetch.spec.whatwg.org/#zip-resource-types

AVK: you can't sniff css, html, xml, ...
... so you use the extension

AR: you need to not have a manifest...it imperils the streaming

YK: surprised and nervous about .png being different inside a zip file
... what if it's a GIF?

AVK: image rendering system ignores the mimetype already...no worries

YK: if you take a bunch of content and zip it up, want it to behave the same
... want the manifest to help me avoid weird behavior

AVK: we could, but I don't want to

AR: type comes through use

YK: existing content may be relying on the quirks
... and make break if you zip it up

TBL: the server could do a request to itself and bundle up responses

AVK: we should consider the complexity budget

YK: but we're adding a new thing...a new semantic
... the extension dominating is something we've never had before
... and this has new semantics

AVK: the alternative is a manifest

YK: these semantics are new

AVK: so is zip
... sniffing is not sensible, it doesn't cover all bases

AR: what actually breaks?

AVK: pain text in an iframe...what should it do?

TBL: different behavior for same file at two locations

<annevk> SVG without a MIME type won't work in <img>

SK: we're losing content type and encoding and we can't detect the charset

AVK: you can

YK: we try hard, but it's not 100%

DKA: we have a system like this...widget
... it has it's own URI spec for internal resources

HT: why aren't you using multi-part mime?

[ charset, text/plain discussion ]

<dka> Just for your bedtime reading, the Widget family of specifications: http://www.w3.org/2008/webapps/wiki/WidgetSpecs

thanks!

<dka> Widgets URI scheme: http://www.w3.org/TR/widgets-uri/

<ht> TBL: Whatever determines how the server would serve a document (wrt Content-type and charset) should also be what a server will include in a 'zip' file

AR: crazy idea: let service worker hve access to unzipping APIs and let it do this

<ht> AR, That works best if we take TBL's suggestion and just use /

TBL: similar problem in another space...serving some things up raw...tables in the data area

ht, true

TBL: and we mostly don't use FTP servers...we just use HTTP servers

AR: what about subsetting? only a few mime types allowed

[ no ]

TBL: manifest is appealing because you can introduce new mime types on old servers

AR: perhaps zip won't work?

YK: not sure...

[ SVG and image ]

TBL: suppose we say the zip mimetype, there's this set of extensions that map to mime types

<ht> I hear two positions: No manifest, but some combination of sniffing and context-of-use will determine (most) media types, vs. manifest (at the beginning)

[ that's annevk's proposal ]

YK: you can't just zip things up
... now you have to fix in extensions

[ discussion about constraint of zipping up a dir ]

[ we don't have locale to fall back on ]

AVK: I'm ok with a manifest...stuff that's not in the manifest will be application/octet-stram

<ht> Actually, it was revisiting the point wrt "zipping up a dir" that relative URIs should continue to work

YK: so css has to be in the manifest?

AVK: yes

<ht> W/o a manifest, no-one has come up with a way to guarantee charset

AR: nail in right-click argument's coffin: ordering for right-click case is likely tobe very bad, making things slow
... in that case, why not multipart mime?

YK: hrm....
... we need to arrrive at a format that lets us have multiple http documents with their headers

HT: you can zip a multipart mime file

<ht> or, as others seem to prefer, gzip

not sure why I try to participate

<dka> Another red herring for the fire: http://en.wikipedia.org/wiki/Web_ARChive

<dka> And another: http://en.wikipedia.org/wiki/Webarchive

<dka> One more: http://en.wikipedia.org/wiki/Mozilla_Archive_Format

I was going to say that it'd be really nice to have an API that deals in terms of WS Response objects and a map

<timbl> Due to the temporary shutdown of the federal government,

<timbl> the Library of Congress is closed to the public and researchers beginning October 1, 2013 until further notice.

<timbl> Who can keep a copy of the LoC in case the USG goes down? Wikileaks?

<timbl> http://www.loc.gov/home2/shutdown-message.html

<ht> ar, what changes if you move 'down' to the HTTP Response layer? What's better there?

<ht> I guess you get the Transfer Encoding vs. Charset Encoding distinction. . .

ht: I don't think a ton...except that we hve this nice API symmetry

AVK: now we need to define the format if we use multipart-mime

TBL: isn't there content-disposition?

[ lots of discussion about zipping, multipart mime...etc. ]

<ht> AR: I don't want to zip the imaages in a bundle

thanks ht

sorry for failing at the scribing

<ht> Various: multipart-mime allows that some parts are zipped, and some aren't -- use content-encoding: zip/gzip/...

calling for help in filling in the last 20 min

<dka> ajourned

\o/

- DRAFT -

Technical Architecture Group Teleconference

01 Oct 2013

Attendees

Contents

HTML5

API design

lunch

Security & Privacy

API Guide

Zip URLs

Summary of Action Items

Scribe.perl diagnostic output