Technical Architecture Group Face-to-Face -- 03 Apr 2012

Web APIs and security

JAR: I know what Adam Barth is up to, SES people are up to

Noah: Robin and i exchanged email.

Noah:My previous experience was that web apps have been less successful than I had hoped. I got this music catalog though the post, with "ios rocks" in the music catalog -- an amazing list of ting s people are building with native apps. Like a mixing board with ipad dock and the software on the ipad. This needs a proprietary hardware connector for example. 6 channel streaming recorder, effects pedal, etc. so you need low-level access to things like DSPs

Robin: That is access to the core audio API.

Yves: Actually for audio you can do all the processing in JS: it is a question of latency why you need core audio.

<masinter> aspects of native: (a) performance (b) access to APIs (c) monetization (d) trust

<masinter> maybe (b) and (d) are related? vendors link (b) to (c) because platform vendors take percentage

<masinter> (a) performance = throughput but also latency

Noah: A risk of de-facto standardization around a particular connector

TimBL: An RDF client library has to be aware of 303s etc, so that it understands the right relationships between things. You must be able to write code which will work in trusted and untrsuted Apps, write once runanywhere. When running as a script (untrusted) rather than Firefox (say) extension (trusted code) it must be the same RDF library. In Extensio mode, it's omnipotent, otherwise in script mode it is very constrained. But the API must be bascially the same. If you violate the cross-site-scripting attack checks in a browser, at the moment, there is no error code, no error message, so it is really hard to invoke code conditional on and in adaptation the untrusted environement.

TimBL: When I tried this there was no error code or exception. I raised it on the list. The response I got is "it's really important not to give a response, so the app can't phish to find out what's possible. And trusted apps are not a goal."

TimBL: Two points: 1) I think it's distressing to have a system that doesn't help you debug; 2) the system has to be capable of running in a trusted mode where you're sure you'll get some kind of response, either success or error

JAR: Seems analogous to Noah's point about access to the hardware port

Noah: Yes, except the port access stuff may be harder to make portable if the ports are proprietary and different

<masinter> air & phonegap (http://phonegap.com/) are examples of 'native app' development tools which allow writing apps as both webapps and native

TimBL: A lot of people have assumed there will be shades of gray between fully trusted and untrusted apps. Seeming like some people are feeling the middle ground may be too hard to work out. The architecture which is emerging has only the two extremes.

JAR: Do you, Tim, agree that the middle ground is unlikely to worth working toward?

TimBL: Seems like a research project. I'm interested in the TAG's position on the question: should APIs always give good responses for both trusted and untrusted apps?

jar: The question is, is there any middle ground between the completely trusted and untrusted app. Orthogonal question, can you design APIs which work in either situation? There are two general approaches on the table, from 50km view. One approach is origin case, origin(module) defines power of module, links to CORS design and Adam Barth's academic work. The other design is you get power by being passed it as a parameter. This is a 30 year old ACL vs Capability argument, we should not get into it now. People are polarized. In Tim's example, using XHR, you are saying 'here is the URL' and later getting a response callback, or your callback just doesn't happen in the bad case. In the origin case, you would use the origin of the module to decide whether to authorize the delivery of the error code to the XHR caller.

Robin: The origin is the one -- the HTML page -- which involved the js, not the actual URI the js was loaded from, which is irrelevant/not tracked

Robin: The js is not namespaced -- anything can put callbacks on anything, no boundaries.

jar: Javascript's kind of like Java -- The aim is "write once, run anywhere".

NM: Well, Java has a pretty elaborate class loader model that's pertinent to how Java code is loaded and gets privilege

jar: java security was a disaster -- based on call chain -- like the origin system

jar: in the capability method, you have a param you can pass which gives you the right to do things and you pass it to the library
... there is intense pressure to make js apps work and access things for which you need privs

Dominique Hazaël-Massieux joins the meeting

dom: a lot of the topics you have been discussing may be very relevant to what I will present

jar: personally, i find this the way to think about it -- it is a question of privs and to whom they are granted. There are more than two priv levels -- in fact there are many levels -- it might have access to the net but not the core audio for example. In fact there are questions of to what inside the app it is granted -- not the whole app, as now.

<Zakim> darobin, you wanted to point out that there is some possibility for APIs in the grey areas as well; point out new work; different design for trusted APIs and to say that SES does

<Zakim> noah, you wanted to talk about shared libraries

Robin: many things to say

Robin: 1) w3c has sent out announcement that it is looking into new work for system level APIs -- see member only https://lists.w3.org/Archives/Member/w3c-ac-members/2012JanMar/0057.html

dom: The device API meting is open and discussed this

robin: When you design APIS which work inside the browser security model, the API looks very different from something done with full trust access. There is investigation of new work area for APIs specifically for completely trusted APIs

Robin: 2) even if we take the very simple binary on/off trust, there is some room for grey area.

Robin: The example for the XHR where you want to not give error messages

Robin: In firefox, you double-click the tab and it makes it an installed app.

tim: really?

Robin: Concept of installed apps. They don't have to have high-level apps, you can give them specific privs -- greater local storage, system notifications, getting error messages could be some

Robin: [Who said this? RB, tutti to check] Most of what I wanted to talk about can go into DOM's session

Robin: [Who said this? RB, tutti to check. JAR thinks Dom said this.] SES are not a solution to the trust issue

jar: you mean security

Robin: They intermesh -- SES allows you to bring in 3rd parties which operate inside a limited space without access to each other

Robin: [Who said this? RB, tutti to check. JAR guesses Dom] All policy based systems which don't plug the XSS hole are really threatened by that hole

jar: SES doesn't give you a notion of what things [principals] have what authority in running code - you have to say, (like in Powerbox etc. and ongoing work) how you [? collect the query - (missed)]

TimBL: This isn't just about trusted apps. I use the same code, server side, and on the command line, including for test harnesses. I want all that to run my AJAX code. This needs to be part of normal computing. So, it's not just trusted and untrusted apps in the browser, includes things like node.js on the command line and server side.

<darobin> http://www.phantomjs.org/ -> PhantomJS, run a browser on the command line

<darobin> https://github.com/tmpvar/jsdom -> JSDOM, emulation of a browser environment in NodeJS

TimBL: Also... when you download software modukles from different people and representing different people, we'll need the concepts of agents running on behalf of completely different entities. We will have to surface remote entities as first class principals withing the system. Like having Adobe have an account of my system with the privilege to update its own apps.

TimBL: If I install a bunch of stuff like /application/microsoft, I'm willing to give Microsoft certain rights to e.g. update code in that part of the space. I'd like to know what rights I'm giving them. I think the origin represents this legal entity in an obviously broken way. Maybe some Un*x systems will go some way toward associating origins with such points in the filesystem trees.

jar: I think you have to specify the granularity of the grant of authority -- is it object, function, program, etc

ashok: how can I as a user give this authority to an app?

robin: unsolved problem.
... Policy of a rathole to fall into. [? RB to check]

jar: what about Powerbox?

robin: later
... My personal take e [? RB to check] is a hard-to-get-through process you can't do by mistake.

dom: at the moment you can buy stuff on the web no review

Noah: We are out of time.

Web Applications: Security and Web Applications Permissions

<noah> ACTION-344?

<trackbot> ACTION-344 -- Jonathan Rees to alert TAG chair when CORS and/or UMP goes to LC to trigger security review -- due 2012-03-27 -- OPEN

<trackbot> http://www.w3.org/2001/tag/group/track/actions/344

Noah: Note the change in order from the published agenda, this brought forward from 10:00 today

<dom> http://www.w3.org/2012/Talks/dhm-tag/

Dom: [ presents a talk of 16 slides]

Larry: Isn't monetization also a driver for native apps?

dom: Phonegap is addressing that, but I don't thing it is the biggest driver
... We also are looking at payment in the W3C headlights process
... [edits slide 2 to add Monetization]

<darobin> FYI I proposed an approach to modularisation for features, but there was no interest: http://w3c-test.org/dap/proposals/request-feature/

jar: For privacy with camera, how about confinement?

dom: basically impossible

jar: confinement being limiting the ability of the app to send any data back home

dom: Interesting to explore this approach though
... [slide 6]

robin: recommend panopticlick (http://panopticlick.eff.org/)

anon1: [identity suppressed for privacy reasons] I see [from Panopticlick]:

Your browser fingerprint appears to be unique among the 2,119,594 tested so far. Currently, we estimate that your browser has a fingerprint that conveys at least 21.02 bits of identifying information.

[discussion of fingerprinting details]

<Ashok> Hmmm... I got exactly the same message from Panopticlick

<darobin> http://www.mozilla.org/en-US/b2g/ -> The B2G Project

<darobin> https://www.tizen.org/ -> Tizen Project

<darobin> http://www.w3.org/community/coremob/ -> Core Mobile Web Platform CG

<masinter> http://tools.ietf.org/html/draft-ietf-geopriv-dhcp-lbyr-uri-option-14

<jrees> RSA Conference 2011 - Making Security Decisions Disappear into the User's Workflow - Alan Karp http://www.youtube.com/watch?v=POA8SLCT5EY&noredirect=1

<masinter> http://www.w3.org/2010/api-privacy-ws/

<jrees> here's Karp's tech report http://www.hpl.hp.com/techreports/2009/HPL-2009-341.pdf

_______

Robin: this is how web intents basically works. You have a service page which defines an action, like Pick. Picking a set of contacts from the addressbook for say sending an email. The service definition declares what it can do, pick a set of contacts. Then the user agent registers this service, modulo user input (?TBD (chrome people think it can be registered without UI)) RB to check

Robin: You then have a client page. Suppose you have a game -- you don't give the game full access to the entire addressbook. You just want access to it to be given to a set of people. The client page says "Start activity... pick contacts" and includes a button which the user must press on. This pops up a dialog to chose which service. Then it instantiates the service page on the side [in an iframe?] and you pick your contacts, and the contacts are returned to the original page. RB to check

jenit: Who defines which fields are actually transferred?

robin: The service page just gives a URI identifying the action, and one identifying the "type" which is a random semantic-free parameter used just as a filter

______ [break] _____

Dom: Right now, policy considerations are all at the browser level -- ability to access a website is granted indefinitely

Ashok: You asked about geolocation -- that uses policy?

dom: In a browser-dependent way - it all depends on whether user has granted it many times, etc.
... this is all left to the browser

Ashok: Not to the user?

Dom: For geolocation...

Ashok: Can you as a user author a policy?

dom: you can revoke access for a given website. There is no UI for it, there is no policy API in the web browser. In device APIs, we really did explore that space quite a lot. They could see the long term value but ...

dom: Some talk about having a generic application-wide policy, like CIA want to prevent location ever being available

dom: [slide 9/16]

Robin: The ideal is for the user to make an informed decision without thinking 0.5 ;-)

jar: what info is not sensitive?

larry: sometimes a problem is one person giving away info sensitive another person. Like mentioning their name and email in the same sentence.

larry: [Who said this? LM, tutti to check] We can't just restrict heis [?] talk of privacy to a user's own information

larry: When geopriv talked about privacy policy, they ended up with a API extension which insisted on passing a policy and a timeout with every API call

dom: I am not saying this is the solution, I am just pointing out what is out there

larry: Consent, opt in and opt out .. we should look hard at the assumption that assent helps.

ht: I am personally in the "always run virus check" mode for anything I install, as I had a terrible experience with a bad download once.

ht: there is nothing comparable on my phone which allows me to look at any web app, look at the Javascript, and figure out whether it is a bad one.

<jrees> http://www.veracode.com/ ???

Noah: Different - viruses you just look for signature of particular hacks, on js in general, you can't just look at the code

tim: Codepath tracing is getting pretty sophisticated, and maybe in the future you might be able to

robin: There is a crowd-sourced database of known bad web apps

timbl: Some kind of "Nutrition Facts" for what apps do for you would be a great addition to the add-on store, as it would remove the "after that a free-for-all" problem.

Noah: Different users might care about different things

robin: The android UI is generally regarded as horrible

<darobin> http://i.imgur.com/JWEII.jpg -> screenshot of the Android permissions dialog

tim: Maybe with some rethinking it could be better, particularly if it makes promises about what the app will do rather than talk about the low-level access the app is allowed.

larry: We don't have a vocabulary for trust. I would like to see use cases; we have stories that we should collect together, then analyse them.

tim: We've been doing that within MIT for 10 years. Anyone who tries to make an algebra of trust is making a big mistake. They don't match the real world. Trust systems have to connect to the real world, and therefore has to be a semweb application. I want to be able to say that my coworkers can access something, that the DIG blog could be commented on by friends of friends or who had attended a particular conference. I don't want to have a Google Circle to drag them into. You have to connect trust to reality, which is what the semantic web does.

<masinter> http://masinter.blogspot.com/2011/08/internet-privacy-telling-friend-may.html

larry: I was complaining about the word 'owner' to talk about meaning, because we don't have a good notion of identity. In order to talk about trust, you have to have a model of identity. If there's a problem defining the owner of a URI, or the namespace of individuals, perhaps we create a namespace of identity by projecting owners. You provide identity by saying which URIs they control.

Larry: maybe we could identify principals by the URI [domain names, email ids etc] they control

timbl: That's OpenID. It identifies you as the person who has write access to a given page.

robin: and BrowserID identifies you through an email address

timbl: and WebID does the same thing [URI you control]

<masinter> I wonder what is the identity of "browser vendors": product safety evaluations

Larry: This is like product safety. Cars that you can drive off a cliff aren't unsafe. There's an assumption that asking permission where people understand the permission is better than one where the permission isn't clear. Perhaps these are like product safety ratings. Are we looking for PICS extended to apps, as we talk about rating and validating?

robin: That could be done in the ecosystem, but not at this level

larry: The stuff about what apps gets into the app store

robin: If you have a policy-based system; that's the question we have to ask first

dom: Out-of-band curation is one possible approach. I think we'll see multiple approaches. There isn't a shared understanding within the WGs about what will work for the Web

robin: Or what the stories are, what the problem spaces are, what the terminology is

larry: The stuff about origin is also a matter of trust. A matter of brand. I trust my bank, and things I download from my bank. Brands give you trust

dom: I agree that origin is related to brand

larry: There's something about PICS we don't want to repeat, as it didn't succeed. But we can't avoid it by just saying we're not going there

dom: I don't think any of us know where exactly we're going. I think the TAG, as involved with ,cross-group, cross-technology issues should be helping: To identify terminology, to identify experts

larry: I'm trying to map out the space: brand, trust, rating, authority. Finding others who have mapped out the space, and adopt the framework

noah: We've often said that the TAG should work in this space, but not found someone to do it

robin: I would like to do this work. The first step, which might lead to further work . . . would be to agree on some terminology. Which is currently chaotic. It would be very helpful for cross-group understanding

noah: Who else would we have to involve?

robin: From B2G project, from the Trident project

noah: Could we do that without starting a Community Group?

robin: maybe a TF?. I'd avoid a CG because it can be hard for members to join. I'd prefer a TF, separate from www-tag

ashok: A Finding on this would be wonderful. Terminology, mapping the landscape, use cases

noah: We have to work out the initial scope

ashok: I would go beyond terminology, to use cases and landscape

robin: Terminology alone won't cut it. first success would be to get the right people talking together. include people from Privacy IG

noah: What other deliverables?

jar: The use case list and terminology mesh very nicely

robin: I'm happy to do that, and I can get funding to do it

noah: Does anyone object to this?

JeniT: how does the privacy draft fit into this?

robin: I will need to think about whether it should be a product of the TF

noah: From TAG logistics, is it one or two things to track?

timbl: We could bank what we have, publish it as a Note. get it out there

noah: There's a dated editor's draft available. To publish it as a Note, we'd need more sessions

timbl: We should produce something sooner rather than later

noah: We were reviewing as first draft yesterday

robin: I have a bunch of updates to make on it. I'll do another draft, and let's see what people think of it then

larry: I'd be happy publishing it to say "this is our initial work on this topic, which we will take forward". My objections were about taking it forward as a longer-term effort. In terms of RFC categories, it's not April Fools and it's not Standards Track. Publishing things early is good as long as the status is clear

noah: I'm only worried that people might take it as being something the TAG believes

<darobin> ACTION: Robin to update Privacy by Design in APIs [recorded in http://www.w3.org/2001/tag/2012/04/03-minutes.html#action01]

<trackbot> Created ACTION-684 - Update Privacy by Design in APIs [on Robin Berjon - due 2012-04-10].

ashok: How does this relate to the bigger Finding we talked about?

noah: Robin should scope that larger thing, I think we should leave it to him. Draft a product page

jar: Limited scope for Note as written. I don't see the relationship with the other

larry: I think this is more about an architecture around security and permissions

<noah> ACTION-514?

<trackbot> ACTION-514 -- Robin Berjon to draft a finding on API minimization -- due 2012-05-01 -- PENDINGREVIEW

<trackbot> http://www.w3.org/2001/tag/group/track/actions/514

<noah> close ACTION-514

<trackbot> ACTION-514 Draft a finding on API minimization closed

<noah> ACTION-684 Due 2012-05-08

<trackbot> ACTION-684 Update Privacy by Design in APIs due date now 2012-05-08

<darobin> .ACTION: Robin to create a product page proposing the Task Force on Web Security/Privileges/Trust/etc.

<noah> ACTION: Robin to create a product page proposing the Task Force on Web Security/Privileges/Trust/etc. - Due 2012-04-17 [recorded in http://www.w3.org/2001/tag/2012/04/03-minutes.html#action02]

<trackbot> Created ACTION-685 - create a product page proposing the Task Force on Web Security/Privileges/Trust/etc. [on Robin Berjon - due 2012-04-17].

<jrees> Task force on X where X = ? some options: [Web] Privilege Grants; Web Trust use cases & terminology

URI comparison

<masinter> http://tools.ietf.org/html/draft-ietf-iri-comparison-01

A percent-encoding mechanism is used to represent a data octet in a component when that octet's corresponding character is outside the allowed set or is being used as a delimiter of, or within, the component. A percent-encoded octet is encoded as a character triplet, consisting of the percent character "%" followed by the two hexadecimal digits representing that octet's numeric value. For example, "%20" is the percent-encoding for the

<jrees> Larry and I agree that http://www.w3.org/TR/rdf-concepts/#section-Graph-URIref is inconsistent with RFC 3986 view of equivalence

<jrees> and that therefore the strings that are called "URIs" in RDF are not really URIs

<timbl> We noted that the HTTP BIS had been changed significantly to be consistent with a non-document view of the web which it had not started with.

<timbl> over lunch

WebApps Storage

http://www.w3.org/2001/tag/2012/04/02-agenda#storage

<masinter> http://trac.tools.ietf.org/wg/iri/trac/ticket/111

<masinter> http://trac.tools.ietf.org/wg/iri/trac/ticket/113

NM: Time to get this over the last hurdles

<masinter> http://trac.tools.ietf.org/wg/iri/trac/ticket/114

<masinter> http://trac.tools.ietf.org/wg/iri/trac/ticket/112

http://www.w3.org/2001/tag/doc/Seamless%20Applications.pdf

AM: The above is a discussion document asking us to consider whether we should go in this direction

LM: We could also consider combining this with web vs. native apps topic

NM: [points us to draft product page: http://www.w3.org/2001/tag/products/clientsidestorage-2012-03-02.html]

<masinter> I think there's a strong correlation between local storage with backup (for native apps), vs web storage with caching (for web apps)

<darobin> [larry, I think that native or web is orthogonal to this problem—issues are about identifying resources irrespective of storage location, and the value of client/server synch]

NM: [takes us through the product page]

NM: Is this roughly in the right direction

<noah> https://www.w3.org/2001/tag/doc/Seamless%20Applications.pdf

AM: My doc addresses the question of how to write apps which would run seamlessly whether connected or disconnected
... Three requirements I came up with:
... 1) What it requires when it's connected;
... 2) Minimum requirement when not connected
... 3) Where it might find those requirements

<noah> I think we need to state the relationship between identification and access when connected and when not

AM: Hints about (3) might be AppCache, IndexedDB, local file store, Web Storage
... Regardless of how the local information is found, should be accessible in a uniform way

NM, TBL: That sounds contradictory

RB: By user or by app?

AM: By app

TBL: MySQL API and filestore API are different, right?

AM: Yes, but once you access a particular resource, the API thereafter is the same

TBL: So a resource is for instance a JSON blob

Tutti: So there are two layers -- a layer of access, which is different for different stores, and a layer of utilisation of a resource once accessed, which is uniform whereever it comes from

NM: So if the store happens to be a SQL store, access might involve joins

AM: Yes

<masinter> I'm concerned about error recovery, update conflict resolution, etc. when working offline?

NM: So we don't lose the unique value of the particular storage media

AM: Right

TBL: Does anyone understand where this is going/why?

AM: The fact is that there will be lots of different storage media

<jrees> ashok urging shared API for the objects retrieved using all the various APIs?

NM: So once I've got a JSON blob I can do another join

AM: Not talking about that
... Think of this as a calendar app
... So suppose you got the blob which is your calendar
... as you work with it, you update it
... If the app was running connected it would be working with both local and global calendar
... but if running disconnected, you have only the local resource available

NM: does this require distributed 2-phase commit

AM: yes

AM: Once you get connected, you start transactions at both levels, back out all local-only changes, recommit them all both locally and globally, then complete the transactions

NM: That requires a lot of mechanism, to support distributed two-phase commit, and is typically not nearly stateless.

TBL: Backing up, 'access' built out of parts, or blob stored monolithically?

AM: Let's not go to complex access, e.g. joins, simpler to assume monolithic storage.

TBL: iPhoto stores [in a more complex way]

NM: I'm pushing on this because I think he's solving the wrong problem

AM: If you exploit a particular storage scheme's special properties, then you are tied to it
... but I didn't want to go there

HST: I've had this problem: you have a storage problem and an interoperability problem. You don't know what provision the different Web platforms have. I had to write different shims for the different storage facilities across the different browsers: cookies, Google Gears or whatever. That's what Ashok is trying to solve.

HST: I understand the problem AM is trying to solve, it's the fact that different platforms today support different basic offline storage models

NM: Right, that's just a matter of API design, not a problem the TAG needs to work on

AM: The problem I see is that not all the backends have transactions, which my story needs

JAR: They will

RB: localStorage won't

TBL: You can use e.g. git on top of a local filestore. . .

AM: Moving on -- if the commit described above fails, the user loses all their work

NM: Fails, or there's a conflict?

AM: Conflict, right -- that's the bad case
... Can we say anything beyond "The app has to do what it can"

NM: There is 30 years of work on this problem

TBL: Apple Sync Services requires you to declare your object type, e.g. Calendar Event
... Mostly works, but if you have conflicting values for the same field, there's a generic tabular conflict resolution display to the user
... My experience is that this sometimes happens when I can't see any difference in that display, or even when I haven't touched the app on the phone at all. . .

NM: Lotus Notes has application-specific handlers
... Default is to make two copies of the relevant unit
... Difference between deletion and creation is tricky, sometimes handled by 'tombstones', with timeouts
... so you can tell the difference between "I deleted, you didn't" and "You created, I didn't"
... Multi-person, multi-year task and then you don't get it right -- we shouldn't go there

TBL: Another route is to enforce universal undo, so you can step back one step at a time

NM: You're relying on there always being a human available to help

<noah> Right...that's my bottom line. This is the wrong problem for us to be trying to solve and, even if it were the right problem, the solutions are horrendously difficult, have been worked on for 30 years, and would be in the hands of a design/development group, not the TAG

AM: Yes, some DBs to that

<noah> I would like us to look at one particular problem: when I use an application that runs locally and potentially disconnected, to update information that we otherwise want on the Web, what is good architecture regarding identification, and what latitude should be available for implementation?

<noah> I would like to see a finding that if information is to be identified with a URI for use on the Web, then it should be identified with the same URI when accessed disconnected.

Tutti: discussion of various source control systems' approach to related problems

JAR: I agree with NM that there is a huge background wrt sync -- is that what we want to work on?

AM: Is it important for us to be able to write "seamless apps", and support others who want to do so

RB: We are seeing a collection of offline stores being deployed, can we get in now to help exploit them responsibility

NM: If information is to be identified with a URI for use on the Web, then it should be identified with the same URI when accessed disconnected.

AM: I asked Raman about this, wrt using GMail offline -- does the message have a URI?
... He said probably not until it gets online

NM: I'm not saying it's obvious how to do this, but it would have real value if we did
... Consider working on an airplane, writing a document and an email which points to the document, by its URI
... So that when I get online, I synchronize and the email ships
... The email should point to the document online
... This is (close to) what Lotus Notes has done for years
... This may be too hard, at least in some cases, but it is an architectural desideratum

AM: How can you have the same URI -- you're not on the Web when you are on the airplane

NM: Yes I am -- the Web is not a set of servers, it's an information space
... I suspect if follows that the apps do any necessary synchronization, not the underlying storage mechanism
... That means e.g. the JScript in GMail knows enough to create URIs in a way consistent with the way those names will be created at sync time

AM: So, is all we can say application-specific architectures will exist, or can we say something overarching?

NM: Well, at least Good Practices, as above, and maybe design patterns and even maybe APIs to support them?

TBL: LDP API work relevant?

AM: Maybe

NM: I'm guessing that in practice apps would mostly do the syncing, as they do today. There might be some shareable infrastructure the emerges to help the apps, e.g for storing URI-identified rows in index-DB or SQL and/or tracking updates since last connect. I don't think the TAG should spec the exact sync protocol or shared facilities. We should make statements about how URIs are used. Of course, we need to be sure that what we recommend is deployable in practice, and that it meets the intended needs.

TBL: LDP apps use a triple-based API, which is grounded in a generic store

TBL: Interaction between API and store is "fetch/store the entire store" or "delta"
... That's where sync has to happen
... So this would enable a generic approach to sync

NM: So, where do we go with this?
... We've seen AM's proposal, my alternative, and TBL's LDP example
... Not sure whether LDP is a third proposal

AM: I think the LDP story goes way beyond NM's approach

NM: So what story are we trying to tell?

HST: Do we have a client? Is anybody asking for this? Is anybody listening?

NM: Not as such -- people are building stores, but no-one has asked for our advice

JAR: I prefer RB's "Goal is to try to anticipate pitfalls and raise awareness" better than the existing product page's goal

NM: Yes, if you mean high-level pitfalls, i.e. we are the T A G

RB: I have these problems today, and don't know where to look for help

NM: As long as we don't try to roll our own

TBL: Pointing to existing solution spaces

JAR: Commissioning ourselves to do a report on the problem

NM: CouchDB guys said they were building on some of the Lotus Notes work, e.g. tombstones

<darobin> http://couchdb.apache.org/

<noah> CouchDB Overview: http://couchdb.apache.org/docs/overview.html

RB: CouchDB is simple, you put JSON docs in, nothing is deleted, you access with Map-Reduce

AM: What can we say generally?

<dom> [I'm not sure the TAG documenting Web apps sync will reach the right audience (presumably Web developers?)]

HST: I think this is a Vietnam, we should walk away

NM: Straw poll:

. . . Nothing: 3+
. . . Work towards a uniform API, maybe including sync, per AM/Product page: 0?
. . . Patterns/pitfalls: 5

NM: If we tried to do PaPi (per RB), volunteers?

RB: I'll review and advise

LM: As before

AM: Yes, I'll try

NM: I'll review
... So, clean up the Product page and get started on the work

<masinter> the product page is meta, not worth spending much time on when we can work on the document

<noah> ACTION-647?

<trackbot> ACTION-647 -- Ashok Malhotra to draft product page on client-side storage focusing on specific goals and success criteria -- due 2012-03-06 -- OPEN

<trackbot> http://www.w3.org/2001/tag/group/track/actions/647

<masinter> are we commissioning a study or just a survey

LM: If we find a survey then this can be simple -- we just distil and point

<masinter> telling people where the cliffs are that they might fall off, we don't have to build the guard rails

HST: I was worried that if JAR's summary, that we need to do the survey ourselves is the case, then this is too big a task

<masinter> the product page is just there to tell people the general area where we're working, don't deep end on it

<darobin> .ACTION: Robin to draft scope and goals for the Patterns/Pitfalls work in local/remote storage synch

<HST> This was not done here, but was done subsequently (telcon of 2012-04-12 and is ACTION-693

<noah> ACTION-572?

<trackbot> ACTION-572 -- Yves Lafon to look at AppCache in HTML5 -- due 2012-03-06 -- CLOSED

<trackbot> http://www.w3.org/2001/tag/group/track/actions/572

NM: Adjourned until 1600, then DHM on threats and opportunities on the Mobile Web

<masinter> http://tools.ietf.org/html/draft-ietf-iri-comparison-01 should update 3986

<masinter> http://trac.tools.ietf.org/wg/iri/trac/ticket/112

<masinter> http://trac.tools.ietf.org/wg/iri/trac/ticket/114

<masinter> at IRI meeting last week we resolved to look at http://tools.ietf.org/wg/precis/

<jrees> on break, TimBL, Larry, JAR about whether web spec level should be separated from application level and/or social good level (?)

<jrees> maybe s/level/scope/?

<masinter> conformance vs. social expectation

<masinter> conformance doesn't require you to do things that the social expectation for normal use of the web might require you to do

<masinter> and if you want to create applications that rely on conforming properties, you might not be able to rely on the social conventions being followed

[Resuming at 1608]

NM: Mobile issues, then Admin

Mobile threats and opportunities

DHM: Two main points
... Disruptive impact coming from Web being on some many different platforms, but that you can build cross-/multi-platform applications
... E.g. using web app on phone so that tilting phone reorients image on a different device
... "Hyper-devices": the Web enables new use of our devices

<jrees> dom, link to blog post please?

<dom> http://www.w3.org/QA/2011/11/from_hypertext_to_hyperdevices.html

NM: Blue Spruce at IBM looked at cross-linked browsing experience

<noah> Project Blue Spruce may be of interest: http://www-01.ibm.com/software/ebusiness/jstart/bluespruce/

<darobin> http://www.readwriteweb.com/archives/ibm_blue_spruce_first_look.php

AM: WebEx ?

NM: Bluespruce is not shared desktop, but rather a coordinated browsing experience in which DOM changes are hooked, and broadcast to all participants.

NM: Linkage at the level of DOM

DHM: Not sure about the architectural impact, but thought worth mentioning
... Other area is WebRTC
... Real-time peer-to-peer communication
... File-sharing as a side-effect
... WebRTC is essentially Skype within the browser
... Audio-Video comms within the browser is the driving app
... Two parts: Access to the camera, mike and audio out; peer-to-peer connection
... There is a requirement for a mediation server, but there is work at eliminating it
... There's a Javascript API defined at W3C, plus a UDP-based protocol defined at IETF
... Two phases, establishing the connection and then actually trading data
... RTCWeb is name for IETF protocol, WebRTC is the W3C API

<masinter> http://tools.ietf.org/wg/rtcweb/

<masinter> http://tools.ietf.org/html/draft-ietf-rtcweb-data-channel-00

<masinter> The Web Real-Time Communication (WebRTC) working group is charged to provide protocol support for direct interactive rich communication using audio, video, and data between two peers' web-browsers. This document describes the non-media data transport aspects of the WebRTC framework. It provides an architectural overview of how the Stream Control Transmission Protocol (SCTP) is used in the WebRTC context as a generic transport service allowing Web Browser to exchange generic data from peer to peer.

NM: Patent problems?

DHM: Pretty confident at IETF this stuff is safe
... But the codec issue is still live, as it must be common between the two peers
... Some vendors don't want MPEG4 to be allowed

LM: How far along is codec and transport?

DHM: Last week at IETF Paris chairs put pressure on getting to consensus

JAR: Doesn't this raise the possibility of peer-to-peer HTTP?

DHM: Yes in principle, but not in practise yet, but that's one of the potential disruptive impacts that's coming

TBL: I've always been interested in p2p for HTTP as a tool against censorship

DHM: At the moment the peers have to access essentially the same Web page to initiate the connection

<masinter> note that SCTP vs. SPDY was a hot discussion at IETF

TBL: Jonathan Zittrain at the Berkman Center at Harvard has a project "Mirror as you link" to develop data sharing on the Web

NM: We haven't yet proven that this approach to p2p maps to the existing uses of e.g. bittorrent

JAR: I was surprised that these two were tied

TBL: Discovery is a big complex problem
... E.g. use a distributed hash table of everyone who is looking for a connection

DHM: There is a whole stack here, with security and encryption and so on
... Just SSL isn't good enough, to avoid man-in-the-middle attacks at the connection initiation time
... Because we don't have universal crypto-secure personal identities
... One proposal is to use mutually-trusted shared identity providers, such as Facebook, to reciprocally verify

<dom> http://tools.ietf.org/html/draft-rescorla-rtcweb-generic-idp-01

<masinter> we talked earlier about using "owner(URI)" as an identity token

<masinter> http://www.ietf.org/proceedings/83/slides/slides-83-rtcweb-3.pdf

<jrees> The link that LM entered is to a presentation "Media Security: A chat about RTP, SRTP, Security Descriptions, DTLS-SRTP, EKT, the past and the future"

<masinter> presentation from last week's RTCWEB discussing keys management and RTCWeb security

AM: Isn't it easier to just encrypt the conversation?

DHM: But we don't have a deployed PK system on the Web

TBL: PK doesn't need PKI -- it can be much simpler

NM: Ray Ozzie did instant group-creation before his company was bought by Microsoft, called Groove

JAR: PKI can be decoupled from the problem, and p2p doesn't need the whole PKI as we understand it

<noah> NM: Groove uses a peer exchange of public keys to establish identities, then allows collaboration groups to be created across organizations

NM: Thanks to DHM!

Administration

RESOLUTION: Minutes of 8, 15, 22 March all approved as a fair record of the respective meetings

<noah> http://www.w3.org/2001/tag/tag-weekly#Admin

NM: Agreed in the past that we would meet 12-14 June
... in Cambridge

<masinter> does TAG have opinions about W3C process http://lists.w3.org/Archives/Public/www-archive/2012Mar/att-0007/AB_List_of_Concerns-20120306.htm ?

NM: Our end-of-summer f2f has yet to be scheduled
... I will have difficulty travelling in September or for TPAC in November
... Options include -- yet again in Cambridge, Septemberish

<JeniT> http://www.w3.org/2012/10/TPAC/

NM: Another alternative would be a weekend before/after TPAC
... although that is in Europe again
... Or without me

RB: Weekend OK but not next to TPAC

NM: Net-net -- we will wait a while before trying to schedule the next f2f after June

XML Error Recovery

RB: At XML Prague March 2012 a lot of discussion about future of XML, XML and JSON, etc.
... A panel on XML / HTML issues, chaired by Norm Walsh
... There was consensus of interest in a processing model for XML that would not halt and catch fire at first well-formedness error

JT: There would be reporting of any error recovery actions to e.g. Firebug and/or the console

RB: The advantage would be that users would not be punished for the errors of others

NM: The scoping to end-user browser scenarios seems too limiting. There are other important uses of XML, and the same XML that goes to a browser sometimes needs to be processed in other ways. Some of the recovery that's safe when presenting info to a user is dangerous if the resulting data is to be trusted as untainted in a database.

JT: Not exclusively
... Other discussions identified other use cases: editors "of necessity" go through states where the documents are not well-formed, but a tree-view is still useful
... Mark Logic has an error-recovery mode for loading into the DB
... As do some editors
... but all of that is idiosyncratic
... So the question was if we could have uniform and predictable error recovery
... across all three use cases

RB: [libxml pattern] -- same document twice gives same result
... Primary use case is in trying to deploy XML to user-facing apps
... The fact that the halt-and-catch-fire experience blows that, so browsers have started silently correcting

JAR: But we know where silent error recovery leads -- it leads to HTML5 -- the moving target aspect is really bad

NM: We can address that by publishing a TAG finding to insist on no silent error recovery

JAR: Errors have to be ugly, to put pressure on fixing them

TBL: Designing the level of ugliness is important -- the console is too well hidden -- show the warning briefly
... and allowing it to be configured to persist, for instance

RB: So that discussion led to a W3C Community Group, with Anne van Kesteren editing his earlier XML 5 draft, but the work product will not be called XML 5
... This is not going to run at breakneck speed, but will work its way along

AM: Does Mark Logic have a patent in the area?

JT: They use the schema to help, I don't know about a patent

HST: There's prior art . . .

<Zakim> noah, you wanted to comment on Robin's proposal and to discuss why use cases matter and why standardization matters

NM: The stakes go up for automatic data import
... There are gambles you are willing to take when heading for a web page that are inappropriate for importing mission-critical data which may not be used for some time. . .
... So starting with an existing algorithm w/o much inclination to change it makes me nervous

<masinter> quiet error recovery in popular browsers is more harmful than vendor prefix, but we have this with MIME type sniffing too, which is a kind of quiet fixup

<masinter> sniffing application/xhtml+xml => text/html is an automatic fixup

NM: The pervasiveness of consistent error recovery will change community expectations

RB: For me user-facing software is the key case
... But browser deployment will leak, no matter what

<Zakim> jrees, you wanted to comment on Noah's idea

TBL: RDF allows XML buried in RDF, it would be good to allow XML ER in there [?]
... Feeds with XML in can cause real problems -- RSS readers must be super-tolerant -- but we keep seeing e.g. DOCTYPEs in tweets??? TBL to correct/fill in please

JAR: So you are heading for tolerance

RB: Not tolerance, they are still errors, with well-defined recovery strategies
... The HTML situation is horrible not because of tolerance, but because the recovery rules are so complicated because the recovery heritage is so complex

JAR: This will promote a race to the bottom

RB: Is that a problem, and if so why?

JAR: There will be no selective pressure
... Drift in the correction landscape will eventually lead to meaning change

LM: Sniffing itself has promoted this by the sniffing of application/xhtml+xml => text/html
... If the popular receivers are strict, then producers will check first

RB: Indeed, and sending the same doc to different browsers with different media types makes it worse

LM: The right place to put this is in Apache and IIS, so the data that goes out is fixed

TBL: And sends a message to root!
... Whenever you have a string with two different potential readings, you have a security hole

<jrees> correction is fine but *silent* (i.e. painless) correction is a big security risk

TBL: Simple security attack example for different parsers doing different things: Tim puts up a page which he knows Larry's browser and Ashok's browser will see differently, asks Larry to OK it to Ashok, and then Ashok transfers money to Tim, as he sees a different message.

JT: We are committed to non-silent recovery

RB: Exactly what that means is up for discussion and implementation choice

HST: It's precisely those honest additions to your assurances that make us worried . . .

NM: Isn't this going to make the sniffing of text/plain as application/xml have dire consequences?

RB: That isn't in scope for the XML ER CG in my opinion, because what causes the UA to treat something as XML is prior
... The sniffing stuff is someone else's problem

LM: The sniffing document was originally in the HTML WG
... It was moved to the IETF Web Security group
... Where some members raised doubts
... I'm not involved in the document
... It expired at the IETF
... The WebApps packaging draft makes normative reference to the expired draft
... The HTML5 draft has a normative reference to the expired draft
... One of the issues raised against the document was to never sniff to PDF, the original editor declined to make any change
... No examples have been forthcoming

RB: The opposite case does arise, that is, correctly labelled application/pdf docs being sniffed as something else, particularly short ones

LM: My suggestion wrt sniffing was that any document whose media type was determined by sniffing to be different that its published type, then it should get a different/unique origin
... We have an abandoned document that a) is normatively referenced; b) creates a problem wrt XML and error recovery; c) contradicts the Authoritative Metadata finding
... We should do something, particularly about the XML case

JAR: If the XML ER CG doesn't say anything about sniffing, the TAG will have to. . .

NM: Sniffing XML as non-XML is clearly not relevant to the XML ER CG, but they can say "This algorithm is not robust / appropriate / safe when applied to non-XML sniffed as XML, don't do that"

NM: Please reread authoritative metadata since it clearly talks about security holes

NM: People know the arguments against sniffing, they just think *their* considerations are more important

[scribe notes that discussion continued past the end of scheduled meeting closure]

<timbl> Of course the semicolon-adding Javascript behaviour of js parsers is a possible security hole, bug etc too

RB: I'm really concerned that the sniffing spec is dead

LM: I tried to get that actioned w/o success

<jrees> yes, applying the pressure early in the development chain is best, but if a problem gets past all intermediaries, then the final consumer needs to suffer a little, so that there is some selective pressure

<darobin> ACTION: Robin to try to find who is in charge of the current browser content sniffing clustermess, and see if there is a way of moving out of the quagmire - due 2012-05-01 [recorded in http://www.w3.org/2001/tag/2012/04/03-minutes.html#action03]

<trackbot> Created ACTION-686 - try to find who is in charge of the current browser content sniffing clustermess, and see if there is a way of moving out of the quagmire [on Robin Berjon - due 2012-05-01].

<darobin> mmm, so long as automatic semicolon insertion is well-defined (and it is) I think it's security-safe

<darobin> whether it's good programming style is another issue, of course

<jrees> no.

<jrees> speciation in xml would be bad

<scribe> ACTION: Noah to look for opportunities to discuss putting forward something to the AB about the Process and the failed reference from RECs to expired RFCs as a side-effect of scope creep etc. [recorded in http://www.w3.org/2001/tag/2012/04/03-minutes.html#action04]

<trackbot> Created ACTION-687 - Look for opportunities to discuss putting forward something to the AB about the Process and the failed reference from RECs to expired RFCs as a side-effect of scope creep etc. [on Noah Mendelsohn - due 2012-04-10].

ACTION-687 due 2012-04-04

<trackbot> ACTION-687 Look for opportunities to discuss putting forward something to the AB about the Process and the failed reference from RECs to expired RFCs as a side-effect of scope creep etc. due date now 2012-04-04

<darobin> Jim Fuller has done some excellent work on generation of XML language validators based on XSLT/XQuery genetic algorithms, so I think speciation in XML may not be so bad ;-)

- DRAFT -

Technical Architecture Group Face-to-Face

03 Apr 2012

Attendees

Contents