Technical Architecture Group F2F -- 08 Jun 2010

[agenda discussion]

Content-type Override (sniffing)

action-370?

<trackbot> ACTION-370 -- Henry S. Thompson to hST to send a revised-as-amended version of http://lists.w3.org/Archives/Public/www-tag/2009Dec/0068.html to the HTTP bis list on behalf of the TAG -- due 2010-05-17 -- PENDINGREVIEW

<trackbot> http://www.w3.org/2001/tag/group/track/actions/370

ht: Yves is one of the editors of the spec. He has suggested putting our requested text into another section (of HTTP bis).

Yves: We have discussed with Mark. He has suggested [and it was the consensus in HTTP bis] that it should be in the security section.

ht: I think it's the best compromise.

<Yves> http://lists.w3.org/Archives/Public/ietf-http-wg/2010JanMar/0327.html

<noah> We are reviewing the above email, which has suggested amendments to the TAG's proposed HTTPbis text on sniffing.

ht: [comparing yves's version to henry's version]

ht: this is better than the status quo.
... the thing that has been taken out is the idea that you shouldn't do any kind of media type change unless you have evidence to support the belief that the media type is mistaken.

noah: [points to "risks of sniffing" proposed text in http://lists.w3.org/Archives/Public/ietf-http-wg/2010AprJun/0171.html]
... text "type of the actual document" doesn't [scan]. If you have an XML document, it is also text/plain in some sense but if you server it as XML then you are signalling that [the receiver] should process as XML.

ht: I feel that 7.3 [Media Type Issue] in Yves's message is a reasonable compromise.
... My feeling is that - I would be happy if we closed this issue with the following email to the [http bis] mailing list - "thank you for your response to our suggestion. The prose offered by Yves would be a significant improvement. Further changes suggested by Mark Nottingham would be [fine] but we're not worried.'

noah: I'm not so convinced.
... I'm not happy with the "may not be that of the actual document" text. Also disinclined to close the issue.

ht: how about a response more like : "thank you for re-opening this issue. We think the version suggested by Yves in his message of 24 March would adequately address our concerns. We're less comfortable with the suggested addition in Mark's message of June XX because it confuses..."
... I'm not happy that it goes in the security section because the core question is that it overrides user intent but that's a cost I'm willing to accept for getting the text in the document at all.

Noah: this word - "identify" - bothers me. I don't think the content-type header identifies.

ht: Suggest another word.

Noah: I think it's about the sender's intent. If I send you something as [e.g.] PDF, I don't want you to interpret it as text/plain.

[some wordsmithing]

noah: ...correctly convey the intended [type] of the content...

Yves: it's worth clarifying the text.

ht: "does not reflect the intended interpretation of the content"

<noah> Sounds good to me.

Noah: ok to seek consensus on this?

Tim: What the TAG often ends up doing is coming up with phrases like this that [can only be interpreted] with a background in Web Arch.
... I think the new text is fine.

jar: I wonder what it would be in IETF-jargon?
... the core idea in IETF is the stack - e.g. the server automatically provides a content type which is not appropriate at the application level.

Noah: we could say "we don't think it's appropriate to say that the content type identifies the content" and suggest an alternative.

ht: quotes chapter and verse

noah: anyone unhappy?
... i think the right concerns have been laid out...
... I propose to put down 370... and look at other actions related to sniffing.
... Larry has written an interesting message in relation to ACTION-425. We could go over it...

ACTION-425?

<trackbot> ACTION-425 -- Larry Masinter to draft updated MIME finding(s), with help from DanA, based on www-tag discussion -- due 2010-05-31 -- PENDINGREVIEW

<trackbot> http://www.w3.org/2001/tag/group/track/actions/425

Noah: Shall we do ACTION-387 first?

ht: should we now bring that message up and review it?

ACTION-387?

<trackbot> ACTION-387 -- Henry S. Thompson to review JK/NM's stuff on sniffing, authoritative metadata, self-describing web, incl. http://lists.w3.org/Archives/Public/www-tag/2010Jan/0025.html -- due 2010-06-07 -- PENDINGREVIEW

<trackbot> http://www.w3.org/2001/tag/group/track/actions/387

noah: The real context for this - we had a set of discussions. John Kemp suggested some updates. I had some concerns... We could leave this to this afternoon when John will join us.

ACTION-425?

<trackbot> ACTION-425 -- Larry Masinter to draft updated MIME finding(s), with help from DanA, based on www-tag discussion -- due 2010-05-31 -- PENDINGREVIEW

<trackbot> http://www.w3.org/2001/tag/group/track/actions/425

http://masinter.blogspot.com/2010/06/mime-and-web.html

<noah> http://lists.w3.org/Archives/Public/www-tag/2010May/0063.html

<noah> Mail 0063 is a draft from Larry that we will now review.

<noah> Yves says http://masinter.blogspot.com/2010/06/mime-and-web.html is the same text, with better format. We'll review that.

Noah: Brings up Larry's blog entry.

[group reviewing]

tim: I really like this article.
... and I like the fact that he's passionate about cleaning things up and has included a list of to-dos. To this is a good role model for future TAG things.
... now the philosophical rant: there has been some misunderstanding in semantic web contexts that the semantics of a fragment identifier depend on mime type.
... we've been telling people to use a URI name but it also depends on the mime type...
... so that was the confusion. Both things are true - "this [the fragment identifier] is a name" - and when you dereference it the MIME type [needs to be consistent].
... in the RDF spec says "if you want to know what the triples are in this document - this is how you parse them into a set of triples." it doesn't say "some of these URIs may include fragment identifiers."

jar: You're right that it doesn't say that but the RDF/XML media type registration does say that.

<Yves> http://www.ietf.org/rfc/rfc3870.txt

jar: you run into trouble if you're using another type - e.g. RDFa. So the HTML doc you get back from the FOAF namespace URI, it avoids the dual use...

tim: There are two designs and an "un-design"
... design 2: you can have the same ID being used in a way so that the human reader when they look at an abstract property they are directed to a written description.
... design 3: "we know what we mean..."

ht: second tim's description on 2nd position. There's a sentence on RFD-3986 which says "there's content negotiation and this notion that the media type defines the semantics of fragment identifies" concluding : if you use fragment identifies in context of content negotiation it's your job to see to it that the semantics of the fragment IDs are not incompatible with each other.

<noah> http://tools.ietf.org/html/rfc3023#page-16

noah: I see [a potential problem] in the RDF / XML space.
... this is the spec that defines the "+xml" convention.

<ht> http://www.w3.org/2006/02/son-of-3023/draft-murata-kohn-lilley-xml-03.html#naming

[EdNote: We were looking at the above, at least some of us some of the time, but the most recent version is http://www.w3.org/2006/02/son-of-3023/draft-murata-kohn-lilley-xml-04.html. It does not appear to differ in the areas under discussion.]

noah: +xml convention spec [specifically calls out] fragment identification. If you look at that section [in 3023 itself] it says "not defined yet." So there is a potential train wreck coming up.

ht: You're right - this is the definitive statement - but nobody pays attention to it.

tim: What's the solution?

ht: It's time to look at this again - it's opportune.

<Yves> http://tools.ietf.org/html/rfc3870#section-3

Yves: Pointed to fragment identifier section of RDF media type registation...

Noah: This doesn't resolve anything.

ht: the particular question on 3023 BIS is - it does not give any advice.

noah: the problem is that the +xml stuff is tricky.
... there is a published rfc-3023 - to which http://www.w3.org/2006/02/son-of-3023/draft-murata-kohn-lilley-xml-03.html#naming is a proposed revision...
... I would remove "fragment identification" as something that can be [automatically] identified...

ht: I think - if you write in a media type registration document e.g. for SCD+XML - if we define in the W3C spec that defines SCD, "fragment identifier semantics is thus and overrides XML semantics"... then that's [OK].
... you're right - the clear implication of http://www.w3.org/2006/02/son-of-3023/draft-murata-kohn-lilley-xml-03.html#naming as it stands is that if you use the +xml media type you are licensing all the interpretations of 3023bis.
... the RDF serialisation can do what it does because 3023 is under-specified. Once this new version comes out, we've got a real conflict.

noah: I think the spirit is very similar.

ht: but the letter is different.
... quotes: "if [xpointerfranework] and [xpointerelement] are inappropriate for some XM-based media types it should not follow the naming convention +xml"

tim: two options: A) remove stuff about frag id, leave rdf+xml; B) leave frag ID, remove +xml from rdf

ht: I had in mind a 3rd way: unlike the other properties, fragment identifiers can be overridden in a + spec.
... that would still require changing the spec.

noah: [seeking clarification on (A)]: "remove stuff about frag id as part of generic processing"

tim: Yes.

noah: I may have written software that doesn't recognize RDF but does know the +xml convention. Now that won't work for RDF+XML [responding to Henry's 3rd way].

tim: [names Henry's proposal: "grandfather"]

ht: fragment identifier interpretation happens...

tim: In your FOAF file you can use an XSLT -

jar: If you serve it as rdf+xml it won't work in a browser - you have to serve it as .../xml.

noah: what are our options?

jar: 4th option: figure out by context what is identified.

yves: pragmatically that's what implementations will do.
... [gives an example involving video]

tim: That should be in the spec.
... We could say "the video tag must only be used for pointing to a video and all video mimetypes must use media fragments mime types."
... if you happen to have scalable video graphics or animated SVG as a video then the mime types must support media fragments.
... The TAG should change the architecture document to add that text.

noah: I'm OK with that.

yves: [on using the frag id for specifying a part of a video] if the time-range is unknown for the media type then it is ignored.

ht: whatever happened to vertical bar? | server-side fragments?
... real question: is this what you said, Jonathan, for the 'pragma' option: "it's OK for generic processors to process generically and 'RDF' processors to process specially."

jar: [sort of]
... so the question is: suppose through con-neg you get 2 completely different documents? Then you'd say "that URI is not a good URI because we don't know what it's designating."
... this is the HTTP semantics problem.

ht: if you give an XInclude processor an XML document to process and it finds xi:include elements with href elements, it will interpret the result as an XML document - unless you say not to.
... whereas an RDF processor goes the other direction.

noah: I'm more worried about people who run generic tools on documents that are lying around ...

tim: xinclude - can I say "xinclude something#foo" ?

ht: yes: [No, corrected later -- XInclude forbids fragids in its 'href' attrs, and provides a separate way of designating fragments]

tim: Then this does break.
... if you use that to refer to the element tree, in an RDF document it doesn't refer to an element tree it refers [to a concept].

Photo of whiteboard with options:

options for dealing with generic processing of media types conflict

noah: [calls a break] we will pick this up again tomorrow. I propose what we do is come back to ACTION-425 when Larry can be with us. ACTION-370 we come back to tomorrow. ACTION-387 we come to this afternoon.

[back from break]

http semantics

jar: [presents slides at http://www.w3.org/2001/tag/2010/06/http-semantics.pdf]
... problems that I see in these discussions on "what do URIs mean / name" etc...
... these are the sorts of questions that I'm trying to model or address.
... didn't get as far as I wanted to get.
... not trying to "define meaning."
... trying to dispel fear and uncertainty around providing metadata for resources on the Web.
... needs to be grounded [in operational interpretation] but not too soon - you do it formally so what you do has broad applicability.
... [presents "http semantics landscapes: exchanges" diagram from slides]
... this is not a class diagram.
... it's a template for diagrams involving instances.
... the point is: you can do an ontological account - a scientific account - looking at what really happens in an HTTP exchange.
... [introducing "exchange variability" diagram] if you want to look at patterns -

jar: you can have 2 different request messages that contain the same URI. They can lead to 3 different response messages... This leads to a cascade of events. Eventually you'll have a response. The outcome of these 3 exchanges are these 3 different representations.
... [e.g. 2 are in PDF and 1 is in HTML]
... there are some properties of the representation which are common and some which differ.
... talking about "properties of exchanges."
... subclasses of exchanges: defined by restriction or by inclusion.
... in this modelling, we have some basic axioms already emerging - e.g. "every exchange has a send request."
... which can be expressed in OWL...
... even with this minimal theory you can do something interesting - e.g. dated URIs used at W3C - a resource with an interesting property in that it only has one representation. It is possible to model meaning to an extent.

jar: you could talk about the class of exchanges with this URI - class of exchanges with this URI possesses this invariant.
... so classes of exchanges talking about invariance on a representation, invariance on the exchange, ...
... you can also talk about the relation of these exchanges to the world.
... there are ways you can specify classes not just on restrictions on individual members - you can say "these particular members" or "has been accessed twice" ...
... [in response to question by Ashok] the origin is specified as a string and a port number... those would be properties of the exchange. If you wanted you could have rule - if the exchange was on this origin then do this... a restriction.

Ashok: It would make this more useful [to add rules]...

jar: you need a good foundation first...
... [back to slides - "subclasses of Exchange" slide 2]
... some subclasses: empirical / promise / API / constraint

tim: here we're talking about network protocols and meaning.
... there are social protocols - e.g. when somebody lies - in order to design something in the real world you need to consider all the possible things an "evil" person could do. Some of these cases you don't have to worry about.
... we need to distinguish between modelling how things are supposed to work and what happens when they don't work.

jar: this gives a start...
...

tim: what's interesting to me - what the TAG says is a meta-level promise. If "people behave" the TAG will promise that the server will never complain that people have accused them of saying something they didn't say...

[brief toilet-roll discussion]

jar: [back, thankfully, to the slides]
... you can say : all the exchanges I've observed are members of the following class...

ht: is the following claim a claim you want to make: 2 exchanges that differ in ways which are not captured by these classes are actually not different in any interesting way?

jar: is that a definition of interesting or a consequence of interesting?

ht: definition, I guess.

tim: [gives example of how ethernet cable gives a promise of containing ethernet packets]

jar: hypotheses - take e.g. dublin core metadata - if you make an assertion that something has this dc:title, etc... there are different ways to interpret that but one way to interpret is that the representation will satisfy those dublin core properties.

yves: when you have a promise - that somebody made the assumption that something will happen - there is a link to time - during this time X the promise might change...

jar: promises might go bad.
... the time at which it happened is a property of an exchange.
... last subclass of exchange: induced.
... if you're a librarian or a scientist, what you're interested in talking about is content, data, measurements, ...
... e.g. a telescope is an example of an "induced" resource.
... [more on induced exchange classes]
... maybe for each kind of thing that you want to allow 200 responses for you state [what happens].

tim: this is why it's important to model the semantic web stuff.

jar: the question of "what is an information resource" is a rat-hole. The interesting question is what classes of exchange are appropriate to this information resource? And if the answer is "none" then [maybe?] it's not an information resource.

ht: what kinds of differences between representations would be allowed for representations of the same resource?

jar: suppose i want a class of "news sites"? I can do that. but if I were to do that with something that today provides news but tomorrow is frozen in time then tomorrow it wouldn't be news.

yves: the news site is a good example. If you do a get on newssite.com/latestnews then you get back something with a timestamp...

ht: brian smith wrote "semantics of clocks" article - [jar] said you could say "an internet clock is a good internet clock if the time represented by the relevant string in the response is within some epsilon of the time of the response." you could say something parallel about news sites...

DKA: Can you define art?

<ht> Scots law requires that responses to offers of oral contracts must be "timeous" for the contract to be binding

jar: I wanted to touch on validators as a possible application. Web architecture validators. A Link http header could be way to communicate promises. Even if you're the one making the promise you might want to self-monitor with a validator.
... [digression] a promise is a speech act and it has parts.
... we now have 10 active metadata ontologies - e.g. "on this web page there is a mention of this gene" if they use the URI as the subject of that statement then they are worried it might change, the con-neg might go wrong, etc...
... their fear is that they won't be understood.
... what is acceptable to this audience now is things like web citation - "from this URI I got the following stuff or some stuff with these properties at this date..."
... the link header might give some assurance - if it says "this is stable content" - they would have more assurance to be able to use that URI as a subject.

tim: so "persistence commons"
... w3c has a draft persistence policy...

yves: they don't want to use the URI because it might change...

ht: there isn't a transparent way to write a URI which is recognisably a reference to a published journal article.

jar: you can say "info:doi/" - [with some caveats]

ht: [+1 to the idea of a validator]

noah: wrap for lunch?

<jar>[A location detection service: http://npdoty.name/location/]

[Resume after lunch]

Client-side storage

AM: http://www.w3.org/2001/tag/2010/05/WebApps.html

[AM speaks to the doc't, not scribed]

<noah> NM: Is this an API spec?

<noah> http://www.w3.org/TR/2009/WD-webstorage-20091222/

AM: Yes

NM: In this design, the javascript gets the data and can do what it wants, but the browser doesn't do anything automatically (c.f. cookies)

AM: Yes
... Name-value pairs

JR: 1970s-era file system

AM: Maybe name-value pairs plus indices

NM: What is index good for that names can't do?

AM: More like a database

Google Gears was there

scribe: with SQL-like access
... but not typed
... WebSQL was based on [a particular] product

<DKA> Web SQL: http://dev.w3.org/html5/webdatabase/

AM: Hasn't progressed because there is only one (proprietary) implementation
... But it looks like the indexed API is going ahead
... More efficient than simple name-value

<DKA> The note from the Web SQL spec: This specification has reached an impasse: all interested implementors have used the same SQL backend (Sqlite), but we need multiple independent implementations to proceed along a standardisation path. Until another implementor is interested in implementing this spec, the description of the SQL dialect has been left as simply a reference to Sqlite, which isn't acceptable for a standard. Should you be an implementor interested in imp

<DKA> ... contact the editor so that he can write a specification for the dialect, thus allowing this specification to move forward."

JR: You don't need CORS . . .

[scribe lost]

NM: App has javascript from A, tries to get stockquote from B, not possible w/o CORS
... Same thing applies once we have local storage -- JS from A wants stored information originating from B

TBL: Suppose we had a consistent access-control ontology
... with people, and sites, and origins
... and reconstruct something similar to CORS, but more general

<DKA> http://dev.w3.org/html5/webstorage/#privacy seems problematic to me - full of MAY clauses...

<timbl> yes.

<timbl> "User agents may, if so configured by the user, automatically delete stored data after a period of time." or configure by the (eg) bank.

AM: Agreed

NM: How do we follow up on this?

AM: It's possible that the 'may' is opening the door for something like CORS

<timbl> "User agents may allow sites to access session storage areas in an unrestricted manner, but require the user to authorize access to local storage areas." -- that is the user expressing trust in a set of code - that is equivalent to software instalation

DKA: [Eudora example]

TBL: You have a local IMAP cache
... some clients expose the cache in IMAP format

NM: what about a better GMail searcher, which would need special permissions to get at the GMail cache

TBL: Facebook is using web storage, and LinkedIn offers better functionality if you allow it in

DKA: I want to build that using OAuth

NM: But it won't work offline

DKA: Yes it will, it will keep working on my behalf [between the two, w/o my participation]

TBL: That's going all the way down the road to trusting big services
... I'd like to be able to run my IMAP client offline

DKA: But now a polluted website can inject a script which can harvest my email

TBL: There's no such thing as a little bit trusting

NM: Distribution is a complex issue
... This whole thing also breaks down because the storage stuff isn't integrated into our access model
... Compare this to the way caches work
... where we do have names
... The web storage model has no names in that sense, and no RESTfulness
... So the existing access control mechanisms may well not generalize

DKA: Always assumed this was just an optimization

JR: These things always start small and market forces drive towards more functionality

<timbl> Possible recommenders of trust in scrupts: 1) The user 2) The writer of another script (which created some data) 3) a third party registry like the addons.mozilla.org or iTunes AppStore

HST: I had the experience of using earlier versions of some of this
... and it was all same-origin protected -- every access had a URI-prefix which was SO checked and which constrained subsequent access

HST: Looking briefly at the WD-webstorage-20091222, that appears to be still true
... The session store vs. local store difference matters
... Not clear exactly what patterns of access are allowed . . .

YL: This is all similar to what happens wrt information in local HTTP cache -- the browsers know where data comes from, even if it comes from cache

<timbl> Yves discusses provenance tracking at the javascript language level

YL: So you could require annotation of data with its origin

JR: But this will still be vulnerable to CSRF

TBL: A lot of semweb are going in the direction of annotation of all triples with provenence info -- triples are becoming quads
... And feeding into reason maintenance systems so conclusions can be quickly withdrawn if premises are compromised

ACTION Ashok to comment to web storage guys: basically all of this is origin-based, but section 6.1 has a 'may' -- is this a door being held open for CORS?

<trackbot> Created ACTION-438 - Comment to web storage guys: basically all of this is origin-based, but section 6.1 has a 'may' -- is this a door being held open for CORS? [on Ashok Malhotra - due 2010-06-15].

NM: Concerned we need a plan of how to feed this into our overall WebApps story

AM: Work towards some bullet points?

TBL: Propose "Storage structure and access control should be considered separately"

<Zakim> timbl, you wanted to ask JAR what easier way of doing this was

TBL: JR, how do we do this?

JR: How do we avoid confused deputy attacked? Where a deputy gets too much power, and does more than it was intended to be allowed to do

<noah> The chair infers that the group would like to continue technical discussion.

JR: The way to do this is to pull the authorization as far forward as possible, by recording who/what/when is authorized
... instead of saying there was a point at which they were authorized to do everything

TBL: You have to have a way of specifying what is allowed -- is that code?

JR: Could be

TBL: Two ways -- characteristic function, or name, for what they are going to do, in order to authorize up front

JR: You have a description of what they want to do, it's to call some function with some args
... The pblm with CORS is that it just controls access to particular API, as it were
... Whereas e.g. with Kerberos the authorization act includes what it is you can do with the authorization
... Whereas with a login-based mechanism, you get authorization to "be" a particular user, and all kinds of things come from that

[scribe missed a bit]

<timbl> jar: This (login is much too high granularity

JR: In Javascript this is easy, the auth. decision is an object, which encapsulates the capability to do the action . . .

JR: Adam Barth says this isn't yet viable, because it depends on secure ECMAScript, but there's a lot of ordinary Javascript still around

AM: Who are we aiming this at?

NM: As with the Web Arch document, we should target not the authors of HTTP, but the smart user population looking for guidance as to how to use it... So we should describe a stable architecture which shows how the specs fit together (which may as a by-product help the authors of the specs that are still under construction)

<jar> Adam has done some cool work making origin-based authorization decisions work as well as they can in javascript

AM: I had started with the understanding that there were two goals: session storage and persistent storage. Now I see we need to add a third: [controlled] cross-site access.

JR: What's needed is a narrative, not a design

DKA: Possible kinds of actions: We could try to intervene in the progress of the Web Storage spec. What kind of proposal could we point them towards?

NM: Implementations shipped already, or not?

AM: Not even in last call

NM: No, I meant the code

TBL: I'd be surprised if it was baked

HST: But Mozilla shipped with something close to this at least a year ago

DKA: I will try to find out

JR: Adam Barth is trying to write an RFC on how to use cookies securely

<jar> http://www.eros-os.org/pipermail/cap-talk/2010-February/013753.html

NM: We could say something from the TAG on client-side storage which didn't get into the deep security pblms

<johnk> hi, I'm on the call now

<noah> Ashok will prepare input to discuss on the telcon on the 17th (if we have one)

<noah> ***BREAK***

Web Applications: Security

NM: Goal is to have a better idea of whether there's serious writing to be done
... what next steps if not writing

AM: If you want sites to cooperate, is CORS/UMP/OAuth/What the way to do it?

NM: What is your opinion, if you have one --- let's go around the room seeing what people think as of now.

AM: I don't know yet

JR: Hard to answer, because I've been in the capability camp for so long
... So if I were designing an app, I would take whatever mechanism I could find that would let me layer capabilities on top

NM: But the web community is going to make a call. . .

JR: I don't think CORS will scale

NM: What about UMP?

JR: Yes, I think it has the right properties -- from an engineering perspective

<johnk> +1 to Jonathan

<johnk> agree that UMP is better from architectural perspective

JR: I see you could need some more structure on top, to match web needs . . . there would be a certain amount of re-invention of the wheel for the next level, for instance wrt UI

AM: I still need to think about it, read the specs again

DKA: I'm more familiar with the CORS approach, I also know OAuth, which I think is sometimes the right response
... It's a well-used bit of web arch.
... I think the widgets digital signature stuff could also be useful
... Similar to CORS, but based on signatures, not origin

NM: How is this an access-control mechanism?

<DKA> http://www.w3.org/TR/widgets-access/

DKA: Here's the spec:

The doc't is specific to widgets, but it could be extended to docs as well

ACTION Appelquist to introduce signature-based approaches to access control

<trackbot> Created ACTION-439 - Introduce signature-based approaches to access control [on Daniel Appelquist - due 2010-06-15].

YL: I don't think CORS is the ultimate solution to WebApp security -- UMP is much simpler. People understand Same Origin, CORS is hard to make sense of, where UMP is simpler
... It provides what people mostly need, which is cooperation between two sites
... Much better way in for devs who are not seriously engaged with security issues

HST: So, is Kerberos a capability-based system?

JR: Like capability, in that you get an unguessable token, but it's still a capability just to be a particular person
... But it illustrates that the outcome of login can be an object

<johnk> OAuth is the new Kerberos, roughly-speaking

<johnk> SAML was the new Kerberos, before OAuth

HST: So I get that's it's not a clean distinction. . .

<noah> Possibly pertinent Kerberos RFC http://www.ietf.org/rfc/rfc1510.txt

<noah> Wikipedia simplified description of Kerberos protocol: http://simple.wikipedia.org/wiki/Kerberos_%28protocol%29#Simplified_description_of_the_protocol

HST:I've recently started using the MarkLogic XQuery + XML datastore system. It uses a very fine-grained inventory of permissions, down to the individual function level, as well as distinguishing read, add and change permissions on the datastore, but it still assigns sets of those to user groups, and so depends on user authentication == ambient authority

HST: Having said all that, +1 to finding, articulating and selling a capability-based story

NM: Not sure I have the grounding to make an informed judgement
... The concerns I hear about CORS come from impressive sources, and I think I hear good things about UMP, as far as it goes
... In particular, I don't see how UMP extends to cookie functionality, which seems to be what people really need
... WRT capabilities, my early experience was that it's easy to build systems that are architecturally pure, but that don't scale well to the necessary range of real-world scenarios. So I will need to be convinced in practice we can make it work

<noah> Not quite what I said: capability systems are very cool and can work well, but they tend to be somewhat difficult to implement and roll out. So, I want to see something that the community will accept as practical. Maybe or maybe not something like WebKeys signals a direction. I just don't want to gamble on some unseen capability system working until we've seen early versions getting traction on the Web.

AM: JR pointed out that Javascript does not have encapsulation -- without that, how can you do capabilities

JR: By adding features to the language -- that's what SES (Secure ECMAScript) does

JR: UMP is capabililties at the protocol level, that's not the same as capabilities in the programming environment, we need both

<jar> Too many features -> too much access -> loss of encapsulation. Javascript. Remove features -> can't access stuff you shouldn't be able to get -> recovery of encapsulation. Secure Javascript.

JK: Based on experience and reading the specs, I think the origin-based approach we have to security today needs to be fixed, CORS is not going to do it, and UMP looks like it will
... But I also note that the in-practice approach people have taken to protecting against CSRF and click-jacking amounts to something very close to capabilities in practice

JK: Fixing origin-based by hacking pseudo-capabilities on top of other hacks seems doomed, something cleaner from the ground up should be preferred

<noah> John's framing resonates for me: a key question is whether the permission token is integrated with or separate from the identifier. The latter approach is, more or less capabilities.

TBL: Is this a research-grade problem, or is it really ready? The extra stuff you, JR, said was needed?

JR: There is Waterken, which is built on top of UMP using web keys, which is a commercial product from Tyler Close

<jar> http://waterken.sourceforge.net/

JR: Note that UMP does not imply Web Key

<Zakim> jar, you wanted to briefly note that use of UMP does not require use of unguessable secret URIs. you can put the CSRF token in a parameter i.e. elsewhere in the request e.g. in

JR: Using UMP for CSRF protection is isomorphic to what people are doing today, for example using hidden POST parameters
... containing a nonce

<johnk> yes exactly

JR: UMP extends this existing story wrt e.g. click-jacking, w/o any javascript, into general javascript usage

ht:I have a PhD studnet, Alex Milowski, who has been looking at capability-based security for distributed computation with semi-trusted actors. A simplified example is the 'The triangle architecture'

Think of SETI at home... hierarchical decomposition

suppose a URI names a part of a decomposition of a probem

Diagram description: Henry shows a central coordination node sending http interactions to a farm of worker nodes.

ht: but the GET has to be done asynchronously, the results may take days or weeks to compute.

<noah> He described it as peer-to-peer, but the diagram is a one-level tree.

ht: triangle architecture: I will give URI to each work for use in delivering answer, as well as the question
... so I have an S3 account... I want you to do a computation... and put the answer there (on my nickel)

OAUTH doesn't work here. Too much authority granted

timbl: OAUTH is designed to do that

noah: SAML too?

ht: we're just starting to understand the options here

<noah> SAML = Security assertions markup language

ht: so we want a capability to store the result. time, location and scale limited.

<noah> Not sure, but XACML may be pertinent too (see: http://www.oasis-open.org/committees/xacml/ )

<noah> Wikipedia on XACML http://en.wikipedia.org/wiki/XACML

ht: We know S3 doesn't have this ability today, but we can build a gateway to S3 that does...
... Capability seems the simplest way to talk about this.

noah: I'd recommend you look at ACL-based approaches

AM: I don't thing XACML is capability-like
... I think it's about access control

NM: I think it's a bit more fine-grained. . .

DKA: What's also relevant is that we have browser and server as actors
... which makes me look to OAuth, rather than XACML which would set a very high bar

<johnk> XACML is a language for representing access control policies

<johnk> SAML is a language for making assertions about a "security principal"

NM: I find the Web Services approach awkward, but it is driven by use cases similar to the one just offered
... such as long response times

JR: We're not in an either-or situation altogether -- based on time-scale, for example, capabilities are designed for time-bounded, seconds, days or weeks, whereas ACLs are more 'permanent'

TBL: Capabilities have completely different properties depending on whether they can be copied or faked

JR: If you're cooperating over a secure connection, no third party can get the capability

HST: But you don't necessarily trust the person you send the capability to
... E.g. in the triangle architecture

TBL: You may assume everyone is reliably identified, so you can grant a capability only to me

JR: Signing things is only necessary if they can be interpreted

<jar> HT's example is interesting because it's ok if the capability is leaked. doesn't protect anything that matters much

<Zakim> noah, you wanted to talk about tokens integrated with or not integrated with the identifier

TBL: MiM could get the capability and immediately use it

HST: Not if the real actor's public key was in the capability request

JR: Or if the connection is encrypted

NM: JK said something really interesting: comparing a token in a POST header with a capability expressed in a URI
... but slightly different, in that the coupling between the identifier and the capability is different, not by much, but different
... which reminded me of my concerns about web key
... Since a link is a capability, I have to be careful to send my URIs carefully, i.e. using HTTPS
... But we've also added a post facto requirement for people to protect their caches, logs, ....
... Maybe there's existing advice which forestalls this, but it runs against part of our story about making the Web work: share links, etc.
... Makes me nervous to be sure we've spotted all the places that have to be secured to guarantee that URI-embedded capabilities can't leak

JR: Just remember that UMP isn't committed to web key

NM: yes, understood

JR: There's always a credential -- it's got to be communicated, and it has to be protected
... currently, it's distributed among cookies, ssl keys, IP addresses
... It's always in the response
... And that's not going to change

NM: Yes, but my comment was that putting the secret in URIs entrains a huge history of what browsers and other tools do with URIs
... and that history is very different in impact from the parallel history wrt e.g. cookies

JR: It's a question of what Javascript you write.

NM: I thought there was a crucial difference between UMP and XMLHttpRequest wrt cookies -- XMLHttpRequest automatically attaches my cookies for site X to any request to site X
... UMP doesn't

HT: If I do use URI-encoded capabilities, and do what I described in my example, with a GET over HTTPs, but my Apache server logs every URI, the log file, albeit restricted to root, contains the capability and goes to all my sysadmins. I don't want that.

JAR: Yes, if not used with something else like a login.

<noah> http://www.w3.org/2001/tag/tag-weekly#WebAppsSec

HST: So NM's point is simply that expectations about for example privacy vary wrt existing bits of Web Arch, e.g. URI vs. header

JR: Right. Nothing to do with UMP?

NM: Right.
... JK, what about ACTION-340?

JK: I think that's done

<noah> ACTION-340?

<trackbot> ACTION-340 -- John Kemp to summarize recent discussion around XHR and UMP -- due 2010-06-04 -- PENDINGREVIEW

<trackbot> http://www.w3.org/2001/tag/group/track/actions/340

<noah> close ACTION-340

<trackbot> ACTION-340 summarize recent discussion around XHR and UMP closed

<noah> ACTION-417 Due 2010-07-13

<trackbot> ACTION-417 Frame section 7, security due date now 2010-07-13

<noah> ACTION-435?

<trackbot> ACTION-435 -- Jonathan Rees to consult Tyler Close regarding UMP-informed web storage vulnerability analysis -- due 2010-06-15 -- OPEN

<trackbot> http://www.w3.org/2001/tag/group/track/actions/435

<noah> ACTION-435 has been bumped to 15 June

<noah> ACTION-280?

<trackbot> ACTION-280 -- John Kemp to (with John K) to enumerate some CSRF scenarios discussed in Jun in Cambridge -- due 2010-07-06 -- OPEN

<trackbot> http://www.w3.org/2001/tag/group/track/actions/280

<noah> We just moved ACTION-280 from Dan to John, and made date in early July.

ACTION-33 due 2010-07-15

<trackbot> ACTION-33 revise naming challenges story in response to Dec 2008 F2F discussion due date now 2010-07-15

<noah> ACTION-344?

<trackbot> ACTION-344 -- Jonathan Rees to alert TAG chair when CORS and/or UMP goes to LC -- due 2010-06-10 -- OPEN

<trackbot> http://www.w3.org/2001/tag/group/track/actions/344

<noah> JAR: I want text of action to record why I'm doing this. I think it's so we can do a review of their Last Call regarding security considerations.

<noah> close ACTION-412

<trackbot> ACTION-412 Try the clarification question, blog item, or wiki approach to metadata-in-uris vs CSRF closed

Technical Architecture Group Teleconference

08 Jun 2010

Attendees

Contents

Content-type Override (sniffing)

http semantics

Client-side storage

Web Applications: Security