W3C

TAG F2F

07 Jun 2010

Agenda

See also: IRC log

Attendees

Present
Jonathan_Rees, Noah_Mendelsohn, Ashok_Malhotra, Henry_Thompson, Daniel_Appelquist, TimBL, Yves_Lafon
Regrets
Larry, Masinter
Chair
Noah Mendelsohn
Scribe
Jonathan Rees, Yves

Contents


<jar> convening

<jar> scribe: Jonathan Rees

<jar> scribenick: jar

item: Convene, review agenda

Welcome Yves

F2F Main Goals:

Make progress on writing related to Web Application Architecture

XML / HTML integration

Other technical topics

noah: Question: what is the best form for our publications? recs, findings, blogs
... Let's get things written, push them out, then figure out their disposition (finding etc)
... Move domain name discussion from Tues to Mon

ht: My goal was to brainstorm with a few (non-TAG) people on whether a more substantial meeting for consensus building is worthwhile
... Kevin Ashley, new director of DCC (digital curation center)

<ht> Helen Hockx-Yu

ht: Helen is from British Library

(discussion of agenda, how to involve John K and others who have to call in)

<DKA> http://www.w3.org/2001/tag/2010/05/web-apps-notes.html

Web Applications: Overview

dka: (talking from projected page, follow above link)
... TAG could act as a 'lens' relating frontline work to broader community
... Robin Berjon has been helpful
... Look at how how 'web applications' are being used in the field
... Originally people used GET... maybe not always in the best way
... Look at patterns of web app usage, say here's how they use the artifacts of web arch. Maybe this would be useful
... relation of messaging protocols to webarch - e.g. in social networks

timbl: Messages vs. information space

dka: e.g. Websockets is making this connection; that's interesting

ht: +1 to drilling down enough to understand how XMPP addresses [this]

timbl: Pattern approach is good, say: If you use these protocols in this way you get this value.
... Applying this to web apps arch, there are many different areas, so [difficult]
... Taxonomies are ratholes...

ht: People are inventing new things faster than we can classify them

dka: People know REST, and ask what can the TAG add to the REST story?

timbl: REST used to mean web + Fielding... now there are new patterns that build on this (e.g. odata, JSON, SPARQL patterns)

noah: There's a level where we might say here's the architecture pattern, that connects client side & server side use of URIs
... Raman is giving interesting particular examples; [that's a second level]
... connect the 2 levels (architectural patterns & usage patterns)

dka: I'm influenced by what we're doing in mobile web best practices group. Look into applying some of that methodology. Here's a usage, evaluate it, does it lead to a good result?

timbl: often TAG work has been triggered by a very specific issue. Intense discussion, followed sometimes by saying it's OK.

dka: Where can we have an activist agenda?

noah: Our mandate is architecture... the web is an information space, much of the value comes from linking
... network effects

dka: The 'cool uris don't change' resonates with people, they know w3c says this

yves: Hard to link into a silo. Entertainment industry, for example, tends to make monolithic [insular] apps

ashok: Is it lack of education, or is it business reasons?

timbl: One concern about web apps is that they're too slow, this motivates things like app stores [native applications that could have been web apps]

noah: The native apps are less clunky, little things

timbl: We can [try to] clean up the web... we could push for very hot web apps...

ht: Many of these "you need specific browser X to look at this page" errors are false - the page works fine if you edit out the browser detection code

timbl: (writing on white board, 3rd item in list after 'very hot apps') Decent social networking (socially aware storage; P2P)

dka: I care about this last one, but I wouldn't put it under the topic of web apps

noah, timbl: "For more, see the facebook page"

timbl: The problem is that the garden center is committed to FB being their ISP

ht: But then it is on the web, right? [question preceded timbl: The problem...]

dka: re security, I've been focusing on DAP, GEO
... in widgets, there's the concept of a feature having a URI (e.g. the geolocation)
... this mirrors what app stores are trying to do. E.g. "this app wants to access you camera and location, ok?"

noah: There are 3 levels of apps, app store, drive-by, widgets (with install step)

dka: What constitutes a web app?

ht: we talked about this in Cambridge, with John K

timbl: negotiating the interchange of data between otherwise untrusted parties. An install is part of a trust system. The dominant web sites are ones that get trusted.
... different [orthogonal] from technical questions, such as shuffling data, caching

noah: Users think they know what it means to install something. At install points they expect these annoying questions

ashok: There can be problems if your situation changes (you don't want to allow use at a particular location, etc)

noah: Tension between install step & on-the-spot decision making

dka: Is there a middle way?
... People who are building widgets think of them as app containers, maybe [missed]

yves: You want a local storage of [privacy/security] properties that can be examined and changed

timbl: BT open zone
... there's an assumption that the domain is a principal ...

individuals (social entities) and orgs are principals

scribe: and then there are lumps of software - also treated as principals

timbl: Would be good if the TAG looked at this... maybe recommend keys... the user interface is the hard part

ht: It always comes back to the question of principals. These systems founder because the identity not being useful to you.
... That doesn't tell you whether you should accept content from that principal.

jar: Lampson emphasizes accountability

ashok: Certs are often meaningless

<ht> http://cacm.acm.org/magazines/2009/11/48419-usable-security-how-to-get-it/fulltext

noah: PKI, Verisign but also PGP, Lotus Notes, integrated UI... PKI is a low level building block

jar: Lampson article is cynical, interesting even if you don't agree

timbl: The CORS idea seems bizarre to me, isn't the web public anyhow, why should scripts be different from other agents? - but it's a reputation system

yves: Host X is fine for some things, but what if I don't like their tracking system?...

dka: Let's look at the list, think about priorities

Break for 15 minutes.

<DKA> FYI - art heist blog article in NYT I was referring to earlier: http://nyti.ms/d7qjaW [context: freedom of the Internet, also privacy]

Reconvening.

Agenda shuffling

HTML / XML Unification

<DKA> msg ht let me know if you think guests this afternoon will want internet access - if so, I need their email addresses if possible so I can get them passwords.

<DKA> hrm

noah: There've been repeated efforts on tag soup / html moving forward.
... Raman had nominated TimBL to convene some kind of group to tackle this problem

timbl: When Raman said the TAG was ineffective, I said what should be done?
... I've wanted to bring the XML and HTML forks back together
... Making a client has been made more difficult, with two stacks
... re mime types, Larry has said to fix it by fixing incentives ...
... we want a culture where it's considered good to clean things up ...
... maybe we'll try and fail
... So how can we clean this up.
... E.g. maybe fix the validator, so it would say "you could do such and such and then it would work with so and so client"
... To what extent do you move HTML toward XML, or XML toward HTML
... One end of the scale, non-nested close tags are silly, we can just say it's not OK
... On the other hand, default namespaces seem really convenient
... thus a range of attitudes depending on feature
... Possibility of a group with representation from both communities, people who understand issues deeply
... I've wanted to do this, but have felt there hasn't been backing. TAG support would be helpful

<Zakim> ht, you wanted to mention the widespread use of XML toolchains in web/document management environment

ht: Yes

<DKA> +1 to bringing together moderates rather than extremists.

<noah> I don't think moderates/extremists is the point. In fact, most of the people in the community I think have important perspectives. I think we need to bring together experts who understands the needs of the communities and the details of the technologies.

ht: I've become aware that there's a big community that has bought into XML toolchains.
... But this doesn't count with HTML5 community because they're not building web apps. Instead they're making documents

timbl: Example website?

ht: Many of the drivers for HTML are trying to make the browser the application delivery platform
... The member participation in XHTML2 was from info delivery bus, not app deliv bus (yes, this is a specious distinction)
... There's overlap but these are different communities
... We mustn't leave the XML community behind. They bought into XML, and they get no benefit from HTML5
... They're shipping XHTML. They're uncertain about what would happen if they moved to HTML5

<Zakim> DKA, you wanted to give a brief counter-example...

dka: When we launched VF5 portal over browser on phone, that was based on an XML toolchain
... news, weather, horoscope - info sources don't want to build their own mobile sites, so we say, give us XML (partner markup) and we'll take care of it

<ht> Here's an example of an all XML toolchain-produced website, managed using the Factonomy product I mentioned -- the site owners (almost) never see markup, but maintain and augment the site via form-based interfaces: http://www.scottishhumanrights.com/

dka: XSLT, WML, XHTML
... Now, the devices have grown up, the clients have more mature browsers, the info providers now have skills, so portal infrastructure may have a diminishing role
... Portal is evolving toward a directory

noah: compare AOL, Yahoo! ?

dka: Maybe this story gives a data point
... If it's easier for an intermediary to use HTML5, they're going to gravitate toward it

timbl: A little person will get drupal ...

<Zakim> ht, you wanted to present the above example

<timbl> .. and hope that drupal works well with mobile

<timbl> So the CMSs (open source and other) have to be\ fixed to work with mobile.

ht: SHR writes no documents, just adds, organizes. Skins and layouts chosen from menu. The toolchain pulls both structure and content out of database and renders it in XHTML
... javascript is used to make site look same in all browsers ... [grumble...]

<Zakim> noah, you wanted to noodle on JSON

noah: Why was XML to W3C? Because hope was that it was to be used generally on the web

x/XML to/XML brought to/

noah: Losing to JSON et al.

<ht> HST wonders how sure you are that XML is (still) losing to JSON -- I hear less about it than I used to, and more about ATOM

noah: the HTML5 app builders are not using XML ...

timbl: RSS, ATOM
... there's lots of XML on the web... Word ...
... issue is important to anyone who writes scripts, has to deal with two different DOMs

<timbl> SVG, DOCX

yves: JSON has implied datatypes; XML datatypes are complex. JSON easier, natural
... compound document is a hard problem, huge problem in XML/HTML integration

timbl: If people are moving data around, the direction to go is to standardize some way to map JSON to/from RDF

ht: parsing XML is faster than parsing JSON...

noah: transient effect

<timbl> ht, pointers please!!!!

ht: (that was just a footnote)

<ht> http://performance.survol.fr/2008/04/json/

<ht> http://blogs.nitobi.com/dave/2005/09/29/javascript-benchmarking-iv-json-revisited/

<noah> JAR: Somehow reminds me of the LISP vision: universal format for data (did I get that right?_

<noah> JAR: RDF somewhat the same

<noah> JAR: Seems there's a constant struggle around getting data sent around in a machine processable way, and integrated with UI stuff.

<ht> http://labs.mudynamics.com/2009/05/01/json-xml-performance/

<ht> http://ejohn.org/apps/jsonvxml/jsonvxml.png

ht: [cleaning up html ...] validator should give positive feedback, feedback directed to the authors ... this seems promising
... What is the benefit we're trying to achieve? What is the motivating headline? We want to make the web (fill in blank)?

timbl: simpler, cleaner?

<ht> Whitepaper about short- and long-term benefits

timbl: benefits, motto

<ht> And then a catchphrase similar to "cool URIs don't change"

timbl: Since 2 years ago, progress on polyglot documents, important idea & set of use cases
... Many discussions on namespaces

<ht> The survival of polyglot as a 1st-class citizen is what really matters for the XML-tool-chain-doc-management stuff I've been banging on about

timbl: strategy?

jar: Task force vs. working group vs. other - let's be clear


. action timbl to Create a task force on XML / HTML integration

action timbl to Create a task force on XML / HTML convergence

<trackbot> Created ACTION-437 - Create a task force on XML / HTML convergence [on Tim Berners-Lee - due 2010-06-14].

action-437 due in one month

<trackbot> ACTION-437 Create a task force on XML / HTML convergence due date now in one month


. RESOLVED: The TAG recommends the formation of a task force to drive the reunification of XML and HTML

RESOLUTION: The TAG recommends the formation of a task force to drive the reunification of XML and HTML

(passed by general acclaim)

<ht> OK, I have found the beginning of the thread about Tim's document: http://lists.w3.org/Archives/Member/tag/2008Aug/0067.html

Yves will scribe this afternoon

ADJOURNED until after lunch.

<ht> [Member-only link above, sorry]

<Yves> scribenick: Yves

<scribe> scribe: Yves

Domain name persistence (workshop planning)

(round of introductions)

Helen Hockx-Yu & Kevin Ashley

ht: commonly raised issue about using http URIs for persistence is because of possibilities of 404.

persistence for multi hundred of years, how (owned) domain names could achieve such persistence?

might lead to a workshop on digital data persistence.

there are two main lines, first one is to go to IANA for new TLDs under new rules that deserve to live forever

leading to more stable reference than with usual domain names

alternate way is to allow existing domains to fall in that category of "persistent domain"

open question, what would be the basis for selecting/allowing specific domains to become persistent

also some legal issues relative to endowments

jar: question about attaching metadata to a document, more difficult to do in an archive system than on a regular web site

advocates for URN claiming it was more reliable, how to you solve the URN resolving issue?

Kevin: difference between persistent of content, and persistence of domain name

jar: there are three aspects, the data, the cataloguing aspect, the metadata, and how to access them

persistence, cataloguing are almost solved problems, but automatic retrieval and update of metadata in archived records is an unsolved issue

Helen: we start with seeds URI and crawls. We did two domain crawls (of .uk). between the 2 crawls, more than 5% became unavailable, and new one emerged. Also ownership changed on some domain, with different content.

timbl: how many domains have meaningful whois information?

Helen: the goal is not to detect who the owner is, but that it changed

memento project is to embed time and date dimension in http requests, allowing you to do conneg based on time

you can link to dated version, and the memento server would serve the closest in time dated version of the content

there is a plugin for firefox that add a time slider.

timbl: so you need to add in your server a link to the memento server of your choice
... difficulty at W3C to mementoify our content is because of ACLs is not versionned in time, so you can get content at a specific time, but not the list of who was able to access this data

Kevin: same issue with government material

ht: there are some examples of useful software where the free version of it is only accessible through the wayback machine

Helen: in our case, we store things in archival format, so when people request archived copy, they may not be able to access the content

(arc)

ht: there are different requirements. in the IETF or W3C case, "follow your nose" means that when you get an http exchange, you can follow blindy the set of rules, so everything you need to know is on the web. But it all depends on http URI resolving.

having the bits available in an archived site is good, but having the /TR/ and RFCs available at their URIs is crucial

<Zakim> ht, you wanted to mention the meeting last Friday, and next steps

<noah> TAG Finding on linking to alternate formats: http://www.w3.org/2001/tag/doc/alternatives-discovery.html

trade off between conneg and linkable versions. If you have multiple versions, it is good practise to have URIs of the specific versions, and a way to list and compare alternates

Helen: because of technology limitation archived content is not an 100% faithful copy of a website (because of active content), also lots of producers don't want theit content archived

<ht> So, kinds of reference preservation goals. . . in archives, for web standards (follow your nose), [what else]

like making accessible things that were removed on purpose (like for legal reasons) on one site

<Zakim> jar, you wanted to suggest 'persistent linking'

<ht> and "interarchival reference" to the list above

<noah> Should we start thinking about the workshop?

<ht> UNESCO ICA -- International Council xxx Archives

ht: archival task is a bit similar to search engines to avoid tarpits (automatic crawling)
... jar's issue is persistent linking and not persistent content
... need to find a direction

<ht> ISO group forming wrt Web (archive) Metrics, says Helen H-Y

(discussions on who would be interested in participating in a workshop on persistence)

<ht> Possible attendees: DNS people (Ask TLR?), Memento designer (Herbert van der ???), Internet Archive (Wayback) ???

<ht> (different) National Library people

<ht> Ray Dennenbarg, Stu Weibel

<Ashok> s/Dennebarg/Dennenberg/

<ht> JR's question: Which intervention point(s) are the most promising?

<ht> Australian National Data Service; IIPC (International Internet Preservation Consortium), ???, Int'l library of Singapore (mtg at IPRES, Vienna, September), then IIPC next May in NL

<jar> http://lists.w3.org/Archives/Public/www-tag/2009Dec/0109.html

<ht> , National Library of Scotland, IIPC

<noah> Henry lists things to do moving forward:

<jar> http://netpreserve.org/about/index.php international internet preservation consortium

ht: need a white paper to document the different use cases and scenarios, so that we can point possible invitees to

<noah> Is the whitepaper input to or prepared with output from the proposed workshop?

white paper as a way to ground the discussion on agreed and understood points

Web Application, Client-side state

Web Applications: Client-side state

<noah> ACTION-430?

<trackbot> ACTION-430 -- Ashok Malhotra to propose a plan for his contributions to section 5: Client-side state -- due 2010-06-07 -- OPEN

<trackbot> http://www.w3.org/2001/tag/group/track/actions/430

http://www.w3.org/2001/tag/2010/05/WebApps.html

<johnk> hello, this is John - is it OK for me to dial in?

<noah> Yes, great JK. We'll figure out the phone here and dial in ASAP.

<johnk> thanks

<johnk> yes, when my phone boots up ;)

<noah> If you'll be on soon, we're waiting.

discussion moving to security

<noah> http://www.w3.org/2001/tag/tag-weekly#WebAppsSec

Web Applications - Security

<noah> http://www.w3.org/2001/tag/2010/06/01-cross-domain.html

<noah> ACTION-240?

<trackbot> ACTION-240 -- John Kemp to read thread on RDFa, CURIEs and profile and summarize http://lists.w3.org/Archives/Public/www-tag/2009Feb/0295.html -- due 2009-03-21 -- CLOSED

<trackbot> http://www.w3.org/2001/tag/group/track/actions/240

<noah> ACTION-340?

<trackbot> ACTION-340 -- John Kemp to summarize recent discussion around XHR and UMP -- due 2010-06-04 -- PENDINGREVIEW

<trackbot> http://www.w3.org/2001/tag/group/track/actions/340

http://www.w3.org/2001/tag/2010/06/01-cross-domain.html

JohnK: document updated after email feedback from the original TAG email on the topic

it is clear that no specification completely forbid to leak information form one site to the other (cross-origin forgeries)

Tim: is it because it is not about all URIs and methods for retrieving them (like script, or <img> tags)?

jar: it is considered safe to load an image or script, not sending information

<jar> *loading* script is ok... it's honoring the script's requests that's troublesome

ashok: if you want websites to cooperate, same origin policy will forbid that. cors will help that while still allowing same origin policy checks
... does cors provide solution to xsrf/clickjacking? => not

johnK: validating origin alone is not enough to decide to do the request or not

timbl: origin is the origin of the script, the main question is how to run script form an untrusted website

Noah: I sign on an airline site, get a cookie from that site. I visit another site that wants to cancel my flight (attack script). it starts with a GET request. With CORS, the cookie will go out, potentially authenticating me, it will have extra header

for the browser to hand or not the data to the script if it is from the authorized list

now, what if it is a POST (or any unsafe operation)

JohnK: for any unsafe operation there is a 'pre-flight' request

http://lists.w3.org/Archives/Public/public-webapps/2009AprJun/1324.html

<noah> http://lists.w3.org/Archives/Public/public-webapps/2009AprJun/1324.html

<jar> DBAD = don't be a deputy

http://lists.w3.org/Archives/Public/public-webapps/2009OctDec/att-0468/CORS.pdf

jar: if you reuse credentials, there will always be confused deputy attacks

Noah: surprising in the GET case that instead of having the server not returning the data if not allowed, it sends back data and trust the user-agent to do the security check on its behalf

<johnk> dka: that link is also to CORS - (see the link below the title to /cors)

<DKA> oops :)

<DKA> I meant: http://www.w3.org/TR/widgets-access/

jar: when you authenticate yourself, you give information to your user-agent

ht: in a capability oriented system, there are URIs that allows to identify compound resources that are resources+locks

<johnk> a capability URI is not only a pointer to a resource, but the ability to use it

<johnk> ... do something to that resource

Noah: is there an guessability of capabiity URI?

jar: not if you don't have the credentials to use that capability
... protection against CSRF is about unguessable tokens, in the URI or not

ht: unguessable, but it is interceptable?

jar: in banks, they also use defense against interception

<johnk> CSRF prevention example (from my Google account register domain HTML form):

<johnk> <input type="hidden" name="secTok" id="secTok"

<johnk> value='a06972ccb723db43c20d95985e5f17c5'/>

http://www.w3.org/TR/UMP/#access-control-allow-origin-header (about U:)

s,http://www.w3.org/TR/UMP/#access-control-allow-origin-header,http://www.w3.org/TR/2010/WD-UMP-20100126/#access-control-allow-origin-header

<jar> oauth vs. capabilities: http://www.eros-os.org/pipermail/cap-talk/2010-June/014235.html

<jar> Why we're talking about this (1) the finding that could be read as no secrets in URIs, (2) web app security in general

<noah> Explaining a bit about OAuth vs. CORS/UMP might be worthwhile, but I'm a little reluctant to get into doing an analysis of the whole server-side vs. client-side data integration tradeoff.

<johnk> ok

<jar> timbl: Do we accept the idea that origins can be principals?

<jar> timbl: Consider a company that provides scripts, and is generally trusted to do so

timbl: using domains as principals means that you trust the script to do the right thing, be an intermediary, respect the access controls

<jar> timbl: Hypothesis: using domains as principals counts, and means something... that you trust the scripts to do the right thing ... to be trustworthy intermediaries, to implement some kind of access control on behalf of others

<timbl> So in an ACL-based system, users have to be able to add script domains to the ACLs.

johnk: how do we move forward? broad architectural isue about security on the web. The current model is mixing user agent authentication and server-based authentication

so security is build on hacks over hacks

johnK: the algorithm for user agent is sufficiently complex that is may leave holes in the implementation

<jar> johnk: will ordinary web developers be able to do the right thing, or will it be too complex

need to have a way to allow web developers to do the right thing without having to think too much about it

ashok: critical to allow people to assess trust, but why is it at the header level, and not using policy URIs?

<Zakim> noah, you wanted to ask about followup & more sessions at this F2F

jar: it has been proposed

ADJOURNED

Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.135 (CVS log)
$Date: 2010/06/07 15:59:33 $