Provenance Incubator Group Teleconference

21 May 2010


See also: IRC log


[IPcaller], Yolanda, SamCoppens, smiles, +1.619.223.aaaa, jcheney, Ivan, +1.617.768.aabb, Irini, olaf, michaelp, +86528aacc, +1.937.775.aadd, +0238059aaee, harry, +1.518.763.aaff, Lalana
Yolanda Gil
James Cheney


<trackbot> Date: 21 May 2010

<scribe> Scribe: James Cheney

<scribe> ScribeNick: jcheney

Yolanda: not on IRC

Provenance in social web

Yolanda: New members?

<pgroth> loving swan

(didn't catch your name...)

SWAN ontology one of ontologies we are trying to integrate

<smiles> jcheney, his name is Paolo Ciccarese I believe


While we wait for harry, discuss other agenda items

<pgroth> harry is dialing in now

<Paolo> jcheney, sorry about that. Paolo Ciccarese is my name indeed

Discussion of scenarios including news aggregator - adding technical requirements leading to state of art/technology & indentfy gap

Paul: followup - architecture diagrams, association between tech and user requirements

Do we want to do this for all three scenarios? Or just web scenario?

Yolanda: finish first, next in June

Paul: Need one more diagram (Yogesh), then tech reqs

Yolanda: Any other interest in state-of-art report?

Luc: yes

Olaf: will have a look

Paul: will sort out next steps with Yogesh and others

Yolanda: Harry will give an overview of story so far with Prov & Social Web XG interaction, scenarios

Harry: Social Web XG looking at standardization for distributed open social networks
... Broad scope, trying to zero in on candidates for standards

<hhalpin> http://www.foaf-project.org/

RDF, FOAF emerging. Some adoption but most use cases not possible with RDF 1.0

Need versioning, access control, provenance

and most use cases involve these

Determining genuineness as data moves between social nets: need a solution, no good one now

<hhalpin> http://www.internetidentityworkshop.com/

<hhalpin> openid, certicate, foaf+ssl

Most activity so far: what is a good internet ID? (>50% of social web XG so far)

<hhalpin> URIs, Web ID

<hhalpin> policy-driven

Provenance needs identity (personal, authority, certificates); also foundational for policy driven use cases, esp. privacy

<hhalpin> the privacy jungle

<hhalpin> trust

<hhalpin> provenance

Objective, data-driven measures of trust?

Harry: Questions?

<JimM_> provenance == source of ~FOAF network info -- info about the authority stating it?

Luc: You think prov should be based on identity in soc. nets. What does this mean?

<pgroth> it means (i think ) that provenance "grounds out" in identity

Harry: Identity needed to determine "Who" in open systems. (lack of central authentication?)

What is a reliable open-ended system for identifying people over internet?

Use cases: demonstrate provenance, but problematizes identity

<hhalpin> http://www.w3.org/2005/Incubator/socialweb/wiki/UserStories

<pgroth> http://www.w3.org/2005/Incubator/prov/wiki/Social_Web

People in social web have multiple identities

<Paolo> Related project for identifying people http://esw.w3.org/Foaf%2Bssl

Ability to "take down" your data (facebook death)

whereas most provenance is "append only". Need to support deletion of provenance?

In social network, need to track down copies and delete them too (difficult...)

<pgroth> we added that

<pgroth> :-)

Paul: We did add a deletion" aspect to use case

Harry: Not much academic work on this

Iranian election - false information, which messages are real

(on the other hand, maybe the Iranian twitterers didn't want to be easily identifiable!)

need "anonymous IDs" - know this is the same person but not who it really is? persistence of trust relationships

government-level spamming of social network sites

Vocabulary for "person made claim on date" - still need to build "time" and "versioning" into RDF to support this

<hhalpin> http://activitystrea.ms/

Example: Activity Streams (FriendFeed) allow sharing of updates among sites.

<hhalpin> atom-based protocol

Define a simple atom-based protocol for profile updates (non-RDF, finite vocab)

<hhalpin> change-control

Lack of provenance framework to deal with change control

<hhalpin> who-why-what

<hhalpin> atom

Problem with Atom: Have to subscribe and poll feed to see if anything changed

<hhalpin> real-time updates

<hhalpin> HTTP GET

<hhalpin> 5000 http atom feed

Need push model instead of "pull" / polling model

<hhalpin> load

<hhalpin> pubsubhub

Server overload problem due to centralization

<hhalpin> push

<hhalpin> vodafone onesocialweb

social web XG members developing products (Vodafone, Google Buzz) using pubsubhub, atom, activitystreams

atom describes updates, but not fine-grained deletion, identity; not as rich as opm

Jim: Is RDF the problem or is the problem the lack of an RDF syndication/update formalism?

Harry: Lacks notification, change control

<hhalpin> rdf lacks change control, real-time notification, and a rich model of identity.

Jim: Seems like a non-technical issue

<pgroth> +q

Harry: The difference is that there are de facto standards for updates (Atom) and pub/sub (PubSubHub).

Jim: Prov community could encourage standardization, recognition of importance of pub/sub issues

Harry: Scalability to 100,000s of users (why facebook bought friendfeed)

Jim: OPM accounts may handle some of these issues

<Luc> a few years ago, (using the pasoa model), we had defined a provenance-aware rss, which required headers to be communicated, as part of messages.

Paul: Jim, is it enough develop a vocabulary, or are there protocol/architecture issues (with RDF pub/sub)?

<Luc> so, it's both vocabulary and protocol!

Jim: Third option: prov vocabularies plus other vocabularies

<pgroth> can't hear you harry

<hhalpin> cvs on rdf

Harry: want declarative vocabulary, but also want (distributed?) CVS for RDF

<hhalpin> versioning and vocabulary and realtime update mechanism

<hhalpin> atom/xmpp/json

Versioning a subset of prov vocabulary, and real-time (change driven) update mechanism.

<ssahoo2> would RDF named graph help in managing RDF graph versioning?

<JimM_> yes

Really would like clear connection between prov and policy, trust.

Social Web XG running until Sept 2010, drafts of final report in next month or two, will circulate

Satya: how complicated are queries in social web context? is more complicated analysis of provenance required?

Harry: Example: want to distribute photo update to (possibly complicated) group of friends, possibly inferred from previous user actions

Satya: Could be related to comparison/diff of provenance graphs

Harry: No one does this yet, but coming - Google Buzz, facebook policy mishaps leading to changes, policy imposition

Privacy, trust wasn't built-into system

Satya: Requirement in eScience that users want to be able to repeat experiments. Social web: reconstruct history?

Harry: Might want to be able to replay / script /infer actions or privacy policies

Example: User changes jobs, privacy policy for new job can be based on that for old job

Yolanda: Moving on to additional topics

Next item: RDF Next Steps workshop submission

Jun: Draft started last month, based on user requirements

Message: Need best practice for common requirements for PDF concerning provenance, vocabulary for provenance patterns

Need proper identity management (data, graphs, annotation)

<pgroth> is it accepted?


<pgroth> excellent!

Accepted for presentation, small changes needed

Three other papers mention provenance

<ivan> ian davis

http://www.w3.org/2009/12/rdf-ws/papers/ws06/, http://www.w3.org/2009/12/rdf-ws/papers/ws12, http://www.w3.org/2009/12/rdf-ws/papers/ws09

Next topic: Query provenance (Irini)

Irini: goal was to show how existing relational data provenance models can be used for SPARQL/RDF


Showed that existing model works for positive SPARQL

Negation problematic (as usual...)

Thanks to Olaf for comments

Can discuss in future meeting

Yolanda: Questions?

Next topic: New use case, internet architecture use case

Will revisit next week


trackbot, end telcon

Summary of Action Items

[End of minutes]

