From Social Web XG Wiki
Jump to: navigation, search


Fabien Gandon & Alexandre Monnin : Tagging standards, focus on NiceTag


Lalana Kagal & Slim Trabelsi : AIR policy language & PrimeLife 16 Jun 2010


Paul Trevithick : State of Digital Identity 2010 09 Jun 2010


Michael Cooper: W3C Team from Protocols and Formats Working Group on Accessibility in Distributed Social Networks

STUBB did this ever happen ??

Paul Groth : Intersection of Provenance and the Social Web 12 Mar 2010


Oshani Seneviratne : Using RDFa to check and share licensing permissions (Creative Commons) 14 Apr 2010


Matthew Eltringham : User Generated Content (UGC) Hub at the BBC


Atom, OData, and RDF 07 Apr 2010

STUBB -- Was there an invited speaker ??

Marcos Carecas, Robin Berjon, & Scott Wilson : W3C Widgets, OpenSocial, and Apache work 10 Mar 2010


Tim Berners-Lee : Open Distributed Social Networking 03 Mar 2010


Dick Hardt : OpenID

STUBB -- Did this happen ??

Manu Sporny (BitMunk) & Doug Schepers & Dave Raggett (W3C) : Micropayments 10 Feb 2010


Anita Doehler & Daniel Appelquist : OneSocialWeb 03 Feb 2010


Jeff Panzer : Salmon Protocol 20 Jan 2010


Christine, Renato, and Kaliya : Social Web Frameworks

STUBB Did this happen ??

Dan Brickley & Henry Story : FOAF & FOAF+SSL 09 Dec 2009


Chris Messina : ActivityStreams 02 Dec 2009


Eran Hammer-Lahav : Yahoo (XDRS) Open Discovery Stack 25 Nov 2009


Elias Bizannes & Joseph Smar : 28 Oct 2009


David Recordon & Luke Shepard: Facebook 21 Oct 2009


Matt Lee: GNU Social/ 14 Oct 2009


Evan Prodromou : OpenMicroBlogging 07 Oct 2009


Peter Saint-Andre of XMPP


Peter is the Executive Director of the XMPP Standards Foundation (, formally known as the Jabber Software Foundation. He heads up the independent, non-profit standards development initiative whose primary mission is to define an open, XML-based protocol for presence, instant messaging, and real-time communications.

In addition to describing the work undertaken when formalising the XMPP protocol stack, Peter hinted at ways in which XMPP could be integrated into the Social Web, as well as identifying how the work fits in with the rest of the W3C specifically the work in the SWXG. Given that up until now the W3C has basically ignored social web technology XMPP has been identified as one of the key technological enablers for a distributed, decentralised social network.

IETF's Extensible Messaging and Presence Protocol XMPP is built upon the work carried out by the Jabber development community, of which Peter has been active in since late 1999. XMPP grew out of instant messaging and is currently used in the Jabber IM, in Google Talk, and in the newly released Google Wave platform. The initial idea was to build an extensible protocol for passing around of XML, this was done by taking lessons learnt from building jabber, so that XML could be used to build streams of messages. The XMPP RFC came about in 2004, and since the release of RFC-3920 people can now write their own implementations. XMPP was designed to be a generalised transport mechanism for the shifting of XML between peers in a network.

The following applications of XMPP where mentioned:


Before XMPP can overtake IRC for multi-user chatting, XMPP requires some further additions. The chess game presented below builds upon multi user chat room technology. Furthermore, the notion of an XML-based multi user chat environment may enable the collaborative editing of documents via the XMPP protocol.

The XMPP community are currently looking at ways of facilitating distributed multi user chat rooms over a network so that if one of the peers goes down the room will remain accessible. At this point Peter stressed that XMPP is not trying to get rid of technologies such as IRC, they just want to build a protocol for the exchange of structured data. It is suggested that only when XMPP has a viable solution for the distributed room problem then it may be a considered a viable replacement for IRC. Further information to progress in this space can be found here :

One project attempting to deliver XMPP on a clustered, high-availability network for high volumes of users is the community. Prosody are also looking at ways of implementing distributed multi user chat rooms for their instant messaging network.

Another compelling feature of XMPP over IRC is the fact that is uses cert-based authentication, and the fact that it doesn't reveal a user's IP address as in IRC. Following on from the fact that cert-based authentication has been implemented in XMPP, the US department of defense (DoD) has adopted XMPP as its only approved chat mechanism. This along with a number of other security holes identified in IRC helped the DoD decide to adopt XMPP.

Unlike IRC, XMPP has support for native presence information, and for network availability. XMPP also caters for audience segregation when it comes to presence information. It was noted that 90% of all traffic in an IM environment is presence information and not chat. XMPP enumerates strategies to manage presence information so that one doesn't have to receive presence data continuously. One such implementation which is trying to tackle this problem is the beta Nokia-chat client :

Furthermore there is work underway in the project to look at calculating diff's on presence information so as to reduce duplication of information sent to a user's various clients.

'XMPP, Push and Poll'

Given that many services are switching from a polling method of acquiring updates, XMPP attempts to solve this problem by using push technology. Push technologies are becoming popular in the domains of real-time search as they use less resources than traditional polling technologies. Push technologies are used by XMPP to disseminate presense information. The outcome of this work can be found in the pubsub extension of XMPP,

'XMPP and Security'

It was noted that XMPP has always used SSL for encryption. But as it stands XMPP only supports per-hop encryption of data, and not end-to-end encryption. This is also known as OTR 'Off the Record Messaging'.

Since 2004, there has been a start_tls command, which enables a per hop encryption. Given that it is not end-to-end encryption there will always be clear-text stored somewhere on a server. There is still ongoing work in the XMPP community to enable end-to-end encryption, the community have been looking at pptp, and smime, and have yet to come to any conclusions on this topic. The XMPP community aims to first work on end-to-end encryption, and then there will be work on the group chat encryption.

'XMPP and OAuth'

Another piece of work is the integration of OAuth over XMPP, the document outlining how the interactions would happen are currently hosted in an Experimental document encapsulated by XMPP extensions which can be found here :

'XMPP and Geolocation'

Yahoo's fireagle also makes use of XMPP to share geolocation information:

'XMPP and RDF'

And there was been some work undertaken by DanBri to show rdf query over XMPP :

A list of existing XMPP implementations discussed:

Online Games : The work undertaken in this online games presents the foundation for the development of Multi User Chatrooms (MUCs).

File Sharing :

Google Wave :

Google Talk : Google Talk uses XMPP for their signaling channel, which is used to setup voice and video calls.

Pidgin: An open-source implementation of XMPP.

Questions and Answers

Question: from petef: Google Wave servers do server federation using XMPP, will anymore of Wave be based on XMPP?

Answer: Google are not that interested in creating a client-side XMPP implementation for Wave. The wave servers are based on XMPP, as XMPP does federation really well, and support server-server encryption. But no, as far as Peter is aware Google are not putting that much effort into the creation of a XMPP-based clients.

Google Wave XMPP server code :

Question: Henry Story asked about client side authentication based on certs (foaf+ssl).

Answer: XMPP uses SASL external, which allows for XMPP to make use client based certs. It was noted that the main problem with x509 certificates is that most people don't have them - although everyone does have a social networking page. Furthermore Peter informed the group that X.509 certs are being used by the US army to create encrypted connections and to perform authentication. Henry presented foaf+ssl as a method to allow one's browser to create a client-side cert, which it could be used to do end-to-end encryption and authentication.

Question: hajons asked about standards and their evolution. Given that XMPP is being used in Google Wave, how might XMPP be shaped as it is developed by large companies.

Answer: Google Talk presented as a specification for video signalling, but it was not built on top of any existing XMPP standard. The Google Talk signalling spec was developed in house and was released as a finished final product. The work undertaken at Google when developing Talk was very similar to the work undertaken by Jingle. The XMPP extension specified by Jingle has now become a part of the XMPP, and given the work is now an official XMPP extension Google are now looking to harmonise their work in Google Talk with the specification generated by Jingle.

Given that it is still early days for Google Wave. Google Wave are trying to develop the technology in the open, a different approach taken than in the development of Google Talk.

Ros Lawler, Head of eCommerce, Random House Group- Meeting on September 16th 2009

A highly successful and dynamic marketeer, Ros Lawler spoke to the group about Random House's use of social media as a marketing tool.

Ros explained that the simplest use of social networks is that fans of given publishers self-organise so you can meet them and engage directly. For example, Dave Gorman had a fan site who they, Random House, approached. They were able to offer video content and more to the group so that they were able to join the conversation. Members of that group were invited to a pre-launch event and met Dave Gorman etc. so that works well.

Where no fan group exists Random House can create one. For example, they did this for Terry Pratchet by working with Bebo – which has a younger age profile and is generally more creative than Facebook. They ran a competition where people dressed up as their favourite Discworld character and submitted videos etc.

At the end of the day it's about return on investment. The highest ROI is achieved by e-mail so one activity of social media usage is data capture so they can do e-mail marketing. but this is time consuming work and they constantly have to balance the effort with the return.

Initial usage of social media was done from people's personal Twitter feeds and SN pages but this quickly becomes unmanageable so they created corporate accounts. Colleagues go through digital training in how to use, say Tweetdeck – and then the problem hist because corporate IT departments won't allow such software to be installed and run. This lead to a discussion of corporate culture – backed up by other big companies represented in the meeting. The conservative IT culture goes against the social media culture. People being paid to spend time on Facebook, uploading and watching video causes more queries internally than just the (high) bandwidth use.

This lead to a suggestion of corporate IT Best Practice guidance?

We also discussed the problems of maintaining multiple accounts on multiple platforms – clearly a standardised approach for all SNs would make life easier and allow, for example, the ability to manage multiple accounts through a single interface. The existence of W3C widgets is a step in this direction.

There was a discussion about the Habitat (ab)use of the #iranelections tag on Twitter and the case of moonfruit (who provide white label websites). They asked people to include the #moonfruit tag to get a fee something and this lead to twitter eventually banning the tag – with all sorts of implications for editorial control and ownership. Similarly, Ning's terms and conditions are incompatible with Random House's own copyright rules – which prevents them including any of their book content on that network.

Overall, this was an interesting discussion coming at social media from a different perspective then the group normally has.

Thanks to Ros for her time and insight.

Dave Raggett - Meeting on September 16th 2009

W3C Fellow, Dave Raggett, presented some slides, entitled "New Directions for the Web". He covered a number of topics including, delegation, the Web of Things, privacy and context. The general feeling was that we have a number of technologies to tackle these issues, such as P3P, OpenID and Oauth, however that the field was still very much in its infancy. Dave is also part of the W3C Context Awareness and Personalisation Working Group, which follows on from the Ubiquitous Web Applications Working Group, and aims to create patterns and ontologies for dealing with context and personalization, on the web.

Moving on to questions, Dave suggested that it would be advantageous to have liaison between groups, so that we can try to achieve a unified vision. He thought that privacy was perhaps a more key issue than identity, and that we should learn lessons about usability from existing projects such as P3P, looking at what works, and more importantly what doesn't. For example usability was a drawback for P3P, which prevented it reaching its full potential. On privacy, new cryto techniques such as Zero Knowledge Proofs should also be examined. One key aim suggested was to prevent vocabulary fragmentation, for example, the Delivery Context Ontology could be mapped with others in the area, however we use tools such as Protege, but this is an area where we should aim to be become more collaborative, and ideally the toolset should evolve accordingly.

Dan Brickley: WebFinger


Nathan Eagle - Meeting on August 19th 2009

Looking at the big picture, Nathan was quick to point out that while there are a large number of mobile devices out there, there is a vast discrepancy between mobile phone usage in the developed world, and that in the developing. For example, most of the mobile phones in the world (c. 2.5 billion) are on a prepay package which is enormously expensive in local terms. This is one reason, that the average duration of a call in Africa is about only 3 seconds. He suggested that enabling affordable Internet around the world was a more immediate goal than standardization of data formats. To enable the developing world to participate in the social web, pressure needs to be applied to telcos to make access affordable.

Regarding privacy, telcos take the view that they own your data, and that is valuable. However, it is very hard to analyze even using super computers. With respect to data on a mobile device, he said that data available on your mobile is richer than that recorded by the telcos, but generally involved installing spyware on the device.

On the topic of standardization, many attempts have been made in both industry and academia for the use of data gathering and sharing, from the Nokia Scope project to the CrowdAdd project, but this has been largely unsuccessful.

One reason he suggests that there has been a lack of success, was that he feels the privacy issues have never been sorted out universally. Asked about the role data formats, such as ontologies, he said that they would be useful, but in the near term, other challenges such as providing affordable pricing schemes, for generating user generated content (eg such as local job opportunities) would have the greatest impact.

Sam Critchley from Gypsii - Meeting on July 22nd 2009

A number of businesses like Gypsii are making their business off of context-aware mobile applications for social networking that make it easy for users to generate content. First, there is a real need to standardize context information. Second, they have to make their application for different phones, such as the iPhone and Windows Mobile. Lastly, they would like an easy to share data like phone numbers and while OpenID 2.0 can do this, it is not well-known how to do this in an extensible manner. Christine Perey then hosted a discussion of standards for measuring social networks. As a social network is not just a web-site, the sheer number of accounts is not a great metric, as many people make an account but do not use it. What is needed is a metric of how much time a user spends on a social networking service, and how that changes over units of time. An ability to discover what components of a social web site a user spends the most time on would be ideal. A simple standard for sharing this type of information from the W3C would help analysts and researchers interested in social networks.

Simon Tennant from Buddlyclous on OSLO Initative and Matt Womer : W3C Geolocation API 15 July 2009


Joseph Bonneau, Sören Preibusch : "The Privacy Jungle: On the Market for Data Protection in Social Networks" - Meeting on July 01 2009

All the data gathered and the paper can be found on web on Sören's website The Privacy Jungle. Slides can also be found which cover the basis of the talk given to the Social Web XG.

As per Sören's email, the abstract of the "Privacy Jungle" was presented as :

"We have conducted the first thorough analysis of the market for privacy practices and policies in online social networks. From an evaluation of 45 social networking sites using 260 criteria we find that many popular assumptions regarding privacy and social networking need to be revisited when considering the entire ecosystem instead of only a handful of well-known sites. Contrary to the common perception of an oligopolistic market, we find evidence of vigorous competition for new users. Despite observing many poor security practices, there is evidence that social network providers are making efforts to implement privacy enhancing technologies with substantial diversity in the amount of privacy control offered. However, privacy is rarely used as a selling point, even then only as auxiliary, non-decisive feature. Sites also failed to promote their existing privacy controls within the site. We similarly found great diversity in the length and content of formal privacy policies, but found an opposite promotional trend: though almost all policies are not accessible to ordinary users due to obfuscating legal jargon, they conspic uously vaunt the sites' privacy practices. We conclude that the market for privacy in social networks is dysfunctional in that there is significant variation in sites' privacy controls, data collection requirements, and legal privacy policies, but this is not effectively conveyed to users. Our empirical findings motivate us to introduce the novel model of a privacy communication game, where the economically rational choice for a site operator is to make privacy control available to evade criticism from privacy fundamentalists, while hiding the privacy control interface and privacy policy to maximise sign-up numbers and encourage data sharing from the pragmatic majority of users."

Summary of Discussion

The high usage of social networking sites (SNS)s has been well documented and given reports of the high penetration levels of SNSs Bonneau & Preibusch's work was motivated by an illustration showing the per country breakdown of the most popular SNSs. The data illustrates the state of affairs as per November 2008. Given the list of most popular social networking sites the Privacy Jungle evaluates 45 of the top SNS using ~260 criteria so as to help enlighten popular assumptions regarding privacy and social networking.

The ability for people to share information with other people through the process of selecting friends was the overriding pre-condition for a site being labeled a "social networking site"(SNS). Only sites which allow you to make friends were considered. This is also one of the characteristics of SNSs as defined in danah boyd & Nicole Ellison's Social Network Sites: Definition, History, and Scholarship.

It was also stressed that sites claiming to be "leading sites" are often far from it and that statistical data about social networks given to advertisers seems more reliable than data given in press releases and to general public. The information gathered to rank and narrow down the social networking sites into a list of 45 was based on information pertaining to the number of users, this data was taken from fact sheets or surveys. Flickr was identified as one of the big SNSs that was not on considered in the study. The reason for not including Flickr in the research was due to the fact that the main use case behind Flickr was for pasting vacation pictures to be shared with friends, but befriending people is not the main focus of activity in Flickr. Gmail was also flagged as another Social Networking site, highlighting the fact that most sites are now becoming social in nature. Discussions around whether or not blogging sites (i.e. Blogger, Wordpress, and so on) are social networking sites seemed to come to no real conclusion, but were not considered in the Privacy Jungle study. Youtube was presented as not being a social network (SN), as friend based communication is very limited. This follows on from the idea that the Social Web XG needs to think of creating/employing a definition of what it means to be a Social Networking Site. The Alexa list was stripped down to a list of 45 sites, all of which had English language environments so that the research could be undertaken as intended, the data can be grabbed from here.

The key aspects of the SNSs' privacy policies were evaluated using the following broad categories: language used, technical accessibility, presence of metadata, and data collection and sharing practices according to the policy. Also investigated was the P3P deployment on the sites which was low. Further analysed were the different privacy controls that the sites offered to users to control access to personal innformation through the platform.

It was also highlighted that sites usually do not promote privacy as an argument for their front page but do so on sites which are expected to be read by only those who are interested in privacy anyway: privacy seals were absent on the main pages of any network, but seven out of 45 did display a privacy seal in their privacy policy. At the same time service providers do not wish for their data to be leaked outside of their SN. The "lock-in effect" of social networks is nothing new and has been around since Amazon, and as a result privacy has been important from a business point of view for a while now.

Insight was also presented with regards to what happens when content when it is deleted from a social networking site. It was presented that most social networking sites do not claim to ever remove data, some sites make partial claims to how they will delete information which you wish deleted. This work was undertaken manually by Joseph, where by he deleted and then checked how long it took for the sites to actually remove the content from the web. One reason why this process of removing content seems to be problem is due to the fact that most of the social networking sites outsource their content hosting to delivery networks, for example Facebook use Akamai, and as a result if you ever delete a photo from facebook it might be available for the next 30 days. With respect to removing/deleting information Facebook's privacy policy states that "Removed information may persist in back-up copies for a reasonable period of time but will not be generally available to members of Facebook.", the ambiguity in the statement lies with respect to what is said to be a "reasonable period of time". It was also mentioned that SNS do not tend to inform their users about which country their content delivery networks are based in, that is sites tend to have caches for different countries in order to speed up their http requests.

An SNS's "content delivery network" is under the service providers control, they tend to have the master copy of the content but where the copies are held tends to be unknown. E.g. once a photo has been seen in the UK from the Akamai, there will be a local cache of the photo on a UK Akamai server. This is a huge grey area legally, for data protection laws tend to differ from country to country, and it is unclear to what the status of your content is from place to place. It was highlighted that the larger sites are the tip of the iceberg in privacy practices and that the smaller sites often started with bad privacy, probably due to the fact that they still are not big enough to have their own privacy lawyer.

A question was posed to the presenters on how they thought this work could be taken forward, and the following two suggestions were made:

  1. A task to diff the privacy texts, to see what percentage of wording was reused across the sites.
  2. There are some ideas around the textual analysis of the policies, this may identify :
    1. Reuse of boilerplate text
    2. How difficult it is to parse the text
    3. The various relevant dimensions of a privacy policy
    4. Help identify how the various policies differ from one another

And finally there was a mention of POWDER, and P3P technologies. It was mentioned to how only 15% of the sites investigated implemented P3P, and out of the ones which did implement the spec some of the them were doing it incorrectly. There was also talk of looking into how the W3C could step in to facilitate best practices for putting up privacy policies, perhaps in the form of a cut-down dialect of P3P in RDFa, to increase uptake.

Kaliya Hamlin from Identity Commons - Meeting on June 17 2009

One of the fundamental components of the Social Web is an open, standards-based way to build a notion of identity in a federated and distributed manner. Many of the technologies to this have been developed within the broad camp of the Identity Commons. Identity Commons came together to see how individuals could manage and control their own identities on the Web, It formed in between 2004 and 2005 was really focused on user-centric identity, but also has now branched out into a more nuanced understanding of identity including organization and card-centric identity. One of their achievements is a common lexicon to talk about identity on the Web, and some of the work of this diverse community is captured in the "Venn of Identity" paper. Identity Commons is bound together by their principles and a twice-yearly meeting called the Internet Identity Workshop, whose discussions have led to well-known identity technologies being launched, including OpenID and information card identity work such as the the Higgins Project. In general, Identity Commons working groups, when their technology matures, tends to take their work for standardization through OASIS or IETF, as they have believed that the W3C has not been interested in digital identity. Furthermore, most people in Identity Commons believes that market adoption, rather than standards organizations, will determine the deployment of digital identity. However, if the W3C shows genuine engagement with grassroots groups like Identity Commons, there is likely a role for the W3C in digital identity.

TimBL & Peter Mika : VCard in RDF May 27th 2009