Semantic enhancements for social networks

Augmenting raw data and expressing sticky constraints to allow for better data handling

by Rigo Wenning, W3C Privacy Activity Lead and Legal Counsel
Ivan Herman, Semantic Web Activity Lead

Social networks today come with a certain business model and rather feel like remote applications that one can use via a web browser. But the social connections between people go far beyond mere questions of applications and interoperability. Behind all this are stories of fate and fortune of people that communicate with their friends and ennemies, communicate for business and political reasons or just for fun, seek vanity and fame or just look for lost connections.

This raises questions of two natures: There are technical questions of interoperability and enhancements on the one hand. There are questions on how the society will deal with the phenomenon on the other hand. And there is the edge between both. This position paper is mainly about this thin line between technology and society that may help to mediate or conciliate contradictory goals that we have seen sparking up again and again throughout the short history of the web. I will start to consider and weigh some technical questions, continue with social questions and finally suggest a means to add data handling hooks that would allow for better governance in line with the needs of a democratic society.

I. Technical questions

As the workshop points out, there is some basic issue about interoperability. Most social networks today operate in server side application sandboxes that use web technology, but aren't really connected to the rest of the world in a webbed way. Some people would even talk about walled gardens. So there is a great desire by many users to have an interoperable format that allows them to transport their social graph. Also social networks have that interest to make transition easier in order to gain new subscribers in an easier way. Subscription to multiple social networks by a single person are not an absolute exception.

This point will be made by many others on the workshop, so a short statement of support seems to be in order and sufficient. But having such an interoperable representation of the social graph has consequences beyond exchanging my list of friends between sites. The focus of this position paper will be therefor on the social aspects assuming that there is an interoperable exchange format for the social graph.

II. Social questions

Most of the issues in social networks are known and are linked to the concept of personality. This encompasses social concepts like reputation, false allegations, right to one's image, privacy, insult, discriminations of all sorts and the like. Operators of social networks constantly try to evolve and adapt their platform to better adapt to user's preferences and to respond to the latest incident.

New challenges arise out of the creation of interlinked services where a social network creates partnerships with vendors and others to feed metadata from user actions in third party services back into their network. Facebook already faced some incident. User behavior on those third party services was plainly reported back as an action trail to the social network and exposed without the appropriate governance and access control. A potentially useful interaction between vendor services and the social network was not accompanied by measures taking into account privacy, secrecy issues and a well adapted access control to the action trail of the user. No wonder. How could they have done it? For the moment those kind of safeguards do not exist or are in its infancy.

Taking reputation as an example, it involves exposition and gain of renown to achieve higher recognition that resolves into higher influence and more power. While people try to acquire higher reputation, others try to benefit from the higher reputation or from the vulnerabilities on the way to it. This is nothing new in our societies. Most legal systems around the world therefor know some concept of protection of the natural and virtual person in this game. This ranges from legal institutes like trademarks to the very personal regime of privacy protection that is mostly tight to the natural person. It ranges from the protection of one's image to the ever lasting battle between reputation protection and free speech.

As people struggle to gain reputation in one world, they want to take that reputation to another world. We already know some precedent from the phone networks. In Europe one can request the operator to allow the phone number to follow the individual while changing the operator. This was done to ease changing for better competition between phone operators. Imagine you know 500 people and you would have to contact them to tell them about your changed mobile phone number. This burden alone will create a strong incentive to remain with the current operator unless there is a major pain forcing people to move to another one. The same may count for social networks. How can one take all the investment done into one social network while additionally subscribing to another social network or moving there? For the moment, one would have to restart from scratch, make contacts and start all over again. People would be greatful if they would just be able to take their profile and their assets and also enjoy more than one social networking operator. Data self determination taken seriously would say that the profile is an asset of the user she often invested into heavily. Not taking the user as an asset of the social network operator makes us doing the right thing. If the social network operator makes money by knowing more about us, that's perhaps a different issue and would also work under the conditions outlined here.

Consequently, this paper is an argument against the hype about intermediaries that try to cash in on the mindset that users are assets for the intermediary. Only if the intermediary understands that it is a service to the user and that services can be also rewarded money wise, we may advance to a mindset where we help the user to accomplish his venture for more visibility and recognition in this world.

This needs an interoperable serialization of the social graph of a given user to have an exchange format, but it needs more. It needs the ability to express socially relevant metadata in an interoperable way to also be able to transport the social information and concepts, namely privacy, access, reputation and the like. This is challenging and requires more than just one operator inventing and re-inventing policy languages in his corner. So one is not enough, all are kindly asked to take up the challenge and bring their experience to the table. The competition will then happen on the implementation side. Here, the user interface questions are key to success.

Adding such policy metadata to the interoperable serialization of a social graph is tricky for the simple reason that social values, attributes and concepts are as diverse as our real world is. So the attempt to create the one and all encompassing ontology for metadata or policy data to be used together with the social graph is bound to failure. The metadata itself must be extensible and must allow for additions and regionalizations. But this diversity must stand stable on a common ground and a common set of universal basic attributes, properties and expressions. Too much diversity would confuse users and would destroy any hope for basic but extensible interoperability, let alone the task to convey such complex information to the user.

An additional difficulty is that incidents are not always reported under the right label addressing the right concept. Attacks on reputation or misrepresentations are often reported under the privacy banner. This triggers reactions directed to the wrong values applying the wrong tools. A good analysis of past incidents and what assets/rights where affected would be of great help. Such an analysis would need an interdisciplinary approach, as is so often the case for issues generated by the web and its applications.

Finally, looking a bit ahead, social networks are a new way of communicating. Communication in a democratic society contributes to the making of opinions. Communicating in social networks is a particularly powerful way of making opinions. Consequently, social networks have some social responsibility. This responsibility translates into a higher attention for the balance of power and fair communication practices. Again policy languages are indispensible to achieve something in this area.

III. The edge

Having societal requirements alone is not sufficient either. Those requirements have to be verified and checked against feasability in a constant way. Sometimes, it takes very little to adapt a requirement socially while it would take a lot to adapt the technology to meet a certain requirement and vice versa. Having an interdisciplinary approach may be slower in the beginning, but it will proof to be much more efficient in the end. Technologists and people from social science have to come together to determine ways to accommodate the needs of our societies that were only slightly touched upon here.

As many others will suggest, one way to create a base with easy extensibility is the use of semantic web technologies. FOAF was surely a first step to express parts of the social graph in RDF. However, FOAF, by itself, is not the complete solution. Indeed it does not contain terms of policy languages although whether such terms should be added to FOAF or should be the subject of a separate vocabulary is a minor detail. There are also technical issues to overcome on making assertions on a graph representing an open world.

Reputation systems are well explored in closed systems. Extending them to an open world assumption has not yet seen wide discussion. How could we exchange reputations, how will we measure them and how will the user expect them to be conveyed?

There are a lot more attributes and requirements at stake. In order to create interoperable profile information that still preserves the business models of social networks, we have to distinguish those revenue generating attributes of a person from those attributes that are relevant to the democratic society and thus protected by human rights (and subsequently also protected by law implementing that protection that may contain sanctions against infringers). There may be even a way to de-personalize some of the revenue generating properties of the social graph and thus easing the commercialization.

In the privacy arena, the Primelife project continues to explore the path paved by the PRIME project: exploring a policy language that can help with privacy related data handling, obligations and access control. But Primelife is just a pilot in an overall system of using metadata to express socially desirable constraints over the usage of data. This will benefit the users as they will be able to regain some protection concerning their human rights. But it will also benefit the social networks and their commercial clients as it will provide richer data that is less prone to generate abusive data processing thus leading to a scandal damaging the reputation and trust into that social network. This trust, in turn, is the basis for a healthy growth of the social network, so there is also a commercial interest in implementing and applying the user preferences expressed in this constrain-augmented profile data.

With respect to the expressiveness of such constraints, there are still majors gaps: As of today, we are not able to protect group conversations against world reading in a meaningful and healthy way. We know how to do fine grained access control, but we do not know how to avoid being dragged into the (painful) public spotlight against our will. We want to gossip on those networks, but we do not want future employers to base their decisions on such gossip we spread at some point in time perhaps out of some mode. We want to have our reputation protected against false accusations. We want to avoid that others abuse our reputation for a purpose that we do not want to encourage.

As we can see, this joins the initiatives to express general constraints over data, better known as policy languages. PLING, the Policy languages Interest Group was created because there are already many policy languages, but the Workshop on Languages for Privacy Policy Negotiation and Semantics-Driven Enforcement showed that those are not interoperable. Social networks today suffer from this fragmentation more than others. They are encouraged to help us work for more interoperability in this field and also to give us more insight on special needs of social networks in this area.