This is an archive of an inactive wiki and cannot be modified.

Authors: GeorgeAnadiotis, ThomasFranz, SusanneBoll

Use Case: Collaborative Tagging

Index


1. Introduction

Tags are what may be the simplest form of annotation: simple user-provided keywords that are assigned to resources, in order to support subsequent retrieval. In itself, this idea is not particularly new or revolutionary: keyword-based retrieval has been around for a while. In contrast to the formal semantics provided by the Semantic Web standards, tags have no semantic relations whatsoever, including a lack of hierarchy; tags are just flat collections of keywords.

There are however new dimensions that have boosted the popularity of this approach and given a new perspective on an old theme: low-cost applicability and collaborative tagging.

Tagging lowers the barrier of metadata annotation, since it requires minimal effort on behalf of annotators: there are no special tools or complex interface that the user needs to get familiar with, and no deep understanding of logic principles or formal semantics required – just some standard technical expertise. Tagging seems to work in a way that is intuitive to most people, as demonstrated by its widespread adoption, as well as by certain studies conducted on the field [5]. Thus, it helps bridging the 'semantic gap' between content creators and content consumers, by offering 'alternative points of access' [5] to document collections.

The main idea behind collaborative tagging is simple: collaborative tagging platforms (or, alternatively, distributed classification systems - DCSs [4]) provide the technical means, usually via some sort of web-based interface, that support users in tagging resources. What is the important aspect of this is that they aggregate collections of tags that an individual uses, or his tag vocabulary, called a personomy [9], into what has been termed a folksonomy: a collection of all personomies [7, 8].

Some of the most popular collaborative tagging systems are Delicious (bookmarks), Flickr (images), Last.fm (music), YouTube (video), Connotea (bibliographic information), steve.museum (museum items) and Technorati (blogging). Using these platforms is free, although in some cases users can opt for more advanced features by getting an upgraded account, for which they have to pay. The most prominent among them are Delicious and Flickr, for which some quantitative user studies are available [1,2]. These user studies document a phenomenal growth, that indicates that in real-life tagging is a very viable solution for annotating any type of resource.

2. Motivating Scenario

Let us view some of the current limitations of tag-based annotation, by examining a motivating example:

Let's suppose that user Mary has an account on platform S1, that specializes in images. Mary has been using S1 for a while, so she has progressively built a large image collection, as well as a rich vocabulary of tags (personomy).

Another user, Sylvia, who is Mary's friend, is using a different platform, S2, to annotate her images. At some point, Mary and Sylvia attended the same event, and each one took some pictures with her own camera. As each user has her reasons for choosing a preferred platform, none of them would like to change. They would like however to be able to link to each other's annotated pictures, where applicable: it can be expected that since the pictures were taken at the same time and place, some of them may be annotated in similar way (same tags), even by different annotators. So they may (within the boundaries of word ambiguity) be about the same topic.

In the course of time Mary also becomes interested in video and starts shooting some of her own. As her personal video collection begins to grow, she decides to start using another collaborative tagging system, S3, that specializes in video, in order to better organise it. Since she already has a rich personomy built in S1, she would naturally like to reuse it in S3, to the extent possible: while some of the tags may not be appropriate, as they may represent one-off ('29-08-06') or photography-specific ('CameraXYZ') use, others might as well be reused across modalities/domains, in case they represent high-level concepts ('holidays'). So if Mary has both video and photographic material of some event, and since she has already created a personomy on S1, she would naturally like to be able to reuse it (partially, perhaps) on S2 as well.

3. Issues

The above scenario demonstrates limitations of tag-based systems with respect to personomy reuse:

As media resides not only on Internet platforms but is most likely maintained on a local computer at first, local organizational structures can also not easily be transferred to a tagging platform. The opposite holds as well, a personomy maintained on a tagging platform cannot easily be reused on a desktop computer.

Personomy reuse is currently not easily possible as each platform uses ad-hoc solutions and only provides tag navigation within its own boundaries: there is no standardization that regulates how tags and relations between tags, users, and resources are represented. Due to that lack of standardization there are further technical issues that become visible through the application programming interfaces provided by some tagging platforms:

4. Possible Solutions

When it comes to interoperability, standards-based solutions have repeatedly proven successful in enabling to bridge different systems. This could also be the case here, as a standard for expressing personomies and folksonomies would enable interoperability across platforms. On the other hand, use of a standard should not enforce changes in the way tags are handled internally by each system - it simply aims to function as a bridge between different systems. The question is then, what standard?

We may be able to answer this question if we consider a personomy as a concept scheme: tags used by an individual express his or her expertise, interests and vocabulary, thus constituting the individual's own concept scheme. A recent W3C standard that has been designed specifically to express the basic structure and content of concept schemes is SKOS Core [6].The SKOS Core Vocabulary is an application of the Resource Description Framework (RDF), that can be used to express a concept scheme as an RDF graph. Using RDF allows data to be linked to and/or merged with other RDF data by semantic web applications.

Expressing personomies and folksonomies using SKOS is a good match for promoting a standard representation for tags, as well as integrating tag representation with Semantic Web standards: not only does it enable expression of personomies in a standard format that fits semantically, but also allows mixing personomies with existing Semantic Web ontologies. There is already a publicly available SKOS-based tagging ontology that can be used to build on [3], as well as some existing efforts to induce an ontology from collaborative tagging platforms [10].

Ideally, we would expect existing collaborative tagging platform to build on a standard representation for tags in order to enable interoperability and offer this as a service to their users. In practice however , even if such a representation was eventually adopted as a standard, our expectation is that there will be both technical and political reasons that could possibly hinder its adoption. A different strategy that may be able to deal with this issue then would be to implement this as a separate service that will integrate disparate collaborative tagging platforms based on such an emergind standard for tag representation, in the spirit of Web2.0 mashups. This service could either be provided by a 3rd party, or even be self-hosted by individual users, in the spirit of [11,12]

5. References

[1] HitWise Intelligence: Del.icio.us Traffic More Than Doubled Since January

http://weblogs.hitwise.com/leeann-prescott/2006/08/delicious_traffic_more_than_do.html

[2] Nielsen//NetRatings: USER-GENERATED CONTENT DRIVES HALF OF U.S. TOP 10 FASTEST

GROWING WEB BRANDS, ACCORDING TO NIELSEN//NETRATINGS

http://www.nielsen-netratings.com/pr/PR_060810.PDF

[3] Richard Newman, Danny Ayers, Seth Russell: Tag Ontology. http://www.holygoat.co.uk/owl/redwood/0.1/tags/

[4] Ulises Ali Mejias, 2005: Tag literacy

http://ideant.typepad.com/ideant/2005/04/tag_literacy.html

[5] J. Trant (2006), "Exploring the potential for social tagging and folksonomy in art museums: proof of concept", New Review of Hypermedia and Multimedia, 2006

[6] SKOS Core: http://www.w3.org/2004/02/skos/core/

[7] Mathes, A., Folksonomies - Cooperative Classification and Communication Through Shared Metadata. Computer Mediated Communication - LIS590CMC, Graduate School of Library and Information Science, University of Illinois Urbana-Champaign, 2004

http://www.adammathes.com/academic/computer-mediated-communication/folksonomies.html

[8] Smith, G. “Atomiq: Folksonomy: social classification.” Aug 3, 2004

http://atomiq.org/archives/2004/08/folksonomy_social_classification.html

[9] Hotho, A., Jaschke, R., Schmitz, C., Stumme, G. (2006): Information Retrieval in Folksonomies: Search and Ranking. 3rd European Semantic Web Conference, 11-14 June 2006, Budva, Montenegro

http://www.kde.cs.uni-kassel.de/hotho/pub/2006/seach2006hotho_eswc.pdf

[10] Smitz, P. "Inducing Ontology from Flickr Tags", paper for the Collaborative Web Tagging Workshop at WWW2006, Edinburgh, 22 May 2006

http://www.rawsugar.com/www2006/22.pdf

[11] Koivunen, M. Annotea and Semantic Web Supported Collaboration, in Proc. Of the ESWC2005 Conference, Crete, May 2005.

http://www.annotea.org/eswc2005/01_koivunen_final.pdf

[12] Segawa, O. 2006. Web annotation sharing using P2P. In Proceedings of the 15th International Conference on World Wide Web (Edinburgh, Scotland, May 23 - 26, 2006). WWW '06. ACM Press, New York, NY, 851-852.

http://doi.acm.org/10.1145/1135777.1135910