PhD Thesis - From User Behaviour to Collective Semantics
The World Wide Web has developed into an important platform for social interactions with the rise of social networking applications of different kinds. Collaborative tagging systems, as prominent examples of these applications, allow users to share their resources and to interact with each other. By assigning tags to resources on the Web in a collaborative manner, users contribute to the emergence of complex networks now commonly known as folksonomies, in which users, documents and tags are interconnected with each other.
To reveal the implicit semantics of entities involved in a folksonomy, one requires an understanding of the characteristics of the collective behaviours that create these interconnections. This thesis studies how user behaviours in collaborative tagging systems can be analysed to acquire a better understanding of the collective semantics of entities in folksonomies. We approach this problem from three different but closely related perspectives. Firstly, we study how tags are used by users and how their different intended meanings can be identified. Secondly, we develop a method for assessing the expertise of users and quality of documents in folksonomies by introducing the notion of implicit endorsement. Finally, we study the relations between documents induced from collaborative tagging and compare them with existing hyperlinks between Web documents. We show that, in each of these scenarios, it is crucial to consider the collective behaviours of the users and the social contexts in order to understand the characteristics of the entities.
This project can be considered as a case study of the Social Web, the research outcomes of which can be easily generalised to many other social networking applications. It also fits into the larger framework for understanding the Web set out by the emerging interdisciplinary field of Web Science, as the work involves analyses of the interactions and behaviour of Web users in order to understand how we can improve existing systems and facilitate information sharing and retrieval on the Web.
SPEAR (SPamming-Resistant Expertise Analysis and Ranking) Algorithm
The SPEAR (SPamming-Resistant Expertise Analysis and Ranking) algorithm is jointly developed by my collaborator Michael Noll from the Hasso Plattner Institute and myself. It is first introduced in our paper 'On Measuring Expertise in Collaborative Tagging Systems' presented at the Web Science Conference 2009. It is later more thoroughly tested and discussed in our SIGIR 2009 paper 'Telling Experts from Spammers: Expertise Ranking in Folksonomies'. The details of the algorithm can be found in our papers and the SPEAR Website established by Michael. In addition to having the chances of presenting our ideas in conferences, we have also been very fortunate to have our work mentioned in the media.
- 'New Ranking Algorithm Separates Digital Wheat from Chaff', September 2009.
Article on the Website of the Communications of the ACM
- 'Finding Better Friends: Delicious and SPEAR', August 2009.
Article on ReadWriteWeb
- 'How SPEAR Identifies Domain Experts within Delicious', August 2009.
Invited article for Yahoo!, published on the Delicious.com blog
- 'Speer gegen Spam' (in German), August 2009. Michael's interview with 20 Minutes, the most popular daily newspaper in Switzerland
- 'A Better Way to Rank Expertise Online', July 2009.
Article in Technology Review, published by MIT