Journal Article•DOI•

A Case Study of Collaboration and Reputation in Social Web Search

Kevin McNally¹, Michael P. O'Mahony¹, Maurice Coyle, Peter Briggs, Barry Smyth¹ - Show less +1 more•Institutions (1)

01 Oct 2011-ACM Transactions on Intelligent Systems and Technology (ACM)-Vol. 3, Iss: 1, pp 4

TL;DR: This article proposes a reputation model for HeyStaks users that utilise the implicit collaboration events that take place between users as recommendations are made and selected and indicates that incorporating reputation into the recommendation process further improves the relevance of HeyStak recommendations by up to 40%.

read less

Abstract: Although collaborative searching is not supported by mainstream search engines, recent research has highlighted the inherently collaborative nature of many Web search tasks. In this article, we describe HeyStaks, a collaborative Web search framework that is designed to complement mainstream search engines. At search time, HeyStaks learns from the search activities of other users and leverages this information to generate recommendations based on results that others have found relevant for similar searches. The key contribution of this article is to extend the HeyStaks social search model by considering the search expertise, or reputation, of HeyStaks users and using this information to enhance the result recommendation process. In particular, we propose a reputation model for HeyStaks users that utilise the implicit collaboration events that take place between users as recommendations are made and selected. We describe a live-user trial of HeyStaks that demonstrates the relevance of its core recommendations and the ability of the reputation model to further improve recommendation quality. Our findings indicate that incorporating reputation into the recommendation process further improves the relevance of HeyStaks recommendations by up to 40p.

...read moreread less

Summary (9 min read)

Did you find this useful? Give us your feedback

Figures (17)

Fig. 13. Reputation scores for the 58 users participating in shared staks.

Fig. 14. Reputation score vs. distinct number of results produced.

Fig. 8. User query and activity performance: (a) The number of queries per user per stak. (b) The number of activities per user per stak.

Fig. 17. Reputation ranking: (a) The percentage of recommended results (coverage) with a reputation score exceeding a given reputation threshold. (b) The relative benefit (percentage increase in relevance ratio) across different reputation weights and thresholds.

Fig. 9. Two measures of relevance per unit search effort across staks: (a) The number of activities per query per user. (b) The number of correct answers per query per user.

Fig. 5. The evolution of user reputation across users u1, ..., u4 for result page r, according to the reputation sharing strategy given by Equation 4. At time t1, u1 selects r causing it to be added to the stak. At time t2, u1 gains a single unit of reputation from u2’s selection of the recommended result r. At time t3, r is independently added to the stak by the actions of u3. Finally, at time t4, r is again recommended and selected, this time by u4, causing reputation to be shared equally between u1, u2 and u3, resulting in u1 having a final reputation score of 4/3 (1+1/3), u2 and u3 both having a score of 1/3 and u4 having a score of 0.

Fig. 6. Summary statistics for the one hour user trial: (a) The mean percentage activity type per stak. (b) The number of irrelevant, partially relevant, and relevant pages found during the trial.

Fig. 11. The relevance of (a) organic and (b) recommended results acted on per stak.

Fig. 10. Number of queries receiving at least one HeyStaks recommendation as a percentage of total queries submitted per stak.

Fig. 16. Percentage of users with >0 reputation score per stak vs. time.

Fig. 2. The HeyStaks system architecture and outline recommendation model.

Fig. 1. Social search attempts to bridge the traditional, query-based world of web search with the information sharing world of social networks. A variety of social search and sharing services have emerged to help users harness their social networks in pursuit of more effective information discovery across a variety of application contexts. This figure lists a number of well-known services, both start-up and more mature, that have emerged to fill the gap between the mainstream search world such as Google, Yahoo and Bing, and the major social networks such as Facebook, Twitter and LinkedIn.

Table I. The questions presented to trial participants. The correct answers for each question are also shown.

Fig. 15. User reputation score per user per (shared) stak.

Fig. 12. The relevance ratio for organic and recommended results per stak.

Fig. 3. HeyStaks in action: the screenshot shows how HeyStaks integrates seamlessly with mainstream search engines (Google in this case). In the example the searcher, a mountain biker, is looking for information from the specialist mountain biking brand, Hard Rock. The query submitted is clearly ambiguous and Google responds with results related to the restaurant/hotel chain. However, HeyStaks recognises the query as relevant to the Mountain Biking search stak that the searcher has previously joined and presents a set of more relevant results drawn from this stak.

Fig. 7. Quiz performance: (a) The number of questions attempted per user per stak. (b) The number of questions correctly answered per user per stak.

Content maybe subject to copyright Report

Provided by the author(s) and University College Dublin Library in accordance with publisher

policies. Please cite the published version when available.

Title A Case Study of Collaboration and Reputation in Social Web Search.

Authors(s) McNally, Kevin; O'Mahony, Michael P.; Coyle, Maurice; Briggs, Peter; Smyth, Barry

Publication date 2011-10

Publication information ACM Transactions on Intelligent Systems and Technology, 3 (1):

Publisher ACM

Item record/more information http://hdl.handle.net/10197/3913

Publisher's statement © ACM, 2011 This is the author's version of the work. It is posted here by permission of

ACM for your personal use. Not for redistribution. The definitive version was published in

ACM Transactions on Intelligent Systems and Technology (TIST), Volume 3 Issue 1,

October 2011 http://dx.doi.org/10.1145/2036264.2036268

Publisher's version (DOI) 10.1145/2036264.2036268

Downloaded 2022-08-09T21:48:23Z

The UCD community has made this article openly available. Please share how this access

benefits you. Your story matters! (@ucd_oa)

A Case Study of Collaboration and Reputation in

Social Web Search

KEVIN MCNALLY

MICHAEL P. O’MAHONY

BARRY SMYTH

PETER BRIGGS

and

MAURICE COYLE

CLARITY: Centre For Web Sensor Technologies, University College Dublin,

Dublin, Ireland

1. INTRODUCTION

The scale of the Web and the heterogeneous nature of its content [Signorini and

Gulli 2005] introduces many signiﬁcant information discovery challenges. For all

of the recent developments in search engine technologies, modern search engines

continue to struggle when it comes to providing users with fast and eﬃcient access

to information. For example, recent studies have highlighted how even today’s

leading search engines fail to satisfy 50% of user queries [Smyth et al. 2005]. Part

of the problem rests with the searchers themselves: with an average of only 2-3

terms [Lawrence and Giles 1998; Spink and Jansen 2004], the typical Web search

query is often vague with respect to the searcher’s true intentions or information

needs [Song et al. 2007]. Moreover, searchers sometimes choose query terms that

are not well represented in the page that they are seeking and so simply increasing

the length of queries will not necessarily improve search performance.

Two promising and powerful new ideas in web search are personalization and

collaboration. Personalization questions the one-size-ﬁts-all nature of mainstream

web search — two diﬀerent users with the same query will receive the same result-

list, despite their diﬀerent preferences — and argues that web search needs to

become more personalized so that the implicit needs and preferences of searchers

can be accommodated [Chang et al. 2000; Chirita et al. 2004; Granka et al. 2004;

Speretta and Gauch 2005; Asnicar and Tasso 1997; Ma et al. 2007; Makris et al.

2007; Zhou et al. 2006; Chirita et al. 2005; Pretschner and Gauch 1999; Shen et al.

2005; Budzik and Hammond 2000; Finkelstein et al. 2001].

This paper focuses on the second idea, that of collaboration. In the main, web

search takes the form of an isolated interaction between lone searcher and search

engine. Recently, however, there has been considerable interest in the potential

for web search to evolve to become a more social activity [Morris et al. 2010;

Golovchinsky et al. 2009; Evans et al. 2010; Evans and Chi 2009], whereby the

search eﬀorts of a user might be inﬂuenced by their social graph or the searches of

others, potentially leading to a more collaborative model of search. In the broadest

sense the idea of social search is one that tries to unify two distinctive information

ACM Transactions on Intelligent Systems and Technology, Vol. 2, No. 3, 09 2001, Pages 111–0??.

2 · McNally et al.

discovery worlds: the traditional world of web search and the information sharing

world of social networks. Only a few years ago, by and large, the majority of people

located information of interest through their favourite mainstream search engine.

But recently there has been a very noticeable change in how many web users satisfy

their information needs. For example, recent statistics from Twitter claim that its

users are explicitly searching tweet content 24 billion times per month

as compared

to approximately 88 billion queries per month for Google. Similarly, at the time

of writing, Facebook’s own statistics highlight how its users are sharing upwards

of 30 billion items of content every month.

Many of these items of content would

have previously been located through mainstream search engines. Instead, today,

they are being accessed via our social networks and, in terms of raw volume of

information seeking activity, the social networks are now beginning to compete

with mainstream search engines.

This shift in our information discovery habits has lead to an explosion in the

number and variety of new social-search type services — all of which can inﬂuence

our information discovery activities, bringing the world of web search and social

networks even closer together (see Figure 1). In this context, social search can mean

many things to many people. For some, social search is all about searching the real-

time web (blogs and micro-blogs) `a la the likes of InfoAxe, OneRiot, and Topsy.

For others, social search is about indexing and ﬁltering web content according to

the online activities or opinions of users; see, for example, Mahalo (curated search

categories), Scour (content indexed and ﬁltered by real-time conversations) or the

now-ended Wikia Search. For yet others, social search is about social bookmarking

services (e.g. Delicious, XMarks, Twine), people search (e.g. Pipl, Nayms, Spock),

or social news services (e.g., Digg, Reddit, Mixx).

Our aim is to make mainstream search engines more collaborative and to help

people during routine search tasks by harnessing the recent search experiences of

their friends and colleagues via their social networks. The focus of this paper is the

HeyStaks search service (www.heystaks.com), which adds a layer of collaboration on

top of mainstream search engines: so users continue to search as normal but beneﬁt

from a more collaborative/social search experience. The core HeyStaks system has

been described in detail elsewhere [Smyth et al. 2009a; 2009b] and so we shall only

review the HeyStaks approach in this paper. However, a key contribution of this

paper is a detailed description of a recent live-user trial of HeyStaks in order to

understand the usage and collaboration patterns of users and also the quality of

HeyStaks’ social recommendations relative to the organic results of mainstream

search engines. In addition, a second contribution of this paper is a novel enhanced

reputation model for HeyStaks, which has been developed in order to evaluate the

reputation of individual searchers based on their search contributions. We go on to

demonstrate how this reputation model can be used to further improve the quality

of the HeyStaks recommendations, by prioritising those that originate from more

reputable users.

http://www.boygeniusreport.com/2010/07/07/twitter-handling-24-billion-search-queries-per-

month/

http://www.facebook.com/press/info.php?statistics

ACM Transactions on Intelligent Systems and Technology, Vol. 2, No. 3, 09 2001.

A Case Study of Collaboration and Reputation in Social Web Search · 3

Fig. 1. Social search attempts to bridge the traditional, query-based world of web search with

the information sharing world of social networks. A variety of social search and sharing services

have emerged to help users harness their social networks in pursuit of more eﬀective information

discovery across a variety of application contexts. This ﬁgure lists a number of well-known services,

both start-up and more mature, that have emerged to ﬁll the gap between the mainstream search

world such as Google, Yahoo and Bing, and the major social networks such as Facebook, Twitter

and LinkedIn.

2. BACKGROUND

This paper focuses on discussing HeyStaks as a collaborative information retrieval

technology, augmented by a reputation system based on the collaborations that

implicitly take place between searchers in the HeyStaks social search utility. As

such this background section covers recent, relevant work in the two broad areas of

collaborative information retrieval and reputation systems.

2.1 Collaborative Information Retrieval

Approaches to collaborative information retrieval can be usefully distinguished in

terms of two important dimensions, time — synchronous versus asynchronous

search — and place — that is, co-located versus remote searchers. Co-located

systems oﬀer a collaborative search experience for multiple searchers at a single

location, typically sharing a single PC [Amershi and Morris 2008; Smeaton et al.

2008], whereas remote approaches allow searchers to perform their searches at dif-

ferent locations across multiple devices [Morris and Horvitz 2007a; 2007b; Smyth

et al. 2009b]. The former enjoy the obvious beneﬁt of an increased faculty for direct

ACM Transactions on Intelligent Systems and Technology, Vol. 2, No. 3, 09 2001.

4 · McNally et al.

collaboration that is enabled by the face-to-face nature of co-located search, while

the latter oﬀer a greater opportunity for collaborative search. Alternatively, syn-

chronous approaches are characterised by systems that broadcast a “call to search”,

in which speciﬁc participants are requested to engage in a well-deﬁned search task

for a well deﬁned period of time [Smeaton et al. 2008]. In contrast, asynchronous

approaches are characterised by less well-deﬁned, ad-hoc search tasks and provide

for a more open-ended approach to collaboration in which diﬀerent searchers con-

tribute to an evolving search session over an extended period of time [Morris and

Horvitz 2007a; Boydell and Smyth 2010].

A good example of the co-located, synchronous approach to collaborative web

search is given by the work of Amershi and Morris [2008]. Their CoSearch system

is designed to improve the search experience for co-located users where computing

resources are limited; for example, a group of school children having access to a

single PC. CoSearch is speciﬁcally designed to leverage peripheral devices that may

be available (e.g. mobile phones, extra mice etc.) to facilitate distributed control

and division of eﬀort, while maintaining group awareness and communication. The

purpose of CoSearch is to demonstrate the potential for productive collaborative

web search in resource-limited environments. The focus is very much on dividing

the search labour while maintaining communication between searchers, and live

user studies speak to the success of CoSearch in this regard [Amershi and Morris

2008]. The work of Smeaton et al. [2007] is related in spirit to CoSearch but fo-

cuses on image search tasks using a table-top computing environment. Once again,

preliminary studies speak to the potential for such an approach to improve overall

search productivity and collaboration, at least in speciﬁc types of information ac-

cess tasks. A variation on these forms of synchronous search activities is presented

by Smeaton et al. [2008], where the use of mobile devices as the primary search

device allows for a remote form of synchronous collaborative search. The iBingo

system allows a group of users to collaborate on an image search task with each

user using a iPod touch device as their primary search/feedback device (although

conventional PCs appear to be just as applicable).

Remote search collaboration (whether asynchronous or synchronous) is the aim of

SearchTogether, which allows groups of searchers to participate in extended shared

search sessions as they search to locate information on particular topics [Morris

and Horvitz 2007a]. The SearchTogether system allows users to create shared

search sessions and invite other users to join in these sessions. Each searcher can

independently search for information on a particular topic, but the system provides

features to allow individual searchers to share what they ﬁnd with other session

members by recommending and commenting on speciﬁc results. SearchTogether

supports synchronous collaborative search by allowing searchers to invite others to

join in speciﬁc search tasks, allowing cooperating searchers to synchronously view

the results of each others’ searches via a split-screen style results interface. As

with CoSearch above, one of the key design goals in SearchTogether is to support a

division of labour in complex, open-ended search tasks. In addition, a key feature

of the work is the ability to create a shared awareness among group members by

reducing the overhead of search collaboration at the interface level. SearchTogether

does this by including various features, from integrated messaging, query histories,

ACM Transactions on Intelligent Systems and Technology, Vol. 2, No. 3, 09 2001.

HTML Viewer

Frequently Asked Questions (9)

Q1. What have the authors contributed in "A case study of collaboration and reputation in social web search" ?

Title A Case Study of Collaboration and Reputation in Social Web Search. Publisher 's statement © ACM, 2011 This is the author 's version of the work.

Q2. What is the main focus of this paper?

The focus of this paper is the HeyStaks search service (www.heystaks.com), which adds a layer of collaboration on top of mainstream search engines: so users continue to search as normal but benefit from a more collaborative/social search experience.

Q3. What are the other forms of actions that the user must choose to use?

The 3 other forms of actions (voting, sharing, tagging) the authors refer to as explicit actions in the sense that they are not part of the normal search process, but rather they are HeyStaks specific actions that the user must chose to use.

Q4. What is the problem with the Web search?

Part of the problem rests with the searchers themselves: with an average of only 2-3 terms [Lawrence and Giles 1998; Spink and Jansen 2004], the typical Web search query is often vague with respect to the searcher’s true intentions or information needs [Song et al. 2007].

Q5. How is the relevance score assigned to a candidate result?

Each candidate result, r, is assigned a relevance score using a TF*IDF -based retrieval function as per Equation 2, which serves as the basis for an initial recommendation ranking.

Q6. What is the name of the algorithm?

Similar to O’Donovan and Smyth [2005], Massa and Avesani [2007] propose a reputation algorithm called MoleTrust that can be used to augment an existing collaborative filtering system.

Q7. Why is this work motivated by the idea that an understanding of user reputation can serve as a?

This work is, in part, motivated by the idea that an understanding of user reputation can serve as the basis for strategies to guard against malicious users [Lazzari 2010; Hoffman et al.

Q8. What is the standardACM Transactions on Intelligent Systems and Technology?

the standardACM Transactions on Intelligent Systems and Technology, Vol. 2, No. 3, 09 2001.collaborative filtering algorithm is modified to add a user-user trust score to compliment the normal profile or item-based similarity score, so that recommendation partners are chosen from those users that are not only similar to the target user, but who have also had a positive recommendation history with that user.

Q9. What is the main contribution of this paper?

In addition, a second contribution of this paper is a novel enhanced reputation model for HeyStaks, which has been developed in order to evaluate the reputation of individual searchers based on their search contributions.

"A Case Study of Collaboration and R..." refers background in this paper

"A Case Study of Collaboration and R..." refers background in this paper

Q1. What have the authors contributed in "A case study of collaboration and reputation in social web search" ?

Q2. What is the main focus of this paper?

Q3. What are the other forms of actions that the user must choose to use?

Q4. What is the problem with the Web search?

Q5. How is the relevance score assigned to a candidate result?

Q6. What is the name of the algorithm?

Q7. Why is this work motivated by the idea that an understanding of user reputation can serve as a?

Q8. What is the standardACM Transactions on Intelligent Systems and Technology?

Q9. What is the main contribution of this paper?

A Case Study of Collaboration and Reputation in Social Web Search

Summary (9 min read)

1. INTRODUCTION

2 · McNally et al.

2. BACKGROUND

2.1 Collaborative Information Retrieval

4 · McNally et al.

2.2 Reputation Systems

6 · McNally et al.

3. HEYSTAKS: A SOCIAL SEARCH UTILITY

3.1 Profiling Stak Pages

3.2 Retrieval & Ranking

3.3 Summary Discussion

4. A REPUTATION MODEL FOR SOCIAL SEARCH

4.1 From Activities to Reputation

4.2 Reputation as Collaboration

4.3 A Computational Model of Reputation

14 · McNally et al.

4.4 Reputation and Result Recommendation

5. EVALUATION

5.1 Dataset and Methodology

18 · McNally et al.

5.2 Research Questions

5.3 Quiz Performance

5.4 Search Queries & Result Activities

20 · McNally et al.

5.5 Recommendations & Relevance

22 · McNally et al.

5.6 Searcher Reputation

26 · McNally et al.

5.7 Reputation for Recommendation Ranking

5.8 Limitations & Results Summary

6. CONCLUSIONS

Figures (17)

Citations

References

"A Case Study of Collaboration and R..." refers background in this paper

"A Case Study of Collaboration and R..." refers background in this paper

Related Papers (5)

Frequently Asked Questions (9)

Q1. What have the authors contributed in "A case study of collaboration and reputation in social web search" ?

Q2. What is the main focus of this paper?

Q3. What are the other forms of actions that the user must choose to use?

Q4. What is the problem with the Web search?

Q5. How is the relevance score assigned to a candidate result?

Q6. What is the name of the algorithm?

Q7. Why is this work motivated by the idea that an understanding of user reputation can serve as a?

Q8. What is the standardACM Transactions on Intelligent Systems and Technology?

Q9. What is the main contribution of this paper?