Showing papers by "Nigel Shadbolt published in 2015"

PDF

Open Access

Proceedings Article•DOI•

Designing for Citizen Data Analysis: A Cross-Sectional Case Study of a Multi-Domain Citizen Science Platform

[...]

Ramine Tinati¹, Max Van Kleek¹, Elena Simperl¹, Markus Luczak-Rösch¹, Robert Mark Simpson², Nigel Shadbolt¹ - Show less +2 more•Institutions (2)

University of Southampton¹, University of Oxford²

18 Apr 2015

TL;DR: A case study of the largest single platform of citizen driven data analysis projects to date, the Zooniverse is presented, by eliciting, through structured reflection, experiences of core members of its design team, which yielded four sets of themes, focusing on Task Specificity, Community Development, Task Design and Public Relations and Engagement.

...read moreread less

Abstract: Designing an effective and sustainable citizen science (CS)project requires consideration of a great number of factors. This makes the overall process unpredictable, even when a sound, user-centred design approach is followed by an experienced team of UX designers. Moreover, when such systems are deployed, the complexity of the resulting interactions challenges any attempt to generalisation from retrospective analysis. In this paper, we present a case study of the largest single platform of citizen driven data analysis projects to date, the Zooniverse. By eliciting, through structured reflection, experiences of core members of its design team, our grounded analysis yielded four sets of themes, focusing on Task Specificity, Community Development, Task Design and Public Relations and Engagement, supported by two-to-four specific design claims each. For each, we propose a set of design claims (DCs), drawing comparisons to the literature on crowdsourcing and online communities to contextualise our findings.

...read moreread less

85 citations

Proceedings Article•DOI•

Improving Paid Microtasks through Gamification and Adaptive Furtherance Incentives

[...]

Oluwaseyi Feyisetan¹, Elena Simperl¹, Max Van Kleek¹, Nigel Shadbolt¹•Institutions (1)

University of Southampton¹

18 May 2015

TL;DR: This work defines a predictive model for estimating the most appropriate incentives for individual workers, based on their previous contributions, that allows for a personalised game experience and shows that gamification leads to better accuracy and lower costs than conventional approaches that use only monetary incentives.

...read moreread less

Abstract: Crowdsourcing via paid microtasks has been successfully applied in a plethora of domains and tasks. Previous efforts for making such crowdsourcing more effective have considered aspects as diverse as task and workflow design, spam detection, quality control, and pricing models. Our work expands upon such efforts by examining the potential of adding gamification to microtask interfaces as a means of improving both worker engagement and effectiveness. We run a series of experiments in image labeling, one of the most common use cases for microtask crowdsourcing, and analyse worker behavior in terms of number of images completed, quality of annotations compared against a gold standard, and response to financial and game-specific rewards. Each experiment studies these parameters in two settings: one based on a state-of-the-art, non-gamified task on CrowdFlower and another one using an alternative interface incorporating several game elements. Our findings show that gamification leads to better accuracy and lower costs than conventional approaches that use only monetary incentives. In addition, it seems to make paid microtask work more rewarding and engaging, especially when sociality features are introduced. Following these initial insights, we define a predictive model for estimating the most appropriate incentives for individual workers, based on their previous contributions. This allows us to build a personalised game experience, with gains seen on the volume and quality of work completed.

...read moreread less

70 citations

Book Chapter•

Knowledge Elicitation: Methods, Tools and Techniques

[...]

Nigel Shadbolt, Paul R. Smart

22 May 2015

TL;DR: This work states that knowledge elicitation is a sub-process of knowledge acquisition (which deals with the acquisition or capture of knowledge from any source), and knowledge acquisition is, in turn, a sub of knowledge engineering (which is a discipline that has evolved to support the whole process of specifying, developing and deploying knowledge-based systems).

...read moreread less

Abstract: Introduction Knowledge elicitation consists of a set of techniques and methods that attempt to elicit the knowledge of a domain expert1, typically through some form of direct interaction with the expert. Knowledge elicitation is a sub-process of knowledge acquisition (which deals with the acquisition or capture of knowledge from any source), and knowledge acquisition is, in turn, a sub-process of knowledge engineering (which is a discipline that has evolved to support the whole process of specifying, developing and deploying knowledge-based systems).

...read moreread less

52 citations

Book Chapter•DOI•

A Literature Survey and Classifications on Data Deanonymisation

[...]

Dalal Al-Azizy¹, Dalal Al-Azizy², David E. Millard², Iraklis Symeonidis³, Kieron O'Hara², Nigel Shadbolt⁴ - Show less +2 more•Institutions (4)

University of Tabuk¹, University of Southampton², Katholieke Universiteit Leuven³, University of Oxford⁴

20 Jul 2015

TL;DR: This survey surveys a large number of state-of-the-art techniques of deanonymisation achieved in various methods and on different types of data, and proposes a framework to guide a thorough analysis and classifications.

...read moreread less

Abstract: The problem of disclosing private anonymous data has become increasingly serious particularly with the possibility of carrying out deanonymisation attacks on publishing data. The related work available in the literature is inadequate in terms of the number of techniques analysed, and is limited to certain contexts such as Online Social Networks. We survey a large number of state-of-the-art techniques of deanonymisation achieved in various methods and on different types of data. Our aim is to build a comprehensive understanding about the problem. For this survey, we propose a framework to guide a thorough analysis and classifications. We are interested in classifying deanonymisation approaches based on type and source of auxiliary information and on the structure of target datasets. Moreover, potential attacks, threats and some suggested assistive techniques are identified. This can inform the research in gaining an understanding of the deanonymisation problem and assist in the advancement of privacy protection.

...read moreread less

23 citations

Proceedings Article•DOI•

Self Curation, Social Partitioning, Escaping from Prejudice and Harassment: the Many Dimensions of Lying Online

[...]

Max Van Kleek¹, Dave Murray-Rust², Amy Guy², Daniel Alexander Smith¹, Kieron O'Hara¹, Nigel Shadbolt¹ - Show less +2 more•Institutions (2)

University of Southampton¹, University of Edinburgh²

28 Jun 2015

TL;DR: This paper uses a survey to examine ways in which people fabricate, omit or alter the truth online, and concludes that lying may be essential to maintaining a humane online society.

...read moreread less

Abstract: Portraying matters as other than they truly are is an important part of everyday human communication. In this paper, we use a survey to examine ways in which people fabricate, omit or alter the truth online. Many reasons are found, including creative expression, hiding sensitive information, role-playing, and avoiding harassment or discrimination. The results suggest lying is often used for benign purposes, and we conclude that its use may be essential to maintaining a humane online society.

...read moreread less

17 citations

Book Chapter•DOI•

Towards Hybrid NER: A Study of Content and Crowdsourcing-Related Performance Factors

[...]

Oluwaseyi Feyisetan¹, Markus Luczak-Roesch¹, Elena Simperl¹, Ramine Tinati¹, Nigel Shadbolt¹ - Show less +1 more•Institutions (1)

University of Southampton¹

31 May 2015

TL;DR: The findings show that crowd workers are adept at recognizing people, locations, and implicitly identified entities within shorter microposts, which are expected to lead to the design of more advanced NER pipelines, informing the way in which tweets are chosen to be outsourced or processed by automatic tools.

...read moreread less

Abstract: This paper explores the factors that influence the human component in hybrid approaches to named entity recognition NER in microblogs, which combine state-of-the-art automatic techniques with human and crowd computing. We identify a set of content and crowdsourcing-related features number of entities in a post, types of entities, skipped true-positive posts, average time spent to complete the tasks, and interaction with the user interface and analyse their impact on the accuracy of the results and the timeliness of their delivery. Using CrowdFlower and a simple, custom built gamified NER tool we run experiments on three datasets from related literature and a fourth newly annotated corpus. Our findings show that crowd workers are adept at recognizing people, locations, and implicitly identified entities within shorter microposts. We expect them to lead to the design of more advanced NER pipelines, informing the way in which tweets are chosen to be outsourced or processed by automatic tools. Experimental results are published as JSON-LD for further use by the research community.

...read moreread less

15 citations

Proceedings Article•DOI•

When Resources Collide: Towards a Theory of Coincidence in Information Spaces

[...]

Markus Luczak-Roesch¹, Ramine Tinati¹, Nigel Shadbolt¹•Institutions (1)

University of Southampton¹

18 May 2015

TL;DR: This paper presents a generic information cascade model that exploits only the temporal order of information sharing activities, combined with inherent properties of the shared information resources, applied to data from the world's largest online citizen science platform Zooniverse.

...read moreread less

Abstract: This paper is an attempt to lay out foundations for a general theory of coincidence in information spaces such as the World Wide Web, expanding on existing work on bursty structures in document streams and information cascades. We elaborate on the hypothesis that every resource that is published in an information space, enters a temporary interaction with another resource once a unique explicit or implicit reference between the two is found. This thought is motivated by Erwin Shroedingers notion of entanglement between quantum systems. We present a generic information cascade model that exploits only the temporal order of information sharing activities, combined with inherent properties of the shared information resources. The approach was applied to data from the world's largest online citizen science platform Zooniverse and we report about findings of this case study.

...read moreread less

14 citations

Proceedings Article•DOI•

From Coincidence to Purposeful Flow? Properties of Transcendental Information Cascades

[...]

Markus Luczak-Roesch¹, Ramine Tinati¹, Max Van Kleek¹, Nigel Shadbolt¹•Institutions (1)

University of Southampton¹

25 Aug 2015

TL;DR: This paper applies a method for constructing cascades of information co-occurrence, which is suitable to trace emergent structures in information in scenarios where rich contextual features are unavailable, to analyse information dissemination patterns across the active online citizen science project Planet Hunters.

...read moreread less

Abstract: In this paper, we investigate a method for constructing cascades of information co-occurrence, which is suitable to trace emergent structures in information in scenarios where rich contextual features are unavailable Our method relies only on the temporal order of content-sharing activities, and intrinsic properties of the shared content itself We apply this method to analyse information dissemination patterns across the active online citizen science project Planet Hunters, a part of the Zooniverse platform Our results lend insight into both structural and informational properties of different types of identifiers that can be used and combined to construct cascades In particular, significant differences are found in the structural properties of information cascades when hashtags as used as cascade identifiers, compared with other content features We also explain apparent local information losses in cascades in terms of information obsolescence and cascade divergence; eg, when a cascade branches into multiple, divergent cascades with combined capacity equal to the original

...read moreread less

14 citations

Proceedings Article•DOI•

Self Curation, Social Partitioning, Escaping from Prejudice and Harassment: The Many Dimensions of Lying Online

[...]

Max Van Kleek¹, Daniel Alexander Smith¹, Nigel Shadbolt¹, Dave Murray-Rust², Amy Guy² - Show less +1 more•Institutions (2)

University of Southampton¹, University of Edinburgh²

18 May 2015

TL;DR: This paper uses a survey to examine ways in which people fabricate, omit or alter the truth online, and concludes that lying may be essential to maintaining a humane online society.

...read moreread less

11 citations

Proceedings Article•DOI•

'/Command' and Conquer: Analysing Discussion in a Citizen Science Game

[...]

Ramine Tinati¹, Markus Luczak-Roesch¹, Elena Simperl¹, Nigel Shadbolt¹, Wendy Hall¹ - Show less +1 more•Institutions (1)

University of Southampton¹

28 Jun 2015

TL;DR: A set of behavioural characteristics are described which identify different types of players within the EyeWire platform, which facilitate player interaction and communication alongside completing the gamified scientific task.

...read moreread less

Abstract: Citizen science is changing the process of scientific knowledge discovery. Successful projects rely on an active and able collection of volunteers. In order to attract, and sustain citizen scientists, designers are faced with the task of transforming complex scientific tasks into something accessible, interesting, and hopefully, engaging. In this paper, we examine the citizen science game EyeWire. Our analysis draws up a dataset of over 4,000,000 completed game and 885,000 chat entries, made by over 90,000 players. The analysis provides a detailed understanding of how features of the system facilitate player interaction and communication alongside completing the gamified scientific task. Based on the analysis we describe a set of behavioural characteristics which identify different types of players within the EyeWire platform.

...read moreread less

10 citations

Proceedings Article•DOI•

Socio-technical Computation

[...]

Markus Luczak-Roesch¹, Ramine Tinati¹, Kieron O'Hara¹, Nigel Shadbolt¹•Institutions (1)

University of Southampton¹

28 Feb 2015

TL;DR: This paper elaborate a thesis about the computational capability embodied in information sharing activities that happen on the Web, which is term socio-technical computation, reflecting not only explicitly conditional activities but also the organic potential residing in information in the Web.

...read moreread less

Abstract: Motivated by the significant amount of successful collaborative problem solving activity on the Web, we ask: Can the accumulated information propagation behavior on the Web be conceived as a giant machine, and reasoned about accordingly? In this paper we elaborate a thesis about the computational capability embodied in information sharing activities that happen on the Web, which we term socio-technical computation, reflecting not only explicitly conditional activities but also the organic potential residing in information on the Web.

...read moreread less

Proceedings Article•DOI•

Social Personal Data Stores: the Nuclei of Decentralised Social Machines

[...]

Max Van Kleek¹, Daniel Alexander Smith¹, Dave Murray-Rust², Amy Guy², Kieron O'Hara¹, Laura Dragan¹, Nigel Shadbolt¹ - Show less +3 more•Institutions (2)

University of Southampton¹, University of Edinburgh²

18 May 2015

TL;DR: This short paper presents the technical design of a prototype social machine platform, INDX, which realises both of the key design needs for building decentralised social machines: that of supporting heterogeneous social apps and multiple, separable user identities.

...read moreread less

Abstract: Personal Data Stores are among the many efforts that are currently underway to try to re-decentralise the Web, and to bring more control and data management and storage capability under the control of the user. Few of these architectures, however, have considered the needs of supporting decentralised social software from the user's perspective. In this short paper, we present the results of our design exercise, focusing on two key design needs for building decentralised social machines: that of supporting heterogeneous social apps and multiple, separable user identities. We then present the technical design of a prototype social machine platform, INDX, which realises both of these requirements, and a prototype heterogeneous microblogging application which demonstrates its capabilities.

...read moreread less

Privacy Bridges: EU and US Privacy Experts In Search of Transatlantic Privacy Solutions

[...]

01 Jan 2015

TL;DR: A consensus report outlining a menu of privacy “bridges” that can be built to bring the European Union and the United States closer together is prepared, aimed at providing a framework of practical options that advance strong, globally-accepted privacy values in a manner that respects the substantive and procedural differences between the two jurisdictions.

...read moreread less

Abstract: The EU and US share a common commitment to privacy protection as a cornerstone of democracy. Following the Treaty of Lisbon, data privacy is a fundamental right that the European Union must proactively guarantee. In the United States, data privacy derives from constitutional protections in the First, Fourth and Fifth Amendment as well as federal and state statute, consumer protection law and common law. The ultimate goal of effective privacy protection is shared. However, current friction between the two legal systems poses challenges to realizing privacy and the free flow of information across the Atlantic. Recent expansion of online surveillance practices underline these challenges. Over nine months, the group prepared a consensus report outlining a menu of privacy “bridges” that can be built to bring the European Union and the United States closer together. The efforts are aimed at providing a framework of practical options that advance strong, globally-accepted privacy values in a manner that respects the substantive and procedural differences between the two jurisdictions. The report will be presented at the 2015 International Conference of Privacy and Data Protection Commissioners, which the Dutch Data Protection Authority will host in Amsterdam on 28-29 October 2015.

...read moreread less

Proceedings Article•DOI•

Using WikiProjects to Measure the Health of Wikipedia

[...]

Ramine Tinati¹, Markus Luczak-Roesch¹, Nigel Shadbolt¹, Wendy Hall¹•Institutions (1)

University of Southampton¹

18 May 2015

TL;DR: This paper examines WikiProjects, an emergent, community-driven feature of Wikipedia, and analysis revealed that per WikiProject, the number of article and talk contributions are increasing, as are the numbers of new Wikipedians contributing to individual Wiki project.

...read moreread less

Abstract: In this paper we examine WikiProjects, an emergent, community-driven feature of Wikipedia. We analysed 3.2 million Wikipedia articles associated with 618 active Wikipedia projects. The dataset contained the logs of over 115 million article revisions and 15 million talk entries both representing the activity of 15 million unique Wikipedians altogether. Our analysis revealed that per WikiProject, the number of article and talk contributions are increasing, as are the number of new Wikipedians contributing to individual WikiProjects. Based on these findings we consider how studying Wikipedia from a sub-community level may provide a means to measure Wikipedia activity.

...read moreread less

Journal Article•DOI•

The Right to be Forgotten: Its Potential Role in a Coherent Privacy Regime

[...]

Kieron O'Hara, Nigel Shadbolt

01 Nov 2015-European Data Protection Law Review

TL;DR: The right to be de-indexed should be understood in the context of moves to improve communication with data subjects and support subjects’ autonomy, particularly within the notice and consent regime; understanding the role of obscurity of information, and undermining the current binary assumption that information is either public or not.

...read moreread less

Abstract: This paper examines the recent Google Spain ruling establishing a right to de-indexing based on existing rights to data protection. This ruling has had a divisive effect on the relations between the EU and the US, but this article argues that we should understand the right to de-indexing in the context of: (i) moves to improve communication with data subjects and support subjects’ autonomy, particularly within the notice and consent regime; (ii) understanding the role of obscurity of information, and undermining the current binary assumption that information is either public or not; and (iii) moves to improve the quality of search engines’ output. If we do this, then the right to be de-indexed (and possibly other types of ‘right to be forgotten’) could become a point of contact between the EU and US privacy regimes, not a point of conflict.

...read moreread less

Privacy by obfuscation with personal data management architectures: possibilities and constraints

[...]

Dave Murray-Rust, Kieron O'Hara, Marion Oswald, Max Van Kleek, Nigel Shadbolt - Show less +1 more

30 Jun 2015

TL;DR: It is argued that providing false information on occasion is a common strategy online and offline for people to protect their privacy and determine their representation in the world, and some empirical findings to that effect are discussed.

...read moreread less

Abstract: In this position paper, we discuss legal and technical aspects of protecting privacy using Personal Data Management Architectures (PDMAs), which include, but are not limited to Personal Data Stores and Personal Information Management Services. We argue that providing false information on occasion is a common strategy online and offline for people to protect their privacy and determine their representation in the world, and we discuss some empirical findings to that effect. We describe a potential, and technically-feasible, ecosystem of digital practices and technologies to facilitate this practice, and consider what legal frameworks would be required to support it

...read moreread less