scispace - formally typeset
Search or ask a question
Topic

Crowdsourcing

About: Crowdsourcing is a research topic. Over the lifetime, 12889 publications have been published within this topic receiving 230638 citations.


Papers
More filters
Proceedings ArticleDOI
05 Jun 2011
TL;DR: This paper introduces a framework to address the problem of moderating online content using crowdsourced ratings, and presents efficient algorithms to accurately detect abuse that only require knowledge about the identity of a single 'good' agent, who rates contributions accurately more than half the time.
Abstract: A large fraction of user-generated content on the Web, such as posts or comments on popular online forums, consists of abuse or spam. Due to the volume of contributions on popular sites, a few trusted moderators cannot identify all such abusive content, so viewer ratings of contributions must be used for moderation. But not all viewers who rate content are trustworthy and accurate. What is a principled approach to assigning trust and aggregating user ratings, in order to accurately identify abusive content? In this paper, we introduce a framework to address the problem of moderating online content using crowdsourced ratings. Our framework encompasses users who are untrustworthy or inaccurate to an unknown extent --- that is, both the content and the raters are of unknown quality. With no knowledge whatsoever about the raters, it is impossible to do better than a random estimate. We present efficient algorithms to accurately detect abuse that only require knowledge about the identity of a single 'good' agent, who rates contributions accurately more than half the time. We prove that our algorithm can infer the quality of contributions with error that rapidly converges to zero as the number of observations increases; we also numerically demonstrate that the algorithm has very high accuracy for much fewer observations. Finally, we analyze the robustness of our algorithms to manipulation by adversarial or strategic raters, an important issue in moderating online content, and quantify how the performance of the algorithm degrades with the number of manipulating agents.

198 citations

Proceedings ArticleDOI
20 Jun 2011
TL;DR: This work presents an approach for live learning of object detectors, in which the system autonomously refines its models by actively requesting crowd-sourced annotations on images crawled from the Web, and introduces a novel part-based detector amenable to linear classifiers.
Abstract: Active learning and crowdsourcing are promising ways to efficiently build up training sets for object recognition, but thus far techniques are tested in artificially controlled settings. Typically the vision researcher has already determined the dataset's scope, the labels “actively” obtained are in fact already known, and/or the crowd-sourced collection process is iteratively fine-tuned. We present an approach for live learning of object detectors, in which the system autonomously refines its models by actively requesting crowd-sourced annotations on images crawled from the Web. To address the technical issues such a large-scale system entails, we introduce a novel part-based detector amenable to linear classifiers, and show how to identify its most uncertain instances in sub-linear time with a hashing-based solution. We demonstrate the approach with experiments of unprecedented scale and autonomy, and show it successfully improves the state-of-the-art for the most challenging objects in the PASCAL benchmark. In addition, we show our detector competes well with popular nonlinear classifiers that are much more expensive to train.

198 citations

Proceedings ArticleDOI
07 May 2011
TL;DR: Findings suggest that crowd based design processes may be effective, and point the way toward computer-human interactions that might further encourage crowd creativity.
Abstract: A sketch combination system is introduced and tested: a crowd of 1047 participated in an iterative process of design, evaluation and combination. Specifically, participants in a crowdsourcing marketplace sketched chairs for children. One crowd created a first generation of chairs, and then successive crowds created new generations by combining the chairs made by previous crowds. Other participants evaluated the chairs. The crowd judged the chairs from the third generation more creative than those from the first generation. An analysis of the design evolution shows that participants inherited and modified presented features, and also added new features. These findings suggest that crowd based design processes may be effective, and point the way toward computer-human interactions that might further encourage crowd creativity.

197 citations

Journal ArticleDOI
TL;DR: The most significant opportunities offered by smartphone-based crowdsourcing, and the major challenges that have to be faced, are discussed.
Abstract: Smartphone-based crowdsourcing fosters the rise of radically novel systems and applications in the context of network monitoring. This article discusses the most significant opportunities offered by this approach, and the major challenges that have to be faced. Our experience in building a smartphone-based crowdsourcing system, Portolan, is also included to provide a practical background to the discussion and to demonstrate the possible benefits.

197 citations

Proceedings ArticleDOI
20 May 2012
TL;DR: It is shown that in a crowdsourcing DB system, the optimal solution to both problems is NP-Hard, and heuristic functions are provided to select the maximum given evidence, and to select additional votes.
Abstract: We consider a crowdsourcing database system that may cleanse, populate, or filter its data by using human workers. Just like a conventional DB system, such a crowdsourcing DB system requires data manipulation functions such as select, aggregate, maximum, average, and so on, except that now it must rely on human operators (that for example compare two objects) with very different latency, cost and accuracy characteristics. In this paper, we focus on one such function, maximum, that finds the highest ranked object or tuple in a set. In particularm we study two problems: given a set of votes (pairwise comparisons among objects), how do we select the maximum? And how do we improve our estimate by requesting additional votes? We show that in a crowdsourcing DB system, the optimal solution to both problems is NP-Hard. We then provide heuristic functions to select the maximum given evidence, and to select additional votes. We experimentally evaluate our functions to highlight their strengths and weaknesses.

196 citations


Network Information
Related Topics (5)
Social network
42.9K papers, 1.5M citations
87% related
User interface
85.4K papers, 1.7M citations
86% related
Deep learning
79.8K papers, 2.1M citations
85% related
Cluster analysis
146.5K papers, 2.9M citations
85% related
The Internet
213.2K papers, 3.8M citations
85% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023637
20221,420
2021996
20201,250
20191,341
20181,396