scispace - formally typeset
Search or ask a question
Topic

Crowdsourcing

About: Crowdsourcing is a research topic. Over the lifetime, 12889 publications have been published within this topic receiving 230638 citations.


Papers
More filters
Proceedings ArticleDOI
18 Jun 2014
TL;DR: Corleone is described, a HOC solution for EM, which uses the crowd in all major steps of the EM process, and the implications of this work to executing crowdsourced RDBMS joins, cleaning learning models, and soliciting complex information types from crowd workers.
Abstract: Recent approaches to crowdsourcing entity matching (EM) are limited in that they crowdsource only parts of the EM workflow, requiring a developer to execute the remaining parts. Consequently, these approaches do not scale to the growing EM need at enterprises and crowdsourcing startups, and cannot handle scenarios where ordinary users (i.e., the masses) want to leverage crowdsourcing to match entities. In response, we propose the notion of hands-off crowdsourcing (HOC)}, which crowdsources the entire workflow of a task, thus requiring no developers. We show how HOC can represent a next logical direction for crowdsourcing research, scale up EM at enterprises and crowdsourcing startups, and open up crowdsourcing for the masses. We describe Corleone, a HOC solution for EM, which uses the crowd in all major steps of the EM process. Finally, we discuss the implications of our work to executing crowdsourced RDBMS joins, cleaning learning models, and soliciting complex information types from crowd workers.

251 citations

Journal ArticleDOI
TL;DR: In this paper, the authors show that the productivity exhibited in crowdsourcing exhibits a strong positive dependence on attention, measured by the number of downloads, which in many cases asymptotes to no uploads whatsoever.
Abstract: We show through an analysis of a massive data set from YouTube that the productivity exhibited in crowdsourcing exhibits a strong positive dependence on attention, measured by the number of downloads. Conversely, a lack of attention leads to a decrease in the number of videos uploaded and the consequent drop in productivity, which in many cases asymptotes to no uploads whatsoever. Moreover, short-term contributors compare their performance to the average contributor's performance while long-term contributors compare it to their own media.

248 citations

Proceedings Article
01 Jan 2011
TL;DR: This paper presents an automated quality assurance process that is inexpensive and scalable, and finds that it decreases the amount of manual work required to manage crowdsourced labor while improving the overall quality of the results.
Abstract: Crowdsourcing is an effective tool for scalable data annotation in both research and enterprise contexts Due to crowdsourcing's open participation model, quality assurance is critical to the success of any project Present methods rely on EM-style post-processing or manual annotation of large gold standard sets In this paper we present an automated quality assurance process that is inexpensive and scalable Our novel process relies on programmatic gold creation to provide targeted training feedback to workers and to prevent common scamming scenarios We find that it decreases the amount of manual work required to manage crowdsourced labor while improving the overall quality of the results

247 citations

Journal ArticleDOI
TL;DR: In this paper, a taxonomic theory of crowdsourcing is developed by organizing the empirical variants in nine distinct forms of crowd-sourcing models, focusing on the notion of managerial control systems.
Abstract: In this article, the authors first provide a practical yet rigorous definition of crowdsourcing that incorporates “crowds,” outsourcing, and social web technologies. They then analyze 103 well-known crowdsourcing web sites using content analysis methods and the hermeneutic reading principle. Based on their analysis, they develop a “taxonomic theory” of crowdsourcing by organizing the empirical variants in nine distinct forms of crowdsourcing models. They also discuss key issues and directions, concentrating on the notion of managerial control systems.

244 citations

Journal ArticleDOI
TL;DR: The findings inform recent discussions about potential benefits from crowd science, suggest that involving the crowd may be more effective for some kinds of projects than others, provide guidance for project managers, and raise important questions for future research.
Abstract: Scientific research performed with the involvement of the broader public (the crowd) attracts increasing attention from scientists and policy makers. A key premise is that project organizers may be able to draw on underused human resources to advance research at relatively low cost. Despite a growing number of examples, systematic research on the effort contributions volunteers are willing to make to crowd science projects is lacking. Analyzing data on seven different projects, we quantify the financial value volunteers can bring by comparing their unpaid contributions with counterfactual costs in traditional or online labor markets. The volume of total contributions is substantial, although some projects are much more successful in attracting effort than others. Moreover, contributions received by projects are very uneven across time—a tendency toward declining activity is interrupted by spikes typically resulting from outreach efforts or media attention. Analyzing user-level data, we find that most contributors participate only once and with little effort, leaving a relatively small share of users who return responsible for most of the work. Although top contributor status is earned primarily through higher levels of effort, top contributors also tend to work faster. This speed advantage develops over multiple sessions, suggesting that it reflects learning rather than inherent differences in skills. Our findings inform recent discussions about potential benefits from crowd science, suggest that involving the crowd may be more effective for some kinds of projects than others, provide guidance for project managers, and raise important questions for future research.

243 citations


Network Information
Related Topics (5)
Social network
42.9K papers, 1.5M citations
87% related
User interface
85.4K papers, 1.7M citations
86% related
Deep learning
79.8K papers, 2.1M citations
85% related
Cluster analysis
146.5K papers, 2.9M citations
85% related
The Internet
213.2K papers, 3.8M citations
85% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023637
20221,420
2021996
20201,250
20191,341
20181,396