Topic
Crowdsourcing
About: Crowdsourcing is a research topic. Over the lifetime, 12889 publications have been published within this topic receiving 230638 citations.
Papers published on a yearly basis
Papers
More filters
••
18 Jun 2014TL;DR: Corleone is described, a HOC solution for EM, which uses the crowd in all major steps of the EM process, and the implications of this work to executing crowdsourced RDBMS joins, cleaning learning models, and soliciting complex information types from crowd workers.
Abstract: Recent approaches to crowdsourcing entity matching (EM) are limited in that they crowdsource only parts of the EM workflow, requiring a developer to execute the remaining parts. Consequently, these approaches do not scale to the growing EM need at enterprises and crowdsourcing startups, and cannot handle scenarios where ordinary users (i.e., the masses) want to leverage crowdsourcing to match entities. In response, we propose the notion of hands-off crowdsourcing (HOC)}, which crowdsources the entire workflow of a task, thus requiring no developers. We show how HOC can represent a next logical direction for crowdsourcing research, scale up EM at enterprises and crowdsourcing startups, and open up crowdsourcing for the masses. We describe Corleone, a HOC solution for EM, which uses the crowd in all major steps of the EM process. Finally, we discuss the implications of our work to executing crowdsourced RDBMS joins, cleaning learning models, and soliciting complex information types from crowd workers.
251 citations
••
TL;DR: In this paper, the authors show that the productivity exhibited in crowdsourcing exhibits a strong positive dependence on attention, measured by the number of downloads, which in many cases asymptotes to no uploads whatsoever.
Abstract: We show through an analysis of a massive data set from YouTube that the productivity exhibited in crowdsourcing exhibits a strong positive dependence on attention, measured by the number of downloads. Conversely, a lack of attention leads to a decrease in the number of videos uploaded and the consequent drop in productivity, which in many cases asymptotes to no uploads whatsoever. Moreover, short-term contributors compare their performance to the average contributor's performance while long-term contributors compare it to their own media.
248 citations
•
01 Jan 2011
TL;DR: This paper presents an automated quality assurance process that is inexpensive and scalable, and finds that it decreases the amount of manual work required to manage crowdsourced labor while improving the overall quality of the results.
Abstract: Crowdsourcing is an effective tool for scalable data annotation in both research and enterprise contexts Due to crowdsourcing's open participation model, quality assurance is critical to the success of any project Present methods rely on EM-style post-processing or manual annotation of large gold standard sets In this paper we present an automated quality assurance process that is inexpensive and scalable Our novel process relies on programmatic gold creation to provide targeted training feedback to workers and to prevent common scamming scenarios We find that it decreases the amount of manual work required to manage crowdsourced labor while improving the overall quality of the results
247 citations
••
TL;DR: In this paper, a taxonomic theory of crowdsourcing is developed by organizing the empirical variants in nine distinct forms of crowd-sourcing models, focusing on the notion of managerial control systems.
Abstract: In this article, the authors first provide a practical yet rigorous definition of crowdsourcing that incorporates “crowds,” outsourcing, and social web technologies. They then analyze 103 well-known crowdsourcing web sites using content analysis methods and the hermeneutic reading principle. Based on their analysis, they develop a “taxonomic theory” of crowdsourcing by organizing the empirical variants in nine distinct forms of crowdsourcing models. They also discuss key issues and directions, concentrating on the notion of managerial control systems.
244 citations
••
TL;DR: The findings inform recent discussions about potential benefits from crowd science, suggest that involving the crowd may be more effective for some kinds of projects than others, provide guidance for project managers, and raise important questions for future research.
Abstract: Scientific research performed with the involvement of the broader public (the crowd) attracts increasing attention from scientists and policy makers. A key premise is that project organizers may be able to draw on underused human resources to advance research at relatively low cost. Despite a growing number of examples, systematic research on the effort contributions volunteers are willing to make to crowd science projects is lacking. Analyzing data on seven different projects, we quantify the financial value volunteers can bring by comparing their unpaid contributions with counterfactual costs in traditional or online labor markets. The volume of total contributions is substantial, although some projects are much more successful in attracting effort than others. Moreover, contributions received by projects are very uneven across time—a tendency toward declining activity is interrupted by spikes typically resulting from outreach efforts or media attention. Analyzing user-level data, we find that most contributors participate only once and with little effort, leaving a relatively small share of users who return responsible for most of the work. Although top contributor status is earned primarily through higher levels of effort, top contributors also tend to work faster. This speed advantage develops over multiple sessions, suggesting that it reflects learning rather than inherent differences in skills. Our findings inform recent discussions about potential benefits from crowd science, suggest that involving the crowd may be more effective for some kinds of projects than others, provide guidance for project managers, and raise important questions for future research.
243 citations