scispace - formally typeset
Search or ask a question
Topic

Crowdsourcing

About: Crowdsourcing is a research topic. Over the lifetime, 12889 publications have been published within this topic receiving 230638 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: In this article, the authors proposed a decentralized federated learning (FL) framework that considers the communication efficiency during parameters exchange and formulated a Stackelberg game to find the game's equilibria.
Abstract: Federated learning (FL) rests on the notion of training a global model in a decentralized manner. Under this setting, mobile devices perform computations on their local data before uploading the required updates to improve the global model. However, when the participating clients implement an uncoordinated computation strategy, the difficulty is to handle the communication efficiency (i.e., the number of communications per iteration) while exchanging the model parameters during aggregation. Therefore, a key challenge in FL is how users participate to build a high-quality global model with communication efficiency. We tackle this issue by formulating a utility maximization problem, and propose a novel crowdsourcing framework to leverage FL that considers the communication efficiency during parameters exchange. First, we show an incentive-based interaction between the crowdsourcing platform and the participating client's independent strategies for training a global learning model, where each side maximizes its own benefit. We formulate a two-stage Stackelberg game to analyze such scenario and find the game's equilibria. Second, we formalize an admission control scheme for participating clients to ensure a level of local accuracy. Simulated results demonstrate the efficacy of our proposed solution with up to 22% gain in the offered reward.

159 citations

Proceedings ArticleDOI
24 Oct 2011
TL;DR: This paper uses behavioral observations (HIT completion time, fraction of useful labels, label accuracy) to define five worker types: Spammer, Sloppy, Incompetent, Competent, Diligent, and the `Big Five' personality dimensions.
Abstract: Crowdsourcing platforms offer unprecedented opportunities for creating evaluation benchmarks, but suffer from varied output quality from crowd workers who possess different levels of competence and aspiration. This raises new challenges for quality control and requires an in-depth understanding of how workers' characteristics relate to the quality of their work.In this paper, we use behavioral observations (HIT completion time, fraction of useful labels, label accuracy) to define five worker types: Spammer, Sloppy, Incompetent, Competent, Diligent. Using data collected from workers engaged in the crowdsourced evaluation of the INEX 2010 Book Track Prove It task, we relate the worker types to label accuracy and personality trait information along the `Big Five' personality dimensions.We expect that these new insights about the types of crowd workers and the quality of their work will inform how to design HITs to attract the best workers to a task and explain why certain HIT designs are more effective than others.

159 citations

Journal ArticleDOI
TL;DR: This paper reports on the first attempts to combine crowdsourcing and TREC: the aim is to validate the use of crowdsourcing for relevance assessment, using the Amazon Mechanical Turk crowdsourcing platform to run experiments on TREC data, evaluate the outcomes, and discuss the results.
Abstract: Crowdsourcing has recently gained a lot of attention as a tool for conducting different kinds of relevance evaluations At a very high level, crowdsourcing describes outsourcing of tasks to a large group of people instead of assigning such tasks to an in-house employee This crowdsourcing approach makes possible to conduct information retrieval experiments extremely fast, with good results at a low cost This paper reports on the first attempts to combine crowdsourcing and TREC: our aim is to validate the use of crowdsourcing for relevance assessment To this aim, we use the Amazon Mechanical Turk crowdsourcing platform to run experiments on TREC data, evaluate the outcomes, and discuss the results We make emphasis on the experiment design, execution, and quality control to gather useful results, with particular attention to the issue of agreement among assessors Our position, supported by the experimental results, is that crowdsourcing is a cheap, quick, and reliable alternative for relevance assessment

159 citations

Proceedings Article
01 May 2012
TL;DR: In this article, the authors explore the feasibility of a crowdsourced Sybil detection system for Online Social Networks (OSNs) and conduct a large user study on the ability of humans to detect today's Sybil accounts, using a large corpus of ground-truth sybil accounts from the Facebook and Renren networks.
Abstract: As popular tools for spreading spam and malware, Sybils (or fake accounts) pose a serious threat to online communities such as Online Social Networks (OSNs). Today, sophisticated attackers are creating realistic Sybils that effectively befriend legitimate users, rendering most automated Sybil detection techniques ineffective. In this paper, we explore the feasibility of a crowdsourced Sybil detection system for OSNs. We conduct a large user study on the ability of humans to detect today’s Sybil accounts, using a large corpus of ground-truth Sybil accounts from the Facebook and Renren networks. We analyze detection accuracy by both “experts” and “turkers” under a variety of conditions, and find that while turkers vary significantly in their effectiveness, experts consistently produce near-optimal results. We use these results to drive the design of a multi-tier crowdsourcing Sybil detection system. Using our user study data, we show that this system is scalable, and can be highly effective either as a standalone system or as a complementary technique to current tools.

158 citations

Journal ArticleDOI
TL;DR: Crowdsourcing the analysis of complex and massive data has emerged as a framework to find robust methodologies to solve diverse and important biomedical problems, and foster the creation and dissemination of well-curated data repositories.
Abstract: Considerable resources are required to gain maximal insights into the diverse big data sets in biomedicine. In this Review, the authors discuss how crowdsourcing, in the form of collaborative competitions (known as Challenges), can engage the scientific community to provide the diverse expertise and methodological approaches that can robustly address some of the most pressing questions in genetics, genomics and biomedical sciences. The generation of large-scale biomedical data is creating unprecedented opportunities for basic and translational science. Typically, the data producers perform initial analyses, but it is very likely that the most informative methods may reside with other groups. Crowdsourcing the analysis of complex and massive data has emerged as a framework to find robust methodologies. When the crowdsourcing is done in the form of collaborative scientific competitions, known as Challenges, the validation of the methods is inherently addressed. Challenges also encourage open innovation, create collaborative communities to solve diverse and important biomedical problems, and foster the creation and dissemination of well-curated data repositories.

158 citations


Network Information
Related Topics (5)
Social network
42.9K papers, 1.5M citations
87% related
User interface
85.4K papers, 1.7M citations
86% related
Deep learning
79.8K papers, 2.1M citations
85% related
Cluster analysis
146.5K papers, 2.9M citations
85% related
The Internet
213.2K papers, 3.8M citations
85% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023637
20221,420
2021996
20201,250
20191,341
20181,396