scispace - formally typeset
Open AccessPosted Content

Error Rate Bounds and Iterative Weighted Majority Voting for Crowdsourcing

Reads0
Chats0
TLDR
Nite-sample exponential bounds on the error rate (in probability and in expectation) of general aggregation rules under the Dawid-Skene crowdsourcing model are provided and can be used to analyze many aggregation methods, including majority voting, weighted majority voting and the oracle Maximum A Posteriori rule.
Abstract
Crowdsourcing has become an eective and popular tool for human-powered computation to label large datasets. Since the workers can be unreliable, it is common in crowdsourcing to assign multiple workers to one task, and to aggregate the labels in order to obtain results of high quality. In this paper, we provide nite-sample exponential bounds on the error rate (in probability and in expectation) of general aggregation rules under the Dawid-Skene crowdsourcing model. The bounds are derived for multi-class labeling, and can be used to analyze many aggregation methods, including majority voting, weighted majority voting and the oracle Maximum A Posteriori (MAP) rule. We show that the oracle MAP rule approximately optimizes our upper bound on the mean error rate of weighted majority voting in certain setting. We propose an iterative weighted majority voting (IWMV) method that optimizes the error rate bound and approximates the oracle MAP rule. Its one step version has a provable theoretical guarantee on the error rate. The IWMV method is intuitive and computationally simple. Experimental results on simulated and real data show that IWMV performs at least on par with the state-of-the-art methods, and it has a much lower computational cost (around one hundred times faster) than the state-of-the-art methods.

read more

Citations
More filters
Journal ArticleDOI

Learning from crowdsourced labeled data: a survey

TL;DR: This survey introduces the basic concepts of the qualities of labels and learning models, and introduces open accessible real-world data sets collected from crowdsourcing systems and open source libraries and tools.
Proceedings Article

Max-margin majority voting for learning from crowds

TL;DR: This paper presents max-margin majority voting (M$^3$3V) to improve the discriminative ability of majority voting and further presents a Bayesian generalization to incorporate the flexibility of generative methods on modeling noisy observations with worker confusion matrices for different application settings.
Journal ArticleDOI

Crowd intelligence in AI 2.0 era

TL;DR: This paper describes the concept of crowd intelligence, and explains its relationship to the existing related concepts, e.g., crowdsourcing and human computation, and introduces four categories of representative crowd intelligence platforms.
Journal ArticleDOI

Max-Margin Majority Voting for Learning from Crowds

TL;DR: In this paper, the authors proposed a max-margin majority voting (M$^3$3V) method to improve the discriminative ability of majority voting and further presented a Bayesian generalization to incorporate the flexibility of generative methods on modeling noisy observations with worker confusion matrices for different application settings.
Journal ArticleDOI

Domain-Weighted Majority Voting for Crowdsourcing

TL;DR: This paper proposes to learn the weights for weighted MV by exploiting the expertise of annotators, model the domain knowledge of different annotators with different distributions and treat the crowdsourcing problem as a domain adaptation problem.
References
More filters
Journal Article

A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models

TL;DR: In this paper, the authors describe the EM algorithm for finding the parameters of a mixture of Gaussian densities and a hidden Markov model (HMM) for both discrete and Gaussian mixture observation models.
Proceedings ArticleDOI

Cheap and Fast -- But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks

TL;DR: This work explores the use of Amazon's Mechanical Turk system, a significantly cheaper and faster method for collecting annotations from a broad base of paid non-expert contributors over the Web, and proposes a technique for bias correction that significantly improves annotation quality on two tasks.
Book

Essai Sur L'Application de L'Analyse a la Probabilite Des Decisions Rendues a la Pluralite Des Voix

TL;DR: Condorcet's paradox (the non-transitivity of majority preferences) is seen as the direct ancestor of Arrow's paradox as discussed by the authors, and it was rediscovered as a foundational work in the theory of voting and societal preferences.
Related Papers (5)