Semi-supervised learning using randomized mincuts

doi:10.1145/1015330.1015429

Proceedings ArticleDOI

Semi-supervised learning using randomized mincuts

Avrim Blum, +3 more

- Vol. 69, pp 13

Chats0

TLDR

The experiments on several datasets show that when the structure of the graph supports small cuts, this can result in highly accurate classifiers with good accuracy/coverage tradeoffs, and can be given theoretical justification from both a Markov random field perspective and from sample complexity considerations.

Abstract:

In many application domains there is a large amount of unlabeled data but only a very limited amount of labeled training data. One general approach that has been explored for utilizing this unlabeled data is to construct a graph on all the data points based on distance relationships among examples, and then to use the known labels to perform some type of graph partitioning. One natural partitioning to use is the minimum cut that agrees with the labeled data (Blum & Chawla, 2001), which can be thought of as giving the most probable label assignment if one views labels as generated according to a Markov Random Field on the graph. Zhu et al. (2003) propose a cut based on a relaxation of this field, and Joachims (2003) gives an algorithm based on finding an approximate min-ratio cut.In this paper, we extend the mincut approach by adding randomness to the graph structure. The resulting algorithm addresses several short-comings of the basic mincut approach, and can be given theoretical justification from both a Markov random field perspective and from sample complexity considerations. In cases where the graph does not have small cuts for a given classification problem, randomization may not help. However, our experiments on several datasets show that when the structure of the graph supports small cuts, this can result in highly accurate classifiers with good accuracy/coverage tradeoffs. In addition, we are able to achieve good performance with a very simple graph-construction procedure.

Semi-supervised learning using randomized mincuts

Citations

Sentiment Analysis and Opinion Mining

Semi-Supervised Learning Literature Survey

Semi-Supervised Learning

Semi-Supervised Learning with Deep Generative Models

Introduction to Semi-Supervised Learning

References

Semi-supervised learning using Gaussian fields and harmonic functions

A database for handwritten text recognition research

Exact Maximum A Posteriori Estimation for Binary Images

Learning from Labeled and Unlabeled Data using Graph Mincuts

Transductive learning via spectral graph partitioning

Related Papers (5)

Semi-supervised learning using Gaussian fields and harmonic functions

Learning with Local and Global Consistency

Combining labeled and unlabeled data with co-training

Semi-Supervised Learning Literature Survey

Text Classification from Labeled and Unlabeled Documents using EM