Proceedings ArticleDOI
Semi-supervised learning using randomized mincuts
Avrim Blum,John Lafferty,Mugizi Robert Rwebangira,Raj Reddy +3 more
- Vol. 69, pp 13
Reads0
Chats0
TLDR
The experiments on several datasets show that when the structure of the graph supports small cuts, this can result in highly accurate classifiers with good accuracy/coverage tradeoffs, and can be given theoretical justification from both a Markov random field perspective and from sample complexity considerations.Abstract:
In many application domains there is a large amount of unlabeled data but only a very limited amount of labeled training data. One general approach that has been explored for utilizing this unlabeled data is to construct a graph on all the data points based on distance relationships among examples, and then to use the known labels to perform some type of graph partitioning. One natural partitioning to use is the minimum cut that agrees with the labeled data (Blum & Chawla, 2001), which can be thought of as giving the most probable label assignment if one views labels as generated according to a Markov Random Field on the graph. Zhu et al. (2003) propose a cut based on a relaxation of this field, and Joachims (2003) gives an algorithm based on finding an approximate min-ratio cut.In this paper, we extend the mincut approach by adding randomness to the graph structure. The resulting algorithm addresses several short-comings of the basic mincut approach, and can be given theoretical justification from both a Markov random field perspective and from sample complexity considerations. In cases where the graph does not have small cuts for a given classification problem, randomization may not help. However, our experiments on several datasets show that when the structure of the graph supports small cuts, this can result in highly accurate classifiers with good accuracy/coverage tradeoffs. In addition, we are able to achieve good performance with a very simple graph-construction procedure.read more
Citations
More filters
Book
Sentiment Analysis and Opinion Mining
TL;DR: Sentiment analysis and opinion mining is the field of study that analyzes people's opinions, sentiments, evaluations, attitudes, and emotions from written language as discussed by the authors and is one of the most active research areas in natural language processing and is also widely studied in data mining, Web mining, and text mining.
BookDOI
Semi-Supervised Learning
TL;DR: Semi-supervised learning (SSL) as discussed by the authors is the middle ground between supervised learning (in which all training examples are labeled) and unsupervised training (where no label data are given).
Posted Content
Semi-Supervised Learning with Deep Generative Models
TL;DR: It is shown that deep generative models and approximate Bayesian inference exploiting recent advances in variational methods can be used to provide significant improvements, making generative approaches highly competitive for semi-supervised learning.
Book
Introduction to Semi-Supervised Learning
TL;DR: This introductory book presents some popular semi-supervised learning models, including self-training, mixture models, co-training and multiview learning, graph-based methods, and semi- supervised support vector machines, and discusses their basic mathematical formulation.
References
More filters
Proceedings Article
Semi-supervised learning using Gaussian fields and harmonic functions
TL;DR: An approach to semi-supervised learning is proposed that is based on a Gaussian random field model, and methods to incorporate class priors and the predictions of classifiers obtained by supervised learning are discussed.
Journal ArticleDOI
A database for handwritten text recognition research
TL;DR: An image database for handwritten text recognition research is described that contains digital images of approximately 5000 city names, 5000 state names, 10000 ZIP Codes, and 50000 alphanumeric characters to overcome the limitations of earlier databases.
Journal ArticleDOI
Exact Maximum A Posteriori Estimation for Binary Images
Proceedings ArticleDOI
Learning from Labeled and Unlabeled Data using Graph Mincuts
Avrim Blum,Shuchi Chawla +1 more
TL;DR: An algorithm based on finding minimum cuts in graphs, that uses pairwise relationships among the examples in order to learn from both labeled and unlabeled data is considered.
Proceedings Article
Transductive learning via spectral graph partitioning
TL;DR: This work proposes an algorithm that robustly achieves good generalization performance and that can be trained efficiently, and shows a connection to transductive Support Vector Machines, and that an effective Co-Training algorithm arises as a special case.