Semi-supervised learning with graphs

Open Access

Semi-supervised learning with graphs

Chats0

TLDR

A series of novel semi-supervised learning approaches arising from a graph representation, where labeled and unlabeled instances are represented as vertices, and edges encode the similarity between instances are presented.

Abstract:

In traditional machine learning approaches to classification, one uses only a labeled set to train the classifier. Labeled instances however are often difficult, expensive, or time consuming to obtain, as they require the efforts of experienced human annotators. Meanwhile unlabeled data may be relatively easy to collect, but there has been few ways to use them. Semi-supervised learning addresses this problem by using large amount of unlabeled data, together with the labeled data, to build better classifiers. Because semi-supervised learning requires less human effort and gives higher accuracy, it is of great interest both in theory and in practice. We present a series of novel semi-supervised learning approaches arising from a graph representation, where labeled and unlabeled instances are represented as vertices, and edges encode the similarity between instances. They address the following questions: How to use unlabeled data? (label propagation); What is the probabilistic interpretation? (Gaussian fields and harmonic functions); What if we can choose labeled data? (active learning); How to construct good graphs? (hyperparameter learning); How to work with kernel machines like SVM? (graph kernels); How to handle complex data like sequences? (kernel conditional random fields); How to handle scalability and induction? (harmonic mixtures). An extensive literature review is included at the end.

Semi-supervised learning with graphs

Citations

Active Learning Literature Survey

Semi-Supervised Learning Literature Survey

Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales

A survey on semi-supervised learning

Retrofitting Word Vectors to Semantic Lexicons

References

Maximum likelihood from incomplete data via the EM algorithm

Convex Optimization

Latent dirichlet allocation

Statistical learning theory

Latent Dirichlet Allocation

Related Papers (5)

Learning with Local and Global Consistency

Semi-Supervised Learning Literature Survey

Combining labeled and unlabeled data with co-training

Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples

Nonlinear dimensionality reduction by locally linear embedding.