A theory of learning from different domains

doi:10.1007/S10994-009-5152-4

Open AccessJournal ArticleDOI

A theory of learning from different domains

Shai Ben-David, +5 more

- 01 May 2010 -

Machine Learning

- Vol. 79, Iss: 1, pp 151-175

TLDR

A classifier-induced divergence measure that can be estimated from finite, unlabeled samples from the domains and shows how to choose the optimal combination of source and target error as a function of the divergence, the sample sizes of both domains, and the complexity of the hypothesis class.

Abstract:

Discriminative learning methods for classification perform well when training and test data are drawn from the same distribution. Often, however, we have plentiful labeled training data from a source domain but wish to learn a classifier which performs well on a target domain with a different distribution and little or no labeled training data. In this work we investigate two questions. First, under what conditions can a classifier trained from source data be expected to perform well on target data? Second, given a small amount of labeled target data, how should we combine it during training with the large amount of labeled source data to achieve the lowest target error at test time? We address the first question by bounding a classifier's target error in terms of its source error and the divergence between the two domains. We give a classifier-induced divergence measure that can be estimated from finite, unlabeled samples from the domains. Under the assumption that there exists some hypothesis that performs well in both domains, we show that this quantity together with the empirical source error characterize the target error of a source-trained classifier. We answer the second question by bounding the target error of a model which minimizes a convex combination of the empirical source and target errors. Previous theoretical work has considered minimizing just the source error, just the target error, or weighting instances from the two domains equally. We show how to choose the optimal combination of source and target error as a function of the divergence, the sample sizes of both domains, and the complexity of the hypothesis class. The resulting bound generalizes the previously studied cases and is always at least as tight as a bound which considers minimizing only the target error or an equal weighting of source and target errors.

A theory of learning from different domains

Citations

Domain-adversarial training of neural networks

Learning Transferable Features with Deep Adaptation Networks

Unsupervised Domain Adaptation by Backpropagation

Unsupervised Domain Adaptation by Backpropagation

Deep transfer learning with joint adaptation networks

References

Statistical learning theory

Sample Selection Bias as a Specification Error

Thumbs up? Sentiment Classiflcation using Machine Learning Techniques

Thumbs up? Sentiment Classification using Machine Learning Techniques

Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews

Related Papers (5)

Domain-adversarial training of neural networks

A Survey on Transfer Learning

Deep Residual Learning for Image Recognition

Unsupervised Domain Adaptation by Backpropagation

Adversarial Discriminative Domain Adaptation