scispace - formally typeset
Open AccessJournal ArticleDOI

A theory of learning from different domains

TLDR
A classifier-induced divergence measure that can be estimated from finite, unlabeled samples from the domains and shows how to choose the optimal combination of source and target error as a function of the divergence, the sample sizes of both domains, and the complexity of the hypothesis class.
Abstract
Discriminative learning methods for classification perform well when training and test data are drawn from the same distribution. Often, however, we have plentiful labeled training data from a source domain but wish to learn a classifier which performs well on a target domain with a different distribution and little or no labeled training data. In this work we investigate two questions. First, under what conditions can a classifier trained from source data be expected to perform well on target data? Second, given a small amount of labeled target data, how should we combine it during training with the large amount of labeled source data to achieve the lowest target error at test time? We address the first question by bounding a classifier's target error in terms of its source error and the divergence between the two domains. We give a classifier-induced divergence measure that can be estimated from finite, unlabeled samples from the domains. Under the assumption that there exists some hypothesis that performs well in both domains, we show that this quantity together with the empirical source error characterize the target error of a source-trained classifier. We answer the second question by bounding the target error of a model which minimizes a convex combination of the empirical source and target errors. Previous theoretical work has considered minimizing just the source error, just the target error, or weighting instances from the two domains equally. We show how to choose the optimal combination of source and target error as a function of the divergence, the sample sizes of both domains, and the complexity of the hypothesis class. The resulting bound generalizes the previously studied cases and is always at least as tight as a bound which considers minimizing only the target error or an equal weighting of source and target errors.

read more

Content maybe subject to copyright    Report

Citations
More filters
Book ChapterDOI

Domain-adversarial training of neural networks

TL;DR: In this article, a new representation learning approach for domain adaptation is proposed, in which data at training and test time come from similar but different distributions, and features that cannot discriminate between the training (source) and test (target) domains are used to promote the emergence of features that are discriminative for the main learning task on the source domain.
Posted Content

Learning Transferable Features with Deep Adaptation Networks

TL;DR: A new Deep Adaptation Network (DAN) architecture is proposed, which generalizes deep convolutional neural network to the domain adaptation scenario and can learn transferable features with statistical guarantees, and can scale linearly by unbiased estimate of kernel embedding.
Posted Content

Unsupervised Domain Adaptation by Backpropagation

TL;DR: In this paper, a gradient reversal layer is proposed to promote the emergence of deep features that are discriminative for the main learning task on the source domain and invariant with respect to the shift between the domains.
Proceedings Article

Unsupervised Domain Adaptation by Backpropagation

TL;DR: The method performs very well in a series of image classification experiments, achieving adaptation effect in the presence of big domain shifts and outperforming previous state-of-the-art on Office datasets.
Proceedings Article

Deep transfer learning with joint adaptation networks

TL;DR: JAN as mentioned in this paper aligns the joint distributions of multiple domain-specific layers across domains based on a joint maximum mean discrepancy (JMMD) criterion to make the distributions of the source and target domains more distinguishable.
References
More filters

Statistical learning theory

TL;DR: Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.
Journal ArticleDOI

Sample Selection Bias as a Specification Error

James J. Heckman
- 01 Jan 1979 - 
TL;DR: In this article, the bias that results from using non-randomly selected samples to estimate behavioral relationships as an ordinary specification error or "omitted variables" bias is discussed, and the asymptotic distribution of the estimator is derived.

Thumbs up? Sentiment Classiflcation using Machine Learning Techniques

TL;DR: In this paper, the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative, was considered and three machine learning methods (Naive Bayes, maximum entropy classiflcation, and support vector machines) were employed.
Proceedings ArticleDOI

Thumbs up? Sentiment Classification using Machine Learning Techniques

TL;DR: This work considers the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative, and concludes by examining factors that make the sentiment classification problem more challenging.
Posted Content

Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews

TL;DR: A simple unsupervised learning algorithm for classifying reviews as recommended (thumbs up) or not recommended (Thumbs down) if the average semantic orientation of its phrases is positive.
Related Papers (5)