scispace - formally typeset
Open AccessPosted Content

Towards a Theoretical Framework of Out-of-Distribution Generalization

Reads0
Chats0
TLDR
This article introduced a new concept of expansion function, which characterizes to what extent the variance is amplified in the test domains over the training domains, and therefore gives a quantitative meaning of invariant features.
Abstract
Generalization to out-of-distribution (OOD) data, or domain generalization, is one of the central problems in modern machine learning. Recently, there is a surge of attempts to propose algorithms for OOD that mainly build upon the idea of extracting invariant features. Although intuitively reasonable, theoretical understanding of what kind of invariance can guarantee OOD generalization is still limited, and generalization to arbitrary out-of-distribution is clearly impossible. In this work, we take the first step towards rigorous and quantitative definitions of 1) what is OOD; and 2) what does it mean by saying an OOD problem is learnable. We also introduce a new concept of expansion function, which characterizes to what extent the variance is amplified in the test domains over the training domains, and therefore give a quantitative meaning of invariant features. Based on these, we prove OOD generalization error bounds. It turns out that OOD generalization largely depends on the expansion function. As recently pointed out by Gulrajani and Lopez-Paz (2020), any OOD learning algorithm without a model selection module is incomplete. Our theory naturally induces a model selection criterion. Extensive experiments on benchmark OOD datasets demonstrate that our model selection criterion has a significant advantage over baselines.

read more

Citations
More filters
Posted Content

Learning Causal Semantic Representation for Out-of-Distribution Prediction

TL;DR: It is proved that under proper conditions CSG identifies the semantic factor by fitting training data, and this semantic identification guarantees the boundedness of OOD generalization error and the success of adaptation.
Posted Content

Towards Out-Of-Distribution Generalization: A Survey

TL;DR: A comprehensive survey of OOD generalization methods can be found at http://out-of-distributiongeneralization.com as mentioned in this paper, where the authors provide a formal definition of the OOD problem.
Journal ArticleDOI

Out-of-Distribution (OOD) Detection Based on Deep Learning: A Review

Peng Cui, +1 more
- 28 Oct 2022 - 
TL;DR: In this paper , the authors classified OOD detection methods into supervised, semi-supervised, and unsupervised methods based on the training data, where supervised data are used, the methods are categorized according to technical means: model-based, distance-based and density-based.
Posted Content

Quantifying and Improving Transferability in Domain Generalization

TL;DR: In this article, the authors formally define transferability that one can quantify and compute in domain generalization, and then prove that their transferability can be estimated with enough samples and give a new upper bound for the target error based on the transferability.
Posted Content

A Theory of Label Propagation for Subpopulation Shift

TL;DR: In this paper, a provably effective framework for domain adaptation based on label propagation is proposed, which not only propagates to the target domain but also improves upon the teacher classifier trained on the source domain.
References
More filters
Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Journal ArticleDOI

Gradient-based learning applied to document recognition

TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Book

The Mathematics of Computerized Tomography

TL;DR: In this paper, the Radon transform and related transforms have been studied for stability, sampling, resolution, and accuracy, and quite a bit of attention is given to the derivation, analysis, and practical examination of reconstruction algorithm, for both standard problems and problems with incomplete data.
Journal ArticleDOI

A theory of learning from different domains

TL;DR: A classifier-induced divergence measure that can be estimated from finite, unlabeled samples from the domains and shows how to choose the optimal combination of source and target error as a function of the divergence, the sample sizes of both domains, and the complexity of the hypothesis class.
Proceedings ArticleDOI

Unbiased look at dataset bias

TL;DR: A comparison study using a set of popular datasets, evaluated based on a number of criteria including: relative data bias, cross-dataset generalization, effects of closed-world assumption, and sample value is presented.