Open AccessPosted Content
Towards Out-Of-Distribution Generalization: A Survey
Reads0
Chats0
TLDR
A comprehensive survey of OOD generalization methods can be found at http://out-of-distributiongeneralization.com as mentioned in this paper, where the authors provide a formal definition of the OOD problem.Abstract:
Classic machine learning methods are built on the $i.i.d.$ assumption that
training and testing data are independent and identically distributed. However,
in real scenarios, the $i.i.d.$ assumption can hardly be satisfied, rendering
the sharp drop of classic machine learning algorithms' performances under
distributional shifts, which indicates the significance of investigating the
Out-of-Distribution generalization problem. Out-of-Distribution (OOD)
generalization problem addresses the challenging setting where the testing
distribution is unknown and different from the training. This paper serves as
the first effort to systematically and comprehensively discuss the OOD
generalization problem, from the definition, methodology, evaluation to the
implications and future directions. Firstly, we provide the formal definition
of the OOD generalization problem. Secondly, existing methods are categorized
into three parts based on their positions in the whole learning pipeline,
namely unsupervised representation learning, supervised model learning and
optimization, and typical methods for each category are discussed in detail. We
then demonstrate the theoretical connections of different categories, and
introduce the commonly used datasets and evaluation metrics. Finally, we
summarize the whole literature and raise some future directions for OOD
generalization problem. The summary of OOD generalization methods reviewed in
this survey can be found at http://out-of-distribution-generalization.com.read more
Citations
More filters
Posted Content
OoD-Bench: Benchmarking and Understanding Out-of-Distribution Generalization Datasets and Algorithms
TL;DR: In this article, the authors identify and measure two distinct kinds of distribution shifts that are ubiquitous in various datasets and compare various OoD generalization algorithms with a new benchmark dominated by the two distribution shifts.
Posted Content
Deep Long-Tailed Learning: A Survey
TL;DR: A comprehensive survey on deep long-tailed learning can be found in this paper, where the authors group existing long-tail learning studies into three main categories (i.e., class re-balancing, information augmentation and module improvement).
Posted ContentDOI
Constructing benchmark test sets for biological sequence analysis using independent set algorithms
TL;DR: In this paper, independent set graph algorithms are used to split sequence data into dissimilar training and test sets, such that each test sequence is less than p% identical to any individual training sequence.
Posted Content
Confounder Identification-free Causal Visual Feature Learning
TL;DR: In this article, a Confounder Identification-free Causal Visual Feature Learning (CICF) method is proposed to learn causal features that are free of interference from confounders.
Posted Content
A benchmark with decomposed distribution shifts for 360 monocular depth estimation
Georgios Albanis,Nikolaos Zioulis,Petros Drakoulis,Federico Alvarez,Dimitrios Zarpalas,Petros Daras +5 more
TL;DR: In this paper, a distribution shift benchmark for monocular depth estimation is proposed, which decomposes the wider distribution shift of uncontrolled testing on in-the-wild data, to three distinct distribution shifts.
References
More filters
Proceedings ArticleDOI
ImageNet: A large-scale hierarchical image database
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Journal ArticleDOI
Gradient-based learning applied to document recognition
Yann LeCun,Léon Bottou,Léon Bottou,Yoshua Bengio,Yoshua Bengio,Yoshua Bengio,Patrick Haffner +6 more
TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Journal ArticleDOI
Regression Shrinkage and Selection via the Lasso
TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Proceedings Article
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe,Christian Szegedy +1 more
TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Proceedings ArticleDOI
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.