Manifold Mixup: Better Representations by Interpolating Hidden States.

Open AccessPosted Content

Manifold Mixup: Better Representations by Interpolating Hidden States.

Vikas Verma, +7 more

- 13 Jun 2018 -

arXiv: Machine Learning

Chats0

TLDR

Manifold Mixup, a simple regularizer that encourages neural networks to predict less confidently on interpolations of hidden representations, improves strong baselines in supervised learning, robustness to single-step adversarial attacks, and test log-likelihood.

Abstract:

Deep neural networks excel at learning the training data, but often provide incorrect and confident predictions when evaluated on slightly different test examples. This includes distribution shifts, outliers, and adversarial examples. To address these issues, we propose Manifold Mixup, a simple regularizer that encourages neural networks to predict less confidently on interpolations of hidden representations. Manifold Mixup leverages semantic interpolations as additional training signal, obtaining neural networks with smoother decision boundaries at multiple levels of representation. As a result, neural networks trained with Manifold Mixup learn class-representations with fewer directions of variance. We prove theory on why this flattening happens under ideal conditions, validate it on practical situations, and connect it to previous works on information theory and generalization. In spite of incurring no significant computation and being implemented in a few lines of code, Manifold Mixup improves strong baselines in supervised learning, robustness to single-step adversarial attacks, and test log-likelihood.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Interpolation consistency training for semi-supervised learning.

Vikas Verma, +6 more

- 01 Jan 2022 -

Neural Networks

TL;DR: Interpolation Consistency Training (ICT) as mentioned in this paper encourages the prediction at an interpolation of unlabeled points to be consistent with the interpolations of the predictions at those points.

...read moreread less

Posted Content

Puzzle Mix: Exploiting Saliency and Local Statistics for Optimal Mixup

Jang-Hyun Kim, +2 more

- 15 Sep 2020 -

arXiv: Learning

TL;DR: The experiments show Puzzle Mix achieves the state of the art generalization and the adversarial robustness results compared to other mixup methods on CIFAR-100, Tiny-ImageNet, and ImageNet datasets.

...read moreread less

Posted Content

REMIND Your Neural Network to Prevent Catastrophic Forgetting

Tyler L. Hayes, +4 more

- 06 Oct 2019 -

arXiv: Learning

TL;DR: REMIND is trained in an online manner, meaning it learns one example at a time, which is closer to how humans learn, and outperforms other methods for incremental class learning on the ImageNet ILSVRC-2012 dataset.

...read moreread less

Posted Content

GraphMix: Improved Training of GNNs for Semi-Supervised Learning

Vikas Verma, +6 more

- 25 Sep 2019 -

arXiv: Learning

TL;DR: GraphMix is presented, a regularization method for Graph Neural Network based semi-supervised object classification, whereby it is proposed to train a fully-connected network jointly with the graph neural network via parameter sharing and interpolation-based regularization.

...read moreread less

Posted Content

SaliencyMix: A Saliency Guided Data Augmentation Strategy for Better Regularization

A. F. M. Shahab Uddin, +4 more

- 02 Jun 2020 -

arXiv: Learning

TL;DR: This work proposes SaliencyMix, a new state-of-the-art top-1 error-reducing model that carefully selects a representative image patch with the help of a saliency map and mixes this indicative patch with a target image that leads the model to learn more appropriate feature representation.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, +1 more

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.

...read moreread less

Book ChapterDOI

Visualizing and Understanding Convolutional Networks

Matthew D. Zeiler, +1 more

TL;DR: A novel visualization technique is introduced that gives insight into the function of intermediate feature layers and the operation of the classifier in large Convolutional Network models, used in a diagnostic role to find model architectures that outperform Krizhevsky et al on the ImageNet classification benchmark.

...read moreread less

Journal ArticleDOI

Approximation by superpositions of a sigmoidal function

George Cybenko

- 01 Dec 1989 -

Mathematics of Control, Signals, and Sys...

TL;DR: It is demonstrated that finite linear combinations of compositions of a fixed, univariate function and a set of affine functionals can uniformly approximate any continuous function ofn real variables with support in the unit hypercube.

...read moreread less

Proceedings Article

Intriguing properties of neural networks

Christian Szegedy, +7 more

TL;DR: It is found that there is no distinction between individual highlevel units and random linear combinations of high level units, according to various methods of unit analysis, and it is suggested that it is the space, rather than the individual units, that contains of the semantic information in the high layers of neural networks.

...read moreread less

Proceedings Article

Efficient Estimation of Word Representations in Vector Space

Tomas Mikolov, +3 more

TL;DR: Two novel model architectures for computing continuous vector representations of words from very large data sets are proposed and it is shown that these vectors provide state-of-the-art performance on the authors' test set for measuring syntactic and semantic word similarities.

...read moreread less

Collapse

Manifold Mixup: Better Representations by Interpolating Hidden States.

Citations

Interpolation consistency training for semi-supervised learning.

Puzzle Mix: Exploiting Saliency and Local Statistics for Optimal Mixup

REMIND Your Neural Network to Prevent Catastrophic Forgetting

GraphMix: Improved Training of GNNs for Semi-Supervised Learning

SaliencyMix: A Saliency Guided Data Augmentation Strategy for Better Regularization

References

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Visualizing and Understanding Convolutional Networks

Approximation by superpositions of a sigmoidal function

Intriguing properties of neural networks

Efficient Estimation of Word Representations in Vector Space

Related Papers (5)

Deep Residual Learning for Image Recognition

ImageNet: A large-scale hierarchical image database

Learning Multiple Layers of Features from Tiny Images

Faster R-CNN: towards real-time object detection with region proposal networks

ImageNet Classification with Deep Convolutional Neural Networks