Manifold Mixup: Better Representations by Interpolating Hidden States

Open AccessProceedings Article

Manifold Mixup: Better Representations by Interpolating Hidden States

Vikas Verma, +6 more

- pp 6438-6447

Chats0

TLDR

Manifold Mixup as discussed by the authors leverages semantic interpolations as additional training signal, obtaining neural networks with smoother decision boundaries at multiple levels of representation, as a result, neural networks trained with Manifold mixup learn class-representations with fewer directions of variance.

Abstract:

Deep neural networks excel at learning the training data, but often provide incorrect and confident predictions when evaluated on slightly different test examples. This includes distribution shifts, outliers, and adversarial examples. To address these issues, we propose Manifold Mixup, a simple regularizer that encourages neural networks to predict less confidently on interpolations of hidden representations. Manifold Mixup leverages semantic interpolations as additional training signal, obtaining neural networks with smoother decision boundaries at multiple levels of representation. As a result, neural networks trained with Manifold Mixup learn class-representations with fewer directions of variance. We prove theory on why this flattening happens under ideal conditions, validate it on practical situations, and connect it to previous works on information theory and generalization. In spite of incurring no significant computation and being implemented in a few lines of code, Manifold Mixup improves strong baselines in supervised learning, robustness to single-step adversarial attacks, and test log-likelihood.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features

Sangdoo Yun, +5 more

TL;DR: CutMix as discussed by the authors augments the training data by cutting and pasting patches among training images, where the ground truth labels are also mixed proportionally to the area of the patches.

...read moreread less

Journal ArticleDOI

Embracing imperfect datasets: A review of deep learning solutions for medical image segmentation

Nima Tajbakhsh, +5 more

- 01 Jul 2020 -

Medical Image Analysis

TL;DR: This article provides a detailed review of the solutions above, summarizing both the technical novelties and empirical results, and compares the benefits and requirements of the surveyed methodologies and provides recommended solutions.

...read moreread less

Posted Content

InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization

Fan-Yun Sun, +3 more

- 31 Jul 2019 -

arXiv: Learning

TL;DR: Experimental results on the tasks of graph classification and molecular property prediction show that InfoGraph is superior to state-of-the-art baselines and InfoGraph* can achieve performance competitive with state- of- the-art semi-supervised models.

...read moreread less

Proceedings ArticleDOI

Interpolation consistency training for semi-supervised learning.

Vikas Verma, +6 more

- 01 Jan 2022 -

Neural Networks

TL;DR: Interpolation Consistency Training (ICT) as mentioned in this paper encourages the prediction at an interpolation of unlabeled points to be consistent with the interpolations of the predictions at those points.

...read moreread less

Proceedings ArticleDOI

BBN: Bilateral-Branch Network With Cumulative Learning for Long-Tailed Visual Recognition

Boyan Zhou, +3 more

TL;DR: Zhang et al. as mentioned in this paper proposed a unified Bilateral-Branch Network (BBN) to take care of both representation learning and classifier learning simultaneously, where each branch does perform its own duty separately.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book

Deep Learning

Ian Goodfellow, +2 more

TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.

...read moreread less

Proceedings Article

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, +1 more

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.

...read moreread less

Posted Content

Efficient Estimation of Word Representations in Vector Space

Tomas Mikolov, +3 more

- 16 Jan 2013 -

arXiv: Computation and Language

TL;DR: This paper proposed two novel model architectures for computing continuous vector representations of words from very large data sets, and the quality of these representations is measured in a word similarity task and the results are compared to the previously best performing techniques based on different types of neural networks.

...read moreread less

Posted Content

Rethinking the Inception Architecture for Computer Vision

Christian Szegedy, +4 more

- 02 Dec 2015 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This work is exploring ways to scale up networks in ways that aim at utilizing the added computation as efficiently as possible by suitably factorized convolutions and aggressive regularization.

...read moreread less

Book ChapterDOI

Visualizing and Understanding Convolutional Networks

Matthew D. Zeiler, +1 more

TL;DR: A novel visualization technique is introduced that gives insight into the function of intermediate feature layers and the operation of the classifier in large Convolutional Network models, used in a diagnostic role to find model architectures that outperform Krizhevsky et al on the ImageNet classification benchmark.

...read moreread less

Collapse

International Journal of Computer Vision

ImageNet: A large-scale hierarchical image database

Jia Deng, +5 more

Manifold Mixup: Better Representations by Interpolating Hidden States

Citations

CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features

Embracing imperfect datasets: A review of deep learning solutions for medical image segmentation

InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization

Interpolation consistency training for semi-supervised learning.

BBN: Bilateral-Branch Network With Cumulative Learning for Long-Tailed Visual Recognition

References

Deep Learning

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Efficient Estimation of Word Representations in Vector Space

Rethinking the Inception Architecture for Computer Vision

Visualizing and Understanding Convolutional Networks

Related Papers (5)

Deep Residual Learning for Image Recognition

Learning Multiple Layers of Features from Tiny Images

ImageNet Classification with Deep Convolutional Neural Networks

ImageNet Large Scale Visual Recognition Challenge

ImageNet: A large-scale hierarchical image database