scispace - formally typeset
Open AccessProceedings Article

Regularizing Deep Neural Networks by Noise: Its Interpretation and Optimization

Reads0
Chats0
TLDR
This paper interprets that the conventional training methods with regularization by noise injection optimize the lower bound of the true objective and proposes a technique to achieve a tighter lower bound using multiple noise samples per training example in a stochastic gradient descent iteration.
Abstract
Overfitting is one of the most critical challenges in deep neural networks, and there are various types of regularization methods to improve generalization performance. Injecting noises to hidden units during training, e.g., dropout, is known as a successful regularizer, but it is still not clear enough why such training techniques work well in practice and how we can maximize their benefit in the presence of two conflicting objectives---optimizing to true data distribution and preventing overfitting by regularization. This paper addresses the above issues by 1) interpreting that the conventional training methods with regularization by noise injection optimize the lower bound of the true objective and 2) proposing a technique to achieve a tighter lower bound using multiple noise samples per training example in a stochastic gradient descent iteration. We demonstrate the effectiveness of our idea in several computer vision applications.

read more

Content maybe subject to copyright    Report

Citations
More filters
Book ChapterDOI

Towards Robust Neural Networks via Random Self-ensemble

TL;DR: Random Self-Ensemble (RSE) as mentioned in this paper adds random noise layers to the neural network to prevent the strong gradient-based attacks, and ensembles the prediction over random noises to stabilize the performance.
Posted Content

Towards Robust Neural Networks via Random Self-ensemble

TL;DR: This paper proposes a new defense algorithm called Random Self-Ensemble (RSE), which adds random noise layers to the neural network to prevent the strong gradient-based attacks, and ensembles the prediction over random noises to stabilize the performance.
Proceedings Article

Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels

TL;DR: In this article, the authors established the first benchmark of controlled real-world label noise from the web, which enabled them to study the web label noise in a controlled setting for the first time, and they showed that their method achieves the best result on their dataset as well as on two public benchmarks (CIFAR and WebVision).
Journal ArticleDOI

Big-Data Science in Porous Materials: Materials Genomics and Machine Learning

TL;DR: It is shown that having so many materials allows us to use big-data methods as a powerful technique to study these materials and to discover complex correlations.
Proceedings Article

Supervised autoencoders: Improving generalization performance with unsupervised regularizers

TL;DR: This work theoretically and empirically analyze and provides a novel generalization result for linear auto-encoders, proving uniform stability based on the inclusion of the reconstruction error in a neural network that predicts both inputs (reconstruction error) and targets jointly.
References
More filters
Proceedings ArticleDOI

Deep Pyramidal Residual Networks

TL;DR: This research gradually increases the feature map dimension at all units to involve as many locations as possible in the network architecture and proposes a novel residual unit capable of further improving the classification accuracy with the new network architecture.
Proceedings Article

Dropout Training as Adaptive Regularization

TL;DR: By casting dropout as regularization, this work develops a natural semi-supervised algorithm that uses unlabeled data to create a better adaptive regularizer and consistently boosts the performance of dropout training, improving on state-of-the-art results on the IMDB reviews dataset.
Posted Content

Neural Module Networks

TL;DR: The authors decomposes questions into their linguistic substructures, and uses these structures to dynamically instantiate modular networks (with reusable components for recognizing dogs, classifying colors, etc.) for visual question answering.
Proceedings ArticleDOI

Image Question Answering Using Convolutional Neural Network with Dynamic Parameter Prediction

TL;DR: In this paper, a joint network with the CNN for ImageQA and the parameter prediction network is proposed, which is trained end-to-end through back-propagation, where its weights are initialized using a pre-trained CNN and GRU.
Proceedings Article

Adaptive dropout for training deep neural networks

TL;DR: A method is described called 'standout' in which a binary belief network is overlaid on a neural network and is used to regularize of its hidden units by selectively setting activities to zero, which achieves lower classification error rates than other feature learning methods, including standard dropout, denoising auto-encoders, and restricted Boltzmann machines.
Related Papers (5)