Open AccessJournal Article
Discovering Parametric Activation Functions
Reads0
Chats0
TLDR
This paper proposes a technique for customizing activation functions automatically, resulting in reliable improvements in performance, and discovers both general activation functions and specialized functions for different architectures, consistently improving accuracy over ReLU and other activation functions by significant margins.Abstract:
Recent studies have shown that the choice of activation function can significantly affect the performance of deep learning networks. However, the benefits of novel activation functions have been inconsistent and task-dependent, and therefore the rectified linear unit (ReLU) is still the most commonly used. This paper proposes a technique for customizing activation functions automatically, resulting in reliable improvements in performance. Evolutionary search is used to discover the general form of the function, and gradient descent to optimize its parameters for different parts of the network and over the learning process. Experiments with three different neural network architectures on the CIFAR-100 image classification dataset show that this approach is effective. It discovers different activation functions for different architectures, and consistently improves accuracy over ReLU and other recently proposed activation functions by significant margins. The approach can therefore be used as an automated optimization step in applying deep learning to new tasks.read more
Citations
More filters
Journal ArticleDOI
Review: Deep Learning in Electron Microscopy
TL;DR: In this paper, a review of deep learning in electron microscopy is presented, with a focus on hardware and software needed to get started with deep learning and interface with electron microscopes.
Journal ArticleDOI
Deep learning in electron microscopy
TL;DR: This review paper offers a practical perspective aimed at developers with limited familiarity of deep learning in electron microscopy that discusses hardware and software needed to get started with deep learning and interface with electron microscopes.
Journal ArticleDOI
Smish: A Novel Activation Function for Deep Learning Methods
TL;DR: Experiments show that Smish tends to operate more efficiently than Logish, Mish, and other activation functions on EfficientNet models with open datasets, and the performance of Smish in various deep learning models and the parameters of its function f(x)=αx·tanh[ln(1+sigmoid(βx))], and where α = 1 and β = 1, Smish was found to exhibit the highest accuracy.
Journal ArticleDOI
How important are activation functions in regression and classification? A survey, performance comparison, and future directions
TL;DR: This work surveys the activation functions that have been employed in the past as well as the current state-of-the-art, and presents various developments in activation functions over the years and the advantages and disadvantages or limitations of these activation functions.
Posted Content
Low Curvature Activations Reduce Overfitting in Adversarial Training.
TL;DR: This article showed that using activation functions with low (exact or approximate) curvature values has a regularization effect that significantly reduces both the standard and robust generalization gaps in adversarial training.
References
More filters
Proceedings ArticleDOI
Deep Residual Learning for Image Recognition
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings ArticleDOI
ImageNet: A large-scale hierarchical image database
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Proceedings Article
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe,Christian Szegedy +1 more
TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Dissertation
Learning Multiple Layers of Features from Tiny Images
TL;DR: In this paper, the authors describe how to train a multi-layer generative model of natural images, using a dataset of millions of tiny colour images, described in the next section.
Proceedings Article
Rectified Linear Units Improve Restricted Boltzmann Machines
Vinod Nair,Geoffrey E. Hinton +1 more
TL;DR: Restricted Boltzmann machines were developed using binary stochastic hidden units that learn features that are better for object recognition on the NORB dataset and face verification on the Labeled Faces in the Wild dataset.
Related Papers (5)
Object recognition algorithm based on optimized nonlinear activation function-global convolutional neural network
Feng-Ping An,Jun-e Liu,Lei Bai +2 more