Discovering Parametric Activation Functions

Open AccessJournal Article

Discovering Parametric Activation Functions

Garrett Bingham, +1 more

- 04 May 2021 -

arXiv: Learning

Chats0

TLDR

This paper proposes a technique for customizing activation functions automatically, resulting in reliable improvements in performance, and discovers both general activation functions and specialized functions for different architectures, consistently improving accuracy over ReLU and other activation functions by significant margins.

Abstract:

Recent studies have shown that the choice of activation function can significantly affect the performance of deep learning networks. However, the benefits of novel activation functions have been inconsistent and task-dependent, and therefore the rectified linear unit (ReLU) is still the most commonly used. This paper proposes a technique for customizing activation functions automatically, resulting in reliable improvements in performance. Evolutionary search is used to discover the general form of the function, and gradient descent to optimize its parameters for different parts of the network and over the learning process. Experiments with three different neural network architectures on the CIFAR-100 image classification dataset show that this approach is effective. It discovers different activation functions for different architectures, and consistently improves accuracy over ReLU and other recently proposed activation functions by significant margins. The approach can therefore be used as an automated optimization step in applying deep learning to new tasks.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Review: Deep Learning in Electron Microscopy

Jeffrey M. Ede

- 17 Sep 2020 -

arXiv: Image and Video Processing

TL;DR: In this paper, a review of deep learning in electron microscopy is presented, with a focus on hardware and software needed to get started with deep learning and interface with electron microscopes.

...read moreread less

Journal ArticleDOI

Deep learning in electron microscopy

Jeffrey M. Ede

TL;DR: This review paper offers a practical perspective aimed at developers with limited familiarity of deep learning in electron microscopy that discusses hardware and software needed to get started with deep learning and interface with electron microscopes.

...read moreread less

Journal ArticleDOI

Smish: A Novel Activation Function for Deep Learning Methods

Xueliang Wang, +2 more

- 11 Feb 2022 -

Electronics

TL;DR: Experiments show that Smish tends to operate more efficiently than Logish, Mish, and other activation functions on EfficientNet models with open datasets, and the performance of Smish in various deep learning models and the parameters of its function f(x)=αx·tanh[ln(1+sigmoid(βx))], and where α = 1 and β = 1, Smish was found to exhibit the highest accuracy.

...read moreread less

Journal ArticleDOI

How important are activation functions in regression and classification? A survey, performance comparison, and future directions

Ameya D. Jagtap, +1 more

- 06 Sep 2022 -

Journal of machine learning for modeling...

TL;DR: This work surveys the activation functions that have been employed in the past as well as the current state-of-the-art, and presents various developments in activation functions over the years and the advantages and disadvantages or limitations of these activation functions.

...read moreread less

Posted Content

Low Curvature Activations Reduce Overfitting in Adversarial Training.

Vasu Singla, +3 more

- 15 Feb 2021 -

arXiv: Learning

TL;DR: This article showed that using activation functions with low (exact or approximate) curvature values has a regularization effect that significantly reduces both the standard and robust generalization gaps in adversarial training.

...read moreread less

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

Jia Deng, +5 more

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.

...read moreread less

Proceedings Article

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, +1 more

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.

...read moreread less

Dissertation

Learning Multiple Layers of Features from Tiny Images

Alex Krizhevsky

TL;DR: In this paper, the authors describe how to train a multi-layer generative model of natural images, using a dataset of millions of tiny colour images, described in the next section.

...read moreread less

Proceedings Article

Rectified Linear Units Improve Restricted Boltzmann Machines

Vinod Nair, +1 more

TL;DR: Restricted Boltzmann machines were developed using binary stochastic hidden units that learn features that are better for object recognition on the NORB dataset and face verification on the Labeled Faces in the Wild dataset.

...read moreread less