scispace - formally typeset
Open AccessJournal Article

Discovering Parametric Activation Functions

Reads0
Chats0
TLDR
This paper proposes a technique for customizing activation functions automatically, resulting in reliable improvements in performance, and discovers both general activation functions and specialized functions for different architectures, consistently improving accuracy over ReLU and other activation functions by significant margins.
Abstract
Recent studies have shown that the choice of activation function can significantly affect the performance of deep learning networks. However, the benefits of novel activation functions have been inconsistent and task-dependent, and therefore the rectified linear unit (ReLU) is still the most commonly used. This paper proposes a technique for customizing activation functions automatically, resulting in reliable improvements in performance. Evolutionary search is used to discover the general form of the function, and gradient descent to optimize its parameters for different parts of the network and over the learning process. Experiments with three different neural network architectures on the CIFAR-100 image classification dataset show that this approach is effective. It discovers different activation functions for different architectures, and consistently improves accuracy over ReLU and other recently proposed activation functions by significant margins. The approach can therefore be used as an automated optimization step in applying deep learning to new tasks.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Review: Deep Learning in Electron Microscopy

TL;DR: In this paper, a review of deep learning in electron microscopy is presented, with a focus on hardware and software needed to get started with deep learning and interface with electron microscopes.
Journal ArticleDOI

Deep learning in electron microscopy

TL;DR: This review paper offers a practical perspective aimed at developers with limited familiarity of deep learning in electron microscopy that discusses hardware and software needed to get started with deep learning and interface with electron microscopes.
Journal ArticleDOI

Smish: A Novel Activation Function for Deep Learning Methods

TL;DR: Experiments show that Smish tends to operate more efficiently than Logish, Mish, and other activation functions on EfficientNet models with open datasets, and the performance of Smish in various deep learning models and the parameters of its function f(x)=αx·tanh[ln(1+sigmoid(βx))], and where α = 1 and β = 1, Smish was found to exhibit the highest accuracy.
Journal ArticleDOI

How important are activation functions in regression and classification? A survey, performance comparison, and future directions

TL;DR: This work surveys the activation functions that have been employed in the past as well as the current state-of-the-art, and presents various developments in activation functions over the years and the advantages and disadvantages or limitations of these activation functions.
Posted Content

Low Curvature Activations Reduce Overfitting in Adversarial Training.

TL;DR: This article showed that using activation functions with low (exact or approximate) curvature values has a regularization effect that significantly reduces both the standard and robust generalization gaps in adversarial training.
References
More filters
Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Proceedings Article

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Dissertation

Learning Multiple Layers of Features from Tiny Images

TL;DR: In this paper, the authors describe how to train a multi-layer generative model of natural images, using a dataset of millions of tiny colour images, described in the next section.
Proceedings Article

Rectified Linear Units Improve Restricted Boltzmann Machines

TL;DR: Restricted Boltzmann machines were developed using binary stochastic hidden units that learn features that are better for object recognition on the NORB dataset and face verification on the Labeled Faces in the Wild dataset.
Related Papers (5)
Trending Questions (1)
Do activation functions affect the predictive performance of neural networks?

Activation functions significantly impact neural network performance. Customized activation functions discovered through evolutionary search and gradient descent consistently outperform traditional functions like ReLU, enhancing predictive accuracy.