scispace - formally typeset
Search or ask a question
Topic

Sigmoid function

About: Sigmoid function is a research topic. Over the lifetime, 2228 publications have been published within this topic receiving 59557 citations. The topic is also known as: S curve.


Papers
More filters
Proceedings Article
29 Nov 1999
TL;DR: For mixture density estimation, it is shown that a k-component mixture estimated by maximum likelihood (or by an iterative likelihood improvement that is introduced) achieves log-likelihood within order 1/k of the log- likelihood achievable by any convex combination.
Abstract: Gaussian mixtures (or so-called radial basis function networks) for density estimation provide a natural counterpart to sigmoidal neural networks for function fitting and approximation. In both cases, it is possible to give simple expressions for the iterative improvement of performance as components of the network are introduced one at a time. In particular, for mixture density estimation we show that a k-component mixture estimated by maximum likelihood (or by an iterative likelihood improvement that we introduce) achieves log-likelihood within order 1/k of the log-likelihood achievable by any convex combination. Consequences for approximation and estimation using Kullback-Leibler risk are also given. A Minimum Description Length principle selects the optimal number of components k that minimizes the risk bound.

285 citations

Journal ArticleDOI
TL;DR: Findings show how the sigmoid function offers significantly greater advantages than the other functions in a same decisional model.
Abstract: Fuzzy cognitive maps (FCM) are graph-based modeling tools. FCM can to be used for structuring and supporting decisional processes. Also, FCM allow developing what-if analysis, through the definition of scenarios. It is possible to choose among four activation functions: (1) sigmoid function, (2) hyperbolic tangent function, (3) step function and (4) threshold linear function. The use of each function can provide different alternatives. In this context, the main objective of the present study is to develop a benchmarking analysis among the mentioned functions using a same decisional model. Findings show how the sigmoid function offers significantly greater advantages than the other functions.

278 citations

Journal ArticleDOI
TL;DR: A special admissibility condition for neural activation functions is introduced which requires that the neural activation function be oscillatory and linear transforms are constructed which represent quite general functions f as a superposition of ridge functions.

277 citations

Proceedings ArticleDOI
01 Jan 1989
TL;DR: Multilayer feedforward networks possess universal approximation capabilities by virtue of the presence of intermediate layers with sufficiently many parallel processors; the properties of the intermediate-layer activation function are not so crucial.
Abstract: K.M. Hornik, M. Stinchcombe, and H. White (Univ. of California at San Diego, Dept. of Economics Discussion Paper, June 1988; to appear in Neural Networks) showed that multilayer feedforward networks with as few as one hidden layer, no squashing at the output layer, and arbitrary sigmoid activation function at the hidden layer are universal approximators: they are capable of arbitrarily accurate approximation to arbitrary mappings, provided sufficiently many hidden units are available. The present authors obtain identical conclusions but do not require the hidden-unit activation to be sigmoid. Instead, it can be a rather general nonlinear function. Thus, multilayer feedforward networks possess universal approximation capabilities by virtue of the presence of intermediate layers with sufficiently many parallel processors; the properties of the intermediate-layer activation function are not so crucial. In particular, sigmoid activation functions are not necessary for universal approximation. >

274 citations

Journal ArticleDOI
TL;DR: It is shown theoretically that the recently developed extreme learning machine (ELM) algorithm can be used to train the neural networks with threshold functions directly instead of approximating them with sigmoid functions.
Abstract: Neural networks with threshold activation functions are highly desirable because of the ease of hardware implementation. However, the popular gradient-based learning algorithms cannot be directly used to train these networks as the threshold functions are nondifferentiable. Methods available in the literature mainly focus on approximating the threshold activation functions by using sigmoid functions. In this paper, we show theoretically that the recently developed extreme learning machine (ELM) algorithm can be used to train the neural networks with threshold functions directly instead of approximating them with sigmoid functions. Experimental results based on real-world benchmark regression problems demonstrate that the generalization performance obtained by ELM is better than other algorithms used in threshold networks. Also, the ELM method does not need control variables (manually tuned parameters) and is much faster.

268 citations


Network Information
Related Topics (5)
Artificial neural network
207K papers, 4.5M citations
69% related
Deep learning
79.8K papers, 2.1M citations
68% related
Linear regression
21.3K papers, 1.2M citations
68% related
Convolutional neural network
74.7K papers, 2M citations
67% related
Sampling (statistics)
65.3K papers, 1.2M citations
67% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023253
2022674
2021121
2020158
2019167
2018134