Topic
Sigmoid function
About: Sigmoid function is a research topic. Over the lifetime, 2228 publications have been published within this topic receiving 59557 citations. The topic is also known as: S curve.
Papers published on a yearly basis
Papers
More filters
•
29 Nov 1999TL;DR: For mixture density estimation, it is shown that a k-component mixture estimated by maximum likelihood (or by an iterative likelihood improvement that is introduced) achieves log-likelihood within order 1/k of the log- likelihood achievable by any convex combination.
Abstract: Gaussian mixtures (or so-called radial basis function networks) for density estimation provide a natural counterpart to sigmoidal neural networks for function fitting and approximation. In both cases, it is possible to give simple expressions for the iterative improvement of performance as components of the network are introduced one at a time. In particular, for mixture density estimation we show that a k-component mixture estimated by maximum likelihood (or by an iterative likelihood improvement that we introduce) achieves log-likelihood within order 1/k of the log-likelihood achievable by any convex combination. Consequences for approximation and estimation using Kullback-Leibler risk are also given. A Minimum Description Length principle selects the optimal number of components k that minimizes the risk bound.
285 citations
••
TL;DR: Findings show how the sigmoid function offers significantly greater advantages than the other functions in a same decisional model.
Abstract: Fuzzy cognitive maps (FCM) are graph-based modeling tools. FCM can to be used for structuring and supporting decisional processes. Also, FCM allow developing what-if analysis, through the definition of scenarios. It is possible to choose among four activation functions: (1) sigmoid function, (2) hyperbolic tangent function, (3) step function and (4) threshold linear function. The use of each function can provide different alternatives. In this context, the main objective of the present study is to develop a benchmarking analysis among the mentioned functions using a same decisional model. Findings show how the sigmoid function offers significantly greater advantages than the other functions.
278 citations
••
TL;DR: A special admissibility condition for neural activation functions is introduced which requires that the neural activation function be oscillatory and linear transforms are constructed which represent quite general functions f as a superposition of ridge functions.
277 citations
••
01 Jan 1989TL;DR: Multilayer feedforward networks possess universal approximation capabilities by virtue of the presence of intermediate layers with sufficiently many parallel processors; the properties of the intermediate-layer activation function are not so crucial.
Abstract: K.M. Hornik, M. Stinchcombe, and H. White (Univ. of California at San Diego, Dept. of Economics Discussion Paper, June 1988; to appear in Neural Networks) showed that multilayer feedforward networks with as few as one hidden layer, no squashing at the output layer, and arbitrary sigmoid activation function at the hidden layer are universal approximators: they are capable of arbitrarily accurate approximation to arbitrary mappings, provided sufficiently many hidden units are available. The present authors obtain identical conclusions but do not require the hidden-unit activation to be sigmoid. Instead, it can be a rather general nonlinear function. Thus, multilayer feedforward networks possess universal approximation capabilities by virtue of the presence of intermediate layers with sufficiently many parallel processors; the properties of the intermediate-layer activation function are not so crucial. In particular, sigmoid activation functions are not necessary for universal approximation. >
274 citations
••
TL;DR: It is shown theoretically that the recently developed extreme learning machine (ELM) algorithm can be used to train the neural networks with threshold functions directly instead of approximating them with sigmoid functions.
Abstract: Neural networks with threshold activation functions are highly desirable because of the ease of hardware implementation. However, the popular gradient-based learning algorithms cannot be directly used to train these networks as the threshold functions are nondifferentiable. Methods available in the literature mainly focus on approximating the threshold activation functions by using sigmoid functions. In this paper, we show theoretically that the recently developed extreme learning machine (ELM) algorithm can be used to train the neural networks with threshold functions directly instead of approximating them with sigmoid functions. Experimental results based on real-world benchmark regression problems demonstrate that the generalization performance obtained by ELM is better than other algorithms used in threshold networks. Also, the ELM method does not need control variables (manually tuned parameters) and is much faster.
268 citations