scispace - formally typeset
Search or ask a question

Showing papers on "Activation function published in 2002"


Journal ArticleDOI
TL;DR: In this paper, the authors studied the absolute periodicity of delayed neural networks and derived simple and checkable conditions for guaranteeing absolute stability and absolute periodicity, and provided simulations for absolute stability.
Abstract: Proposes to study the absolute periodicity of delayed neural networks. A neural network is said to be absolutely periodic, if for every activation function in some suitable functional set and every input periodic vector function, a unique periodic solution of the network exists and all other solutions of the network converge exponentially to it. Absolute stability of delayed neural networks is also studied in this paper. Simple and checkable conditions for guaranteeing absolute periodicity and absolute stability are derived. Simulations for absolute periodicity are given.

98 citations


Journal ArticleDOI
07 Aug 2002
TL;DR: Based on globally Lipschitz continous activation functions, new conditions ensuring existence, uniqueness and global robust exponential stability of the equilibrium point of interval neural networks with delays are obtained.
Abstract: In this paper, based on globally Lipschitz continuous activation functions, new conditions ensuring existence, uniqueness and global robust exponential stability of the equilibrium point of interval neural networks with delays are obtained. The delayed Hopfield network, bidirectional associative memory network and cellular neural network are special cases of the network model considered. All the results obtained are generalizations of some recent results reported in the literature for neural networks with constant delays.

83 citations


Journal ArticleDOI
TL;DR: It is seen that for this class of objective functions, the dimensionality of the problem is critical and with increasing numbers of decision variables, the learning becomes more and more difficult for ESs, and an “efficient” parameterization becomes crucial.

73 citations


Journal ArticleDOI
TL;DR: The architecture of radial basis function neural networks is modified so as to also model linear as well as the usual nonlinear input–output relationships, which is at least as powerful as the Takagi–Sugeno type of fuzzy rule-based systems.

73 citations


Journal ArticleDOI
TL;DR: New results on bounds for the gradient and Hessian of the error are provided and it is possible to estimate how much one can reduce the error by changing the centers, and a step size can be specified to achieve a guaranteed, amount of reduction in error.
Abstract: In radial basis function (RBF) networks, placement of centers is said to have a significant effect on the performance of the network. Supervised learning of center locations in some applications show that they are superior to the networks whose centers are located using unsupervised methods. But such networks can take the same training time as that of sigmoid networks. The increased time needed for supervised learning offsets the training time of regular RBF networks. One way to overcome this may be to train the network with a set of centers selected by unsupervised methods and then to fine tune the locations of centers. This can be done by first evaluating whether moving the centers would decrease the error and then, depending on the required level of accuracy, changing the center locations. This paper provides new results on bounds for the gradient and Hessian of the error considered first as a function of the independent set of parameters, namely the centers, widths, and weights; and then as a function of centers and widths where the linear weights are now functions of the basis function parameters for networks of fixed size. Moreover, bounds for the Hessian are also provided along a line beginning at the initial set of parameters. Using these bounds, it is possible to estimate how much one can reduce the error by changing the centers. Further to that, a step size can be specified to achieve a guaranteed, amount of reduction in error.

73 citations


Book ChapterDOI
01 Jan 2002
TL;DR: This chapter deals with neural networks using a supervised learning procedure that can learn a nonlinear relationship between an input and the corresponding output based on the desired or target output, and examines three types of neural networks: multilayer perceptrons, finite impulse response (FIR) multilayers perceptron, and recurrent neural networks.
Abstract: Publisher Summary An artificial neural network is a highly connected array of elementary processors called neurons. Neural networks have been extensively investigated and have been successfully applied to a variety of areas; one of these areas is time-series prediction. This chapter deals with neural networks using a supervised learning procedure that can learn a nonlinear relationship between an input and the corresponding output based on the desired or target output. Such neural networks can learn the nonlinear relationship by using the past value of the time series as the input and the desired output, and can implicitly construct the underlying model required for time-series prediction. This chapter discusses three types of neural networks: multilayer perceptrons, finite impulse response (FIR) multilayer perceptrons, and recurrent neural networks. The multilayer perceptron consists of the input layer of distribution nodes, one or more hidden layer of computation nodes, and an output layer of computation nodes. The input signal into the input layer passes through the hidden layers to the output layer in a forward direction, on a layer-by-layer basis. FIR multilayer perceptrons can be constructed by replacing synaptic weights with FIR synaptic filters in the structure of the standard multilayer perceptrons. The recurrent neural networks use an ordinary model of a neuron, but the networks develop a temporal processing capability through feedback built into the architecture. This chapter examines the aforementioned neural networks using training algorithms and applications.

71 citations


Book ChapterDOI
01 Jan 2002
TL;DR: A major advantage of fuzzy wavenets techniques in comparison to most neurofuzzy methods is that the rules are validated online during learning by using a simple algorithm based on the fast wavelet decomposition algorithm.
Abstract: The combination of wavelet theory and neural networks has lead to the development of wavelet networks. Wavelet networks are feed-forward neural networks using wavelets as activation function. They have been used in classification and identification problems with some success. Their strength lies on catching essential features in “frequency-rich” signals. In wavelet networks, both the position and the dilation of the wavelets are optimized besides the weights. Wavenet is another term to describe wavelet networks. Originally, wavenets did refer to neural networks using dyadic wavelets. In wavenets, the position and dilation of the wavelets are fixed and the weights are optimized by the network. We propose to adopt this terminology. The theory of wavenets has been generalized by the author to biorthogonal wavelets. This extension to biorthogonal wavelets has lead to the development of fuzzy wavenets. A serious difficulty with most neurofuzzy methods is that they do often furnish rules without a transparent interpretation. A solution to this problem is furnished by multiresolution techniques. The most appropriate membership functions are chosen from a dictionary of membership functions forming a multiresolution. The dictionary contains a number of membership functions that have the property to be symmetric, everywhere positive and with a single maxima. This family includes among others splines and some radial functions. The main advantage of using a dictionary of membership functions is that each term, such as “small”, “large” is well defined beforehand and is not modified during learning. The multiresolution properties of the membership functions in the dictionary function permit to fuse or split membership functions quite easily so as to express the rules under a linguistically understandable and intuitive form for the human expert. Different techniques, generally referred by the term “fuzzy-wavelet”, have been developed for data on a regular grid. Fuzzy wavenets extend these techniques to online learning. A major advantage of fuzzy wavenets techniques in comparison to most neurofuzzy methods is that the rules are validated online during learning by using a simple algorithm based on the fast wavelet decomposition algorithm. Significant applications of wavelet networks and fuzzy wavenets are discussed to illustrate the potential of thcse methods.

70 citations


Journal ArticleDOI
TL;DR: It is shown that, without knowing the system parameters, and with only basic information on the uncertainties, the procedure is shown to provide simple solutions to the classical problems of neural network function approximation, as well as eccentricity control and friction compensation of mechanical systems.

58 citations


Journal ArticleDOI
TL;DR: This paper attempts to generalize Piche's method by deriving an universal expression of MLP sensitivity for antisymmetric squashing activation functions, without any restriction on input and output perturbations.
Abstract: Sensitivity analysis on a neural network is mainly investigated after the network has been designed and trained. Very few have considered this as a critical issue prior to network design. Piche's statistical method (1992, 1995) is useful for multilayer perceptron (MLP) design, but too severe limitations are imposed on both input and weight perturbations. This paper attempts to generalize Piche's method by deriving an universal expression of MLP sensitivity for antisymmetric squashing activation functions, without any restriction on input and output perturbations. Experimental results which are based on, a three-layer MLP with 30 nodes per layer agree closely with our theoretical investigations. The effects of the network design parameters such as the number of layers, the number of neurons per layer, and the chosen activation function are analyzed, and they provide useful information for network design decision-making. Based on the sensitivity analysis of MLP, we present a network design method for a given application to determine the network structure and estimate the permitted weight range for network training.

54 citations


Proceedings ArticleDOI
18 Nov 2002
TL;DR: This paper investigates existence conditions of energy functions for a class of fully connected complex-valued neural networks and proposes an energy function, analogous to those of real-valued Hopfield-type neural networks, and shows that, similar to the real- valued ones, the energy function enables us to analyze qualitative behaviors of the complex- valued neural networks.
Abstract: Recently models of neural networks that can deal with complex numbers, complex-valued neural networks, have been proposed and several studies on their abilities of information processing have been done. In this paper we investigate existence conditions of energy functions for a class of fully connected complex-valued neural networks and propose an energy function, analogous to those of real-valued Hopfield-type neural networks. It is also shown that, similar to the real-valued ones, the energy function enables us to analyze qualitative behaviors of the complex-valued neural networks. We present dynamic properties of the complex-valued neural networks obtained by qualitative analysis using the energy function. A synthesis method of complex-valued associative memories by utilizing the analysis results is also discussed.

53 citations


Journal ArticleDOI
TL;DR: In this article, the authors demonstrate that relative to a random initial population, seeding the initial population of an evolutionary search with center-crossing networks significantly improves both the frequency and the speed with which high-fitness oscillatory circuits evolve on a simple walking task.
Abstract: A center-crossing recurrent neural network is one in which the null-(hyper)surfaces of each neuron intersect at their exact centers of symmetry, ensuring that each neuron's activation function is centered over the range of net inputs that it receives. We demonstrate that relative to a random initial population, seeding the initial population of an evolutionary search with center-crossing networks significantly improves both the frequency and the speed with which high-fitness oscillatory circuits evolve on a simple walking task. The improvement is especially striking at low mutation variances. Our results suggest that seeding with center-crossing networks may often be beneficial, since a wider range of dynamics is more likely to be easily accessible from a population of center-crossing networks than from a population of random networks.

Journal ArticleDOI
TL;DR: FC networks, which generalize the earlier corner classification networks, have been compared against Backpropagation and Radial Basis Function networks and are seen to have excellent performance for prediction of time-series and pattern recognition.

Journal ArticleDOI
TL;DR: A digital design for piecewise-linear (PWL) approximation to the sigmoid function is presented, based on a recursive algorithm that uses lattice operators max and min to approximating nonlinear functions.
Abstract: A digital design for piecewise-linear (PWL) approximation to the sigmoid function is presented. Circuit operation is based on a recursive algorithm that uses lattice operators max and min to approximating nonlinear functions. The resulting hardware is programmable, allowing for the control of the delay-time/approximation-accuracy rate.

Proceedings ArticleDOI
07 Aug 2002
TL;DR: A new activation function to accelerate backpropagation learning is presented and simulation using this activation function shows improvement in the learning speed compared with other commonly used functions and the newactivation function proposed by Bilski (2000).
Abstract: Multilayer feedforward network trained by backpropagation algorithm suffers from slow learning speed. One of the reasons of slow convergence is the diminishing value of the derivative of the commonly used activation functions as the nodes approaches saturated values. In this paper, we present a new activation function to accelerate backpropagation learning. A comparison among the commonly used activation functions, recently proposed logarithmic function and the proposed activation function shows accelerated convergence with the proposed one. This activation function can be used in conjunction with other techniques to further accelerate the learning speed or reduce the chance of being trapped in local minima. Simulation using this activation function shows improvement in the learning speed compared with other commonly used functions and the new activation function proposed by Bilski (2000). This function may also be used in other multilayer feedforward training algorithms.

Proceedings ArticleDOI
11 Nov 2002
TL;DR: A novel approach for applying genetic algorithms to the configuration of radial basis function networks is presented and a new crossover operator that allows for some control over the competing conventions problem is introduced.
Abstract: A novel approach for applying genetic algorithms to the configuration of radial basis function networks is presented. A new crossover operator that allows for some control over the competing conventions problem is introduced. Also, a minimalist initialization scheme which tends to generate more parsimonious models is also presented. Finally, a reformulation of generalized cross-validation criterion for model selection, making it more conservative, is discussed. The proposed model is submitted to a computational experiment in order to verify its effectiveness.

Proceedings ArticleDOI
06 Nov 2002
TL;DR: This paper analyses the linear division and un-division problems in logical operation performed by single-layer perceptron and XOR and proposes several solutions to solve the problems of XOR.
Abstract: This paper tries to explain the network structures and methods of single-layer perceptron and multi-layer perceptron It also analyses the linear division and un-division problems in logical operation performed by single-layer perceptron XOR is linear un-division operation, which cannot be treated by single-layer perceptron With the analysis, several solutions are proposed in the paper to solve the problems of XOR Single-layer perceptron can be improved by multi-layer perceptron, functional perceptron or quadratic function These solutions are designed and analyzed

Journal ArticleDOI
TL;DR: The aim of this paper is to present an efficient implementation of unsupervised adaptive-activation function neurons dedicated to one-dimensional probability density estimation, with application to independent component analysis.

Proceedings ArticleDOI
10 Dec 2002
TL;DR: A PD-PI-type fuzzy controller has been developed where the membership functions are adjusted by tuning the scaling factors using a neural network, demonstrating that the sigmoidal function and its shape can represent the nonlinearity of the system.
Abstract: The limitations of conventional model-based control mechanisms for flexible manipulator systems have stimulated the development of intelligent control mechanisms incorporating fuzzy logic and neural networks. Problems have been encountered in applying the traditional PD-, PI-, and PID-type fuzzy controllers to flexible-link manipulators. A PD-PI-type fuzzy controller has been developed where the membership functions are adjusted by tuning the scaling factors using a neural network. Such a network needs a sufficient number of neurons in the hidden layer to approximate the nonlinearity. A simple realisable network is desirable and hence a single neuron network with a nonlinear activation function is used. It has been demonstrated that the sigmoidal function and its shape can represent the nonlinearity of the system. A genetic algorithm is used to learn the weights, biases and shape of the sigmoidal function of the neural network.

Journal ArticleDOI
10 Dec 2002
TL;DR: The combined periodic activation function is found to possess the fast convergence and multicluster classification capabilities of the sinusoidal activation function while keeping the robustness property of the sigmoid function required in the modelling of unknown systems.
Abstract: The authors investigate the convergence and pruning performance of multilayer feedforward neural networks with different types of neuronal activation functions in solving various problems. Three types of activation functions are adopted in the network, namely, the traditional sigmoid function, the sinusoidal function and a periodic function that can be considered as a combination of the first two functions. To speed up the learning, as well as to reduce the network size, the extended Kalman filter (EKF) algorithm conjunct with a pruning method is used to train the network. The corresponding networks are applied to solve five typical problems, namely, 4-point XOR logic function, parity generation, handwritten digit recognition, piecewise linear function approximation and sunspot series prediction. Simulation results show that periodic activation functions perform better than monotonic ones in solving multicluster classification problems. Moreover, the combined periodic activation function is found to possess the fast convergence and multicluster classification capabilities of the sinusoidal activation function while keeping the robustness property of the sigmoid function required in the modelling of unknown systems.

Proceedings ArticleDOI
TL;DR: The most important characteristics of RBF networks are illustrated with a number of examples and the same algorithm and program may be successfully applied to regression modeling or pattern classification.
Abstract: This paper discusses an implementation and application of Radial Basis Function (RBF) Networks. This type of neural networks performs a universal approach to function approximation. The same algorithm and program may be successfully applied to regression modeling or pattern classification. We illustrate the most important characteristics of RBF networks with a number of examples and discuss network behavior in depth. The software has been implemented in the A+ language, which became available to developers in January of 2001.

Journal ArticleDOI
TL;DR: The present solution for the N-bit parity problem is solved with a neural network that allows direct connections between the input layer and the output layer and can be simplified by using a single hidden layer neuron with a “staircase” type activation function instead of ⌊N/2⌋ hidden layer neurons.

Proceedings ArticleDOI
13 May 2002
TL;DR: It is shown that folly complex FNNs can universally approximate any complex mapping to an arbitrary accuracy on a compact set of input patterns with probability 1.
Abstract: Recently, we have presented the ‘fully’ complex feed-forward neural network (FNN) using a subset of complex elementary transcendental functions (ETFs) as the nonlinear activation functions. In this paper, we show that folly complex FNNs can universally approximate any complex mapping to an arbitrary accuracy on a compact set of input patterns with probability 1. The proof is extended to a new family of complex activation functions possessing essential singularities. We discuss properties of the complex activation functions based on the types of their singularity and the implications of these to the efficiency and the domain of convergence in their applications.

Journal ArticleDOI
V. Deolalikar1
TL;DR: A new two-layer neural paradigm based on increasing the dimensionality of the output of the first layer is proposed and is shown to be capable of forming any arbitrary decision region in input space.
Abstract: It is well known that a two-layer perceptron network with threshold neurons is incapable of forming arbitrary decision regions in input space, while a three-layer perceptron has that capability. The effect of replacing the output neuron in a two-layer perceptron with a bithreshold element is studied. The limitations of this modified two-layer perceptron are observed. Results on the separating capabilities of a pair of parallel hyperplanes are obtained. Based on these, a new two-layer neural paradigm based on increasing the dimensionality of the output of the first layer is proposed and is shown to be capable of forming any arbitrary decision region in input space. Then a type of logic called bithreshold logic, based on the bithreshold neuron transfer function, is studied. Results on the limits of switching function realizability using bithreshold gates are obtained.

Proceedings ArticleDOI
28 Oct 2002
TL;DR: The proposed adaptive activation function for multilayer feedforward neural networks is based upon the backpropagation (BP) algorithm and its learning speed is much faster than that of traditional networks with fixed activation function.
Abstract: The aim of this paper is to propose a new adaptive activation function for multilayer feedforward neural networks Based upon the backpropagation (BP) algorithm, an effective learning method is derived to adjust the free parameters in the activation function as well as the connected weights between neurons Its performance is demonstrated by the N-parity and two-spiral problems The simulation results showed that the proposed method is more suitable to the pattern classification problems and its learning speed is much faster than that of traditional networks with fixed activation function

Journal ArticleDOI
TL;DR: Two different artificial neural networks are used for estimating a time dependent boundary condition (x = 0) in a slab: multilayer perceptron (MP) and radial base function (RBF).
Abstract: Two different artificial neural networks (NN) are used for estimating a time dependent boundary condition (x = 0) in a slab: multilayer perceptron (MP) and radial base function (RBF). The input for the NN is the temperature time-series obtained from a probe next to boundary of interest. Our numerical experiments follow the work of Krejsa et al. [4]. The NNs were trainned considering 5 per cent of noise in the experimental data. The training was performed considering 500 similar test-functions and 500 different test-functions. Inversions with trained NNs with different test-functions were better. The RBF-NN presented a slightly better results than MP-NN.

Proceedings ArticleDOI
11 Aug 2002
TL;DR: Simulation results show that periodic activation functions perform better than monotonic ones in solving multi-cluster classification problems such as handwritten digit recognition.
Abstract: The problem of handwritten digit recognition is dealt with by multilayer feedforward neural networks with different types of neuronal activation functions. Three types of activation functions are adopted in the network, namely, the traditional sigmoid function, sinusoidal function and a periodic function that can be considered as a combination of the first two functions. To speed up the learning, as well as to reduce the network size, an extended Kalman filter algorithm with the pruning method is used to train the network. Simulation results show that periodic activation functions perform better than monotonic ones in solving multi-cluster classification problems such as handwritten digit recognition.

Proceedings ArticleDOI
12 May 2002
TL;DR: A co-evolutionary approach is used to train each of the created networks by adjusting both the weights of the hidden-layer neurons and the parameters for their activation functions.
Abstract: This paper describes a hierarchical evolutionary technique developed to design and train feedforward neural networks with different activation functions on their hidden-layer neurons (heterogeneous neural networks). At the upper level, a genetic algorithm is used to determine the number of neurons in the hidden layer and the type of the activation function of those neurons. At the second level, neural nets compete against each other across generations so that the nets with the lowest test errors survive. Finally, on the third level, a co-evolutionary approach is used to train each of the created networks by adjusting both the weights of the hidden-layer neurons and the parameters for their activation functions.

Journal ArticleDOI
TL;DR: A class of data-reusing learning algorithms for real-time recurrent neural networks (RNNs) is analyzed and a general sigmoid nonlinear activation function of a neuron for the real time recurrent learning training algorithm is undertaken.
Abstract: A class of data-reusing learning algorithms for real-time recurrent neural networks (RNNs) is analyzed. The analysis is undertaken for a general sigmoid nonlinear activation function of a neuron for the real time recurrent learning training algorithm. Error bounds and convergence conditions for such data-reusing algorithms are provided for both contractive and expansive activation functions. The analysis is undertaken for various configurations that are generalizations of a linear structure infinite impulse response adaptive filter.

Proceedings ArticleDOI
07 Nov 2002
TL;DR: This method exploits a DC approach for constructing a dictionary in fault diagnosis using the neural network's classification capability to provide robust diagnosis using a mechanism to deal with the problem of component tolerance and reduce testing time.
Abstract: This paper presents a method for analog circuit fault diagnosis by using neural networks. This method exploits a DC approach for constructing a dictionary in fault diagnosis using the neural network's classification capability. Also, Radial Basis Function (RBF) and backward error propagation (BEP) networks are considered and compared for analog fault diagnosis. The primary focus of the paper is to provide robust diagnosis using a mechanism to deal with the problem of component tolerance and reduce testing time. Simulation results show that the radial basis function network with reasonable dimension has double precision in fault classification but its classification is local, while the backward error propagation network with reasonable dimension has single precision in fault classification but its classification is global.

Proceedings ArticleDOI
03 Nov 2002
TL;DR: In this paper, a new hidden layer error function is proposed which de-emphasizes net function errors that correspond to saturated activation function values, and an adaptive learning rate based on the local shape of the error surface is used in hidden layer training.
Abstract: The output weight optimization-hidden weight optimization (OWO-HWO) feedforward network training algorithm alternately solves linear equations for output weights and reduces a separate hidden layer error function with respect to hidden layer weights. Here, a new hidden layer error function is proposed which de-emphasizes net function errors that correspond to saturated activation function values. In addition, an adaptive learning rate based on the local shape of the error surface is used in hidden layer training. Faster learning convergence is experimentally verified.