Showing papers on "Activation function published in 1992"

PDF

Open Access

Posted Content•

Multilayer Feedforward Networks with a Non-Polynomial Activation Function Can Approximate Any Function

[...]

Moshe Leshno, Valdimir Ya. Lin, Allan Pinkus, Shimon Schocken

01 Mar 1992-Social Science Research Network

TL;DR: In this article, it was shown that a standard multilayer feedforward network with a locally bounded piecewise activation function can approximate any continuous function to any degree of accuracy if and only if the network's activation function is not a polynomial.

...read moreread less

Abstract: Several researchers characterized the activation function under which multilayer feedforwardnetworks can act as universal approximators. We show that most of all the characterizationsthat were reported thus far in the literature are special cases of the followinggeneral result: a standard multilayer feedforward network with a locally bounded piecewisecontinuous activation function can approximate any continuous function to any degree ofaccuracy if and only if the network's activation function is not a polynomial. We alsoemphasize the important role of the threshold, asserting that without it the last theoremdoes not hold.

...read moreread less

216 citations

Proceedings Article•DOI•

Neural networks in control systems

[...]

Kumpati S. Narendra¹, S. Mukhopadhyay¹•Institutions (1)

Yale University¹

16 Dec 1992

TL;DR: Simulation results are presented to demonstrate that the methods presented can be used for the effective control of complex nonlinear systems and it is shown that globally stable adaptive controllers can be determined.

...read moreread less

Abstract: Some of the problems that arise in the control of nonlinear systems in the presence of uncertainty are considered Multilayer neural networks and radial basis function networks are used in the design of identifiers and controllers, and gradient methods are used to adjust their parameters For a restricted class of nonlinear systems, it is shown that globally stable adaptive controllers can be determined Simulation results are presented to demonstrate that the methods presented can be used for the effective control of complex nonlinear systems >

...read moreread less

197 citations

Proceedings Article•DOI•

Sensitivity analysis for feedforward artificial neural networks with differentiable activation functions

[...]

Sherif Hashem¹•Institutions (1)

Honeywell¹

07 Jun 1992

TL;DR: A method for computing the network output sensitivities with respect to variations in the inputs for multilayer feedforward artificial neural networks with differentiable activation functions is presented.

...read moreread less

Abstract: A method for computing the network output sensitivities with respect to variations in the inputs for multilayer feedforward artificial neural networks with differentiable activation functions is presented. It is applied to obtain expressions for the first- and second-order sensitivities. An example is introduced along with a discussion to illustrate how the sensitivities are calculated and to show how they compare to the actual derivatives of the function being modeled by the neural network. >

...read moreread less

118 citations

Journal Article•DOI•

A constructive method for multivariate function approximation by multilayer perceptrons

[...]

Shlomo Geva¹, Joaquin Sitte¹•Institutions (1)

Queensland University of Technology¹

01 Jul 1992-IEEE Transactions on Neural Networks

TL;DR: It is shown how to construct a perceptron with two hidden layers for multivariate function approximation, which can perform function approximation in the same manner as networks based on Gaussian potential functions, by linear combination of local functions.

...read moreread less

Abstract: Mathematical theorems establish the existence of feedforward multilayered neural networks, based on neurons with sigmoidal transfer functions, that approximate arbitrarily well any continuous multivariate function. However, these theorems do not provide any hint on how to find the network parameters in practice. It is shown how to construct a perceptron with two hidden layers for multivariate function approximation. Such a network can perform function approximation in the same manner as networks based on Gaussian potential functions, by linear combination of local functions. >

...read moreread less

100 citations

Journal Article•DOI•

Identification of non-linear processes using reciprocal multiquadric functions

[...]

Martin Pottman¹, Dale E. Seborg¹•Institutions (1)

University of California, Santa Barbara¹

01 Jan 1992-Journal of Process Control

TL;DR: In this article, a stepwise regression algorithm based on orthogonalization and a series of statistical tests is employed for designing and training of the RBF networks, which yields non-linear models, which are stable and linear in the model parameters.

...read moreread less

94 citations

Journal Article•DOI•

Simple sigmoid-like activation function suitable for digital hardware implementation

[...]

Hon Keung Kwan¹•Institutions (1)

University of Windsor¹

16 Jul 1992-Electronics Letters

TL;DR: Simulation results on the uses of the second-order function and the bipolar sigmoid function for training multilayer feedforward networks using the backpropagation algorithm show that they have similar generalisation properties while the second order function has a slight advantage in convergence speed.

...read moreread less

Abstract: A simple sigmoid-like second-order piecewise activation function suitable for direct digital hardware implementation is presented. Simulation results on the uses of the second-order function and the bipolar sigmoid function for training multilayer feedforward networks using the backpropagation algorithm show that they have similar generalisation properties while the second-order function has a slight advantage in convergence speed.< >

...read moreread less

87 citations

Proceedings Article•

The Power of Approximating: a Comparison of Activation Functions

[...]

Bhaskar DasGupta¹, Georg Schnitger²•Institutions (2)

University of Minnesota¹, Pennsylvania State University²

30 Nov 1992

TL;DR: This work compares activation functions in terms of the approximation power of their feedforward nets in the case of analog as well as boolean input.

...read moreread less

Abstract: We compare activation functions in terms of the approximation power of their feedforward nets. We consider the case of analog as well as boolean input.

...read moreread less

76 citations

Journal Article•DOI•

Modifying the Generalized Delta Rule to Train Networks of Non-monotonic Processors for Pattern Classification

[...]

Michael R. W. Dawson, Donald Schopflocher

01 Jan 1992-Connection Science

TL;DR: A modification of the generalized delta rule is described that is capable of training multilayer networks of value units, i.e. units defined by a particular non-monotonic activation function, the Gaussian, which suggests that value unit networks may be better suited for learning some pattern classification tasks and for answering general questions related to the organization of neurophysiological systems.

...read moreread less

Abstract: A modification of the generalized delta rule is described that is capable of training multilayer networks of value units, i.e. units defined by a particular non-monotonic activation function, the Gaussian, For simple problems of pattern classification, this rule produces networks with several advantages over standard feedforward networks: they require fewer processing units and can be trained much more quickly. Though superficially similar, there are fundamental differences between the networks trained by this new learning rule and radial basis function networks. These differences suggest that value unit networks may be better suited for learning some pattern classification tasks and for answering general questions related to the organization of neurophysiological systems.

...read moreread less

61 citations

Proceedings Article•DOI•

An architecture of neural networks for input vectors of fuzzy numbers

[...]

Hisao Ishibuchi, Ryosuke Fujioka, Hideo Tanaka

08 Mar 1992

TL;DR: The authors proposed an architecture of multilayer feedforward neural networks for classification problems of fuzzy vectors where the activation function is extended to a fuzzy input-output relation by the extension principle.

...read moreread less

Abstract: The authors proposed an architecture of multilayer feedforward neural networks for classification problems of fuzzy vectors. A fuzzy input vector is mapped to a fuzzy number by the proposed neural network where the activation function is extended to a fuzzy input-output relation by the extension principle. A learning algorithm is derived from a cost function defined by a target output and the level set of a fuzzy output. The proposed classification method of fuzzy vectors is illustrated by a numerical example. >

...read moreread less

56 citations

Journal Article•DOI•

Connectionist approach to PID autotuning

[...]

António E. Ruano, Peter J. Fleming¹, D.I. Jones²•Institutions (2)

University of Sheffield¹, University of Wales²

01 May 1992

TL;DR: A connectionist approach to the problem of PID autotuning is proposed, based on integral measures of the step response, which gives a major reduction in the number of iterations needed to achieve a local minimum.

...read moreread less

Abstract: A connectionist method for autotuning PID controllers is proposed. This technique, which is applicable both in open and in closed loops, employs multilayer perceptrons to approximate the mappings between the identification measures of the plant and the optimal PID values. The neutral network controller is designed to adapt to changing system structures and parameter values online. To achieve this objective, the network weighting coefficients are determined during an offline training phase. Simulation results are presented to illustrate the properties of the controller. One of the important aspects of neural networks is the convergence characteristic of this training phase. In the proposed approach, multilayer perceptrons are employed for nonlinear function approximation. As a consequence, the neurons have a linear activation function in their output layer. It is shown that a new learning criterion can be defined for this class of multilayer perceptrons, which is commonly found in control systems applications. Comparisons of the standard and the reformulated criteria, using different training algorithms, show that the new formulation achieves a significant reduction in the number of iterations needed to converge to a local minimum.

...read moreread less

51 citations

Proceedings Article•DOI•

Stable Recursive Identification Using Radial Basis Function Networks

[...]

Robert M. Sanner¹, Jean-Jacques E. Slotine¹•Institutions (1)

Massachusetts Institute of Technology¹

24 Jun 1992

TL;DR: In this article, the radial basis function networks can be used to produce stable, convergent, recursive identifiers in both continuous and discrete time, and the latter is of particular interest as it can serve as a model of the general neural network functional learning process.

...read moreread less

Abstract: The methodology developed for adaptive control applications of radial basis function networks can easily also be used to produce stable, convergent, recursive identifiers, in both continuous and discrete time. The latter is of particular interest as it can serve as a model of the general neural network functional learning process, and hence gives some direct insights into the factors influencing the success of these methods.

...read moreread less

Journal Article•DOI•

Implementation issues of sigmoid function and its derivative for VLSI digital neural networks

[...]

P. Murtagh¹, Ah Chung Tsoi¹•Institutions (1)

University of Queensland¹

01 May 1992

TL;DR: A number of different implementations for the first derivative of the sigmoid function are proposed based on overall speed performance (circuit speed and training time) and hardware requirements.

...read moreread less

Abstract: This paper proposes a number of different implementations for the first derivative of the sigmoid function. The implementation of the sigmoid function employs a powers-of-two piecewise linear approximation. The best implementation scheme for the derivative is suggested based on overall speed performance (circuit speed and training time) and hardware requirements.

...read moreread less

A Connection Between GRBF and MLP

[...]

Minoru Maruyama, Federico Girosi, Tomaso Poggio

01 Apr 1992

TL;DR: For normalized inputs, multilayer perceptron networks are radial function networks albeit wth a non-standard radial function).

...read moreread less

Abstract: Both multilayer perceptrons (MLP) and Generalized Radial Basis Functions GRBF) have good approximation properties, theoretically and experimentally. Are they related? The main point of this paper 'is to show that for normalized inputs, multilayer perceptron networks are radial function networks albeit wth a non-standard radial function). This provides an interpretation of the weights u7 as centers t of the radial function network, and therefore as equivalent to templates. This 'Insight may be useful for practical applications, ncluding better 'Initialization procedures for MLP. In the remainder of the paper, we discuss the relation between the radial functions that correspond to the sigmoid for normalized inputs and well-behaved radial basis functions, such as the Gaussian. In particular, we observe that the radial function associated with the sigmoid 'is an activation function that is good approximation to Gaussian basis functions for a range of values of the bias parameter. The mplication is that a MLP network can always simulate a Gaussian GRBF network (with the same nmber of uits but less parameters); the converse is true oly for certain values of the bias parameter. Numerical experiments indicate that this constraint 'is not always satisfied in practice by MLP networks trained with backpropagation. Multiscale RBF networks, on the other hand, can approximate MLP networks with a smilar number of parameters.

...read moreread less

Journal Article•DOI•

Neural networks as nonlinear structure-activity relationship analyzers. Useful functions of the partial derivative method in multilayer neural networks

[...]

Tomoo Aoyama, Hiroshi Ichikawa

01 Sep 1992-Journal of Chemical Information and Computer Sciences

TL;DR: It was shown that the partial differential coefficients of output strength with respect to input parameters were useful to analyze the relationship between inputs and outputs in the neural network and characteristics of each data in the set of data.

...read moreread less

Abstract: The operation of the perceptron-type neural network can be regarded as a function which transforms an input vector to another (output) vector. We have presented the analytical formula for the partial derivative of this function with respect to the elements of the input vector. Using numerical data, we have examined the accuracy, independency of the elements of the input vector, and recognition ability of a function in mixed functions. It was shown that the partial differential coefficients of output strength with respect to input parameters were useful to analyze the relationship between inputs and outputs in the neural network and characteristics of each data in the set of data.

...read moreread less

Proceedings Article•DOI•

ANN with two-dendrite neurons and its weight initialization

[...]

Y. Chen¹, Farokh B. Bastani•Institutions (1)

Western Geophysical¹

07 Jun 1992

TL;DR: The authors propose the multidendrite multiactivation product unit and the vectorial connection model for artificial neural networks and an optimal weight initialization algorithm is developed for a three-layer network with hidden units of 2D vectorial connections.

...read moreread less

Abstract: The authors propose the multidendrite multiactivation product unit and the vectorial connection model for artificial neural networks. A generalized backpropagation learning rule is also developed for multilayer feedforward networks with a new neuron model and connections. Each hidden neuron is a multiactivation product unit which requires vectorial axon connections and a productive activation function. An optimal weight initialization algorithm is developed for a three-layer network with hidden units of 2D vectorial connections. The weights between the input layer and the hidden layer are derived from the feature selection methods used in pattern recognition. The activation function is the product of a 2D Hermite spline base function. The weights between the hidden layer and the third layer are scaled coefficients of the 2D Hermite spline interpolations. The performances of networks initialized by the new algorithm are compared with those obtained by selecting random initial weights. >

...read moreread less

Journal Article•DOI•

Designing multilayer feedforward neural networks using simplified sigmoid activation functions and one-powers-of-two weights

[...]

Hon Keung Kwan¹, C.Z. Tang¹•Institutions (1)

University of Windsor¹

03 Dec 1992-Electronics Letters

TL;DR: A design method for multilayer feedforward neural networks with simplified sigmoid activation functions and one-powers-of-two weights is proposed that can retain a nearly identical generalisation capability of the corresponding network using continuous weights while having increased computational speed in applications and reduced cost in digital hardware implementation.

...read moreread less

Abstract: A design method for multilayer feedforward neural networks with simplified sigmoid activation functions and one-powers-of-two weights is proposed. The designed multilayer feed-forward neural network can retain a nearly identical generalisation capability of the corresponding network using continuous weights, while having increased computational speed in applications and reduced cost in digital hardware implementation.

...read moreread less

Journal Article•DOI•

Original Contribution: Classes of feedforward neural networks and their circuit complexity

[...]

John Shawe-Taylor¹, Martin Anthony², Walter Kern³•Institutions (3)

Royal Holloway, University of London¹, London School of Economics and Political Science², University of Twente³

01 Nov 1992-Neural Networks

TL;DR: This paper defines appropriate classes of feedforward neural networks with specified fan-in, accuracy of computation and depth and using techniques of communication complexity proceed to show that the classes fit into a well-studied hierarchy of boolean circuits.

...read moreread less

Journal Article•DOI•

Autoassociative Neural Networks for Image Compression

[...]

Andrea Basso¹, Murat Kunt¹•Institutions (1)

École Polytechnique¹

01 Nov 1992-European Transactions on Telecommunications

TL;DR: A massively parallel implementation of a linear version of the neural technique on the Associative String Processor (ASP) machine and promising results are shown in terms of learning speed and quality of the reconstructed images.

...read moreread less

Abstract: In this paper a neural autoassociative technique applied to image compression is presented. Particular attention is devoted to the preprocessing stage. The validity of some of the already established theoretical results is discussed and an experimental study of the mapping capabilities of the network based on a nonlinear parametrized activation function is presented. In order to test the image reconstruction capabilities of the neural technique, comparisons with more traditional image processing tools such as Karhunen-Loeve Transform (KLT) are shown. A massively parallel implementation of a linear version of the neural technique on the Associative String Processor (ASP) machine is presented. Despite the linear structure of the ASP and the use of fixed arithmetic for the implementation, promising results are shown in terms of learning speed (of the order of 109 connections per second) and quality of the reconstructed images.

...read moreread less

Proceedings Article•DOI•

A variant of second-order multilayer perceptron and its application to function approximations

[...]

Cheng-Chin Chiang¹, Hsin-Chia Fu¹•Institutions (1)

National Chiao Tung University¹

07 Jun 1992

TL;DR: A second-order multilayer perceptron that uses a different activation function, the quadratic sigmoid function, and a learning algorithm is developed based on this new activation function to approximate continuous-valued functions.

...read moreread less

Abstract: A second-order multilayer perceptron that uses a different activation function, the quadratic sigmoid function, is proposed. Unlike the conventional sigmoid activation function, the quadratic sigmoid function exhibits second-order characteristics among the input components. Based on this new activation function, a learning algorithm is developed for the new multilayer perceptron. The proposed multilayer perceptron has been used to approximate continuous-valued functions. The approximation results show that the learning speed and the network size were significantly improved in comparison with the conventional multilayer perceptrons which use the sigmoid activation functions. >

...read moreread less

Patent•

Neural network having an optimized transfer function for each neuron

[...]

Kazuyuki Shiomi¹, Sei Watanabe¹•Institutions (1)

Honda¹

18 Feb 1992

TL;DR: In this paper, the authors proposed to use the characteristic data for determining the characteristics of the transfer functions (for example, sigmoid functions) of the neurons of the hidden layer and the output layer of a neural network.

...read moreread less

Abstract: The characteristic data for determining the characteristics of the transfer functions (for example, sigmoid functions) of the neurons of the hidden layer and the output layer (the gradients of the sigmoid functions) of a neural network are learned and corrected in a manner similar to the correction of weighting data and threshold values. Since at least one characteristic data which determines the characteristics of the transfer function of each neuron is learned, the transfer function characteristics can be different for different neurons in the network independently of the problem and/or the number of neurons, and be optimum. Accordingly, a learning with high precision can be performed in a short time.

...read moreread less

Proceedings Article•

On Learning µ-Perceptron Networks with Binary Weights

[...]

Mostefa Golea¹, Mario Marchand¹, Thomas R. Hancock²•Institutions (2)

University of Ottawa¹, Princeton University²

30 Nov 1992

TL;DR: This paper gives a polynomial time algorithm that PAC learns these networks under the uniform distribution and suggests that, under reasonable distributions, µ-perceptron networks may be easier to learn than fully connected networks.

...read moreread less

Abstract: Neural networks with binary weights are very important from both the theoretical and practical points of view. In this paper, we investigate the learnability of single binary perceptrons and unions of µ-binary-perceptron networks, i.e. an "OR" of binary perceptrons where each input unit is connected to one and only one perceptron. We give a polynomial time algorithm that PAC learns these networks under the uniform distribution. The algorithm is able to identify both the network connectivity and the weight values necessary to represent the target function. These results suggest that, under reasonable distributions, µ-perceptron networks may be easier to learn than fully connected networks.

...read moreread less

Patent•DOI•

Operational speed improvement for neural network

[...]

Bernhard E. Boser¹•Institutions (1)

Bell Labs¹

02 Mar 1992-Journal of the Acoustical Society of America

TL;DR: In this paper, a hyperbolic tangent function is replaced by a piecewise linear threshold logic function in a neural network and a similar but less complex nonlinear function in each neuron or computational element after each neuron has been trained by an appropriate training algorithm for the classifying problem addressed by the neural network.

...read moreread less

Abstract: Higher operational speed is obtained without sacrificing computational accuracy and reliability in a neural network by interchanging a computationally complex nonlinear function with a similar but less complex nonlinear function in each neuron or computational element after each neuron of the network has been trained by an appropriate training algorithm for the classifying problem addressed by the neural network. In one exemplary embodiment, a hyperbolic tangent function is replaced by a piecewise linear threshold logic function.

...read moreread less

Proceedings Article•DOI•

Radial Basis Function Networks Applied to Process Control

[...]

A.G. Hofland¹, A.J. Morris¹, Gary Montague¹•Institutions (1)

University of Newcastle¹

24 Jun 1992

TL;DR: In this paper, Radial basis function networks are compared with sigmoidal activation function feedforward networks using data from a large industrial process and the contribution that RBF networks can make to the process modelling and control toolbox is examined.

...read moreread less

Abstract: There are strong relationships between radial basis function (RBF) approaches and neural network representations. Indeed, the RBF representation can be implementated in the form of a two-layered network. This paper examines the contribution that RBF networks can make to the process modelling and control toolbox. Radial basis function networks are compared with sigmoidal activation function feedforward networks using data from a large industrial process.

...read moreread less

Patent•

Neural network with rule-base network configuration

[...]

Juergen Hollatz¹, Volker Tresp¹•Institutions (1)

Siemens¹

11 Sep 1992

TL;DR: In this article, the neural network is pre-structural in a given network configuration using rule-base knowledge, and each initial value is defined by a normalised linear summation function in terms of weighted and unweighted base function.

...read moreread less

Abstract: The neural network is pre-structural in a given network configuration using rule-base knowledge. Each initial value is defined by a normalised linear summation function in terms of weighted and unweighted base function. The latter is obtained from the mean values of localised positive function of the neural network input values.Pref. the base function is obtained via an exponential function of an inverse covariance matrix, multiplied by a normalisation factor for the base function.

...read moreread less

Proceedings Article•DOI•

Noise sensitivity of static neural network classifiers

[...]

Steven D. Beck, Joydeep Ghosh¹•Institutions (1)

University of Texas at Austin¹

16 Sep 1992-Proceedings of SPIE

TL;DR: A variety of artificial neural networks are evaluated for their classification abilities under noisy inputs, including feedforward networks, localized basis function networks, and exemplar classifiers.

...read moreread less

Abstract: A variety of artificial neural networks are evaluated for their classification abilities under noisy inputs. These networks include feedforward networks, localized basis function networks, and exemplar classifiers. The performance of radial basis function classifiers deteriorate rapidly in the presence of noise, but elliptical basis variants are able to adapt to extraneous input components quite robustly. For feedforward networks, selective pruning of weights based on an `optimal brain damage' approach helps in noise-tolerant classification. Results from a radar classification problem are presented.

...read moreread less

Proceedings Article•DOI•

Comparison of generalization in multi-layer perceptrons with the log-likelihood and least-squares cost functions

[...]

Murray J. J. Holt¹•Institutions (1)

Loughborough University¹

01 Jan 1992

TL;DR: An analysis is presented which suggests how the use of this function can improve convergence and generalization and tests on simulated data provide evidence of improved generalization with the log likelihood cost function.

...read moreread less

Abstract: The log likelihood cost function is discussed as an alternative to the least-squares criterion for training feedforward neural networks. An analysis is presented which suggests how the use of this function can improve convergence and generalization. Tests on simulated data using both training algorithms provide evidence of improved generalization with the log likelihood cost function. >

...read moreread less

Proceedings Article•

An improved connectionist activation function for energy minimization

[...]

Gadi Pinkas¹, Rina Dechter²•Institutions (2)

Washington University in St. Louis¹, University of California, Irvine²

12 Jul 1992

TL;DR: The improved algorithm guarantees that a global minimum is found in linear time for tree-like subnetworks and is self-stabilizing for trees (cycle-free undirected graphs) and remains correct under various scheduling demons.

...read moreread less

Abstract: Symmetric networks that are based on energy minimization, such as Boltzmann machines or Hopfield nets, are used extensively for optimization, constraint satisfaction, and approximation of NP-hard problems. Nevertheless, finding a global minimum for the energy function is not guaranteed, and even a local minimum may take an exponential number of steps. We propose an improvement to the standard activation function used for such networks. The improved algorithm guarantees that a global minimum is found in linear time for tree-like subnetworks. The algorithm is uniform and does not assume that the network is a tree. It performs no worse than the standard algorithms for any network topology. In the case where there are trees growing from a cyclic subnetwork, the new algorithm performs better than the standard algorithms by avoiding local minima along the trees and by optimizing the free energy of these trees in linear time. The algorithm is self-stabilizing for trees (cycle-free undirected graphs) and remains correct under various scheduling demons. However, no uniform protocol exists to optimize trees under a pure distributed demon and no such protocol exists for cyclic networks under central demon.

...read moreread less

Proceedings Article•DOI•

On the complexity of learning with a small number of nodes

[...]

Ian Parberry

07 Jun 1992

TL;DR: The loading problem for a four-node neural network with a node function set equal to AC/sub 10/ plus the three-input equality function is shown to be NP complete, and it is indicated how the required results can be derived in a similar fashion.

...read moreread less

Abstract: It is shown that the loading problem for a six-node neural network with a node function set AC/sub 10/, i.e., the conjunction or disjunction of a subset of the inputs or their complements is NP complete. It can be deduced from this observation that the loading problem for a six-node analog neural network is NP hard. The loading problem for a four-node neural network with a node function set equal to AC/sub 10/ plus the three-input equality function is shown to be NP complete, and it is indicated how the required results can be derived in a similar fashion. Three loading problems are studied. >

...read moreread less

Proceedings Article•DOI•

A cellular network formed of Hopfield networks

[...]

B. Ling¹, F.M.A. Salama¹•Institutions (1)

Michigan State University¹

09 Aug 1992

TL;DR: Computer simulation shows that the authors' network structure and design approach are valid and a sufficient stability criterion which can be realized by a redesign of the neuron function is given.

...read moreread less

Abstract: A general structure of the cellular neural network is introduced. An analytical method is presented to find the weight matrix for a given set of desired vectors. An energy function is then constructed. Using the energy function, a sufficient stability criterion which can be realized by a redesign of the neuron function is given. Computer simulation shows that the authors' network structure and design approach are valid. >

...read moreread less

Wavelets as basis functions for localized learning

[...]

Bhavik R. Bakshi, George Stephanopoulos

01 Jan 1992

TL;DR: The mathematical framework for the development of Wave-Nets is presented and various aspects of their practical implementation are discussed and the problem of predicting a chaotic time-series is solved as an illustrative example.

...read moreread less

Abstract: A novel artificial neural network with one hidden layer of nodes, whose basis functions are drawn from a family of orthonormal wavelets, is developed in this paper. Wavelet Networks or Wave-Nets are based on fm theoretical foundations of functional analysis. The good localization characteristics of the basis functions, both in the input and frequency domains, allow hierarchical, multi-resolution learning of input-output maps from experimental data. Furthermore, Wave-Nets allow explicit estimation of global and local prediction error-bounds, and thus lend themselves to a rigorous and transparent design of the network. Computational complexity arguments prove that the training and adaptation efficiency of Wave-Nets is at least an order of magnitude better than other networks. This paper presents the mathematical framework for the development of Wave-Nets and discusses various aspects of their practical implementation. The problem of predicting a chaotic time-series is solved as an illustrative example. Learning by artificial neural networks represents an expansion of the unknown nonlinear relationship between inputs, x and outputs, F(x), into a space spanned by the activation functions of the network’s node. Specifically, Poggio and Girosi (1989) have shown that learning by feedforward neural networks can be regarded as synthesizing an approximation of a multi-dimensional function, over a space spanned by the activations functions, ~XX, k), 1 = 1, 2, ..., m, where k are adjustable parameters, i.e. m I= 1 F(x) = CflI(X, k) (1) Using empirical data, the activation function parameters, and the network parameters, cp 1 = 1,2, ..., m, are adjusted in such a way as to minimize the approximation error. The solution to this nonlinear problem is often ad hoc, requiring trial and error, giving artificial neural networks a “black box” character. local, as in Radial Basis Function Networks (RBFN). Both networks are capable of approximating any continuous function with arbitrary accuracy, given enough nodes, but have different approximation properties. Adaptation and incremental learning with global approximators is slow due to the interaction of many nodes, and may not converge. They could also lead to large extrapolation errors without warning. These problems are overcome in neural networks with local activation functions. Improved understanding of the relationship between neural networks, approximation theory and functional analysis has prompted several researchers to look for better ways to design neural networks. From the theory of functional analysis it is well known that functions can be represented as a weighted sum of orthogonal basis functions. Such expansions can be easily represented as neural nets which can be designed for the desired error rate using the properties of orthonormal expansions, thus decreasing the ad hocness of neural net design. Unfortunately, most orthogonal functions are global approximators, and suffer from the disadvantages mentioned above. In order to take full advantage of orthonormality of basis functions, and localized learning, we need a set of basis functions which are local and orthogonal. properties. Such functions, belonging to the class of wavelets have been developed, and have found applications in several fields like signal processing and quantum physics (Daubechies, 1988; Mallat, 1989). In this paper we propose the development of neural networks with activation functions derived from various classes of orthogonal wavelets. The resulting -let Mwork, or Wave-Net has all the advantages of true localized learning. Furthermore, in most learning problems the training data are often non-uniformly distributed in the input space. An efficient way of solving such problems is by learning at multiple resolutions (Moody, 1989). A higher resolution of the input space may be used if data are dense, and a lower resolution where they are sparse. Wavelets, in addition to forming an orthogonal basis are also capable of explicitly representing the behavior of a function at various resolutions of input variables. Consequently, a Wave-Net is fwst trained to learn the mapping between inputs and outputs at the coarsest resolution of input values. Subsequently, it is trained to incorporate elements of the input-output mapping at higher resolutions of the input variables until the desired level of generalization has been reached. Such hierarchical, multi-resolution training has many attractive features for solving engineering problems, e.g. a meaningful interpretation of the resulting mapping, estimation of mapping errors both in local Two types of activation functions are commonly used; global, as in Backpropagation Networks (BPN), and It was believed until recently that it was not possible to build simple orthonormal bases with good localization

...read moreread less