Showing papers on "Activation function published in 1990"

PDF

Open Access

Journal Article•DOI•

[...]

01 Jan 1990-Neural Networks

TL;DR: A probabilistic neural network that can compute nonlinear decision boundaries which approach the Bayes optimal is formed, and a fourlayer neural network of the type proposed can map any input pattern to any number of classifications.

...read moreread less

3,772 citations

Journal Article•DOI•

Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks

[...]

Kurt Hornik¹, Maxwell B. Stinchcombe¹, Halbert White¹•Institutions (1)

University of California, San Diego¹

01 Oct 1990-Neural Networks

TL;DR: A shoulder strap retainer having a base to be positioned on the exterior shoulder portion of a garment with securing means attached to the undersurface of the base for removably securing the base to the exterior shoulders portion of the garment.

...read moreread less

1,709 citations

Journal Article•DOI•

The multilayer perceptron as an approximation to a Bayes optimal discriminant function

[...]

Dennis W. Ruck¹, Steven K. Rogers¹, Matthew Kabrisky¹, Mark E. Oxley¹, Bruce W. Suter¹ - Show less +1 more•Institutions (1)

Air Force Institute of Technology¹

01 Dec 1990-IEEE Transactions on Neural Networks

TL;DR: The multilayer perceptron, when trained as a classifier using backpropagation, is shown to approximate the Bayes optimal discriminant function.

...read moreread less

Abstract: The multilayer perceptron, when trained as a classifier using backpropagation, is shown to approximate the Bayes optimal discriminant function. The result is demonstrated for both the two-class problem and multiple classes. It is shown that the outputs of the multilayer perceptron approximate the a posteriori probability functions of the classes being trained. The proof applies to any number of layers and any type of unit activation function, linear or nonlinear. >

...read moreread less

866 citations

Proceedings Article•DOI•

Approximating and learning unknown mappings using multilayer feedforward networks with bounded weights

[...]

Maxwell B. Stinchcombe, Halbert White¹•Institutions (1)

University of California, San Diego¹

17 Jun 1990

TL;DR: It is shown that feedforward networks having bounded weights are not undesirable restricted, but are in fact universal approximators, provided that the hidden-layer activation function belongs to one of several suitable broad classes of functions: polygonal functions, certain piecewise polynomial functions, or a class of functions analytic on some open interval.

...read moreread less

Abstract: It is shown that feedforward networks having bounded weights are not undesirable restricted, but are in fact universal approximators, provided that the hidden-layer activation function belongs to one of several suitable broad classes of functions: polygonal functions, certain piecewise polynomial functions, or a class of functions analytic on some open interval. These results are obtained by trading bounds on network weights for possible increments to network complexity, as indexed by the number of hidden nodes. The hidden-layer activation functions used include functions not admitted by previous universal approximation results, so the present results also extend the already broad class of activation functions for which universal approximation results are available. A theorem which establishes the approximate ability of these arbitrary mappings to learn when examples are generated by a stationary ergodic process is given

...read moreread less

125 citations

Journal Article•DOI•

Limitations of multi-layer perceptron networks - steps towards genetic neural networks

[...]

Heinz Mühlenbein

01 Aug 1990

TL;DR: This paper demystify the multi-layer perceptron network by showing that it just divides the input space into regions constrained by hyperplanes, and uses this information to construct minimal training sets.

...read moreread less

Abstract: In this paper we investigate multi-layer perceptron networks in the task domain of Boolean functions. We demystify the multi-layer perceptron network by showing that it just divides the input space into regions constrained by hyperplanes. We use this information to construct minimal training sets. Despite using minimal training sets, the learning time of multi-layer perceptron networks with backpropagation scales exponentially for complex Boolean functions. But modular neural networks which consist of independentky trained subnetworks scale very well. We conjecture that the next generation of neural networks will be genetic neural networks which evolve their structure. We confirm Minsky and Papert: “The future of neural networks is tied not to the search for some single, universal scheme to solve all problems at once, bu to the evolution of a many-faceted technology of network design.”

...read moreread less

58 citations

Patent•

Clustered neural networks

[...]

William P. Lincoln

09 Oct 1990

TL;DR: In this paper, a plurality of neural networks are coupled to an output neural network, or judge network, to form a clustered neural network and the judge network combines the outputs of the plurality of individual neural networks to provide the output from the entire clustered network.

...read moreread less

Abstract: A plurality of neural networks are coupled to an output neural network, or judge network, to form a clustered neural network. Each of the plurality of clustered networks comprises a supervised learning rule back-propagated neural network. Each of the clustered neural networks are trained to perform substantially the same mapping function before they are clustered. Following training, the clustered neural network computes its output by taking an "average" of the outputs of the individual neural networks that make up the cluster. The judge network combines the outputs of the plurality of individual neural networks to provide the output from the entire clustered network. In addition, the output of the judge network may be fed back to each of the individual neural networks and used as a training input thereto, in order to provide for continuous training. The use of the clustered network increases the speed of learning and results in better generalization. In addition, clustering multiple back-propagation networks provides for increased performance and fault tolerance when compared to a single unclustered network having substantially the same computational complexity. The present invention may be used in applications that are amenable to neural network solutions, including control and image processing applications. Clustering of the networks also permits the use of smaller networks and provides for improved performance. The clustering of multiple back-propagation networks provides for synergy that improves the properties of the clustered network over a comparably complex non-clustered network.

...read moreread less

56 citations

Proceedings Article•DOI•

Generalization of neural networks to the complex plane

[...]

Tom Clarke¹•Institutions (1)

University of Central Florida¹

17 Jun 1990

TL;DR: A complex-valued generalization of neural networks is presented and an activation function with more desirable characteristics in the complex plane is proposed, including the possibility of self oscillation.

...read moreread less

Abstract: A complex-valued generalization of neural networks is presented. The dynamics of complex neural networks have parallels in discrete complex dynamics which give rise to the Mandelbrot set and other fractals. The continuation to the complex plane of common activation functions and the resulting neural dynamics are discussed. An activation function with more desirable characteristics in the complex plane is proposed. The dynamics of this activation function include the possibility of self oscillation. Possible applications in signal processing and neurobiological modeling are discussed

...read moreread less

44 citations

Proceedings Article•DOI•

Backpropagation representation theorem using power series

[...]

Mu-Song Chen¹, Michael T. Manry•Institutions (1)

University of Texas at Arlington¹

17 Jun 1990

TL;DR: A representation theorem is developed for backpropagation neural networks that states that each term in the power series for F(x) is realizable using a building block, and each building block has one hidden layer.

...read moreread less

Abstract: A representation theorem is developed for backpropagation neural networks. First, it is assumed that the function to be approximated, F ( x ) for the vector x , is continuous and has finite support, so that it can be approximated arbitrarily well by a multidimensional power series. The activation function, sigmoid or otherwise, is then approximated by a power-series function of the net. Basic building-block subnetworks, realizing the monomial or product of the inputs, are implemented with any desired degree of accuracy. Each term in the power series for F ( x ) is realizable using a building block, and each building block has one hidden layer. Hence, the overall network has one hidden layer

...read moreread less

30 citations

Journal Article•DOI•

Transcriptional activator LEU3 of yeast. Mapping of the transcriptional activation function and significance of activation domain tryptophans.

[...]

Kemin Zhou¹, Gunter B. Kohlhaw¹•Institutions (1)

Purdue University¹

15 Oct 1990-Journal of Biological Chemistry

TL;DR: It is shown here that the transcriptional activation function of LEU3 resides within the C-terminal 32 amino acids, and that an alpha-isopropylmalate-induced conformational change in the central region releases and thus activates the activation domain.

...read moreread less

28 citations

Proceedings Article•DOI•

Multi-layer perceptrons with discrete weights

[...]

Michele Marchesi, G. Orlandi, Francesco Piazza, L. Pollonara, Aurelio Uncini - Show less +1 more

17 Jun 1990

TL;DR: A learning procedure based on back-propagation for obtaining a neural network with discrete weights, under the assumption that the neuron activation function is computed through a lookup table (LUT) and that a LUT can be shared among many neurons.

...read moreread less

Abstract: The feasibility of restricting the weight values in multilayer perceptrons to powers of two or sums of powers of two is studied. Multipliers could be thus replaced by shifters and adders on digital hardware, saving both time and chip area, under the assumption that the neuron activation function is computed through a lookup table (LUT) and that a LUT can be shared among many neurons. A learning procedure based on back-propagation for obtaining a neural network with such discrete weights is presented. This learning procedure requires full real arithmetic and therefore must be performed offline. It starts from a multilayer perceptron with continuous weights learned using back-propagation. Then a weight normalization is made to ensure that the whole shifting dynamics is used and to maximize the match between continuous and discrete weights of neurons sharing the same LUT. Finally, a discrete version of BP algorithm with automatic learning rate control is applied up to convergence. Some test runs on a simple pattern recognition problem show the feasibility of the approach

...read moreread less

27 citations

Book Chapter•DOI•

Fokker-Planck Description of Learning in Backpropagation Networks

[...]

Günter Radons¹, H. G. Schuster¹, D. Werner¹•Institutions (1)

University of Kiel¹

01 Jan 1990

TL;DR: The parameter dependence of the resulting diffusion tensor suggests how for perfectly trainable networks parameters can be made to converge to globally optimal values corresponding to an errorfree implementation of the desired input—output relations.

...read moreread less

Abstract: Stochastic pattern presentation induces fluctuations in the weigths of Backpropagation networks, which enable the system to escape from local minima in parameter space. For small learning rates we find that learning is governed by a Fokker-Planck equation. The parameter dependence of the resulting diffusion tensor suggests how for perfectly trainable networks parameters can be made to converge to globally optimal values corresponding to an errorfree implementation of the desired input—output relations. For cases where perfect learning is impossible we demonstrate the usefulness of a simulated annealing-like procedure to reach the minimal error state. We also propose a new activation function which can drastically improve learning as is demonstrated for the parity problem.

...read moreread less

Proceedings Article•

Sequential Adaptation of Radial Basis Function Networks.

[...]

Visakan Kadirkamanathan, Mahesan Niranjan, Frank Fallside

01 Jan 1990

Journal Article•

A neural network simulation of simultaneous single-unit activity recorded from the dragonfly ganglia.

[...]

Faller We¹, Luttges Mw•Institutions (1)

University of Colorado Boulder¹

01 Jan 1990-Biomedical sciences instrumentation

TL;DR: This model represents the first computer based network simulation using actual experimental neural data obtained from a large number of spontaneously active cells in a small intact ganglion, and indicates that the networks may synthesize patterns of activity needed for biological function.

...read moreread less

Abstract: Techniques are described that allow the use of multiple neuron spike data in a computational neural network architecture. The network architecture was devised to match the number of actual neurons from which data were obtained. The network was successfully trained to accurately predict the multiple neuron spike trains. Simultaneous spike histories of 44 neurons were modeled by a network architecture consisting of 44 input units, 88 hidden units with recurrent connections and 44 output units. The activation function of each unit was determined by data unique to a single neuron. These data were coupled with an analog gradient that preserved both the exact spiking times and the relative spiking tendency of each neuron. The input activation values were compared to network output target values calculated to occur 5 msec forward in the composite spiking records of all neurons. Following 2000 training cycles with the gradient data, the average error of each unit in the network was 0.0016. Discrete output values for each network unit were correlated with those of all other units. These correlations were comparable to those done using the actual neuron data. Both correlations reveal a functional connectivity pattern among the units and neurons. These connectivity patterns indicate that the networks may synthesize patterns of activity needed for biological function; in this case, flight patterns carried out in the mesothoracic ganglion of the dragonfly. This model represents, to the best of our knowledge, the first computer based network simulation using actual experimental neural data obtained from a large number of spontaneously active cells in a small intact ganglion.

...read moreread less

Book Chapter•DOI•

A Geometric Approach to the Structural Synthesis of Multilayer Perceptron Neural Networks

[...]

Jianbin Hao¹, S Tan¹, Joos Vandewalle¹•Institutions (1)

Katholieke Universiteit Leuven¹

01 Jan 1990

TL;DR: No general principle or guideline available for a synthesis task for multilayer perceptron classification is available, but the methods generally lead to the structure which only deals with a particular classification problem.

...read moreread less

Abstract: Designing a multilayer perceptron for general purpose classification has important practical implications. Since the capacity of multilayer perceptron to realize arbitrary dichotomies (or two-class classifications) is limited, the most important step in a design procedure is the determination of the number of the layers and the amount of nodes in each layer apart from the determination of the weights and the threshold values. Unfortunately, there has been no general principle or guideline available for such a synthesis task, normal design often proceeds on an ad hoc and empirical basis, the methods generally lead to the structure which only deals with a particular classification problem [1] [2].

...read moreread less

Proceedings Article•DOI•

Role of activation function on hidden units for sample recording in three-layer neural networks

[...]

X. Ying¹•Institutions (1)

Chinese Academy of Sciences¹

17 Jun 1990

TL;DR: It is shown that the k hidden units with asymptotic activation function are able to transfer any given k+1 different inputs to linearly independent GHUVs (generated hidden unit vectors) by properly setting weights and thresholds by leading to a scheme for understanding associative memory in the three-layer networks.

...read moreread less

Abstract: It is shown that the k hidden units with asymptotic activation function are able to transfer any given k +1 different inputs to linearly independent GHUVs (generated hidden unit vectors) by properly setting weights and thresholds. The number of hidden units with the LIT (linearly independent transformation) capability for the polynomial activation function is limited by the order of polynomials. For analytic asymptotic activation functions and given different inputs, the LIT is a generic capability and a probability 1 capability in setting weights and thresholds randomly. It is a generic and a probability 1 property for any random input if the weight and threshold setting has LIT capability for some k +1 inputs. For three-layer nets with k hidden units, in which the activation function is asymptotic and the output layer is without activation function, they are sufficient to record k +1 arbitrary real samples. It is probability 0 to record k +2 random real samples if the activation is a unit step function. This is true for the sigmoid function in the case of associative memory. These conclusions lead to a scheme for understanding associative memory in the three-layer networks

...read moreread less

Book Chapter•DOI•

Conditions on Activation Functions of Hidden Units for Learning by Backpropagation

[...]

Masahiko Arai¹•Institutions (1)

Toshiba¹

01 Jan 1990

TL;DR: Conditions on an activation function of hidden units for the purpose of utilizing backpropagation for three-layer-net learning are considered and it is discussed that the vectors made from the states and a constant become linearly independent.

...read moreread less

Abstract: This paper considers conditions on an activation function of hidden units for the purpose of utilizing backpropagation for three-layer-net learning. A necessary condition for the convergence of backpropagation procedures to a global minimum of a cost function is that a set of states of the hidden layer is linearly separable. A sufficient condition for the separability is that the vectors made from the states and a constant become linearly independent. This paper discusses the conditions that the vectors become linearly independent.

...read moreread less

The effect of an adaptive slope of the activation function on the performance of the back propagation algorithm

[...]

Ali Rezgui

01 Jan 1990

TL;DR: The main conclusion of the dissertation is that the back propagation algorithm can be made more robust by not only making the weights adaptive, but by making the slopes of the nonlinearities adaptive as well.

...read moreread less

Abstract: This dissertation is an investigation of the effect of the slope of an activation function (the node nonlinearity) on the performance of the back propagation algorithm in training a multilayer perceptron. When the slope of the activation function is too steep, the input of the nonlinearity often falls in the saturation region. When this occurs, the derivative of the nonlinearity becomes very small resulting in a very small update of the weights. The main conclusion of the dissertation is that the back propagation algorithm can be made more robust by not only making the weights adaptive, but by making the slopes of the nonlinearities adaptive as well. Also a piecewise linear activation function is investigated as a computationally more efficient approximation to the commonly used sigmoid function.

...read moreread less

Proceedings Article•DOI•

Unsupervised clustering based on a competitive cost function

[...]

Marco Saerens

17 Jun 1990

TL;DR: A general unsupervised learning scheme based on a competitive cost function is presented and the network is able to cluster the data and performs well compared to the dynamic clusters technique, though it fails to make optimal partition of the data for some problems.

...read moreread less

Abstract: A general unsupervised learning scheme based on a competitive cost function is presented. A gradient technique is used to minimize the cost function. The algorithm is then applied to clustering problems by using a particular unit activation function. A quadratic potential function is used which permits clustering the data with ellipsoids. Comparisons are made with the dynamic clusters method on artificial data and on R.A. Fisher's (1936) iris data set. Results show that the network is able to cluster the data and performs well compared to the dynamic clusters technique, though it fails to make optimal partition of the data for some problems. Moreover, it automatically finds the number of clusters, contrary to most clustering techniques

...read moreread less