Showing papers in &quot;IEEE Transactions on Neural Networks in 1995&quot;

Diagonal recurrent neural networks for dynamic systems control

TL;DR: This paper studies the approximation and learning properties of one class of recurrent networks, known as high-order neural networks; and applies these architectures to the identification of dynamical systems.

...read moreread less

Abstract: Several continuous-time and discrete-time recurrent neural network models have been developed and applied to various engineering problems. One of the difficulties encountered in the application of recurrent networks is the derivation of efficient learning algorithms that also guarantee the stability of the overall system. This paper studies the approximation and learning properties of one class of recurrent networks, known as high-order neural networks; and applies these architectures to the identification of dynamical systems. In recurrent high-order neural networks, the dynamic components are distributed throughout the network in the form of dynamic neurons. It is shown that if enough high-order connections are allowed then this network is capable of approximating arbitrary dynamical systems. Identification schemes based on high-order network architectures are designed and analyzed. >

...read moreread less

761 citations

Journal Article•DOI•

[...]

C.-C. Ku¹, Kwang Y. Lee¹•Institutions (1)

Pennsylvania State University¹

Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems

TL;DR: Convergence theorems for the adaptive backpropagation algorithms are developed for both DRNI and DRNC and an approach that uses adaptive learning rates is developed by introducing a Lyapunov function.

...read moreread less

Abstract: A new neural paradigm called diagonal recurrent neural network (DRNN) is presented. The architecture of DRNN is a modified model of the fully connected recurrent neural network with one hidden layer, and the hidden layer comprises self-recurrent neurons. Two DRNN's are utilized in a control system, one as an identifier called diagonal recurrent neuroidentifier (DRNI) and the other as a controller called diagonal recurrent neurocontroller (DRNC). A controlled plant is identified by the DRNI, which then provides the sensitivity information of the plant to the DRNC. A generalized dynamic backpropagation algorithm (DBP) is developed and used to train both DRNC and DRNI. Due to the recurrence, the DRNN can capture the dynamic behavior of a system. To guarantee convergence and for faster learning, an approach that uses adaptive learning rates is developed by introducing a Lyapunov function. Convergence theorems for the adaptive backpropagation algorithms are developed for both DRNI and DRNC. The proposed DRNN paradigm is applied to numerical problems and the simulation results are included. >

...read moreread less

725 citations

Journal Article•DOI•

[...]

Tianping Chen¹, Hong Chen²•Institutions (2)

Fudan University¹, Sun Microsystems²

Artificial neural networks for feature extraction and multivariate data projection

TL;DR: The main results are: every Tauber-Wiener function is qualified as an activation function in the hidden layer of a three-layered neural network and the possibility by neural computation to approximate the output as a whole of a dynamical system, thus identifying the system.

...read moreread less

Abstract: The purpose of this paper is to investigate neural network capability systematically. The main results are: 1) every Tauber-Wiener function is qualified as an activation function in the hidden layer of a three-layered neural network; 2) for a continuous function in S'(R/sup 1/) to be a Tauber-Wiener function, the necessary and sufficient condition is that it is not a polynomial; 3) the capability of approximating nonlinear functionals defined on some compact set of a Banach space and nonlinear operators has been shown; and 4) the possibility by neural computation to approximate the output as a whole (not at a fixed point) of a dynamical system, thus identifying the system. >

...read moreread less

704 citations

Journal Article•DOI•

[...]

Jianchang Mao¹, Anil K. Jain²•Institutions (2)

IBM¹, Michigan State University²

Gradient calculations for dynamic recurrent neural networks: a survey

TL;DR: The SAMANN network offers the generalization ability of projecting new data, which is not present in the original Sammon's projection algorithm; the NDA method and NP-SOM network provide new powerful approaches for visualizing high dimensional data.

...read moreread less

Abstract: Classical feature extraction and data projection methods have been well studied in the pattern recognition and exploratory data analysis literature. We propose a number of networks and learning algorithms which provide new or alternative tools for feature extraction and data projection. These networks include a network (SAMANN) for J.W. Sammon's (1969) nonlinear projection, a linear discriminant analysis (LDA) network, a nonlinear discriminant analysis (NDA) network, and a network for nonlinear projection (NP-SOM) based on Kohonen's self-organizing map. A common attribute of these networks is that they all employ adaptive learning algorithms which makes them suitable in some environments where the distribution of patterns in feature space changes with respect to time. The availability of these networks also facilitates hardware implementation of well-known classical feature extraction and projection approaches. Moreover, the SAMANN network offers the generalization ability of projecting new data, which is not present in the original Sammon's projection algorithm; the NDA method and NP-SOM network provide new powerful approaches for visualizing high dimensional data. We evaluate five representative neural networks for feature extraction and data projection based on a visual judgement of the two-dimensional projection maps and three quantitative criteria on eight data sets with various properties. >

...read moreread less

695 citations

Journal Article•DOI•

[...]

Barak A. Pearlmutter¹•Institutions (1)

Princeton University¹

01 Sep 1995-IEEE Transactions on Neural Networks

TL;DR: The author discusses advantages and disadvantages of temporally continuous neural networks in contrast to clocked ones and presents some "tricks of the trade" for training, using, and simulating continuous time and recurrent neural networks.

...read moreread less

Abstract: Surveys learning algorithms for recurrent neural networks with hidden units and puts the various techniques into a common framework. The authors discuss fixed point learning algorithms, namely recurrent backpropagation and deterministic Boltzmann machines, and nonfixed point algorithms, namely backpropagation through time, Elman's history cutoff, and Jordan's output feedback architecture. Forward propagation, an on-line technique that uses adjoint equations, and variations thereof, are also discussed. In many cases, the unified presentation leads to generalizations of various sorts. The author discusses advantages and disadvantages of temporally continuous neural networks in contrast to clocked ones continues with some "tricks of the trade" for training, using, and simulating continuous time and recurrent neural networks. The author presents some simulations, and at the end, addresses issues of computational complexity and learning speed. >

...read moreread less

627 citations

Journal Article•DOI•

Neural net robot controller with guaranteed tracking performance

[...]

Frank L. Lewis, K. Liu, Aydin Yesildirek

Holographic reduced representations

TL;DR: It is shown that standard backpropagation, when used for real-time closed-loop control, can yield unbounded NN weights if (1) the net cannot exactly reconstruct a certain required control function or (2) there are bounded unknown disturbances in the robot dynamics.

...read moreread less

Abstract: A neural net (NN) controller for a general serial-link robot arm is developed. The NN has two layers so that linearity in the parameters holds, but the "net functional reconstruction error" and robot disturbance input are taken as nonzero. The structure of the NN controller is derived using a filtered error/passivity approach, leading to new NN passivity properties. Online weight tuning algorithms including a correction term to backpropagation, plus an added robustifying signal, guarantee tracking as well as bounded NN weights. The NN controller structure has an outer tracking loop so that the NN weights are conveniently initialized at zero, with learning occurring online in real-time. It is shown that standard backpropagation, when used for real-time closed-loop control, can yield unbounded NN weights if (1) the net cannot exactly reconstruct a certain required control function or (2) there are bounded unknown disturbances in the robot dynamics. The role of persistency of excitation is explored. >

...read moreread less

611 citations

Journal Article•DOI•

[...]

T.A. Plate

Efficient classification for multiclass problems using modular neural networks

TL;DR: This paper describes a method for representing more complex compositional structure in distributed representations that uses circular convolution to associate items, which are represented by vectors.

...read moreread less

Abstract: Associative memories are conventionally used to represent data with very simple structure: sets of pairs of vectors. This paper describes a method for representing more complex compositional structure in distributed representations. The method uses circular convolution to associate items, which are represented by vectors. Arbitrary variable bindings, short sequences of various lengths, simple frame-like structures, and reduced representations can be represented in a fixed width vector. These representations are items in their own right and can be used in constructing compositional structures. The noisy reconstructions extracted from convolution memories can be cleaned up by using a separate associative memory that has good reconstructive properties. >

...read moreread less

597 citations

Journal Article•DOI•

[...]

Rangachari Anand¹, Kishan G. Mehrotra, Chilukuri K. Mohan, Sanjay Ranka¹•Institutions (1)

Syracuse University¹

Locally excitatory globally inhibitory oscillator networks

TL;DR: The approach is to use a modular network architecture, reducing a K-class problem to a set of K two-class problems, with a separately trained network for each of the simpler problems.

...read moreread less

Abstract: The rate of convergence of net output error is very low when training feedforward neural networks for multiclass problems using the backpropagation algorithm. While backpropagation will reduce the Euclidean distance between the actual and desired output vectors, the differences between some of the components of these vectors increase in the first iteration. Furthermore, the magnitudes of subsequent weight changes in each iteration are very small, so that many iterations are required to compensate for the increased error in some components in the initial iterations. Our approach is to use a modular network architecture, reducing a K-class problem to a set of K two-class problems, with a separately trained network for each of the simpler problems. Speedups of one order of magnitude have been obtained experimentally, and in some cases convergence was possible using the modular approach but not using a nonmodular network. >

...read moreread less

407 citations

Journal Article•DOI•

[...]

DeLiang Wang¹, David Terman•Institutions (1)

Ohio State University¹

Learning in linear neural networks: a survey

TL;DR: A novel class of locally excitatory, globally inhibitory oscillator networks (LEGION) is proposed and investigated, which lays a physical foundation for the oscillatory correlation theory of feature binding and may provide an effective computational framework for scene segmentation and figure/ground segregation in real time.

...read moreread less

Abstract: A novel class of locally excitatory, globally inhibitory oscillator networks (LEGION) is proposed and investigated. The model of each oscillator corresponds to a standard relaxation oscillator with two time scales. In the network, an oscillator jumping up to its active phase rapidly recruits the oscillators stimulated by the same pattern, while preventing other oscillators from jumping up. Computer simulations demonstrate that the network rapidly achieves both synchronization within blocks of oscillators that are stimulated by connected regions and desynchronization between different blocks. This model lays a physical foundation for the oscillatory correlation theory of feature binding and may provide an effective computational framework for scene segmentation and figure/ground segregation in real time. >

...read moreread less

266 citations

Journal Article•DOI•

[...]

Pierre Baldi¹, Kurt Hornik•Institutions (1)

California Institute of Technology¹

Accuracy analysis for wavelet approximations

TL;DR: Most of the known results on linear networks, including backpropagation learning and the structure of the error function landscape, the temporal evolution of generalization, and unsupervised learning algorithms and their properties are surveyed.

...read moreread less

Abstract: Networks of linear units are the simplest kind of networks, where the basic questions related to learning, generalization, and self-organization can sometimes be answered analytically. We survey most of the known results on linear networks, including: 1) backpropagation learning and the structure of the error function landscape, 2) the temporal evolution of generalization, and 3) unsupervised learning algorithms and their properties. The connections to classical statistical ideas, such as principal component analysis (PCA), are emphasized as well as several simple but challenging open questions. A few new results are also spread across the paper, including an analysis of the effect of noise on backpropagation networks and a unified view of all unsupervised algorithms. >

...read moreread less

Journal Article•DOI•

[...]

Bernard Delyon, Anatoli Juditsky, Albert Benveniste

Robust principal component analysis by self-organizing rules based on statistical physics approach

TL;DR: Unlike neural network training, this estimation procedure does not rely on stochastic gradient type techniques such as the celebrated "backpropagation" and it completely avoids the problem of poor convergence or undesirable local minima.

...read moreread less

Abstract: "Constructive wavelet networks" are investigated as a universal tool for function approximation. The parameters of such networks are obtained via some "direct" Monte Carlo procedures. Approximation bounds are given. Typically, it is shown that such networks with one layer of "wavelons" achieve an L/sub 2/ error of order O(N/sup -(/spl rhod)/), where N is the number of nodes, d is the problem dimension and /spl rho/ is the number of summable derivatives of the approximated function. An algorithm is also proposed to estimate this approximation based on noisy input-output data observed from the function under consideration. Unlike neural network training, this estimation procedure does not rely on stochastic gradient type techniques such as the celebrated "backpropagation" and it completely avoids the problem of poor convergence or undesirable local minima. >

...read moreread less

Journal Article•DOI•

[...]

Lei Xu¹, Alan L. Yuille²•Institutions (2)

The Chinese University of Hong Kong¹, Harvard University²

Neural modeling for time series: A statistical stepwise method for weight elimination

TL;DR: The authors' robust rules improve the performances of the existing PCA algorithms significantly when outliers are present and perform excellently for fulfilling various PCA-like tasks such as obtaining the first principal component vector, the first k principal component vectors, and directly finding the subspace spanned by the firstk vector principal components vectors without solving for each vector individually.

...read moreread less

Abstract: This paper applies statistical physics to the problem of robust principal component analysis (PCA). The commonly used PCA learning rules are first related to energy functions. These functions are generalized by adding a binary decision field with a given prior distribution so that outliers in the data are dealt with explicitly in order to make PCA robust. Each of the generalized energy functions is then used to define a Gibbs distribution from which a marginal distribution is obtained by summing over the binary decision field. The marginal distribution defines an effective energy function, from which self-organizing rules have been developed for robust PCA. Under the presence of outliers, both the standard PCA methods and the existing self-organizing PCA rules studied in the literature of neural networks perform quite poorly. By contrast, the robust rules proposed here resist outliers well and perform excellently for fulfilling various PCA-like tasks such as obtaining the first principal component vector, the first k principal component vectors, and directly finding the subspace spanned by the first k vector principal component vectors without solving for each vector individually. Comparative experiments have been made, and the results show that the authors' robust rules improve the performances of the existing PCA algorithms significantly when outliers are present. >

...read moreread less

Journal Article•DOI•

[...]

Marie Cottrell, Bernard Girard, Yvonne Girard, Morgan Mangeas, Corinne Muller - Show less +1 more

01 Nov 1995-IEEE Transactions on Neural Networks

TL;DR: The asymptotical properties of the estimators lead us to propose a systematic methodology to determine which weights are nonsignificant and to eliminate them to simplify the architecture.

...read moreread less

Abstract: Many authors use feedforward neural networks for modeling and forecasting time series. Most of these applications are mainly experimental, and it is often difficult to extract a general methodology from the published studies. In particular, the choice of architecture is a tricky problem. We try to combine the statistical techniques of linear and nonlinear time series with the connectionist approach. The asymptotical properties of the estimators lead us to propose a systematic methodology to determine which weights are nonsignificant and to eliminate them to simplify the architecture. This method (SSM or statistical stepwise method) is compared to other pruning techniques and is applied to some artificial series, to the famous Sunspots benchmark, and to daily electrical consumption data. >

...read moreread less

Journal Article•DOI•

Gradient descent learning algorithm overview: a general dynamical systems perspective

[...]

Pierre Baldi¹•Institutions (1)

California Institute of Technology¹

Use of a quasi-Newton method in a feedforward neural network construction algorithm

TL;DR: This general approach organizes and simplifies all the known algorithms and results which have been originally derived for different problems (fixed point/trajectory learning), for different models, for different architectures, and using different techniques.

...read moreread less

Abstract: Gives a unified treatment of gradient descent learning algorithms for neural networks using a general framework of dynamical systems. This general approach organizes and simplifies all the known algorithms and results which have been originally derived for different problems (fixed point/trajectory learning), for different models (discrete/continuous), for different architectures (forward/recurrent), and using different techniques (backpropagation, variational calculus, adjoint methods, etc.). The general approach can also be applied to derive new algorithms. The author then briefly examines some of the complexity issues and limitations intrinsic to gradient descent learning. Throughout the paper, the author focuses on the problem of trajectory learning. >

...read moreread less

Journal Article•DOI•

[...]

Rudy Setiono¹, L.C.K. Hui¹•Institutions (1)

National University of Singapore¹

Learning without local minima in radial basis function networks

TL;DR: An algorithm for constructing a single hidden layer feedforward neural network that uses the quasi-Newton method to minimize the sequence of error functions associated with the growing network is described.

...read moreread less

Abstract: This paper describes an algorithm for constructing a single hidden layer feedforward neural network. A distinguishing feature of this algorithm is that it uses the quasi-Newton method to minimize the sequence of error functions associated with the growing network. Experimental results indicate that the algorithm is very efficient and robust. The algorithm was tested on two test problems. The first was the n-bit parity problem and the second was the breast cancer diagnosis problem from the University of Wisconsin Hospitals. For the n-bit parity problem, the algorithm was able to construct neural network having less than n hidden units that solved the problem for n=4,/spl middotspl middotspl middot/,7. For the cancer diagnosis problem, the neural networks constructed by the algorithm had small number of hidden units and high accuracy rates on both the training data and the testing data. >

...read moreread less

Journal Article•DOI•

[...]

Monica Bianchini, Paolo Frasconi, Marco Gori

Optimal adaptive k-means algorithm with dynamic adjustment of learning rate

TL;DR: It is proved that the attached cost function is local minima free with respect to all the weights in the case of networks using radial basis functions (RBF), which provides some theoretical foundations for a massive application of RBF in pattern recognition.

...read moreread less

Abstract: Learning from examples plays a central role in artificial neural networks. The success of many learning schemes is not guaranteed, however, since algorithms like backpropagation may get stuck in local minima, thus providing suboptimal solutions. For feedforward networks, optimal learning can be achieved provided that certain conditions on the network and the learning environment are met. This principle is investigated for the case of networks using radial basis functions (RBF). It is assumed that the patterns of the learning environment are separable by hyperspheres. In that case, we prove that the attached cost function is local minima free with respect to all the weights. This provides us with some theoretical foundations for a massive application of RBF in pattern recognition. >

...read moreread less

Journal Article•DOI•

[...]

C. Chinrungrueng¹, Carlo H. Séquin¹•Institutions (1)

University of California, Berkeley¹

Fuzzy multi-layer perceptron, inferencing and rule generation

TL;DR: An enhancement of the traditional k-means algorithm that approximates an optimal clustering solution with an efficient adaptive learning rate, which renders it usable even in situations where the statistics of the problem task varies slowly with time.

...read moreread less

Abstract: Adaptive k-means clustering algorithms have been used in several artificial neural network architectures, such as radial basis function networks or feature-map classifiers, for a competitive partitioning of the input domain. This paper presents an enhancement of the traditional k-means algorithm. It approximates an optimal clustering solution with an efficient adaptive learning rate, which renders it usable even in situations where the statistics of the problem task varies slowly with time. This modification Is based on the optimality criterion for the k-means partition stating that: all the regions in an optimal k-means partition have the same variations if the number of regions in the partition is large and the underlying distribution for generating input patterns is smooth. The goal of equalizing these variations is introduced in the competitive function that assigns each new pattern vector to the "appropriate" region. To evaluate the optimal k-means algorithm, the authors first compare it to other k-means variants on several simple tutorial examples, then the authors evaluate it on a practical application: vector quantization of image data. >

...read moreread less

Journal Article•DOI•

[...]

Sushmita Mitra, Sankar K. Pal

Multiple network fusion using fuzzy logic

TL;DR: A connectionist expert system model, based on a fuzzy version of the multilayer perceptron developed by the authors, is proposed, which infers the output class membership value(s) of an input pattern and also generates a measure of certainty expressing confidence in the decision.

...read moreread less

Abstract: A connectionist expert system model, based on a fuzzy version of the multilayer perceptron developed by the authors, is proposed. It infers the output class membership value(s) of an input pattern and also generates a measure of certainty expressing confidence in the decision. The model is capable of querying the user for the more important input feature information, if and when required, in case of partial inputs. Justification for an inferred decision may be produced in rule form, when so desired by the user. The magnitudes of the connection weights of the trained neural network are utilized in every stage of the proposed inferencing procedure. The antecedent and consequent parts of the justificatory rules are provided in natural forms. The effectiveness of the algorithm is tested on the speech recognition problem, on some medical data and on artificially generated intractable (linearly nonseparable) pattern classes. >

...read moreread less

Journal Article•DOI•

[...]

Sung-Bae Cho, J.H. Kim

Dynamic learning rate optimization of the backpropagation algorithm

TL;DR: This paper presents a method for combining multiple networks based on fuzzy logic, especially the fuzzy integral, which non-linearly combines objective evidence, in the form of a network output, with subjective evaluation of the importance of the individual neural networks.

...read moreread less

Abstract: Multiplayer feedforward networks trained by minimizing the mean squared error and by using a one of c teaching function yield network outputs that estimate posterior class probabilities. This provides a sound basis for combining the results from multiple networks to get more accurate classification. This paper presents a method for combining multiple networks based on fuzzy logic, especially the fuzzy integral. This method non-linearly combines objective evidence, in the form of a network output, with subjective evaluation of the importance of the individual neural networks. The experimental results with the recognition problem of on-line handwriting characters show that the performance of individual networks could be improved significantly. >

...read moreread less

Journal Article•DOI•

[...]

Xiao-Hu Yu¹, Guo-An Chen¹, Shi-Xin Cheng¹•Institutions (1)

Southeast University¹

Emergent synchrony in locally coupled neural oscillators

TL;DR: This paper considers dynamic learning rate optimization of the BP algorithm using derivative information and an efficient method of deriving the first and second derivatives of the objective function with respect to the learning rate is explored.

...read moreread less

Abstract: It has been observed by many authors that the backpropagation (BP) error surfaces usually consist of a large amount of flat regions as well as extremely steep regions. As such, the BP algorithm with a fixed learning rate will have low efficiency. This paper considers dynamic learning rate optimization of the BP algorithm using derivative information. An efficient method of deriving the first and second derivatives of the objective function with respect to the learning rate is explored, which does not involve explicit calculation of second-order derivatives in weight space, but rather uses the information gathered from the forward and backward propagation, Several learning rate optimization approaches are subsequently established based on linear expansion of the actual outputs and line searches with acceptable descent value and Newton-like methods, respectively. Simultaneous determination of the optimal learning rate and momentum is also introduced by showing the equivalence between the momentum version BP and the conjugate gradient method. Since these approaches are constructed by simple manipulations of the obtained derivatives, the computational and storage burden scale with the network size exactly like the standard BP algorithm, and the convergence of the BP algorithm is accelerated with in a remarkable reduction (typically by factor 10 to 50, depending upon network architectures and applications) in the running time for the overall learning process. Numerous computer simulation results are provided to support the present approaches. >

...read moreread less

Journal Article•DOI•

[...]

DeLiang Wang¹•Institutions (1)

Ohio State University¹

Improving model accuracy using optimal linear combinations of trained neural networks

TL;DR: It is found that locally coupled neural oscillators can yield global synchrony and illustrate the potential of locally connected oscillator networks in perceptual grouping and pattern segmentation, which seems missing in globally connected ones.

...read moreread less

Abstract: The discovery of long range synchronous oscillations in the visual cortex has triggered much interest in understanding the underlying neural mechanisms and in exploring possible applications of neural oscillations. Many neural models thus proposed end up relying on global connections, leading to the question of whether lateral connections alone can produce remote synchronization. With a formulation different from frequently used phase models, we find that locally coupled neural oscillators can yield global synchrony. The model employs a previously suggested mechanism that the efficacy of the connections is allowed to change on a fast time scale. Based on the known connectivity of the visual cortex, the model outputs closely resemble the experimental findings. Furthermore, we illustrate the potential of locally connected oscillator networks in perceptual grouping and pattern segmentation, which seems missing in globally connected ones. >

...read moreread less

Journal Article•DOI•

[...]

Sherif Hashem, Bruce W. Schmeiser¹•Institutions (1)

Purdue University¹

On the persistency of excitation in radial basis function network identification of nonlinear systems

TL;DR: The authors formulate the MSE-OLC problem for trained NN's and derive two closed-form expressions for the optimal combination-weights and examples of significant improvement in model accuracy are included.

...read moreread less

Abstract: Neural network (NN) based modeling often requires trying multiple networks with different architectures and training parameters in order to achieve an acceptable model accuracy. Typically, only one of the trained networks is selected as "best" and the rest are discarded. The authors propose using optimal linear combinations (OLC's) of the corresponding outputs on a set of NN's as an alternative to using a single network. Modeling accuracy is measured by mean squared error (MSE) with respect to the distribution of random inputs. Optimality is defined by minimizing the MSE, with the resultant combination referred to as MSE-OLC. The authors formulate the MSE-OLC problem for trained NN's and derive two closed-form expressions for the optimal combination-weights. An example that illustrates significant improvement in model accuracy as a result of using MSE-OLC's of the trained networks is included. >

...read moreread less

Journal Article•DOI•

[...]

Dimitry Gorinevsky¹•Institutions (1)

University of Toronto¹

01 Sep 1995-IEEE Transactions on Neural Networks

TL;DR: The author formulates and proves a PE condition on both the system state parameters and control inputs and study affine RBF network identification that is important for affine nonlinear system control.

...read moreread less

Abstract: Considers radial basis function (RBF) network approximation of a multivariate nonlinear mapping as a linear parametric regression problem. Linear recursive identification algorithms applied to this problem are known to converge, provided the regressor vector sequence has the persistency of excitation (PE) property. The main contribution of this paper is formulation and proof of PE conditions on the input variables. In the RBF network identification, the regressor vector is a nonlinear function of these input variables. According to the formulated condition, the inputs provide PE, if they belong to domains around the network node centers. For a two-input network with Gaussian RBF that have typical width and are centered on a regular mesh, these domains cover about 25% of the input domain volume. The authors further generalize the proposed solution of the standard RBF network identification problem and study affine RBF network identification that is important for affine nonlinear system control. For the affine RBF network, the author formulates and proves a PE condition on both the system state parameters and control inputs. >

...read moreread less

Journal Article•DOI•

An accelerated learning algorithm for multilayer perceptrons: optimization layer by layer

[...]

S. Ergezinger, E. Thomsen

Back-propagation network and its configuration for blood vessel detection in angiograms

TL;DR: A new learning procedure is presented which is based on a linearization of the nonlinear processing elements and the optimization of the multilayer perceptron layer by layer, which yields results in both accuracy and convergence rates which are orders of magnitude superior compared to conventional backpropagation learning.

...read moreread less

Abstract: Multilayer perceptrons are successfully used in an increasing number of nonlinear signal processing applications. The backpropagation learning algorithm, or variations hereof, is the standard method applied to the nonlinear optimization problem of adjusting the weights in the network in order to minimize a given cost function. However, backpropagation as a steepest descent approach is too slow for many applications. In this paper a new learning procedure is presented which is based on a linearization of the nonlinear processing elements and the optimization of the multilayer perceptron layer by layer. In order to limit the introduced linearization error a penalty term is added to the cost function. The new learning algorithm is applied to the problem of nonlinear prediction of chaotic time series. The proposed algorithm yields results in both accuracy and convergence rates which are orders of magnitude superior compared to conventional backpropagation learning. >

...read moreread less

Journal Article•DOI•

[...]

R. Nekovei¹, Ying Sun•Institutions (1)

University of Rhode Island¹

Decision-based neural networks with signal/image classification applications

TL;DR: A neural-network classifier for detecting vascular structures in angiograms was developed and demonstrated its superiority in classification performance and was equivalent to a generalized matched filter with a nonlinear decision tree.

...read moreread less

Abstract: A neural-network classifier for detecting vascular structures in angiograms was developed. The classifier consisted of a multilayer feedforward network window in which the center pixel was classified using gray-scale information within the window. The network was trained by using the backpropagation algorithm with the momentum term. Based on this image segmentation problem, the effect of changing network configuration on the classification performance was also characterized. Factors including topology, rate parameters, training sample set, and initial weights were systematically analyzed. The training set consisted of 75 selected points from a 256/spl times/256 digitized cineangiogram. While different network topologies showed no significant effect on performance, both the learning process and the classification performance were sensitive to the rate parameters. In a comparative study, the network demonstrated its superiority in classification performance. It was also shown that the trained neural-network classifier was equivalent to a generalized matched filter with a nonlinear decision tree. >

...read moreread less

Journal Article•DOI•

[...]

Sun-Yuan Kung¹, Jin-Shiuh Taur¹•Institutions (1)

Princeton University¹

ART-EMAP: A neural network architecture for object recognition by evidence accumulation

TL;DR: A decision-based neural network is proposed, which combines the perceptron-like learning rule and hierarchical nonlinear network structure, which is confirmed by simulations conducted for several applications, including texture classification, OCR, and ECG analysis.

...read moreread less

Abstract: Supervised learning networks based on a decision-based formulation are explored. More specifically, a decision-based neural network (DBNN) is proposed, which combines the perceptron-like learning rule and hierarchical nonlinear network structure. The decision-based mutual training can be applied to both static and temporal pattern recognition problems. For static pattern recognition, two hierarchical structures are proposed: hidden-node and subcluster structures. The relationships between DBNN's and other models (linear perceptron, piecewise-linear perceptron, LVQ, and PNN) are discussed. As to temporal DBNN's, model-based discriminant functions may be chosen to compensate possible temporal variations, such as waveform warping and alignments. Typical examples include DTW distance, prediction error, or likelihood functions. For classification applications, DBNN's are very effective in computation time and performance. This is confirmed by simulations conducted for several applications, including texture classification, OCR, and ECG analysis. >

...read moreread less

Journal Article•DOI•

[...]

Gail A. Carpenter¹, William D. Ross¹•Institutions (1)

Boston University¹

Combinatorial optimization with use of guided evolutionary simulated annealing

TL;DR: A new neural network architecture is introduced for the recognition of pattern classes after supervised and unsupervised learning, which achieves a synthesis of adaptive resonance theory (ART) and spatial and temporal evidence integration for dynamic predictive mapping (EMAP).

...read moreread less

Abstract: A new neural network architecture is introduced for the recognition of pattern classes after supervised and unsupervised learning. Applications include spatio-temporal image understanding and prediction and 3D object recognition from a series of ambiguous 2D views. The architecture, called ART-EMAP, achieves a synthesis of adaptive resonance theory (ART) and spatial and temporal evidence integration for dynamic predictive mapping (EMAP). ART-EMAP extends the capabilities of fuzzy ARTMAP in four incremental stages. Stage 1 introduces distributed pattern representation at a view category field. Stage 2 adds a decision criterion to the mapping between view and object categories, delaying identification of ambiguous objects when faced with a low confidence prediction. Stage 3 augments the system with a field where evidence accumulates in medium-term memory. Stage 4 adds an unsupervised learning process to fine-tune performance after the limited initial period of supervised network training. Each ART-EMAP stage is illustrated with a benchmark simulation example, using both noisy and noise-free data. >

...read moreread less

Journal Article•DOI•

[...]

P.P.C. Yip¹, Yoh-Han Pao¹•Institutions (1)

Case Western Reserve University¹

Approximation capability in C(R~/sup n/) by multilayer feedforward networks and related problems

TL;DR: A new technique is proposed, which incorporates the idea of simulated annealing into the practice of simulated evolution, in place of arbitrary heuristics, called GESA, which is used primarily for combinatorial optimization.

...read moreread less

Abstract: Feasible approaches to the task of solving NP-complete problems usually entails the incorporation of heuristic procedures so as to increase the efficiency of the methods used. We propose a new technique, which incorporates the idea of simulated annealing into the practice of simulated evolution, in place of arbitrary heuristics. The proposed technique is called guided evolutionary simulated annealing (GESA). We report on the use of GESA approach primarily for combinatorial optimization. In addition, we report the case of function optimization, treating the task as a search problem. The traveling salesman problem is taken as a benchmark problem in the first case. Simulation results are reported. The results show that the GESA approach can discover a very good near optimum solution after examining an extremely small fraction of possible solutions. A very complicated function with many local minima is used in the second case. The results in both cases indicate that the GESA technique is a practicable method which yields consistent and good near optimal solutions, superior to simulated evolution. >

...read moreread less

Journal Article•DOI•

[...]

Tianping Chen¹, Hong Chen, Ruey-wen Liu•Institutions (1)

Fudan University¹