scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Neural Networks in 1995"


Journal Article•DOI•
TL;DR: A theoretical justification for the random vector version of the functional-link (RVFL) net is presented, based on a general approach to adaptive function approximation, which results are that the RVFL is a universal approximator for continuous functions on bounded finite dimensional sets.
Abstract: A theoretical justification for the random vector version of the functional-link (RVFL) net is presented in this paper, based on a general approach to adaptive function approximation The approach consists of formulating a limit-integral representation of the function to be approximated and subsequently evaluating that integral with the Monte-Carlo method Two main results are: (1) the RVFL is a universal approximator for continuous functions on bounded finite dimensional sets, and (2) the RVFL is an efficient universal approximator with the rate of approximation error convergence to zero of order O(C//spl radic/n), where n is number of basis functions and with C independent of n Similar results are also obtained for neural nets with hidden nodes implemented as products of univariate functions or radial basis functions Some possible ways of enhancing the accuracy of multivariate function approximations are discussed >

794 citations


Journal Article•DOI•
TL;DR: This paper studies the approximation and learning properties of one class of recurrent networks, known as high-order neural networks; and applies these architectures to the identification of dynamical systems.
Abstract: Several continuous-time and discrete-time recurrent neural network models have been developed and applied to various engineering problems. One of the difficulties encountered in the application of recurrent networks is the derivation of efficient learning algorithms that also guarantee the stability of the overall system. This paper studies the approximation and learning properties of one class of recurrent networks, known as high-order neural networks; and applies these architectures to the identification of dynamical systems. In recurrent high-order neural networks, the dynamic components are distributed throughout the network in the form of dynamic neurons. It is shown that if enough high-order connections are allowed then this network is capable of approximating arbitrary dynamical systems. Identification schemes based on high-order network architectures are designed and analyzed. >

761 citations


Journal Article•DOI•
TL;DR: Convergence theorems for the adaptive backpropagation algorithms are developed for both DRNI and DRNC and an approach that uses adaptive learning rates is developed by introducing a Lyapunov function.
Abstract: A new neural paradigm called diagonal recurrent neural network (DRNN) is presented. The architecture of DRNN is a modified model of the fully connected recurrent neural network with one hidden layer, and the hidden layer comprises self-recurrent neurons. Two DRNN's are utilized in a control system, one as an identifier called diagonal recurrent neuroidentifier (DRNI) and the other as a controller called diagonal recurrent neurocontroller (DRNC). A controlled plant is identified by the DRNI, which then provides the sensitivity information of the plant to the DRNC. A generalized dynamic backpropagation algorithm (DBP) is developed and used to train both DRNC and DRNI. Due to the recurrence, the DRNN can capture the dynamic behavior of a system. To guarantee convergence and for faster learning, an approach that uses adaptive learning rates is developed by introducing a Lyapunov function. Convergence theorems for the adaptive backpropagation algorithms are developed for both DRNI and DRNC. The proposed DRNN paradigm is applied to numerical problems and the simulation results are included. >

725 citations


Journal Article•DOI•
TL;DR: The main results are: every Tauber-Wiener function is qualified as an activation function in the hidden layer of a three-layered neural network and the possibility by neural computation to approximate the output as a whole of a dynamical system, thus identifying the system.
Abstract: The purpose of this paper is to investigate neural network capability systematically. The main results are: 1) every Tauber-Wiener function is qualified as an activation function in the hidden layer of a three-layered neural network; 2) for a continuous function in S'(R/sup 1/) to be a Tauber-Wiener function, the necessary and sufficient condition is that it is not a polynomial; 3) the capability of approximating nonlinear functionals defined on some compact set of a Banach space and nonlinear operators has been shown; and 4) the possibility by neural computation to approximate the output as a whole (not at a fixed point) of a dynamical system, thus identifying the system. >

704 citations


Journal Article•DOI•
TL;DR: The SAMANN network offers the generalization ability of projecting new data, which is not present in the original Sammon's projection algorithm; the NDA method and NP-SOM network provide new powerful approaches for visualizing high dimensional data.
Abstract: Classical feature extraction and data projection methods have been well studied in the pattern recognition and exploratory data analysis literature. We propose a number of networks and learning algorithms which provide new or alternative tools for feature extraction and data projection. These networks include a network (SAMANN) for J.W. Sammon's (1969) nonlinear projection, a linear discriminant analysis (LDA) network, a nonlinear discriminant analysis (NDA) network, and a network for nonlinear projection (NP-SOM) based on Kohonen's self-organizing map. A common attribute of these networks is that they all employ adaptive learning algorithms which makes them suitable in some environments where the distribution of patterns in feature space changes with respect to time. The availability of these networks also facilitates hardware implementation of well-known classical feature extraction and projection approaches. Moreover, the SAMANN network offers the generalization ability of projecting new data, which is not present in the original Sammon's projection algorithm; the NDA method and NP-SOM network provide new powerful approaches for visualizing high dimensional data. We evaluate five representative neural networks for feature extraction and data projection based on a visual judgement of the two-dimensional projection maps and three quantitative criteria on eight data sets with various properties. >

695 citations


Journal Article•DOI•
TL;DR: The author discusses advantages and disadvantages of temporally continuous neural networks in contrast to clocked ones and presents some "tricks of the trade" for training, using, and simulating continuous time and recurrent neural networks.
Abstract: Surveys learning algorithms for recurrent neural networks with hidden units and puts the various techniques into a common framework. The authors discuss fixed point learning algorithms, namely recurrent backpropagation and deterministic Boltzmann machines, and nonfixed point algorithms, namely backpropagation through time, Elman's history cutoff, and Jordan's output feedback architecture. Forward propagation, an on-line technique that uses adjoint equations, and variations thereof, are also discussed. In many cases, the unified presentation leads to generalizations of various sorts. The author discusses advantages and disadvantages of temporally continuous neural networks in contrast to clocked ones continues with some "tricks of the trade" for training, using, and simulating continuous time and recurrent neural networks. The author presents some simulations, and at the end, addresses issues of computational complexity and learning speed. >

627 citations


Journal Article•DOI•
TL;DR: It is shown that standard backpropagation, when used for real-time closed-loop control, can yield unbounded NN weights if (1) the net cannot exactly reconstruct a certain required control function or (2) there are bounded unknown disturbances in the robot dynamics.
Abstract: A neural net (NN) controller for a general serial-link robot arm is developed. The NN has two layers so that linearity in the parameters holds, but the "net functional reconstruction error" and robot disturbance input are taken as nonzero. The structure of the NN controller is derived using a filtered error/passivity approach, leading to new NN passivity properties. Online weight tuning algorithms including a correction term to backpropagation, plus an added robustifying signal, guarantee tracking as well as bounded NN weights. The NN controller structure has an outer tracking loop so that the NN weights are conveniently initialized at zero, with learning occurring online in real-time. It is shown that standard backpropagation, when used for real-time closed-loop control, can yield unbounded NN weights if (1) the net cannot exactly reconstruct a certain required control function or (2) there are bounded unknown disturbances in the robot dynamics. The role of persistency of excitation is explored. >

611 citations


Journal Article•DOI•
TL;DR: This paper describes a method for representing more complex compositional structure in distributed representations that uses circular convolution to associate items, which are represented by vectors.
Abstract: Associative memories are conventionally used to represent data with very simple structure: sets of pairs of vectors. This paper describes a method for representing more complex compositional structure in distributed representations. The method uses circular convolution to associate items, which are represented by vectors. Arbitrary variable bindings, short sequences of various lengths, simple frame-like structures, and reduced representations can be represented in a fixed width vector. These representations are items in their own right and can be used in constructing compositional structures. The noisy reconstructions extracted from convolution memories can be cleaned up by using a separate associative memory that has good reconstructive properties. >

597 citations


Journal Article•DOI•
TL;DR: The approach is to use a modular network architecture, reducing a K-class problem to a set of K two-class problems, with a separately trained network for each of the simpler problems.
Abstract: The rate of convergence of net output error is very low when training feedforward neural networks for multiclass problems using the backpropagation algorithm. While backpropagation will reduce the Euclidean distance between the actual and desired output vectors, the differences between some of the components of these vectors increase in the first iteration. Furthermore, the magnitudes of subsequent weight changes in each iteration are very small, so that many iterations are required to compensate for the increased error in some components in the initial iterations. Our approach is to use a modular network architecture, reducing a K-class problem to a set of K two-class problems, with a separately trained network for each of the simpler problems. Speedups of one order of magnitude have been obtained experimentally, and in some cases convergence was possible using the modular approach but not using a nonmodular network. >

407 citations


Journal Article•DOI•
TL;DR: A novel class of locally excitatory, globally inhibitory oscillator networks (LEGION) is proposed and investigated, which lays a physical foundation for the oscillatory correlation theory of feature binding and may provide an effective computational framework for scene segmentation and figure/ground segregation in real time.
Abstract: A novel class of locally excitatory, globally inhibitory oscillator networks (LEGION) is proposed and investigated. The model of each oscillator corresponds to a standard relaxation oscillator with two time scales. In the network, an oscillator jumping up to its active phase rapidly recruits the oscillators stimulated by the same pattern, while preventing other oscillators from jumping up. Computer simulations demonstrate that the network rapidly achieves both synchronization within blocks of oscillators that are stimulated by connected regions and desynchronization between different blocks. This model lays a physical foundation for the oscillatory correlation theory of feature binding and may provide an effective computational framework for scene segmentation and figure/ground segregation in real time. >

266 citations


Journal Article•DOI•
TL;DR: Most of the known results on linear networks, including backpropagation learning and the structure of the error function landscape, the temporal evolution of generalization, and unsupervised learning algorithms and their properties are surveyed.
Abstract: Networks of linear units are the simplest kind of networks, where the basic questions related to learning, generalization, and self-organization can sometimes be answered analytically. We survey most of the known results on linear networks, including: 1) backpropagation learning and the structure of the error function landscape, 2) the temporal evolution of generalization, and 3) unsupervised learning algorithms and their properties. The connections to classical statistical ideas, such as principal component analysis (PCA), are emphasized as well as several simple but challenging open questions. A few new results are also spread across the paper, including an analysis of the effect of noise on backpropagation networks and a unified view of all unsupervised algorithms. >

Journal Article•DOI•
TL;DR: Unlike neural network training, this estimation procedure does not rely on stochastic gradient type techniques such as the celebrated "backpropagation" and it completely avoids the problem of poor convergence or undesirable local minima.
Abstract: "Constructive wavelet networks" are investigated as a universal tool for function approximation. The parameters of such networks are obtained via some "direct" Monte Carlo procedures. Approximation bounds are given. Typically, it is shown that such networks with one layer of "wavelons" achieve an L/sub 2/ error of order O(N/sup -(/spl rhod)/), where N is the number of nodes, d is the problem dimension and /spl rho/ is the number of summable derivatives of the approximated function. An algorithm is also proposed to estimate this approximation based on noisy input-output data observed from the function under consideration. Unlike neural network training, this estimation procedure does not rely on stochastic gradient type techniques such as the celebrated "backpropagation" and it completely avoids the problem of poor convergence or undesirable local minima. >

Journal Article•DOI•
TL;DR: The authors' robust rules improve the performances of the existing PCA algorithms significantly when outliers are present and perform excellently for fulfilling various PCA-like tasks such as obtaining the first principal component vector, the first k principal component vectors, and directly finding the subspace spanned by the firstk vector principal components vectors without solving for each vector individually.
Abstract: This paper applies statistical physics to the problem of robust principal component analysis (PCA). The commonly used PCA learning rules are first related to energy functions. These functions are generalized by adding a binary decision field with a given prior distribution so that outliers in the data are dealt with explicitly in order to make PCA robust. Each of the generalized energy functions is then used to define a Gibbs distribution from which a marginal distribution is obtained by summing over the binary decision field. The marginal distribution defines an effective energy function, from which self-organizing rules have been developed for robust PCA. Under the presence of outliers, both the standard PCA methods and the existing self-organizing PCA rules studied in the literature of neural networks perform quite poorly. By contrast, the robust rules proposed here resist outliers well and perform excellently for fulfilling various PCA-like tasks such as obtaining the first principal component vector, the first k principal component vectors, and directly finding the subspace spanned by the first k vector principal component vectors without solving for each vector individually. Comparative experiments have been made, and the results show that the authors' robust rules improve the performances of the existing PCA algorithms significantly when outliers are present. >

Journal Article•DOI•
TL;DR: The asymptotical properties of the estimators lead us to propose a systematic methodology to determine which weights are nonsignificant and to eliminate them to simplify the architecture.
Abstract: Many authors use feedforward neural networks for modeling and forecasting time series. Most of these applications are mainly experimental, and it is often difficult to extract a general methodology from the published studies. In particular, the choice of architecture is a tricky problem. We try to combine the statistical techniques of linear and nonlinear time series with the connectionist approach. The asymptotical properties of the estimators lead us to propose a systematic methodology to determine which weights are nonsignificant and to eliminate them to simplify the architecture. This method (SSM or statistical stepwise method) is compared to other pruning techniques and is applied to some artificial series, to the famous Sunspots benchmark, and to daily electrical consumption data. >

Journal Article•DOI•
TL;DR: This general approach organizes and simplifies all the known algorithms and results which have been originally derived for different problems (fixed point/trajectory learning), for different models, for different architectures, and using different techniques.
Abstract: Gives a unified treatment of gradient descent learning algorithms for neural networks using a general framework of dynamical systems. This general approach organizes and simplifies all the known algorithms and results which have been originally derived for different problems (fixed point/trajectory learning), for different models (discrete/continuous), for different architectures (forward/recurrent), and using different techniques (backpropagation, variational calculus, adjoint methods, etc.). The general approach can also be applied to derive new algorithms. The author then briefly examines some of the complexity issues and limitations intrinsic to gradient descent learning. Throughout the paper, the author focuses on the problem of trajectory learning. >

Journal Article•DOI•
TL;DR: An algorithm for constructing a single hidden layer feedforward neural network that uses the quasi-Newton method to minimize the sequence of error functions associated with the growing network is described.
Abstract: This paper describes an algorithm for constructing a single hidden layer feedforward neural network. A distinguishing feature of this algorithm is that it uses the quasi-Newton method to minimize the sequence of error functions associated with the growing network. Experimental results indicate that the algorithm is very efficient and robust. The algorithm was tested on two test problems. The first was the n-bit parity problem and the second was the breast cancer diagnosis problem from the University of Wisconsin Hospitals. For the n-bit parity problem, the algorithm was able to construct neural network having less than n hidden units that solved the problem for n=4,/spl middotspl middotspl middot/,7. For the cancer diagnosis problem, the neural networks constructed by the algorithm had small number of hidden units and high accuracy rates on both the training data and the testing data. >

Journal Article•DOI•
TL;DR: It is proved that the attached cost function is local minima free with respect to all the weights in the case of networks using radial basis functions (RBF), which provides some theoretical foundations for a massive application of RBF in pattern recognition.
Abstract: Learning from examples plays a central role in artificial neural networks. The success of many learning schemes is not guaranteed, however, since algorithms like backpropagation may get stuck in local minima, thus providing suboptimal solutions. For feedforward networks, optimal learning can be achieved provided that certain conditions on the network and the learning environment are met. This principle is investigated for the case of networks using radial basis functions (RBF). It is assumed that the patterns of the learning environment are separable by hyperspheres. In that case, we prove that the attached cost function is local minima free with respect to all the weights. This provides us with some theoretical foundations for a massive application of RBF in pattern recognition. >

Journal Article•DOI•
TL;DR: An enhancement of the traditional k-means algorithm that approximates an optimal clustering solution with an efficient adaptive learning rate, which renders it usable even in situations where the statistics of the problem task varies slowly with time.
Abstract: Adaptive k-means clustering algorithms have been used in several artificial neural network architectures, such as radial basis function networks or feature-map classifiers, for a competitive partitioning of the input domain. This paper presents an enhancement of the traditional k-means algorithm. It approximates an optimal clustering solution with an efficient adaptive learning rate, which renders it usable even in situations where the statistics of the problem task varies slowly with time. This modification Is based on the optimality criterion for the k-means partition stating that: all the regions in an optimal k-means partition have the same variations if the number of regions in the partition is large and the underlying distribution for generating input patterns is smooth. The goal of equalizing these variations is introduced in the competitive function that assigns each new pattern vector to the "appropriate" region. To evaluate the optimal k-means algorithm, the authors first compare it to other k-means variants on several simple tutorial examples, then the authors evaluate it on a practical application: vector quantization of image data. >

Journal Article•DOI•
TL;DR: A connectionist expert system model, based on a fuzzy version of the multilayer perceptron developed by the authors, is proposed, which infers the output class membership value(s) of an input pattern and also generates a measure of certainty expressing confidence in the decision.
Abstract: A connectionist expert system model, based on a fuzzy version of the multilayer perceptron developed by the authors, is proposed. It infers the output class membership value(s) of an input pattern and also generates a measure of certainty expressing confidence in the decision. The model is capable of querying the user for the more important input feature information, if and when required, in case of partial inputs. Justification for an inferred decision may be produced in rule form, when so desired by the user. The magnitudes of the connection weights of the trained neural network are utilized in every stage of the proposed inferencing procedure. The antecedent and consequent parts of the justificatory rules are provided in natural forms. The effectiveness of the algorithm is tested on the speech recognition problem, on some medical data and on artificially generated intractable (linearly nonseparable) pattern classes. >

Journal Article•DOI•
TL;DR: This paper presents a method for combining multiple networks based on fuzzy logic, especially the fuzzy integral, which non-linearly combines objective evidence, in the form of a network output, with subjective evaluation of the importance of the individual neural networks.
Abstract: Multiplayer feedforward networks trained by minimizing the mean squared error and by using a one of c teaching function yield network outputs that estimate posterior class probabilities. This provides a sound basis for combining the results from multiple networks to get more accurate classification. This paper presents a method for combining multiple networks based on fuzzy logic, especially the fuzzy integral. This method non-linearly combines objective evidence, in the form of a network output, with subjective evaluation of the importance of the individual neural networks. The experimental results with the recognition problem of on-line handwriting characters show that the performance of individual networks could be improved significantly. >

Journal Article•DOI•
TL;DR: This paper considers dynamic learning rate optimization of the BP algorithm using derivative information and an efficient method of deriving the first and second derivatives of the objective function with respect to the learning rate is explored.
Abstract: It has been observed by many authors that the backpropagation (BP) error surfaces usually consist of a large amount of flat regions as well as extremely steep regions. As such, the BP algorithm with a fixed learning rate will have low efficiency. This paper considers dynamic learning rate optimization of the BP algorithm using derivative information. An efficient method of deriving the first and second derivatives of the objective function with respect to the learning rate is explored, which does not involve explicit calculation of second-order derivatives in weight space, but rather uses the information gathered from the forward and backward propagation, Several learning rate optimization approaches are subsequently established based on linear expansion of the actual outputs and line searches with acceptable descent value and Newton-like methods, respectively. Simultaneous determination of the optimal learning rate and momentum is also introduced by showing the equivalence between the momentum version BP and the conjugate gradient method. Since these approaches are constructed by simple manipulations of the obtained derivatives, the computational and storage burden scale with the network size exactly like the standard BP algorithm, and the convergence of the BP algorithm is accelerated with in a remarkable reduction (typically by factor 10 to 50, depending upon network architectures and applications) in the running time for the overall learning process. Numerous computer simulation results are provided to support the present approaches. >

Journal Article•DOI•
DeLiang Wang1•
TL;DR: It is found that locally coupled neural oscillators can yield global synchrony and illustrate the potential of locally connected oscillator networks in perceptual grouping and pattern segmentation, which seems missing in globally connected ones.
Abstract: The discovery of long range synchronous oscillations in the visual cortex has triggered much interest in understanding the underlying neural mechanisms and in exploring possible applications of neural oscillations. Many neural models thus proposed end up relying on global connections, leading to the question of whether lateral connections alone can produce remote synchronization. With a formulation different from frequently used phase models, we find that locally coupled neural oscillators can yield global synchrony. The model employs a previously suggested mechanism that the efficacy of the connections is allowed to change on a fast time scale. Based on the known connectivity of the visual cortex, the model outputs closely resemble the experimental findings. Furthermore, we illustrate the potential of locally connected oscillator networks in perceptual grouping and pattern segmentation, which seems missing in globally connected ones. >

Journal Article•DOI•
TL;DR: The authors formulate the MSE-OLC problem for trained NN's and derive two closed-form expressions for the optimal combination-weights and examples of significant improvement in model accuracy are included.
Abstract: Neural network (NN) based modeling often requires trying multiple networks with different architectures and training parameters in order to achieve an acceptable model accuracy. Typically, only one of the trained networks is selected as "best" and the rest are discarded. The authors propose using optimal linear combinations (OLC's) of the corresponding outputs on a set of NN's as an alternative to using a single network. Modeling accuracy is measured by mean squared error (MSE) with respect to the distribution of random inputs. Optimality is defined by minimizing the MSE, with the resultant combination referred to as MSE-OLC. The authors formulate the MSE-OLC problem for trained NN's and derive two closed-form expressions for the optimal combination-weights. An example that illustrates significant improvement in model accuracy as a result of using MSE-OLC's of the trained networks is included. >

Journal Article•DOI•
TL;DR: The author formulates and proves a PE condition on both the system state parameters and control inputs and study affine RBF network identification that is important for affine nonlinear system control.
Abstract: Considers radial basis function (RBF) network approximation of a multivariate nonlinear mapping as a linear parametric regression problem. Linear recursive identification algorithms applied to this problem are known to converge, provided the regressor vector sequence has the persistency of excitation (PE) property. The main contribution of this paper is formulation and proof of PE conditions on the input variables. In the RBF network identification, the regressor vector is a nonlinear function of these input variables. According to the formulated condition, the inputs provide PE, if they belong to domains around the network node centers. For a two-input network with Gaussian RBF that have typical width and are centered on a regular mesh, these domains cover about 25% of the input domain volume. The authors further generalize the proposed solution of the standard RBF network identification problem and study affine RBF network identification that is important for affine nonlinear system control. For the affine RBF network, the author formulates and proves a PE condition on both the system state parameters and control inputs. >

Journal Article•DOI•
TL;DR: A new learning procedure is presented which is based on a linearization of the nonlinear processing elements and the optimization of the multilayer perceptron layer by layer, which yields results in both accuracy and convergence rates which are orders of magnitude superior compared to conventional backpropagation learning.
Abstract: Multilayer perceptrons are successfully used in an increasing number of nonlinear signal processing applications. The backpropagation learning algorithm, or variations hereof, is the standard method applied to the nonlinear optimization problem of adjusting the weights in the network in order to minimize a given cost function. However, backpropagation as a steepest descent approach is too slow for many applications. In this paper a new learning procedure is presented which is based on a linearization of the nonlinear processing elements and the optimization of the multilayer perceptron layer by layer. In order to limit the introduced linearization error a penalty term is added to the cost function. The new learning algorithm is applied to the problem of nonlinear prediction of chaotic time series. The proposed algorithm yields results in both accuracy and convergence rates which are orders of magnitude superior compared to conventional backpropagation learning. >

Journal Article•DOI•
TL;DR: A neural-network classifier for detecting vascular structures in angiograms was developed and demonstrated its superiority in classification performance and was equivalent to a generalized matched filter with a nonlinear decision tree.
Abstract: A neural-network classifier for detecting vascular structures in angiograms was developed. The classifier consisted of a multilayer feedforward network window in which the center pixel was classified using gray-scale information within the window. The network was trained by using the backpropagation algorithm with the momentum term. Based on this image segmentation problem, the effect of changing network configuration on the classification performance was also characterized. Factors including topology, rate parameters, training sample set, and initial weights were systematically analyzed. The training set consisted of 75 selected points from a 256/spl times/256 digitized cineangiogram. While different network topologies showed no significant effect on performance, both the learning process and the classification performance were sensitive to the rate parameters. In a comparative study, the network demonstrated its superiority in classification performance. It was also shown that the trained neural-network classifier was equivalent to a generalized matched filter with a nonlinear decision tree. >

Journal Article•DOI•
TL;DR: A decision-based neural network is proposed, which combines the perceptron-like learning rule and hierarchical nonlinear network structure, which is confirmed by simulations conducted for several applications, including texture classification, OCR, and ECG analysis.
Abstract: Supervised learning networks based on a decision-based formulation are explored. More specifically, a decision-based neural network (DBNN) is proposed, which combines the perceptron-like learning rule and hierarchical nonlinear network structure. The decision-based mutual training can be applied to both static and temporal pattern recognition problems. For static pattern recognition, two hierarchical structures are proposed: hidden-node and subcluster structures. The relationships between DBNN's and other models (linear perceptron, piecewise-linear perceptron, LVQ, and PNN) are discussed. As to temporal DBNN's, model-based discriminant functions may be chosen to compensate possible temporal variations, such as waveform warping and alignments. Typical examples include DTW distance, prediction error, or likelihood functions. For classification applications, DBNN's are very effective in computation time and performance. This is confirmed by simulations conducted for several applications, including texture classification, OCR, and ECG analysis. >

Journal Article•DOI•
TL;DR: A new neural network architecture is introduced for the recognition of pattern classes after supervised and unsupervised learning, which achieves a synthesis of adaptive resonance theory (ART) and spatial and temporal evidence integration for dynamic predictive mapping (EMAP).
Abstract: A new neural network architecture is introduced for the recognition of pattern classes after supervised and unsupervised learning. Applications include spatio-temporal image understanding and prediction and 3D object recognition from a series of ambiguous 2D views. The architecture, called ART-EMAP, achieves a synthesis of adaptive resonance theory (ART) and spatial and temporal evidence integration for dynamic predictive mapping (EMAP). ART-EMAP extends the capabilities of fuzzy ARTMAP in four incremental stages. Stage 1 introduces distributed pattern representation at a view category field. Stage 2 adds a decision criterion to the mapping between view and object categories, delaying identification of ambiguous objects when faced with a low confidence prediction. Stage 3 augments the system with a field where evidence accumulates in medium-term memory. Stage 4 adds an unsupervised learning process to fine-tune performance after the limited initial period of supervised network training. Each ART-EMAP stage is illustrated with a benchmark simulation example, using both noisy and noise-free data. >

Journal Article•DOI•
TL;DR: A new technique is proposed, which incorporates the idea of simulated annealing into the practice of simulated evolution, in place of arbitrary heuristics, called GESA, which is used primarily for combinatorial optimization.
Abstract: Feasible approaches to the task of solving NP-complete problems usually entails the incorporation of heuristic procedures so as to increase the efficiency of the methods used. We propose a new technique, which incorporates the idea of simulated annealing into the practice of simulated evolution, in place of arbitrary heuristics. The proposed technique is called guided evolutionary simulated annealing (GESA). We report on the use of GESA approach primarily for combinatorial optimization. In addition, we report the case of function optimization, treating the task as a search problem. The traveling salesman problem is taken as a benchmark problem in the first case. Simulation results are reported. The results show that the GESA approach can discover a very good near optimum solution after examining an extremely small fraction of possible solutions. A very complicated function with many local minima is used in the second case. The results in both cases indicate that the GESA technique is a practicable method which yields consistent and good near optimal solutions, superior to simulated evolution. >

Journal Article•DOI•
TL;DR: It is found that the boundedness condition on the sigmoidal function plays an essential role in the approximation, as contrast to continuity or monotonity condition.
Abstract: In this paper, we investigate the capability of approximating functions in C(R~/sup n/) by three-layered neural networks with sigmoidal function in the hidden layer. It is found that the boundedness condition on the sigmoidal function plays an essential role in the approximation, as contrast to continuity or monotonity condition. We point out that in order to prove the neural network in the n-dimensional case, all one needs to do is to prove the case for one dimension. The approximation in L/sup p/-norm (1 >