scispace - formally typeset
Search or ask a question

Showing papers on "Recurrent neural network published in 1997"


Journal ArticleDOI
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Abstract: Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Multiplicative gate units learn to open and close access to the constant error flow. LSTM is local in space and time; its computational complexity per time step and weight is O. 1. Our experiments with artificial data involve local, distributed, real-valued, and noisy pattern representations. In comparisons with real-time recurrent learning, back propagation through time, recurrent cascade correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks that have never been solved by previous recurrent network algorithms.

72,897 citations


Journal ArticleDOI
TL;DR: It is shown how the proposed bidirectional structure can be easily modified to allow efficient estimation of the conditional posterior probability of complete symbol sequences without making any explicit assumption about the shape of the distribution.
Abstract: In the first part of this paper, a regular recurrent neural network (RNN) is extended to a bidirectional recurrent neural network (BRNN). The BRNN can be trained without the limitation of using input information just up to a preset future frame. This is accomplished by training it simultaneously in positive and negative time direction. Structure and training procedure of the proposed network are explained. In regression and classification experiments on artificial data, the proposed structure gives better results than other approaches. For real data, classification experiments for phonemes from the TIMIT database show the same tendency. In the second part of this paper, it is shown how the proposed bidirectional structure can be easily modified to allow efficient estimation of the conditional posterior probability of complete symbol sequences without making any explicit assumption about the shape of the distribution. For this part, experiments on real data are reported.

7,290 citations


Journal ArticleDOI
TL;DR: Examples of the use of multi-layer feed-forward neural networks for prediction of carbon-13 NMR chemical shifts of alkanes is given and advantages and disadvantages of multilayer feed- forward neural networks are discussed.

1,206 citations


Journal ArticleDOI
TL;DR: It is shown that neural networks can, in fact, represent and classify structured patterns and all the supervised networks developed for the classification of sequences can, on the whole, be generalized to structures.
Abstract: Standard neural networks and statistical methods are usually believed to be inadequate when dealing with complex structures because of their feature-based approach. In fact, feature-based approaches usually fail to give satisfactory solutions because of the sensitivity of the approach to the a priori selection of the features, and the incapacity to represent any specific information on the relationships among the components of the structures. However, we show that neural networks can, in fact, represent and classify structured patterns. The key idea underpinning our approach is the use of the so called "generalized recursive neuron", which is essentially a generalization to structures of a recurrent neuron. By using generalized recursive neurons, all the supervised networks developed for the classification of sequences, such as backpropagation through time networks, real-time recurrent networks, simple recurrent networks, recurrent cascade correlation networks, and neural trees can, on the whole, be generalized to structures. The results obtained by some of the above networks (with generalized recursive neurons) on the classification of logic terms are presented.

569 citations


Journal ArticleDOI
01 Apr 1997
TL;DR: It is constructively proved that the NARX networks with a finite number of parameters are computationally as strong as fully connected recurrent networks and thus Turing machines, raising the issue of what amount of feedback or recurrence is necessary for any network to be Turing equivalent and what restrictions on feedback limit computational power.
Abstract: Recently, fully connected recurrent neural networks have been proven to be computationally rich-at least as powerful as Turing machines. This work focuses on another network which is popular in control applications and has been found to be very effective at learning a variety of problems. These networks are based upon Nonlinear AutoRegressive models with eXogenous Inputs (NARX models), and are therefore called NARX networks. As opposed to other recurrent networks, NARX networks have a limited feedback which comes only from the output neuron rather than from hidden states. They are formalized by y(t)=/spl Psi/(u(t-n/sub u/), ..., u(t-1), u(t), y(t-n/sub y/), ..., y(t-1)) where u(t) and y(t) represent input and output of the network at time t, n/sub u/ and n/sub y/ are the input and output order, and the function /spl Psi/ is the mapping performed by a Multilayer Perceptron. We constructively prove that the NARX networks with a finite number of parameters are computationally as strong as fully connected recurrent networks and thus Turing machines. We conclude that in theory one can use the NARX models, rather than conventional recurrent networks without any computational loss even though their feedback is limited. Furthermore, these results raise the issue of what amount of feedback or recurrence is necessary for any network to be Turing equivalent and what restrictions on feedback limit computational power.

462 citations


Book
20 Feb 1997
TL;DR: Part I FUNDAMENTALS of PATTERN RECOGNITION Basic Concepts of Pattern Recognition basic concepts of pattern recognition and decision Theoretic Algorithms.
Abstract: Part I FUNDAMENTALS OF PATTERN RECOGNITION 0. Basic Concepts of Pattern Recognition 1. Decision Theoretic Algorithms 2. Structural Pattern Recognition Part II INTRODUCTORY NEURAL NETWORKS 3. Artificial Neural Network Structures 4. Supervised Training via Error Backpropogation: Derivations 5. Acceleration and Stabilization of Supervised Gradient Training of MLPs Part III ADVANCED FUNDAMENTALS OF NEURAL NETWORKS 6. Supervised Training via Strategic Search 7. Advances in Network Algorithms for Recognition 8. Using Hopfield Recurrent Neural Networks Part IV NEURAL, FEATURE, AND DATA ENGINEERING 9. Neural Engineering and Testing of FANNs 10. Feature and Data Engineering

375 citations


Journal ArticleDOI
TL;DR: An adaptive observer for a class of single-input single-output (SISO) nonlinear systems is proposed using a generalized dynamic recurrent neural network (DRNN), with tuned on-line, with no off-line learning required.

232 citations


Journal ArticleDOI
TL;DR: It is shown that this unique fixed point in TCNN can actually evolve into a snap-back repeller which generates chaotic structure, if several conditions are satisfied, and the obtained theoretical results hold for a wide class of discrete-time neural networks.

163 citations


Journal ArticleDOI
TL;DR: The simulation results of this paper indicate that recurrent networks filter noise more successfully than feedforward networks in small as well as large samples.

138 citations


Journal ArticleDOI
TL;DR: It is shown that using intelligent memory order selection through pruning and good initial heuristics significantly improves the generalization and predictive performance of these nonlinear systems on problems as diverse as grammatical inference and time series prediction.
Abstract: Recurrent neural networks have become popular models for system identification and time series prediction. Nonlinear autoregressive models with exogenous inputs (NARX) neural network models are a popular subclass of recurrent networks and have been used in many applications. Although embedded memory can be found in all recurrent network models, it is particularly prominent in NARX models. We show that using intelligent memory order selection through pruning and good initial heuristics significantly improves the generalization and predictive performance of these nonlinear systems on problems as diverse as grammatical inference and time series prediction.

131 citations


Proceedings ArticleDOI
01 Jul 1997
TL;DR: A recurrent self-organizing neural fuzzy inference network (RSONFIN) is proposed in this paper and forms a fast learning algorithm for building a small, yet powerful, dynamic neural fuzzy network.
Abstract: A recurrent self-organizing neural fuzzy inference network (RSONFIN) is proposed in this paper. The RSONFIN is constructed from a series of dynamic fuzzy rules. The temporal relations embedded in the network are built by adding some feedback connections representing the memory elements to a feedforward neural fuzzy network. Each weight as well as node in the RSONFIN has its own meaning and represents a special element in a fuzzy rule. There are no hidden nodes (i.e., no membership functions and fuzzy rules) initially in the RSONFIN. They are created online via concurrent structure identification (the construction of dynamic fuzzy if-then rules) and parameter identification (the tuning of the free parameters of membership functions). The structure learning together with the parameter learning forms a fast learning algorithm for building a small, yet powerful, dynamic neural fuzzy network. Simulations on temporal problems are performed.

Book
01 Jan 1997
TL;DR: A brief introduction to the field is given as well as an implementation of automatic neural network generation using genetic programming, a step towards automation in architecture generation.
Abstract: This paper reports the application of evolutionary computation in the automatic generation of a neural network architecture. It is a usual practice to use trial and error to find a suitable neural network architecture. This is not only time consuming but may not generate an optimal solution for a given problem. The use of evolutionary computation is a step towards automation in architecture generation. In this paper a brief introduction to the field is given as well as an implementation of automatic neural network generation using genetic programming. >

Journal ArticleDOI
TL;DR: Three recurrent neural networks are presented for computing the pseudoinverses of rank-deficient matrices with dynamical equation similar to the one proposed earlier for matrix inversion and capable of Moore--Penrose inversion under the condition of zero initial states.
Abstract: Three recurrent neural networks are presented for computing the pseudoinverses of rank-deficient matrices. The first recurrent neural network has the dynamical equation similar to the one proposed earlier for matrix inversion and is capable of Moore--Penrose inversion under the condition of zero initial states. The second recurrent neural network consists of an array of neurons corresponding to a pseudoinverse matrix with decaying self-connections and constant connections in each row or column. The third recurrent neural network consists of two layers of neuron arrays corresponding, respectively, to a pseudoinverse matrix and a Lagrangian matrix with constant connections. All three recurrent neural networks are also composed of a number of independent subnetworks corresponding to the rows or columns of a pseudoinverse. The proposed recurrent neural networks are shown to be capable of computing the pseudoinverses of rank-deficient matrices.

Journal ArticleDOI
TL;DR: It is shown that all the existing RNN architecture can be considered as special cases of the general RNN architectures, and how these existing architectures can be transformed to the generalRNN architectures.

Proceedings Article
01 Dec 1997
TL;DR: If an object has a continuous family of instantiations, it should be represented by a continuous attractor, and this idea is illustrated with a network that learns to complete patterns.
Abstract: One approach to invariant object recognition employs a recurrent neural network as an associative memory. In the standard depiction of the network's state space, memories of objects are stored as attractive fixed points of the dynamics. I argue for a modification of this picture: if an object has a continuous family of instantiations, it should be represented by a continuous attractor. This idea is illustrated with a network that learns to complete patterns. To perform the task of filling in missing information, the network develops a continuous attractor that models the manifold from which the patterns are drawn. From a statistical view-point, the pattern completion task allows a formulation of unsupervised learning in terms of regression rather than density estimation.

Journal ArticleDOI
TL;DR: Application of recurrent, neural networks in the design of an adaptive power system stabilizer (PSS) is presented in this paper and simulation studies show that the artificial neural network (ANN) based PSS can provide very good damping over a wide range of operating conditions.
Abstract: Application of recurrent, neural networks in the design of an adaptive power system stabilizer (PSS) is presented in this paper. The architecture of the proposed adaptive PSS has two recurrent neural networks. One functions as a tracker to learn the dynamic characteristics of the power plant and the second one functions as a controller to damp the oscillations caused by the disturbances. In the proposed approach, the weights of the neural networks are updated on-line. Therefore, any new information available during actual control of the plant is considered. Simulation studies show that the artificial neural network (ANN) based PSS can provide very good damping over a wide range of operating conditions.

Journal ArticleDOI
TL;DR: Through simulations, the dynamical behavior of CNFCMs is presented and the inference capabilities are illustrated in comparison to that of the classical FCM by means of an example.

Book
01 Sep 1997
TL;DR: An introduction to Fuzzy Systems, Neural Networks, and Genetic Algorithms and a new approach of Neurofuzzy Learning Algorithm.
Abstract: Foreword P.P. Wang. Editor's Preface Da Ruan. Part 1: Basic Principles and Methodologies. 1. Introduction to Fuzzy Systems, Neural Networks, and Genetic Algorithms H. Takagi. 2. A Fuzzy Neural Network for Approximate Fuzzy Reasoning L.P. Maguire, et al. 3. Novel Neural Algorithms for Solving Fuzzy Relation Equations Xiaozhong Li, Da Ruan. 4. Methods for Simplification of Fuzzy Models U. Kaymak, et al. 5. A New Approach of Neurofuzzy Learning Algorithm M. Mizumoto, Yan Shi. Part 2: Data Analysis and Information Systems. 6. Neural Networks in Intelligent Data Analysis Xiaohui Liu. 7. Data-Driven Identification of Key Variables Bo Yuan, G. Klir. 8. Applications of Intelligent Techniques in Process Analysis J. Angstenberger, R. Weber. 9. Neurofuzzy-Chaos Engineering for Building Intelligent Adaptive Information Systems N.K. Kasabov, R. Kozma. 10. A Sequential Training Strategy for Locally Recurrent Neural Networks Jie Zhang, A.J. Morris. Part 3: Nonlinear Systems and System Identification. 11. Adaptive Genetic Programming for System Identification A. Bastian. 12. Nonlinear System Identification with Neurofuzzy Methods O. Nelles. 13. A Genetic Algorithm for Mixed-Integer Optimisation in Power and Water System Design and Control Kai Chen, et al. 14. Soft Computing Based Signal Prediction, Restoration, and Filtering E. Uchino, T. Yamakawa. Subject Index.

Journal ArticleDOI
TL;DR: A novel approach to learning in recurrent neural networks (RNNs) that exploits the principle of discriminative learning, minimizing an error functional that is a direct measure of the classification error.
Abstract: Neural networks (NNs) have been extensively applied to many signal processing problems. In particular, due to their capacity to form complex decision regions, NNs have been successfully used in adaptive equalization of digital communication channels. The mean square error (MSE) criterion, which is usually adopted in neural learning, is not directly related to the minimization of the classification error, i.e., bit error rate (BER), which is of interest in channel equalization. Moreover, common gradient-based learning techniques are often characterized by slow speed of convergence and numerical ill conditioning. In this paper, we introduce a novel approach to learning in recurrent neural networks (RNNs) that exploits the principle of discriminative learning, minimizing an error functional that is a direct measure of the classification error. The proposed method extends to RNNs a technique applied with success to fast learning of feedforward NNs and is based on the descent of the error functional in the space of the linear combinations of the neurons (the neuron space); its main features are higher speed of convergence and better numerical conditioning w.r.t. gradient-based approaches, whereas numerical stability is assured by the use of robust least squares solvers. Experiments regarding the equalization of PAM signals in different transmission channels are described, which demonstrate the effectiveness of the proposed approach.

Proceedings ArticleDOI
23 Mar 1997
TL;DR: A novel hybrid neural network algorithm for noisy time series prediction is presented which exhibits excellent performance on the problem and permits the inference and extraction of rules.
Abstract: The paper considers the prediction of noisy time series data, specifically, the prediction of foreign exchange rate data. A novel hybrid neural network algorithm for noisy time series prediction is presented which exhibits excellent performance on the problem. The method is motivated by consideration of how neural networks work, and by fundamental difficulties with random correlations when dealing with small sample sizes and high noise data. The method permits the inference and extraction of rules. One of the greatest complaints against neural networks is that it is hard to figure out exactly what they are doing-this work provides one answer for the internal workings of the network. Furthermore, these rules can be used to gain insight into both the real world system and the predictor. The paper focuses on noisy time series prediction and rule inference-use of the system in trading would typically involve the utilization of other financial indicators and domain knowledge.

Journal ArticleDOI
TL;DR: A theory that the Wiener-type cascade dynamical model, in which a simple linear plant is used as the dynamic subsystem and a three-layer feedforward artificial neural network is employed as the nonlinear static subsystem, can uniformly approximate a continuous trajectory of a general nonlinear dynamical system with arbitrarily high precision on a compact time domain is shown.
Abstract: In this article we first show a theory that the Wiener-type cascade dynamical model, in which a simple linear plant is used as the dynamic subsystem and a three-layer feedforward artificial neural network is employed as the nonlinear static subsystem, can uniformly approximate a continuous trajectory of a general nonlinear dynamical system with arbitrarily high precision on a compact time domain. We then report some successful simulation results, by training the neural network using a model-reference adaptive control method, for identification of continuous-time and discrete-time chaotic systems, including the typical Duffing, Henon, and Lozi systems. This Wiener-type cascade structure is believed to have great potential for chaotic dynamics identification, control and synchronization.

Journal ArticleDOI
TL;DR: The PRNN-based predictor presented in this paper is shown to be promising and practically feasible in obtaining the best adaptive prediction of real-time MPEG video traffic.
Abstract: This paper investigates the application of a pipelined recurrent neural network (PRNN) to the adaptive traffic prediction of MPEG video signal via dynamic ATM networks. The traffic signal of each picture type (I, P, and B) of MPEG video is characterized by a general nonlinear autoregressive moving average (NARMA) process. Moreover, a minimum mean-squared error predictor based on the NARMA model is developed to provide the best prediction for the video traffic signal. However, the explicit functional expression of the best mean-squared error predictor is actually unknown. To tackle this difficulty, a PRNN that consists of a number of simpler small-scale recurrent neural network (RNN) modules with less computational complexity is conducted to introduce the best nonlinear approximation capability into the minimum mean-squared error predictor model in order to accurately predict the future behavior of MPEG video traffic in a relatively short time period based on adaptive learning for each module from previous measurement data, in order to provide faster and more accurate control action to avoid the effects of excessive load situation. Since those modules of PRNN can be performed simultaneously in a pipelined parallelism fashion, this would lead to a significant improvement in the total computational efficiency of PRNN. In order to further improve the convergence performance of the adaptive algorithm for PRNN, a learning-rate annealing schedule is proposed to accelerate the adaptive learning process. Another advantage of the PRNN-based predictor is its generalization from learning that is useful for learning a dynamic environment for MPEG video traffic prediction in ATM networks where observations may be incomplete, delayed, or partially available. The PRNN-based predictor presented in this paper is shown to be promising and practically feasible in obtaining the best adaptive prediction of real-time MPEG video traffic.

Journal ArticleDOI
TL;DR: Experimental results have confirmed that the proposed recurrent neural network improves discrimination and generalization powers in the recognition of visual patterns.
Abstract: We propose a new type of recurrent neural-network architecture, in which each output unit is connected to itself and is also fully connected to other output units and all hidden units. The proposed recurrent neural network differs from Jordan's and Elman's recurrent neural networks with respect to function and architecture, because it has been originally extended from being a mere multilayer feedforward neural network, to improve discrimination and generalization powers. We also prove the convergence properties of the learning algorithm in the proposed recurrent neural network, and analyze the performance of the proposed recurrent neural network by performing recognition experiments with the totally unconstrained handwritten numeric database of Concordia University, Montreal, Canada. Experimental results have confirmed that the proposed recurrent neural network improves discrimination and generalization powers in the recognition of visual patterns.

Book ChapterDOI
04 Jun 1997
TL;DR: A modification of Pollack's RAAM is presented, called a Recursive Hetero-Associative Memory (RHAM), and it is shown that it is capable of learning simple translation tasks, by building a state-Space representation of each input string and unfolding it to obtain the corresponding output string.
Abstract: This paper presents a modification of Pollack's RAAM (Recursive Auto-Associative Memory), called a Recursive Hetero-Associative Memory (RHAM), and shows that it is capable of learning simple translation tasks, by building a state-Space representation of each input string and unfolding it to obtain the corresponding output string. RHAM-based translators are computationally more powerful and easier to train than their corresponding double-RAAM counterparts in the literature.

Book ChapterDOI
22 Aug 1997
TL;DR: This paper proposes recognizing logo images by using an adaptive model referred to as recursive artificial neural network, which contains the topological structured information of logo and continuous values pertaining to each contour node in the contour-tree representation of logo image.
Abstract: In this paper we propose recognizing logo images by using an adaptive model referred to as recursive artificial neural network. At first, logo images are converted into a structured representation based on contour trees. Recursive neural networks are then learnt using the contourtrees as inputs to the neural nets. On the other hand, the contour-tree is constructed by associating a node with each exterior or interior contour extracted from the logo instance. Nodes in the tree are labeled by a feature vector, which describes the contour by means of its perimeter, surrounded area, and a synthetic representation of its curvature plot. The contour-tree representation contains the topological structured information of logo and continuous values pertaining to each contour node. Hence symbolic and sub-symbolic information coexist in the contour-tree representation of logo image. Experimental results are reported on 40 real logos distorted with artificial noise and performance of recursive neural network is compared with another two types of neural approaches.

Journal ArticleDOI
TL;DR: This paper presents two recurrent neural networks for solving the assignment problem, called primal and dual assignment, which are guaranteed to make optimal assignment and even simpler in architecture than the primal network.
Abstract: This paper presents two recurrent neural networks for solving the assignment problem. Simplifying the architecture of a recurrent neural network based on the primal assignment problem, the first recurrent neural network, called the primal assignment network, has less complex connectivity than its predecessor. The second recurrent neural network, called the dual assignment network, based on the dual assignment problem, is even simpler in architecture than the primal assignment network. The primal and dual assignment networks are guaranteed to make optimal assignment. The applications of the primal and dual assignment networks for sorting and shortest-path routing are discussed. The performance and operating characteristics of the dual assignment network are demonstrated by means of illustrative examples.

Proceedings ArticleDOI
20 Apr 1997
TL;DR: A method of stable motion generation of a biped locomotion robot with recurrent neural networks with genetic algorithms for learning capability and self-adaptive mutation operator and verified that the calculated stable motion trajectory can be successfully applied to the practical biping locomotion.
Abstract: The purpose of this research is to generate natural motion of the biped locomotion robot such as the walking of a human in various environments. In this paper, we propose a method of stable motion generation of a biped locomotion robot. We apply the control of this proposed method with eight force sensors at the soles of the biped locomotion robot. The zero moment point (ZMP) is a well known index of stability in walking robots. ZMP is determined by the configuration of the robots. However, there are many configurations against the ZMP. Because of that, when we use ZMP as the stabilization index, we must select the best configuration in many stability configurations. Then it is a problem of which configurations are selected. In this paper, we solve the problem with recurrent neural networks. In both the single support and double support periods, we calculate the position of ZMP by using values from four force sensors at each sole, and actuation joints and the angles can be determined by recurrent neural networks without ZMP moving out from the supporting area of sole. We employ the recurrent neural networks with genetic algorithms for learning capability and self-adaptive mutation operator. Further, we build a biped locomotion robot in trial, which has 13 joints and verified that the calculated stable motion trajectory can be successfully applied to the practical biped locomotion.

Journal ArticleDOI
TL;DR: The paper discusses how Narendra's (1990, 1991) dynamic backpropagation procedure, which is used for identifying recurrent neural networks from I/O measurements, can be modified with an NL/sub q/ stability constraint in order to ensure globally asymptotically stable identified models.
Abstract: It is known that many discrete-time recurrent neural networks, such as e.g., neural state space models, multilayer Hopfield networks, and locally recurrent globally feedforward neural networks, can be represented as NL/sub q/ systems. Sufficient conditions for global asymptotic stability and input/output stability of NL/sub q/ systems are available, including three types of criteria: (1) diagonal scaling; (2) criteria depending on diagonal dominance; (3) condition number factors of certain matrices. The paper discusses how Narendra's (1990, 1991) dynamic backpropagation procedure, which is used for identifying recurrent neural networks from I/O measurements, can be modified with an NL/sub q/ stability constraint in order to ensure globally asymptotically stable identified models. An example illustrates how system identification of an internally stable model corrupted by process noise may lead to unwanted limit cycle behavior and how this problem can be avoided by adding the stability constraint.

Journal ArticleDOI
TL;DR: Case studies dealing with fuel alcohol production using renewable biomass from agricultural wastes by fermentation with Zymomonas mobilis and recombinant Escherichia coli, and preliminary results for the production of monoclonal antibody using hybridoma cells are examined, indicating that RBF networks are unsuitable when extrapolation is desired.

Journal ArticleDOI
TL;DR: This work develops artificial neural network group theory, then shows how neural network groups are able to approximate any kind of piecewise continuous function, and to any degree of accuracy, and illustrated by way of an ANN expert system for rainfall estimation.