scispace - formally typeset
Search or ask a question

Showing papers on "Recurrent neural network published in 2007"


Journal ArticleDOI
TL;DR: Three different uses of a recurrent neural network as a reservoir that is not trained but instead read out by a simple external classification layer are compared and a new measure for the reservoir dynamics based on Lyapunov exponents is introduced.

930 citations


Journal ArticleDOI
TL;DR: Stability conditions are presented, a stochastic gradient descent method is introduced and a usefulness of leaky-integrator ESNs are demonstrated for learning very slow dynamic systems and replaying the learnt system at different speeds.

740 citations


Book ChapterDOI
16 Aug 2007
TL;DR: Artificial Bee Colony (ABC) Algorithm which has good exploration and exploitation capabilities in searching optimal weight set is used in training neural networks.
Abstract: Training an artificial neural network is an optimization task since it is desired to find optimal weight set of a neural network in training process. Traditional training algorithms has some drawbacks such as getting stuck in local minima and computational complexity. Therefore, evolutionary algorithms are employed to train neural networks to overcome these issues. In this work, Artificial Bee Colony (ABC) Algorithm which has good exploration and exploitation capabilities in searching optimal weight set is used in training neural networks.

450 citations


Book ChapterDOI
09 Sep 2007
TL;DR: A discriminative keyword spotting system based on recurrent neural networks only, that uses information from long time spans to estimate word-level posterior probabilities of sub-word units, is presented.
Abstract: The goal of keyword spotting is to detect the presence of specific spoken words in unconstrained speech. The majority of keyword spotting systems are based on generative hidden Markov models and lack discriminative capabilities. However, discriminative keyword spotting systems are currently based on frame-level posterior probabilities of sub-word units. This paper presents a discriminative keyword spotting system based on recurrent neural networks only, that uses information from long time spans to estimate word-level posterior probabilities. In a keyword spotting task on a large database of unconstrained speech the system achieved a keyword spotting accuracy of 84.5%

278 citations


Journal ArticleDOI
TL;DR: It is shown that Evolino-based LSTM can solve tasks that Echo State nets cannot and achieves higher accuracy in certain continuous function generation tasks than conventional gradient descent RNNs, including gradient-basedLSTM.
Abstract: In recent years, gradient-based LSTM recurrent neural networks (RNNs) solved many previously RNN-unlearnable tasks. Sometimes, however, gradient information is of little use for training RNNs, due to numerous local minima. For such cases, we present a novel method: EVOlution of systems with LINear Outputs (Evolino). Evolino evolves weights to the nonlinear, hidden nodes of RNNs while computing optimal linear mappings from hidden state to output, using methods such as pseudo-inverse-based linear regression. If we instead use quadratic programming to maximize the margin, we obtain the first evolutionary recurrent support vector machines. We show that Evolino-based LSTM can solve tasks that Echo State nets (Jaeger, 2004a) cannot and achieves higher accuracy in certain continuous function generation tasks than conventional gradient descent RNNs, including gradient-based LSTM.

264 citations


Proceedings Article
03 Dec 2007
TL;DR: A system capable of directly transcribing raw online handwriting data is described, consisting of an advanced recurrent neural network with an output layer designed for sequence labelling, combined with a probabilistic language model.
Abstract: In online handwriting recognition the trajectory of the pen is recorded during writing. Although the trajectory provides a compact and complete representation of the written output, it is hard to transcribe directly, because each letter is spread over many pen locations. Most recognition systems therefore employ sophisticated preprocessing techniques to put the inputs into a more localised form. However these techniques require considerable human effort, and are specific to particular languages and alphabets. This paper describes a system capable of directly transcribing raw online handwriting data. The system consists of an advanced recurrent neural network with an output layer designed for sequence labelling, combined with a probabilistic language model. In experiments on an unconstrained online database, we record excellent results using either raw or preprocessed data, well outperforming a state-of-the-art HMM based system in both cases.

262 citations


Journal ArticleDOI
TL;DR: A novel chaotic time-series prediction method based on support vector machines (SVMs) and echo-state mechanisms is proposed, and its generalization ability and robustness are obtained by regularization operator and robust loss function.
Abstract: A novel chaotic time-series prediction method based on support vector machines (SVMs) and echo-state mechanisms is proposed. The basic idea is replacing "kernel trick" with "reservoir trick" in dealing with nonlinearity, that is, performing linear support vector regression (SVR) in the high-dimension "reservoir" state space, and the solution benefits from the advantages from structural risk minimization principle, and we call it support vector echo-state machines (SVESMs). SVESMs belong to a special kind of recurrent neural networks (RNNs) with convex objective function, and their solution is global, optimal, and unique. SVESMs are especially efficient in dealing with real life nonlinear time series, and its generalization ability and robustness are obtained by regularization operator and robust loss function. The method is tested on the benchmark prediction problem of Mackey-Glass time series and applied to some real life time series such as monthly sunspots time series and runoff time series of the Yellow River, and the prediction results are promising

244 citations


Posted Content
TL;DR: In this paper, multi-dimensional recurrent neural networks (MDRNNs) have been proposed for image segmentation, which can be used for vision, video processing, medical imaging and many other areas.
Abstract: Recurrent neural networks (RNNs) have proved effective at one dimensional sequence learning tasks, such as speech and online handwriting recognition. Some of the properties that make RNNs suitable for such tasks, for example robustness to input warping, and the ability to access contextual information, are also desirable in multidimensional domains. However, there has so far been no direct way of applying RNNs to data with more than one spatio-temporal dimension. This paper introduces multi-dimensional recurrent neural networks (MDRNNs), thereby extending the potential applicability of RNNs to vision, video processing, medical imaging and many other areas, while avoiding the scaling problems that have plagued other multi-dimensional models. Experimental results are provided for two image segmentation tasks.

220 citations


Book ChapterDOI
09 Sep 2007
TL;DR: Multi-dimensional recurrent neural networks are introduced, thereby extending the potential applicability of RNNs to vision, video processing, medical imaging and many other areas, while avoiding the scaling problems that have plagued other multi-dimensional models.
Abstract: Recurrent neural networks (RNNs) have proved effective at one dimensional sequence learning tasks, such as speech and online handwriting recognition. Some of the properties that make RNNs suitable for such tasks, for example robustness to input warping, and the ability to access contextual information, are also desirable in multi-dimensional domains. However, there has so far been no direct way of applying RNNs to data with more than one spatio-temporal dimension. This paper introduces multi-dimensional recurrent neural networks, thereby extending the potential applicability of RNNs to vision, video processing, medical imaging and many other areas, while avoiding the scaling problems that have plagued other multi-dimensional models. Experimental results are provided for two image segmentation tasks.

215 citations


Journal ArticleDOI
TL;DR: This research provides a scientific approach for evaluating the short-term seismic hazard potential of a region and yields the best prediction accuracies compared with LMBP and RBF networks.
Abstract: Neural networks are investigated for predicting the magnitude of the largest seismic event in the following month based on the analysis of eight mathematically computed parameters known as seismicity indicators. The indicators are selected based on the Gutenberg-Richter and characteristic earthquake magnitude distribution and also on the conclusions drawn by recent earthquake prediction studies. Since there is no known established mathematical or even empirical relationship between these indicators and the location and magnitude of a succeeding earthquake in a particular time window, the problem is modeled using three different neural networks: a feed-forward Levenberg-Marquardt backpropagation (LMBP) neural network, a recurrent neural network, and a radial basis function (RBF) neural network. Prediction accuracies of the models are evaluated using four different statistical measures: the probability of detection, the false alarm ratio, the frequency bias, and the true skill score or R score. The models are trained and tested using data for two seismically different regions: Southern California and the San Francisco bay region. Overall the recurrent neural network model yields the best prediction accuracies compared with LMBP and RBF networks. While at the present earthquake prediction cannot be made with a high degree of certainty this research provides a scientific approach for evaluating the short-term seismic hazard potential of a region.

209 citations


Proceedings Article
01 Jan 2007
TL;DR: A new connectionist approach to on-line handwriting recognition and address in particular the problem of recognizing handwritten whiteboard notes using a recently introduced objective function, known as Connectionist Temporal Classification (CTC), that directly trains the network to label unsegmented sequence data.
Abstract: In this paper we introduce a new connectionist approach to on-line handwriting recognition and address in particular the problem of recognizing handwritten whiteboard notes. The approach uses a bidirectional recurrent neural network with the long short-term memory architecture. We use a recently introduced objective function, known as Connectionist Temporal Classification (CTC), that directly trains the network to label unsegmented sequence data. Our new system achieves a word recognition rate of 74.0%, compared with 65.4% using a previously developed HMMbased recognition system.

Journal ArticleDOI
01 Mar 2007
TL;DR: GAs are applied to support optimization of the number of time delays and network architectural factors simultaneously for the ATNN and TDNN model to show that the accuracy of the integrated approach proposed is higher than that of the standard ATNN, TDNN and the recurrent neural network (RNN).
Abstract: This study investigates the effectiveness of a hybrid approach based on the artificial neural networks (ANNs) for time series properties, such as the adaptive time delay neural networks (ATNNs) and the time delay neural networks (TDNNs), with the genetic algorithms (GAs) in detecting temporal patterns for stock market prediction tasks. Since ATNN and TDNN use time-delayed links of the network into a multi-layer feed-forward network, the topology of which grows by on layer at every time step, it has one more estimate of the number of time delays in addition to several control variables of the ANN design. To estimate these many aspects of the ATNN and TDNN design, a general method based on trial and error along with various heuristics or statistical techniques is proposed. However, for the reason that determining the number of time delays or network architectural factors in a stand-alone mode does not guarantee the illuminating improvement of the performance for building the ATNN and TDNN model, we apply GAs to support optimization of the number of time delays and network architectural factors simultaneously for the ATNN and TDNN model. The results show that the accuracy of the integrated approach proposed for this study is higher than that of the standard ATNN, TDNN and the recurrent neural network (RNN).

Journal ArticleDOI
TL;DR: Extensive simulation results demonstrate that the LF-DFNN models exhibit superior performance compared to other network types suggested in the literature, and it is shown that DRPE outperforms three gradient descent algorithms, in training of the recurrent forecast models.

Book
01 Jan 2007
TL;DR: The Structure of Neural Networks and Methods of Problem Solving in the Neural Network Logical Basis.
Abstract: The Structure of Neural Networks.- Transfer from the Logical Basis of Boolean Elements "AND, OR, NOT" to the Threshold Logical Basis.- Qualitative Characteristics of Neural Network Architectures.- Optimization of Cross Connection Multilayer Neural Network Structure.- Continual Neural Networks.- Optimal Models of Neural Networks.- Investigation of Neural Network Input Signal Characteristics.- Design of Neural Network Optimal Models.- Analysis of the Open-Loop Neural Networks.- Development of Multivariable Function Extremum Search Algorithms.- Adaptive Neural Networks.- Neural Network Adjustment Algorithms.- Adjustment of Continuum Neural Networks.- Selection of Initial Conditions During Neural Network Adjustment - Typical Neural Network Input Signals.- Analysis of Closed-Loop Multilayer Neural Networks.- Synthesis of Multilayer Neural Networks with Flexible Structure.- Informative Feature Selection in Multilayer Neural Networks.- Neural Network Reliability and Diagnostics.- Neural Network Reliability.- Neural Network Diagnostics.- Conclusion.- Methods of Problem Solving in the Neural Network Logical Basis.

Journal ArticleDOI
TL;DR: It is demonstrated that the SVM achieved diagnostic accuracies which were higher than that of the other automated diagnostic systems, which were compared to determine an optimum classification scheme with high diagnostic accuracy for breast cancer detection.
Abstract: This paper intends to an integrated view of implementing automated diagnostic systems for breast cancer detection. The major objective of the paper is to be a guide for the readers, who want to develop an automated decision support system for detection of breast cancer. Because of the importance of making the right decision, better classification procedures for breast cancer have been searched. The classification accuracies of different classifiers, namely multilayer perceptron neural network (MLPNN), combined neural network (CNN), probabilistic neural network (PNN), recurrent neural network (RNN) and support vector machine (SVM), which were trained on the attributes of each record in the Wisconsin breast cancer database, were compared. The purpose was to determine an optimum classification scheme with high diagnostic accuracy for this problem. This research demonstrated that the SVM achieved diagnostic accuracies which were higher than that of the other automated diagnostic systems.

Book ChapterDOI
09 Sep 2007
TL;DR: Recurrent Policy Gradients, a modelfree reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov decision problems (POMDPs) that require long-term memories of past observations is presented.
Abstract: This paper presents Recurrent Policy Gradients, a modelfree reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov decision problems (POMDPs) that require long-term memories of past observations. The approach involves approximating a policy gradient for a Recurrent Neural Network (RNN) by backpropagating return-weighted characteristic eligibilities through time. Using a "Long Short-Term Memory" architecture, we are able to outperform other RL methods on two important benchmark tasks. Furthermore, we show promising results on a complex car driving simulation task.

Journal ArticleDOI
TL;DR: Zhang et al. as mentioned in this paper proposed a particle swarm optimization (PSO) based approach to infer genetic regulatory networks from time series gene expression data, which can provide meaningful insights in understanding the nonlinear dynamics of the gene expression time series and revealing potential regulatory interactions between genes.
Abstract: Genetic regulatory network inference is critically important for revealing fundamental cellular processes, investigating gene functions, and understanding their relations. The availability of time series gene expression data makes it possible to investigate the gene activities of whole genomes, rather than those of only a pair of genes or among several genes. However, current computational methods do not sufficiently consider the temporal behavior of this type of data and lack the capability to capture the complex nonlinear system dynamics. We propose a recurrent neural network (RNN) and particle swarm optimization (PSO) approach to infer genetic regulatory networks from time series gene expression data. Under this framework, gene interaction is explained through a connection weight matrix. Based on the fact that the measured time points are limited and the assumption that the genetic networks are usually sparsely connected, we present a PSO-based search algorithm to unveil potential genetic network constructions that fit well with the time series data and explore possible gene interactions. Furthermore, PSO is used to train the RNN and determine the network parameters. Our approach has been applied to both synthetic and real data sets. The results demonstrate that the RNN/PSO can provide meaningful insights in understanding the nonlinear dynamics of the gene expression time series and revealing potential regulatory interactions between genes.

Journal ArticleDOI
TL;DR: It is shown that the suggested learning algorithms outperform three gradient descent algorithms, in training of the recurrent forecast models, and exhibit superior performance compared to other network types suggested in the literature.

Journal ArticleDOI
TL;DR: A proof for the universal approximation ability of RNN in state space model form and even extend it to Error Correction and Normalized Recurrent Neural Networks is given.
Abstract: Recurrent Neural Networks (RNN) have been developed for a better understanding and analysis of open dynamical systems. Still the question often arises if RNN are able to map every open dynamical system, which would be desirable for a broad spectrum of applications. In this article we give a proof for the universal approximation ability of RNN in state space model form and even extend it to Error Correction and Normalized Recurrent Neural Networks.

Journal ArticleDOI
TL;DR: In this article, the analysis problem for the existence and stability of periodic solutions is investigated for a class of general discrete-time recurrent neural networks with time-varying delays, and several sufficient conditions are established to ensure the existence, uniqueness and globally exponential stability of the periodic solution for the addressed neural network.

Journal ArticleDOI
TL;DR: It is demonstrated that the BPTT algorithm is more efficient for gradient calculations, but the RTRL algorithm isMore efficient for Jacobian calculations.
Abstract: This paper introduces a general framework for describing dynamic neural networks-the layered digital dynamic network (LDDN). This framework allows the development of two general algorithms for computing the gradients and Jacobians for these dynamic networks: backpropagation-through-time (BPTT) and real-time recurrent learning (RTRL). The structure of the LDDN framework enables an efficient implementation of both algorithms for arbitrary dynamic networks. This paper demonstrates that the BPTT algorithm is more efficient for gradient calculations, but the RTRL algorithm is more efficient for Jacobian calculations

Journal ArticleDOI
TL;DR: In this paper, the state estimation problem for a class of recurrent neural networks (RNNs) with mixed discrete and distributed delays is dealt with, where activation functions are assumed to be neither monotonic, nor differentiable, nor bounded.

Journal ArticleDOI
TL;DR: First, recurrent neural networks are applied to model these two processes together, and an extra factor characterizing the dispersion of prediction repetitions is incorporated into the performance function.

Journal ArticleDOI
TL;DR: The experimental results on two data sets studied in this paper demonstrate that the DEPSO algorithm performs better in RNN training, and the RNN-based model can provide meaningful insight in capturing the nonlinear dynamics of genetic networks and revealing genetic regulatory interactions.

Journal ArticleDOI
TL;DR: In this paper, the robust stability analysis problem for uncertain stochastic neural networks with time-varying delay was studied. But the authors only considered stochastically stable networks with norm bounded parameter uncertainties and their derivative was strictly smaller than one.
Abstract: This paper is concerned with the robust stability analysis problem for uncertain stochastic neural networks with time-varying delay. The parameter uncertainties are assumed to be norm bounded. By defining a new Lyapunov–Krasovskii functional, the restrictions such as the time-varying delay was required to be differentiable and its derivative was strictly smaller than one, are removed. Based on the linear matrix inequality approach, delay-dependent stability criteria are obtained such that for all admissible uncertainties, the stochastic neural network is globally asymptotically stable in the mean square. Two slack variables are introduced into the obtained stability criteria to reduce the conservatism. Finally, a numerical example is given to illustrate the effectiveness of the developed method.

Journal ArticleDOI
TL;DR: Under mild conditions, it is proved that the dynamical systems with unbounded time-varying delays are globally mu-stable.
Abstract: In this letter, dynamical systems with unbounded time-varying delays are investigated. We address the following question: To what extent the time-varying delays can exist while the system is stable? Moreover, a new concept of stability, global mu-stability, is proposed. Under mild conditions, we prove that the dynamical systems with unbounded time-varying delays are globally mu-stable.

Journal ArticleDOI
24 Sep 2007
TL;DR: The designed neural network in a specific case turns out to be the primal-dual network for solving quadratic or linear programming problems and is guaranteed to be globally convergent to solutions of the LVI under the condition that the linear mapping Mx + p is monotone on the constrained set.
Abstract: Most existing neural networks for solving linear variational inequalities (LVIs) with the mapping Mx + p require positive definiteness (or positive semidefiniteness) of M. In this correspondence, it is revealed that this condition is sufficient but not necessary for an LVI being strictly monotone (or monotone) on its constrained set where equality constraints are present. Then, it is proposed to reformulate monotone LVIs with equality constraints into LVIs with inequality constraints only, which are then possible to be solved by using some existing neural networks. General projection neural networks are designed in this correspondence for solving the transformed LVIs. Compared with existing neural networks, the designed neural networks feature lower model complexity. Moreover, the neural networks are guaranteed to be globally convergent to solutions of the LVI under the condition that the linear mapping Mx + p is monotone on the constrained set. Because quadratic and linear programming problems are special cases of LVI in terms of solutions, the designed neural networks can solve them efficiently as well. In addition, it is discovered that the designed neural network in a specific case turns out to be the primal-dual network for solving quadratic or linear programming problems. The effectiveness of the neural networks is illustrated by several numerical examples.

Book ChapterDOI
09 Sep 2007
TL;DR: In this article, the authors introduce three different time-scales and show that the performance and computational complexity of RNNs are highly dependent on these timescales, and this is demonstrated on an isolated spoken digits task.
Abstract: Reservoir Computing (RC) is a recent research area, in which a untrained recurrent network of nodes is used for the recognition of temporal patterns. Contrary to Recurrent Neural Networks (RNN), where the weights of the connections between the nodes are trained, only a linear output layer is trained. We will introduce three different time-scales and show that the performance and computational complexity are highly dependent on these time-scales. This is demonstrated on an isolated spoken digits task.

Journal ArticleDOI
TL;DR: An augmented complex-valued extended Kalman filter algorithm for the class of nonlinear adaptive filters realized as fully connected recurrent neural networks is introduced based on some recent developments in the so-called augmented complex statistics and the use of general fully complex nonlinear activation functions within the neurons.
Abstract: An augmented complex-valued extended Kalman filter (ACEKF) algorithm for the class of nonlinear adaptive filters realized as fully connected recurrent neural networks is introduced. This is achieved based on some recent developments in the so-called augmented complex statistics and the use of general fully complex nonlinear activation functions within the neurons. This makes the ACEKF suitable for processing general complex-valued nonlinear and nonstationary signals and also bivariate signals with strong component correlations. Simulations on benchmark and real-world complex-valued signals support the approach.

Journal ArticleDOI
TL;DR: In this paper, the pth moment exponential stability of stochastic recurrent neural networks with time-varying delays is investigated in detail, employing the method of variation parameter and inequality techniques.
Abstract: In this paper, the issue of pth moment exponential stability of stochastic recurrent neural network with time-varying delays is investigated in detail. Employing the method of variation parameter and inequality techniques, several sufficient conditions ensuring pth moment exponential stability are obtained. Compared with the previous methods, our method does not resort to any Lyapunov function, and the results derived in this paper improve and generalize some earlier works reported in the literature. Two numerical examples are given to illustrate the effectiveness of our results.