scispace - formally typeset
Search or ask a question

Showing papers on "Recurrent neural network published in 2010"


Proceedings Article
01 Jan 2010
TL;DR: Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model.
Abstract: A new recurrent neural network based language model (RNN LM) with applications to speech recognition is presented. Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model. Speech recognition experiments show around 18% reduction of word error rate on the Wall Street Journal task when comparing models trained on the same amount of data, and around 5% on the much harder NIST RT05 task, even when the backoff model is trained on much more data than the RNN LM. We provide ample empirical evidence to suggest that connectionist language models are superior to standard n-gram techniques, except their high computational (training) complexity. Index Terms: language modeling, recurrent neural networks, speech recognition

5,751 citations


Book
01 Jan 2010
TL;DR: Refocused, revised and renamed to reflect the duality of neural networks and learning machines, this edition recognizes that the subject matter is richer when these topics are studied together.
Abstract: For graduate-level neural network courses offered in the departments of Computer Engineering, Electrical Engineering, and Computer Science. Neural Networks and Learning Machines, Third Edition is renowned for its thoroughness and readability. This well-organized and completely upto-date text remains the most comprehensive treatment of neural networks from an engineering perspective. This is ideal for professional engineers and research scientists. Matlab codes used for the computer experiments in the text are available for download at: http://www.pearsonhighered.com/haykin/ Refocused, revised and renamed to reflect the duality of neural networks and learning machines, this edition recognizes that the subject matter is richer when these topics are studied together. Ideas drawn from neural networks and machine learning are hybridized to perform improved learning tasks beyond the capability of either independently.

4,943 citations


Book
14 Oct 2010
TL;DR: This paper presents a meta-modelling framework for modeling Neural Networks without Training for Optimization that combines self-Organizing Maps and Unsupervised Classification.
Abstract: Neural Networks: An Overview.- Modeling with Neural Networks: Principles and Model Design Methodology.- Modeling Metholodgy: Dimension Reduction and Resampling Methods.- Neural Identification of Controlled Dynamical Systems and Recurrent Networks.- Closed-Loop Control Learning.- Discrimination.- Self-Organizing Maps and Unsupervised Classification.- Neural Networks without Training for Optimization.

519 citations


Journal ArticleDOI
TL;DR: New delay-dependent stability criteria for RNNs with time-varying delay are derived by applying this weighting-delay method, which are less conservative than previous results.
Abstract: In this paper, a weighting-delay-based method is developed for the study of the stability problem of a class of recurrent neural networks (RNNs) with time-varying delay. Different from previous results, the delay interval [0, d(t)] is divided into some variable subintervals by employing weighting delays. Thus, new delay-dependent stability criteria for RNNs with time-varying delay are derived by applying this weighting-delay method, which are less conservative than previous results. The proposed stability criteria depend on the positions of weighting delays in the interval [0, d(t)], which can be denoted by the weighting-delay parameters. Different weighting-delay parameters lead to different stability margins for a given system. Thus, a solution based on optimization methods is further given to calculate the optimal weighting-delay parameters. Several examples are provided to verify the effectiveness of the proposed criteria.

374 citations


Journal ArticleDOI
TL;DR: By constructing a novel Lyapunov-Krasovskii functional, and using some new approaches and techniques, several novel sufficient conditions are obtained to ensure the exponential stability of the trivial solution in the mean square.
Abstract: This paper is concerned with the problem of exponential stability for a class of Markovian jump impulsive stochastic Cohen-Grossberg neural networks with mixed time delays and known or unknown parameters. The jumping parameters are determined by a continuous-time, discrete-state Markov chain, and the mixed time delays under consideration comprise both time-varying delays and continuously distributed delays. To the best of the authors' knowledge, till now, the exponential stability problem for this class of generalized neural networks has not yet been solved since continuously distributed delays are considered in this paper. The main objective of this paper is to fill this gap. By constructing a novel Lyapunov-Krasovskii functional, and using some new approaches and techniques, several novel sufficient conditions are obtained to ensure the exponential stability of the trivial solution in the mean square. The results presented in this paper generalize and improve many known results. Finally, two numerical examples and their simulations are given to show the effectiveness of the theoretical results.

282 citations


Journal ArticleDOI
TL;DR: It is found that inputs not only drive network responses, but they also actively suppress ongoing activity, ultimately leading to a phase transition in which chaos is completely eliminated.
Abstract: Neuronal activity arises from an interaction between ongoing firing generated spontaneously by neural circuits and responses driven by external stimuli. Using mean-field analysis, we ask how a neural network that intrinsically generates chaotic patterns of activity can remain sensitive to extrinsic input. We find that inputs not only drive network responses, but they also actively suppress ongoing activity, ultimately leading to a phase transition in which chaos is completely eliminated. The critical input intensity at the phase transition is a nonmonotonic function of stimulus frequency, revealing a "resonant" frequency at which the input is most effective at suppressing chaos even though the power spectrum of the spontaneous activity peaks at zero and falls exponentially. A prediction of our analysis is that the variance of neural responses should be most strongly suppressed at frequencies matching the range over which many sensory systems operate.

256 citations


Proceedings ArticleDOI
18 Jul 2010
TL;DR: A simplified mathematical model is proposed to characterize the pinched hysteretic feature of the memristor, amemristor-based recurrent neural network model is given, and its global stability is studied.
Abstract: Memristor is a newly prototyped nonlinear circuit device. Its value is not unique and changes according to the value of the magnitude and polarity of the voltage applied to it. In this paper, a simplified mathematical model is proposed to characterize the pinched hysteretic feature of the memristor, a memristor-based recurrent neural network model is given, and its global stability is studied. Using differential inclusion, two sufficient conditions for the global uniform asymptotic stability of memristor-based recurrent neural networks are obtained.

193 citations


Journal ArticleDOI
TL;DR: A sufficient condition is obtained to ensure that n-neuron recurrent neural networks can have (4k-1)n equilibrium points and (2k)n of them are locally exponentially stable, which improves and extends the existing stability results in the literature.
Abstract: In this brief, stability of multiple equilibria of recurrent neural networks with time-varying delays and the piecewise linear activation function is studied. A sufficient condition is obtained to ensure that n-neuron recurrent neural networks can have (4k-1)n equilibrium points and (2k)n of them are locally exponentially stable. This condition improves and extends the existing stability results in the literature. Simulation results are also discussed in one illustrative example.

189 citations


Journal ArticleDOI
TL;DR: The method introduced in this paper allows for training arbitrarily connected neural networks, therefore, more powerful neural network architectures with connections across layers can be efficiently trained.
Abstract: The method introduced in this paper allows for training arbitrarily connected neural networks, therefore, more powerful neural network architectures with connections across layers can be efficiently trained. The proposed method also simplifies neural network training, by using the forward-only computation instead of the traditionally used forward and backward computation.

187 citations


Journal ArticleDOI
TL;DR: A general model of recurrent neural networks that perform complex rule-based tasks is proposed, and it is found that the diversity of neuronal responses plays a fundamental role when the behavioral responses are context-dependent.
Abstract: Neural activity of behaving animals, especially in the prefrontal cortex, is highly heterogeneous, with selective responses to diverse aspects of the executed task. We propose a general model of recurrent neural networks that perform complex rule-based tasks, and we show that the diversity of neuronal responses plays a fundamental role when the behavioral responses are context-dependent. Specifically, we found that when the inner mental states encoding the task rules are represented by stable patterns of neural activity (attractors of the neural dynamics), the neurons must be selective for combinations of sensory stimuli and inner mental states. Such mixed selectivity is easily obtained by neurons that connect with random synaptic strengths both to the recurrent network and to neurons encoding sensory inputs. The number of randomly connected neurons needed to solve a task is on average only three times as large as the number of neurons needed in a network designed ad hoc. Moreover, the number of needed neurons grows only linearly with the number of task-relevant events and mental states, provided that each neuron responds to a large proportion of events (dense/distributed coding). A biologically realistic implementation of the model captures several aspects of the activity recorded from monkeys performing context-dependent tasks. Our findings explain the importance of the diversity of neural responses and provide us with simple and general principles for designing attractor neural networks that perform complex computation.

185 citations


DOI
01 Oct 2010
TL;DR: The developed ARNN is constructed based on the adaptive/recurrent neural network architecture and the network weights are adaptively optimized using the recursive Levenberg-Marquardt (RLM) method.
Abstract: Prognostics is an emerging science of predicting the health condition of a system (or its components) based upon current and previous system states. A reliable predictor is very useful to a wide array of industries to predict the future states of the system such that the maintenance service could be scheduled in advance when needed. In this paper, an adaptive recurrent neural network (ARNN) is proposed for system dynamic state forecasting. The developed ARNN is constructed based on the adaptive/recurrent neural network architecture and the network weights are adaptively optimized using the recursive Levenberg-Marquardt (RLM) method. The effectiveness of the proposed ARNN is demonstrated via an application in remaining useful life prediction of lithium-ion batteries. *

Journal ArticleDOI
TL;DR: A novel technique for incremental recognition of the user's emotional state as it is applied in a sensitive artificial listener (SAL) system designed for socially competent human-machine communication.
Abstract: The automatic estimation of human affect from the speech signal is an important step towards making virtual agents more natural and human-like. In this paper, we present a novel technique for incremental recognition of the user's emotional state as it is applied in a sensitive artificial listener (SAL) system designed for socially competent human-machine communication. Our method is capable of using acoustic, linguistic, as well as long-range contextual information in order to continuously predict the current quadrant in a two-dimensional emotional space spanned by the dimensions valence and activation. The main system components are a hierarchical dynamic Bayesian network (DBN) for detecting linguistic keyword features and long short-term memory (LSTM) recurrent neural networks which model phoneme context and emotional history to predict the affective state of the user. Experimental evaluations on the SAL corpus of non-prototypical real-life emotional speech data consider a number of variants of our recognition framework: continuous emotion estimation from low-level feature frames is evaluated as a new alternative to the common approach of computing statistical functionals of given speech turns. Further performance gains are achieved by discriminatively training LSTM networks and by using bidirectional context information, leading to a quadrant prediction F1-measure of up to 51.3 %, which is only 7.6 % below the average inter-labeler consistency.

Journal ArticleDOI
TL;DR: Investigating the influence of the network connectivity (parameterized by the neuron in-degree) on a family of network models that interpolates between analog and binary networks reveals that the phase transition between ordered and chaotic network behavior of binary circuits qualitatively differs from the one in analog circuits, leading to decreased computational performance observed in binary circuits that are densely connected.
Abstract: Reservoir computing (RC) systems are powerful models for online computations on input sequences. They consist of a memoryless readout neuron that is trained on top of a randomly connected recurrent neural network. RC systems are commonly used in two flavors: with analog or binary (spiking) neurons in the recurrent circuits. Previous work indicated a fundamental difference in the behavior of these two implementations of the RC idea. The performance of an RC system built from binary neurons seems to depend strongly on the network connectivity structure. In networks of analog neurons, such clear dependency has not been observed. In this letter, we address this apparent dichotomy by investigating the influence of the network connectivity (parameterized by the neuron in-degree) on a family of network models that interpolates between analog and binary networks. Our analyses are based on a novel estimation of the Lyapunov exponent of the network dynamics with the help of branching process theory, rank measures that estimate the kernel quality and generalization capabilities of recurrent networks, and a novel mean field predictor for computational performance. These analyses reveal that the phase transition between ordered and chaotic network behavior of binary circuits qualitatively differs from the one in analog circuits, leading to differences in the integration of information over short and long timescales. This explains the decreased computational performance observed in binary circuits that are densely connected. The mean field predictor is also used to bound the memory function of recurrent circuits of binary neurons.

Journal ArticleDOI
TL;DR: This work presents a novel approach to on-line emotion recognition from speech using Long Short-Term Memory Recurrent Neural Networks, which recognition is performed on low-level signal frames, similar to those used for speech recognition.
Abstract: For many applications of emotion recognition, such as virtual agents, the system must select responses while the user is speaking. This requires reliable on-line recognition of the user’s affect. However most emotion recognition systems are based on turnwise processing. We present a novel approach to on-line emotion recognition from speech using Long Short-Term Memory Recurrent Neural Networks. Emotion is recognised frame-wise in a two-dimensional valence-activation continuum. In contrast to current state-of-the-art approaches, recognition is performed on low-level signal frames, similar to those used for speech recognition. No statistical functionals are applied to low-level feature contours. Framing at a higher level is therefore unnecessary and regression outputs can be produced in real-time for every low-level input frame. We also investigate the benefits of including linguistic features on the signal frame level obtained by a keyword spotter.

Proceedings Article
01 Jan 2010
TL;DR: This paper presents a new onset detector with superior performance and temporal precision for all kinds of music, including complex music mixes, based on auditory spectral features and relative spectral differences processed by a bidirectional Long Short-Term Memory recurrent neural network, which acts as reduction function.
Abstract: Many different onset detection methods have been proposed in recent years. However those that perform well tend to be highly specialised for certain types of music, while those that are more widely applicable give only moderate performance. In this paper we present a new onset detector with superior performance and temporal precision for all kinds of music, including complex music mixes. It is based on auditory spectral features and relative spectral differences processed by a bidirectional Long Short-Term Memory recurrent neural network, which acts as reduction function. The network is trained with a large database of onset data covering various genres and onset types. Due to the data driven nature, our approach does not require the onset detection method and its parameters to be tuned to a particular type of music. We compare results on the Bello onset data set and can conclude that our approach is on par with related results on the same set and outperforms them in most cases in terms of F1-measure. For complex music with mixed onset types, an absolute improvement of 3.6% is reported.

Journal ArticleDOI
TL;DR: In this article, a class of recurrent neural networks with time delay in the leakage term under impulsive perturbations is considered, and sufficient conditions to guarantee the existence, uniqueness and global asymptotic stability of the equilibrium point are presented.
Abstract: In this paper, a class of recurrent neural networks with time delay in the leakage term under impulsive perturbations is considered. First, a sufficient condition is given to ensure the global existence and uniqueness of the solution for the addressed neural networks by using the contraction mapping theorem. Then, we present some sufficient conditions to guarantee the existence, uniqueness and global asymptotic stability of the equilibrium point by using topological degree theory, Lyapunov–Kravsovskii functionals and some analysis techniques. The proposed results, which do not require the boundedness, differentiability and monotonicity of the activation functions, can be easily checked via the linear matrix inequality (LMI) control toolbox in MATLAB. Moreover, they indicate that the stability behavior of neural networks is very sensitive to the time delay in the leakage term. In the absence of leakage delay, the results obtained are also new results. Finally, two numerical examples are given to show the effectiveness of the proposed results.

Journal ArticleDOI
TL;DR: Experimental investigations show that RNNs equipped with the proposed approach outperform standard real-time recurrent learning and extended Kalman training algorithms for recurrent networks, as well as other contemporary nonlinear neural models, on time-series modeling.
Abstract: This paper develops a probabilistic approach to recursive second-order training of recurrent neural networks (RNNs) for improved time-series modeling. A general recursive Bayesian Levenberg-Marquardt algorithm is derived to sequentially update the weights and the covariance (Hessian) matrix. The main strengths of the approach are a principled handling of the regularization hyperparameters that leads to better generalization, and stable numerical performance. The framework involves the adaptation of a noise hyperparameter and local weight prior hyperparameters, which represent the noise in the data and the uncertainties in the model parameters. Experimental investigations using artificial and real-world data sets show that RNNs equipped with the proposed approach outperform standard real-time recurrent learning and extended Kalman training algorithms for recurrent networks, as well as other contemporary nonlinear neural models, on time-series modeling.

Book ChapterDOI
15 Sep 2010
TL;DR: Experimental results show that the proposed approach for action classification in soccer videos outperforms classification methods of related works, and that the combination of the two features (BoW and dominant motion) leads to a classification rate of 92%.
Abstract: In this paper, we propose a novel approach for action classification in soccer videos using a recurrent neural network scheme. Thereby, we extract from each video action at each timestep a set of features which describe both the visual content (by the mean of a BoW approach) and the dominant motion (with a key point based approach). A Long Short-Term Memory-based Recurrent Neural Network is then trained to classify each video sequence considering the temporal evolution of the features for each timestep. Experimental results on the MICC-Soccer-Actions-4 database show that the proposed approach outperforms classification methods of related works (with a classification rate of 77%), and that the combination of the two features (BoW and dominant motion) leads to a classification rate of 92%.

Journal ArticleDOI
TL;DR: This brief deals with the problem of stability analysis for a class of recurrent neural networks with a time-varying delay in a range and proposes a new type of delay-range-dependent condition using the free-weighting matrix technique to obtain a tighter upper bound on the derivative of the Lyapunov-Krasovskii functional.
Abstract: This brief deals with the problem of stability analysis for a class of recurrent neural networks (RNNs) with a time-varying delay in a range. Both delay-independent and delay-dependent conditions are derived. For the former, an augmented Lyapunov functional is constructed and the derivative of the state is retained. Since the obtained criterion realizes the decoupling of the Lyapunov function matrix and the coefficient matrix of the neural networks, it can be easily extended to handle neural networks with polytopic uncertainties. For the latter, a new type of delay-range-dependent condition is proposed using the free-weighting matrix technique to obtain a tighter upper bound on the derivative of the Lyapunov-Krasovskii functional. Two examples are given to illustrate the effectiveness and the reduced conservatism of the proposed results.

Journal ArticleDOI
TL;DR: In this article, the effect of the cylinder's surface temperature on both the direct and inverse dynamics of the damper is studied, and the neural network model is shown to be reasonably robust against significant temperature variation.

Journal ArticleDOI
TL;DR: In this brief, state feedback controllers are established to not only guarantee exponential stable synchronization between two general chaotic neural networks with or without time delays, but also reduce the effect of external disturbance on the synchronization error to a minimal H∞ norm constraint.
Abstract: This brief studies exponential H∞ synchronization of a class of general discrete-time chaotic neural networks with external disturbance. On the basis of the drive-response concept and H∞ control theory, and using Lyapunov-Krasovskii (or Lyapunov) functional, state feedback controllers are established to not only guarantee exponential stable synchronization between two general chaotic neural networks with or without time delays, but also reduce the effect of external disturbance on the synchronization error to a minimal H∞ norm constraint. The proposed controllers can be obtained by solving the convex optimization problems represented by linear matrix inequalities. Most discrete-time chaotic systems with or without time delays, such as Hopfield neural networks, cellular neural networks, bidirectional associative memory networks, recurrent multilayer perceptrons, Cohen-Grossberg neural networks, Chua's circuits, etc., can be transformed into this general chaotic neural network to be H∞ synchronization controller designed in a unified way. Finally, some illustrated examples with their simulations have been utilized to demonstrate the effectiveness of the proposed methods.

Journal ArticleDOI
TL;DR: A delay&sum readout is introduced, which adds trainable delays in the synaptic connections of output neurons and therefore vastly improves the memory capacity of echo state networks.

Journal ArticleDOI
TL;DR: A review of the theory, extension models, learning algorithms and applications of the RNN, which has been applied in a variety of areas including pattern recognition, classification, image processing, combinatorial optimization and communication systems.
Abstract: The random neural network (RNN) is a recurrent neural network model inspired by the spiking behaviour of biological neuronal networks. Contrary to most artificial neural network models, neurons in the RNN interact by probabilistically exchanging excitatory and inhibitory spiking signals. The model is described by analytical equations, has a low complexity supervised learning algorithm and is a universal approximator for bounded continuous functions. The RNN has been applied in a variety of areas including pattern recognition, classification, image processing, combinatorial optimization and communication systems. It has also inspired research activity in modelling interacting entities in various systems such as queueing and gene regulatory networks. This paper presents a review of the theory, extension models, learning algorithms and applications of the RNN.

Journal ArticleDOI
TL;DR: This investigation on the speech recognition classification performance is performed using two standard neural networks structures as the classifier using Feed-forward Neural Network with back propagation algorithm and a Radial Basis Functions Neural Networks.
Abstract: In this paper is presented an investigation of the speech recognition classification performance. This investigation on the speech recognition classification performance is performed using two standard neural networks structures as the classifier. The utilized standard neural network types include Feed-forward Neural Network (NN) with back propagation algorithm and a Radial Basis Functions Neural Networks.

Journal ArticleDOI
01 Dec 2010
TL;DR: The less conservative robust stability criteria for SNTNNs with time-varying delays are proposed by using a new Lyapunov-Krasovskii functional and a novel series compensation (SC) technique, and the simulation results demonstrate the effectiveness of the proposed criteria.
Abstract: This paper studies a class of new neural networks referred to as switched neutral-type neural networks (SNTNNs) with time-varying delays, which combines switched systems with a class of neutral-type neural networks. The less conservative robust stability criteria for SNTNNs with time-varying delays are proposed by using a new Lyapunov-Krasovskii functional and a novel series compensation (SC) technique. Based on the new functional, SNTNNs with fast-varying neutral-type delay (the derivative of delay is more than one) is first considered. The benefit brought by employing the SC technique is that some useful negative definite elements can be included in stability criteria, which are generally ignored in the estimation of the upper bound of derivative of Lyapunov-Krasovskii functional in literature. Furthermore, the criteria proposed in this paper are also effective and less conservative in switched recurrent neural networks which can be considered as special cases of SNTNNs. The simulation results based on several numerical examples demonstrate the effectiveness of the proposed criteria.

Journal ArticleDOI
TL;DR: A novel delay-dependent stability criterion is established for the considered recurrent neural networks via a new Lyapunov function via a novel linear matrix inequality (LMI) approach.
Abstract: This brief investigates the problem of global exponential stability analysis for discrete recurrent neural networks with time-varying delays. In terms of linear matrix inequality (LMI) approach, a novel delay-dependent stability criterion is established for the considered recurrent neural networks via a new Lyapunov function. The obtained condition has less conservativeness and less number of variables than the existing ones. Numerical example is given to demonstrate the effectiveness of the proposed method.


Journal ArticleDOI
TL;DR: The Recurrent Policy Gradient as discussed by the authors approximates a policy gradient for a recurrent neural network by backpropagating return-weighted characteristic eligibilities through time, which is able to outperform previous RL methods on three important benchmark tasks.
Abstract: Reinforcement learning for partially observable Markov decision problems (POMDPs) is a challenge as it requires policies with an internal state. Traditional approaches suer significantly from this shortcoming and usually make strong assumptions on the problem domain such as perfect system models, state-estimators and a Markovian hidden system. Recurrent neural networks (RNNs) oer a natural framework for dealing with policy learning using hidden state and require only few limiting assumptions. As they can be trained well using gradient descent, they are suited for policy gradient approaches. In this paper, we present a policy gradient method, the Recurrent Policy Gradient which constitutes a model-free reinforcement learning method. It is aimed at training limited-memory stochastic policies on problems which require long-term memories of past observations. The approach involves approximating a policy gradient for a recurrent neural network by backpropagating return-weighted characteristic eligibilities through time. Using a “Long Short-Term Memory” RNN architecture, we are able to outperform previous RL methods on three important benchmark tasks. Furthermore, we show that using history-dependent baselines helps reducing estimation variance significantly, thus enabling our approach to tackle more challenging, highly stochastic environments.

Proceedings ArticleDOI
26 Sep 2010
TL;DR: A novel NNLM adaptation method using a cascaded network is proposed and consistent WER reductions were obtained on a state-of-the-art Arabic LVCSR task over conventional NNLMs.
Abstract: Neural network language models (NNLM) have become an increasingly popular choice for large vocabulary continuous speech recognition (LVCSR) tasks, due to their inherent generalisation and discriminative power. This paper present two techniques to improve performance of standard NNLMs. First, the form of NNLM is modelled by introduction an additional output layer node to model the probability mass of out-of-shortlist (OOS) words. An associated probability normalisation scheme is explicitly derived. Second, a novel NNLM adaptation method using a cascaded network is proposed. Consistent WER reductions were obtained on a state-of-the-art Arabic LVCSR task over conventional NNLMs. Further performance gains were also observed after NNLM adaptation.

Journal ArticleDOI
TL;DR: A less conservative delay-dependent global asymptotical stability criterion is first proposed for RNNs with multiple delays and the obtained stability result is easy to check and improve upon the existing ones.
Abstract: This paper studies the stability problem of a class of recurrent neural networks (RNNs) with multiple delays. By using an augmented matrix-vector transformation for delays and a novel line integral-type Lyapunov-Krasovskii functional, a less conservative delay-dependent global asymptotical stability criterion is first proposed for RNNs with multiple delays. The obtained stability result is easy to check and improve upon the existing ones. Then, two numerical examples are given to verify the effectiveness of the proposed criterion.