scispace - formally typeset
Search or ask a question

Showing papers on "Artificial neural network published in 2005"


Proceedings Article
01 Jan 2005
TL;DR: In this article, a modified, full gradient version of the LSTM learning algorithm was used for framewise phoneme classification, using the TIMIT database, and the results support the view that contextual information is crucial to speech processing, and suggest that bidirectional networks outperform unidirectional ones.
Abstract: In this paper, we present bidirectional Long Short Term Memory (LSTM) networks, and a modified, full gradient version of the LSTM learning algorithm. We evaluate Bidirectional LSTM (BLSTM) and several other network architectures on the benchmark task of framewise phoneme classification, using the TIMIT database. Our main findings are that bidirectional networks outperform unidirectional ones, and Long Short Term Memory (LSTM) is much faster and also more accurate than both standard Recurrent Neural Nets (RNNs) and time-windowed Multilayer Perceptrons (MLPs). Our results support the view that contextual information is crucial to speech processing, and suggest that BLSTM is an effective architecture with which to exploit it'.

3,028 citations


Proceedings ArticleDOI
07 Aug 2005
TL;DR: RankNet is introduced, an implementation of these ideas using a neural network to model the underlying ranking function, and test results on toy data and on data from a commercial internet search engine are presented.
Abstract: We investigate using gradient descent methods for learning ranking functions; we propose a simple probabilistic cost function, and we introduce RankNet, an implementation of these ideas using a neural network to model the underlying ranking function. We present test results on toy data and on data from a commercial internet search engine.

2,813 citations


Journal ArticleDOI
TL;DR: In this article, a modified, full gradient version of the LSTM learning algorithm was used for framewise phoneme classification, using the TIMIT database, and the results support the view that contextual information is crucial to speech processing, and suggest that bidirectional networks outperform unidirectional ones.

2,200 citations


Proceedings ArticleDOI
27 Dec 2005
TL;DR: A new neural model, called graph neural network (GNN), capable of directly processing graphs, which extends recursive neural networks and can be applied on most of the practically useful kinds of graphs, including directed, undirected, labelled and cyclic graphs.
Abstract: In several applications the information is naturally represented by graphs. Traditional approaches cope with graphical data structures using a preprocessing phase which transforms the graphs into a set of flat vectors. However, in this way, important topological information may be lost and the achieved results may heavily depend on the preprocessing stage. This paper presents a new neural model, called graph neural network (GNN), capable of directly processing graphs. GNNs extends recursive neural networks and can be applied on most of the practically useful kinds of graphs, including directed, undirected, labelled and cyclic graphs. A learning algorithm for GNNs is proposed and some experiments are discussed which assess the properties of the model.

1,569 citations


Journal ArticleDOI
TL;DR: This paper investigates the predictability of financial movement direction with SVM by forecasting the weekly movement direction of NIKKEI 225 index and proposes a combining model by integrating SVM with the other classification methods.

984 citations


Journal ArticleDOI
TL;DR: A statistical framework based on the point process likelihood function to relate a neuron's spiking probability to three typical covariates: the neuron's own spiking history, concurrent ensemble activity, and extrinsic covariates such as stimuli or behavior.
Abstract: Multiple factors simultaneously affect the spiking activity of individual neurons. Determining the effects and relative importance of these factors is a challenging problem in neurophysiology. We propose a statistical framework based on the point process likelihood function to relate a neuron's spiking probability to three typical covariates: the neuron's own spiking history, concurrent ensemble activity, and extrinsic covariates such as stimuli or behavior. The framework uses parametric models of the conditional intensity function to define a neuron's spiking probability in terms of the covariates. The discrete time likelihood function for point processes is used to carry out model fitting and model analysis. We show that, by modeling the logarithm of the conditional intensity function as a linear combination of functions of the covariates, the discrete time point process likelihood function is readily analyzed in the generalized linear model (GLM) framework. We illustrate our approach for both GLM and non-GLM likelihood functions using simulated data and multivariate single-unit activity data simultaneously recorded from the motor cortex of a monkey performing a visuomotor pursuit-tracking task. The point process framework provides a flexible, computationally efficient approach for maximum likelihood estimation, goodness-of-fit assessment, residual analysis, model selection, and neural decoding. The framework thus allows for the formulation and analysis of point process models of neural spiking activity that readily capture the simultaneous effects of multiple covariates and enables the assessment of their relative importance.

982 citations


Book ChapterDOI
03 Oct 2005
TL;DR: NFQ, an algorithm for efficient and effective training of a Q-value function represented by a multi-layer perceptron, is introduced and it is shown empirically, that reasonably few interactions with the plant are needed to generate control policies of high quality.
Abstract: This paper introduces NFQ, an algorithm for efficient and effective training of a Q-value function represented by a multi-layer perceptron. Based on the principle of storing and reusing transition experiences, a model-free, neural network based Reinforcement Learning algorithm is proposed. The method is evaluated on three benchmark problems. It is shown empirically, that reasonably few interactions with the plant are needed to generate control policies of high quality.

944 citations


Journal ArticleDOI
TL;DR: In this paper, the authors investigated the effect of data preprocessing, including deseasonalization and detrending, on neural network modeling and forecasting performance for seasonal time series forecasting.

803 citations


Journal ArticleDOI
TL;DR: This paper applies support vector machines (SVMs) to the bankruptcy prediction problem in an attempt to suggest a new model with better explanatory power and stability, and shows that SVM outperforms the other methods.
Abstract: Bankruptcy prediction has drawn a lot of research interests in previous literature, and recent studies have shown that machine learning techniques achieved better performance than traditional statistical ones. This paper applies support vector machines (SVMs) to the bankruptcy prediction problem in an attempt to suggest a new model with better explanatory power and stability. To serve this purpose, we use a grid-search technique using 5-fold cross-validation to find out the optimal parameter values of kernel function of SVM. In addition, to evaluate the prediction accuracy of SVM, we compare its performance with those of multiple discriminant analysis (MDA), logistic regression analysis (Logit), and three-layer fully connected back-propagation neural networks (BPNs). The experiment results show that SVM outperforms the other methods.

797 citations


Journal ArticleDOI
TL;DR: A hybrid learning algorithm is proposed which uses the differential evolutionary algorithm to select the input weights and Moore-Penrose (MP) generalized inverse to analytically determine the output weights.

734 citations


Journal ArticleDOI
TL;DR: The results demonstrate that the accuracy and generalization performance of SVM is better than that of BPN as the training set size gets smaller, and the several superior points of the SVM algorithm compared with BPN are investigated.
Abstract: This study investigates the efficacy of applying support vector machines (SVM) to bankruptcy prediction problem. Although it is a well-known fact that the back-propagation neural network (BPN) performs well in pattern recognition tasks, the method has some limitations in that it is an art to find an appropriate model structure and optimal solution. Furthermore, loading as many of the training set as possible into the network is needed to search the weights of the network. On the other hand, since SVM captures geometric characteristics of feature space without deriving weights of networks from the training data, it is capable of extracting the optimal solution with the small training set size. In this study, we show that the proposed classifier of SVM approach outperforms BPN to the problem of corporate bankruptcy prediction. The results demonstrate that the accuracy and generalization performance of SVM is better than that of BPN as the training set size gets smaller. We also examine the effect of the variability in performance with respect to various values of parameters in SVM. In addition, we investigate and summarize the several superior points of the SVM algorithm compared with BPN.

01 Jan 2005
TL;DR: It is found that neural networks are not able to capture seasonal or trend variations effectively with the unpreprocessed raw data and either detrending or deseasonalization can dramatically reduce forecasting errors.
Abstract: Neural networks have been widely used as a promising method for time series forecasting. However, limited empirical studies on seasonal time series forecasting with neural networks yield mixed results. While some find that neural networks are able to model seasonality directly and prior deseasonalization is not necessary, others conclude just the opposite. In this paper, we investigate the issue of how to effectively model time series with both seasonal and trend patterns. In particular, we study the effectiveness of data preprocessing, including deseasonalization and detrending, on neural network modeling and forecasting performance. Both simulation and real data are examined and results are compared to those obtained from the Box–Jenkins seasonal autoregressive integrated moving average models. We find that neural networks are not able to capture seasonal or trend variations effectively with the unpreprocessed raw data and either detrending or deseasonalization can dramatically reduce forecasting errors. Moreover, a combined detrending and deseasonalization is found to be the most effective data preprocessing approach. � 2003 Elsevier B.V. All rights reserved.

Journal ArticleDOI
TL;DR: A new, simplified and overarching theory of noradrenaline function is inspired by an invertebrate model: neuromodulators in crustacea abruptly interrupt activity in neural networks and reorganize the elements into new functional networks determining the behavioral output.

Journal ArticleDOI
TL;DR: The paper first introduces the concept of significance for the hidden neurons and then uses it in the learning algorithm to realize parsimonious networks, which outperforms several other sequential learning algorithms in terms of learning speed, network size and generalization performance regardless of the sampling density function of the training data.
Abstract: This work presents a new sequential learning algorithm for radial basis function (RBF) networks referred to as generalized growing and pruning algorithm for RBF (GGAP-RBF). The paper first introduces the concept of significance for the hidden neurons and then uses it in the learning algorithm to realize parsimonious networks. The growing and pruning strategy of GGAP-RBF is based on linking the required learning accuracy with the significance of the nearest or intentionally added new neuron. Significance of a neuron is a measure of the average information content of that neuron. The GGAP-RBF algorithm can be used for any arbitrary sampling density for training samples and is derived from a rigorous statistical point of view. Simulation results for bench mark problems in the function approximation area show that the GGAP-RBF outperforms several other sequential learning algorithms in terms of learning speed, network size and generalization performance regardless of the sampling density function of the training data.

Journal ArticleDOI
15 Oct 2005-Talanta
TL;DR: A new method to divide a pool of samples into calibration and validation subsets for multivariate modelling is proposed, and the results of F-tests at 95% confidence level reveal that the proposed technique may be an advantageous alternative to the other three strategies.

Journal ArticleDOI
TL;DR: It is found that optimized component rearrangements could substantially reduce total wiring length in all tested neural networks, suggesting that neural systems are not exclusively optimized for minimal global wiring, but for a variety of factors including the minimization of processing steps.
Abstract: It has been suggested that neural systems across several scales of organization show optimal component placement, in which any spatial rearrangement of the components would lead to an increase of total wiring. Using extensive connectivity datasets for diverse neural networks combined with spatial coordinates for network nodes, we applied an optimization algorithm to the network layouts, in order to search for wire-saving component rearrangements. We found that optimized component rearrangements could substantially reduce total wiring length in all tested neural networks. Specifically, total wiring among 95 primate (Macaque) cortical areas could be decreased by 32%, and wiring of neuronal networks in the nematode Caenorhabditis elegans could be reduced by 48% on the global level, and by 49% for neurons within frontal ganglia. Wiring length reductions were possible due to the existence of long-distance projections in neural networks. We explored the role of these projections by comparing the original networks with minimally rewired networks of the same size, which possessed only the shortest possible connections. In the minimally rewired networks, the number of processing steps along the shortest paths between components was significantly increased compared to the original networks. Additional benchmark comparisons also indicated that neural networks are more similar to network layouts that minimize the length of processing paths, rather than wiring length. These findings suggest that neural systems are not exclusively optimized for minimal global wiring, but for a variety of factors including the minimization of processing steps. Citation: Kaiser M, Hilgetag CC (2006) Nonoptimal component placement, but short processing paths, due to long-distance projections in neural systems. PLoS Comput Biol

Journal ArticleDOI
TL;DR: The GMM-based limb motion classification system demonstrates exceptional classification accuracy and results in a robust method of motion classification with low computational load.
Abstract: This paper introduces and evaluates the use of Gaussian mixture models (GMMs) for multiple limb motion classification using continuous myoelectric signals. The focus of this work is to optimize the configuration of this classification scheme. To that end, a complete experimental evaluation of this system is conducted on a 12 subject database. The experiments examine the GMMs algorithmic issues including the model order selection and variance limiting, the segmentation of the data, and various feature sets including time-domain features and autoregressive features. The benefits of postprocessing the results using a majority vote rule are demonstrated. The performance of the GMM is compared to three commonly used classifiers: a linear discriminant analysis, a linear perceptron network, and a multilayer perceptron neural network. The GMM-based limb motion classification system demonstrates exceptional classification accuracy and results in a robust method of motion classification with low computational load.

Journal ArticleDOI
TL;DR: Past research is extended by providing an advanced, genetic algorithm based, multilayered structural optimization strategy that can assist both in the proper representation of traffic flow data with temporal and spatial characteristics as well as in the selection of the appropriate neural network structure.
Abstract: Short-term forecasting of traffic parameters such as flow and occupancy is an essential element of modern Intelligent Transportation Systems research and practice. Although many different methodologies have been used for short-term predictions, literature suggests neural networks as one of the best alternatives for modeling and predicting traffic parameters. However, because of limited knowledge regarding a network’s optimal structure given a specific dataset, researchers have to rely on time consuming and questionably efficient rules-of-thumb when developing them. This paper extends past research by providing an advanced, genetic algorithm based, multilayered structural optimization strategy that can assist both in the proper representation of traffic flow data with temporal and spatial characteristics as well as in the selection of the appropriate neural network structure. Further, it evaluates the performance of the developed network by applying it to both univariate and multivariate traffic flow data from an urban signalized arterial. The results show that the capabilities of a simple static neural network, with genetically optimized step size, momentum and number of hidden units, are very satisfactory when modeling both univariate and multivariate traffic data.

Journal ArticleDOI
TL;DR: Locally weighted projection regression is the first truly incremental spatially localized learning method that can successfully and efficiently operate in very high-dimensional spaces.
Abstract: Locally weighted projection regression (LWPR) is a new algorithm for incremental nonlinear function approximation in high-dimensional spaces with redundant and irrelevant input dimensions. At its core, it employs nonparametric regression with locally linear models. In order to stay computationally efficient and numerically robust, each local model performs the regression analysis with a small number of univariate regressions in selected directions in input space in the spirit of partial least squares regression. We discuss when and how local learning techniques can successfully work in high-dimensional spaces and review the various techniques for local dimensionality reduction before finally deriving the LWPR algorithm. The properties of LWPR are that it (1) learns rapidly with second-order learning methods based on incremental training, (2) uses statistically sound stochastic leave-one-out cross validation for learning without the need to memorize training data, (3) adjusts its weighting kernels based on only local information in order to minimize the danger of negative interference of incremental learning, (4) has a computational complexity that is linear in the number of inputs, and (5) can deal with a large number of—possibly redundant—inputs, as shown in various empirical evaluations with up to 90 dimensional data sets. For a probabilistic interpretation, predictive variance and confidence intervals are derived. To our knowledge, LWPR is the first truly incremental spatially localized learning method that can successfully and efficiently operate in very high-dimensional spaces.

Journal ArticleDOI
TL;DR: The different experiment results show that accurate predictions can be achieved with a standard feedforward neural network trained with the Levenberg–Marquardt algorithm providing the best results for up to 18 months forecasts.

Journal ArticleDOI
TL;DR: In this article, forecasting techniques to predict the 24 market-clearing prices of a day-ahead electric energy market were considered, including time series analysis, neural networks and wavelets, and extensive analysis was conducted using data from the PJM Interconnection.

Journal ArticleDOI
TL;DR: Two fundamentally different approaches for designing classification models (classifiers) the traditional statistical method based on logistic regression and the emerging computationally powerful techniques based on ANN are introduced.

Journal ArticleDOI
TL;DR: The efficacy of particle identification with boosting algorithms has better performance than that with artificial neural networks for the MiniBooNE experiment, and it is expected that boosting algorithms will find wide application in physics.
Abstract: The efficacy of particle identification is compared using artificial neutral networks and boosted decision trees. The comparison is performed in the context of the MiniBooNE, an experiment at Fermilab searching for neutrino oscillations. Based on studies of Monte Carlo samples of simulated data, particle identification with boosting algorithms has better performance than that with artificial neural networks for the MiniBooNE experiment. Although the tests in this paper were for one experiment, it is expected that boosting algorithms will find wide application in physics.

Journal ArticleDOI
TL;DR: This work reviews network models of internally generated activity, focusing on three types of network dynamics: sustained responses to transient stimuli, which provide a model of working memory; oscillatory network activity; and chaotic activity, which models complex patterns of background spiking in cortical and other circuits.
Abstract: Neural network modeling is often concerned with stimulus-driven responses, but most of the activity in the brain is internally generated. Here, we review network models of internally generated activity, focusing on three types of network dynamics: (a) sustained responses to transient stimuli, which provide a model of working memory; (b) oscillatory network activity; and (c) chaotic activity, which models complex patterns of background spiking in cortical and other circuits. We also review propagation of stimulus-driven activity through spontaneously active networks. Exploring these aspects of neural network dynamics is critical for understanding how neural circuits produce cognitive function.

Journal ArticleDOI
TL;DR: The input determination methodology is applied to a real-world case study in order to determine suitable model inputs for forecasting salinity in the River Murray, South Australia, 14 days in advance.

Journal ArticleDOI
TL;DR: The obtained results demonstrated that the proposed RNNs employing the Lyapunov exponents can be useful in analyzing long-term EEG signals for early detection of the electroencephalographic changes.
Abstract: There are a number of different quantitative models that can be used in a medical diagnostic decision support system including parametric methods, non-parametric methods and several neural network models. Unfortunately, there is no theory available to guide model selection. The aim of this study is to evaluate the diagnostic accuracy of the recurrent neural networks (RNNs) employing Lyapunov exponents trained with Levenberg-Marquardt algorithm on the electroencephalogram (EEG) signals. An approach based on the consideration that the EEG signals are chaotic signals was used in developing a reliable classification method for electroencephalographic changes. This consideration was tested successfully using the non-linear dynamics tools, like the computation of Lyapunov exponents. We explored the ability of designed and trained Elman RNNs, combined with the Lyapunov exponents, to discriminate the EEG signals (EEG signals recorded from healthy volunteers with eyes open, epilepsy patients in the epileptogenic zone during a seizure-free interval, and epilepsy patients during epileptic seizures). The RNNs achieved accuracy rates which were higher than that of the feedforward neural network models. The obtained results demonstrated that the proposed RNNs employing the Lyapunov exponents can be useful in analyzing long-term EEG signals for early detection of the electroencephalographic changes.

Journal ArticleDOI
TL;DR: Several sufficient conditions are derived for the existence, uniqueness, and GRS of equilibria for interval neural networks with time delays by use of a new Lyapunov function and matrix inequality.
Abstract: In this paper, two related problems, global asymptotic stability (GAS) and global robust stability (GRS) of neural networks with time delays, are studied. First, GAS of delayed neural networks is discussed based on Lyapunov method and linear matrix inequality. New criteria are given to ascertain the GAS of delayed neural networks. In the designs and applications of neural networks, it is necessary to consider the deviation effects of bounded perturbations of network parameters. In this case, a delayed neural network must be formulated as a interval neural network model. Several sufficient conditions are derived for the existence, uniqueness, and GRS of equilibria for interval neural networks with time delays by use of a new Lyapunov function and matrix inequality. These results are less restrictive than those given in the earlier references.


Proceedings ArticleDOI
05 Dec 2005
TL;DR: A renewed look at signal classification using spectral coherence and neural networks is taken, the performance of which is characterized by Monte Carlo simulations.
Abstract: Channel sensing and spectrum allocation has long been of interest as a prospective addition to cognitive radios for wireless communications systems occupying license-free bands. Conventional approaches to cyclic spectral analysis have been proposed as a method for classifying signals for applications where the carrier frequency and bandwidths are unknown, but is, however, computationally complex and requires a significant amount of observation time for adequate performance. Neural networks have been used for signal classification, but only for situations where the baseband signal is present. By combining these techniques a more efficient and reliable classifier can be developed where a significant amount of processing is performed offline, thus reducing online computation. In this paper we take a renewed look at signal classification using spectral coherence and neural networks, the performance of which is characterized by Monte Carlo simulations

Journal ArticleDOI
TL;DR: Simulation results substantiate the theoretical analysis and demonstrate the efficacy of the neural model on time-varying matrix inversion, especially when using a power-sigmoid activation function.
Abstract: Following the idea of using first-order time derivatives, this paper presents a general recurrent neural network (RNN) model for online inversion of time-varying matrices. Different kinds of activation functions are investigated to guarantee the global exponential convergence of the neural model to the exact inverse of a given time-varying matrix. The robustness of the proposed neural model is also studied with respect to different activation functions and various implementation errors. Simulation results, including the application to kinematic control of redundant manipulators, substantiate the theoretical analysis and demonstrate the efficacy of the neural model on time-varying matrix inversion, especially when using a power-sigmoid activation function.