scispace - formally typeset
Search or ask a question
Topic

Feedforward neural network

About: Feedforward neural network is a research topic. Over the lifetime, 11431 publications have been published within this topic receiving 310905 citations. The topic is also known as: feed-forward neural network & feed forward neural network.


Papers
More filters
Book
01 Jul 1994
TL;DR: In this chapter seven Neural Nets based on Competition, Adaptive Resonance Theory, and Backpropagation Neural Net are studied.
Abstract: 1. Introduction. 2. Simple Neural Nets for Pattern Classification. 3. Pattern Association. 4. Neural Networks Based on Competition. 5. Adaptive Resonance Theory. 6. Backpropagation Neural Net. 7. A Sampler of Other Neural Nets. Glossary. References. Index.

2,665 citations

Journal ArticleDOI
TL;DR: The NLPCA method is demonstrated using time-dependent, simulated batch reaction data and shows that it successfully reduces dimensionality and produces a feature space map resembling the actual distribution of the underlying system parameters.
Abstract: Nonlinear principal component analysis is a novel technique for multivariate data analysis, similar to the well-known method of principal component analysis. NLPCA, like PCA, is used to identify and remove correlations among problem variables as an aid to dimensionality reduction, visualization, and exploratory data analysis. While PCA identifies only linear correlations between variables, NLPCA uncovers both linear and nonlinear correlations, without restriction on the character of the nonlinearities present in the data. NLPCA operates by training a feedforward neural network to perform the identity mapping, where the network inputs are reproduced at the output layer. The network contains an internal “bottleneck” layer (containing fewer nodes than input or output layers), which forces the network to develop a compact representation of the input data, and two additional hidden layers. The NLPCA method is demonstrated using time-dependent, simulated batch reaction data. Results show that NLPCA successfully reduces dimensionality and produces a feature space map resembling the actual distribution of the underlying system parameters.

2,643 citations

Proceedings ArticleDOI
01 Jan 2014
TL;DR: The first distributed training of LSTM RNNs using asynchronous stochastic gradient descent optimization on a large cluster of machines is introduced and it is shown that a two-layer deep LSTm RNN where each L STM layer has a linear recurrent projection layer can exceed state-of-the-art speech recognition performance.
Abstract: Long Short-Term Memory (LSTM) is a specific recurrent neural network (RNN) architecture that was designed to model temporal sequences and their long-range dependencies more accurately than conventional RNNs. In this paper, we explore LSTM RNN architectures for large scale acoustic modeling in speech recognition. We recently showed that LSTM RNNs are more effective than DNNs and conventional RNNs for acoustic modeling, considering moderately-sized models trained on a single machine. Here, we introduce the first distributed training of LSTM RNNs using asynchronous stochastic gradient descent optimization on a large cluster of machines. We show that a two-layer deep LSTM RNN where each LSTM layer has a linear recurrent projection layer can exceed state-of-the-art speech recognition performance. This architecture makes more effective use of model parameters than the others considered, converges quickly, and outperforms a deep feed forward neural network having an order of magnitude more parameters. Index Terms: Long Short-Term Memory, LSTM, recurrent neural network, RNN, speech recognition, acoustic modeling.

2,492 citations

Journal ArticleDOI
TL;DR: This paper proves in an incremental constructive method that in order to let SLFNs work as universal approximators, one may simply randomly choose hidden nodes and then only need to adjust the output weights linking the hidden layer and the output layer.
Abstract: According to conventional neural network theories, single-hidden-layer feedforward networks (SLFNs) with additive or radial basis function (RBF) hidden nodes are universal approximators when all the parameters of the networks are allowed adjustable. However, as observed in most neural network implementations, tuning all the parameters of the networks may cause learning complicated and inefficient, and it may be difficult to train networks with nondifferential activation functions such as threshold networks. Unlike conventional neural network theories, this paper proves in an incremental constructive method that in order to let SLFNs work as universal approximators, one may simply randomly choose hidden nodes and then only need to adjust the output weights linking the hidden layer and the output layer. In such SLFNs implementations, the activation functions for additive nodes can be any bounded nonconstant piecewise continuous functions g:R→R and the activation functions for RBF nodes can be any integrable piecewise continuous functions g:R→R and ∫Rg(x)dx≠0. The proposed incremental method is efficient not only for SFLNs with continuous (including nondifferentiable) activation functions but also for SLFNs with piecewise continuous (such as threshold) activation functions. Compared to other popular methods such a new network is fully automatic and users need not intervene the learning process by manually tuning control parameters.

2,413 citations

Journal ArticleDOI
TL;DR: In this article, the optimal data selection techniques have been used with feed-forward neural networks and showed how the same principles may be used to select data for two alternative, statistically-based learning architectures: mixtures of Gaussians and locally weighted regression.
Abstract: For many types of machine learning algorithms, one can compute the statistically "optimal" way to select training data. In this paper, we review how optimal data selection techniques have been used with feedforward neural networks. We then show how the same principles may be used to select data for two alternative, statistically-based learning architectures: mixtures of Gaussians and locally weighted regression. While the techniques for neural networks are computationally expensive and approximate, the techniques for mixtures of Gaussians and locally weighted regression are both efficient and accurate. Empirically, we observe that the optimality criterion sharply decreases the number of training examples the learner needs in order to achieve good performance.

2,122 citations


Network Information
Related Topics (5)
Artificial neural network
207K papers, 4.5M citations
95% related
Feature extraction
111.8K papers, 2.1M citations
89% related
Fuzzy logic
151.2K papers, 2.3M citations
87% related
Control theory
299.6K papers, 3.1M citations
87% related
Optimization problem
96.4K papers, 2.1M citations
87% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
2023116
2022309
2021451
2020529
2019488