scispace - formally typeset

JournalISSN: 0899-7667

Neural Computation 

About: Neural Computation is an academic journal. The journal publishes majorly in the area(s): Artificial neural network & Population. It has an ISSN identifier of 0899-7667. Over the lifetime, 3182 publication(s) have been published receiving 381647 citation(s).
Papers
More filters

Journal ArticleDOI
Sepp Hochreiter1, Jürgen Schmidhuber2Institutions (2)
01 Nov 1997-Neural Computation
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Abstract: Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Multiplicative gate units learn to open and close access to the constant error flow. LSTM is local in space and time; its computational complexity per time step and weight is O. 1. Our experiments with artificial data involve local, distributed, real-valued, and noisy pattern representations. In comparisons with real-time recurrent learning, back propagation through time, recurrent cascade correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks that have never been solved by previous recurrent network algorithms.

49,735 citations


Journal ArticleDOI
01 Jul 2006-Neural Computation
TL;DR: A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.
Abstract: We show how to use "complementary priors" to eliminate the explaining-away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive version of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribution of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning algorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind.

13,005 citations


Journal ArticleDOI
01 Nov 1995-Neural Computation
TL;DR: It is suggested that information maximization provides a unifying framework for problems in "blind" signal processing and dependencies of information transfer on time delays are derived.
Abstract: We derive a new self-organizing learning algorithm that maximizes the information transferred in a network of nonlinear units. The algorithm does not assume any knowledge of the input distributions, and is defined here for the zero-noise limit. Under these conditions, information maximization has extra properties not found in the linear case (Linsker 1989). The nonlinearities in the transfer function are able to pick up higher-order moments of the input distributions and perform something akin to true redundancy reduction between units in the output representation. This enables the network to separate statistically independent components in the inputs: a higher-order generalization of principal components analysis. We apply the network to the source separation (or cocktail party) problem, successfully separating unknown mixtures of up to 10 speakers. We also show that a variant on the network architecture is able to perform blind deconvolution (cancellation of unknown echoes and reverberation in a speech signal). Finally, we derive dependencies of information transfer on time delays. We suggest that information maximization provides a unifying framework for problems in "blind" signal processing.

8,549 citations


Journal ArticleDOI
01 Jul 1998-Neural Computation
TL;DR: A new method for performing a nonlinear form of principal component analysis by the use of integral operator kernel functions is proposed and experimental results on polynomial feature extraction for pattern recognition are presented.
Abstract: A new method for performing a nonlinear form of principal component analysis is proposed. By the use of integral operator kernel functions, one can efficiently compute principal components in high-dimensional feature spaces, related to input space by some nonlinear map—for instance, the space of all possible five-pixel products in 16 × 16 images. We give the derivation of the method and present experimental results on polynomial feature extraction for pattern recognition.

7,611 citations


Journal ArticleDOI
Yann LeCun1, Bernhard E. Boser1, John S. Denker1, D. Henderson1  +3 moreInstitutions (1)
01 Dec 1989-Neural Computation
TL;DR: This paper demonstrates how constraints from the task domain can be integrated into a backpropagation network through the architecture of the network, successfully applied to the recognition of handwritten zip code digits provided by the U.S. Postal Service.
Abstract: The ability of learning networks to generalize can be greatly enhanced by providing constraints from the task domain. This paper demonstrates how such constraints can be integrated into a backpropagation network through the architecture of the network. This approach has been successfully applied to the recognition of handwritten zip code digits provided by the U.S. Postal Service. A single network learns the entire recognition operation, going from the normalized image of the character to the final classification.

7,328 citations


Network Information
Related Journals (5)
IEEE Transactions on Neural Networks

6.7K papers, 522K citations

86% related
Journal of Machine Learning Research

3.1K papers, 519.3K citations

86% related
arXiv: Learning

45K papers, 837.1K citations

84% related
Nature Neuroscience

6.2K papers, 1.1M citations

83% related
Journal of Neurophysiology

21.5K papers, 1.6M citations

83% related
Performance
Metrics
No. of papers from the Journal in previous years
YearPapers
2021102
202078
201988
2018106
2017111
201696

Top Attributes

Show by:

Journal's top 5 most impactful authors

Shun-ichi Amari

42 papers, 4.7K citations

Terrence J. Sejnowski

33 papers, 5.5K citations

Masashi Sugiyama

21 papers, 535 citations

Christof Koch

14 papers, 703 citations

Terry Elliott

13 papers, 179 citations