scispace - formally typeset

JournalISSN: 0893-6080

Neural Networks 

About: Neural Networks is an academic journal. The journal publishes majorly in the area(s): Artificial neural network & Recurrent neural network. It has an ISSN identifier of 0893-6080. Over the lifetime, 4821 publication(s) have been published receiving 312118 citation(s).
Papers
More filters

Journal ArticleDOI
TL;DR: It is rigorously established that standard multilayer feedforward networks with as few as one hidden layer using arbitrary squashing functions are capable of approximating any Borel measurable function from one finite dimensional space to another to any desired degree of accuracy, provided sufficiently many hidden units are available.
Abstract: This paper rigorously establishes that standard multilayer feedforward networks with as few as one hidden layer using arbitrary squashing functions are capable of approximating any Borel measurable function from one finite dimensional space to another to any desired degree of accuracy, provided sufficiently many hidden units are available. In this sense, multilayer feedforward networks are a class of universal approximators.

15,834 citations


Journal ArticleDOI
Jürgen Schmidhuber1Institutions (1)
TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.
Abstract: In recent years, deep artificial neural networks (including recurrent ones) have won numerous contests in pattern recognition and machine learning. This historical survey compactly summarizes relevant work, much of it from the previous millennium. Shallow and Deep Learners are distinguished by the depth of their credit assignment paths, which are chains of possibly learnable, causal links between actions and effects. I review deep supervised learning (also recapitulating the history of backpropagation), unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.

11,176 citations



Journal ArticleDOI
Aapo Hyvärinen1, Erkki Oja1Institutions (1)
TL;DR: The basic theory and applications of ICA are presented, and the goal is to find a linear representation of non-Gaussian data so that the components are statistically independent, or as independent as possible.
Abstract: A fundamental problem in neural network research, as well as in many other disciplines, is finding a suitable representation of multivariate data, i.e. random vectors. For reasons of computational and conceptual simplicity, the representation is often sought as a linear transformation of the original data. In other words, each component of the representation is a linear combination of the original variables. Well-known linear transformation methods include principal component analysis, factor analysis, and projection pursuit. Independent component analysis (ICA) is a recently developed method in which the goal is to find a linear representation of non-Gaussian data so that the components are statistically independent, or as independent as possible. Such a representation seems to capture the essential structure of the data in many applications, including feature extraction and signal separation. In this paper, we present the basic theory and applications of ICA, and our recent work on the subject.

7,434 citations


Journal ArticleDOI
TL;DR: The conclusion is that for almost any real-world generalization problem one should use some version of stacked generalization to minimize the generalization error rate.
Abstract: This paper introduces stacked generalization, a scheme for minimizing the generalization error rate of one or more generalizers. Stacked generalization works by deducing the biases of the generalizer(s) with respect to a provided learning set. This deduction proceeds by generalizing in a second space whose inputs are (for example) the guesses of the original generalizers when taught with part of the learning set and trying to guess the rest of it, and whose output is (for example) the correct guess. When used with multiple generalizers, stacked generalization can be seen as a more sophisticated version of cross-validation, exploiting a strategy more sophisticated than cross-validation's crude winner-takes-all for combining the individual generalizers. When used with a single generalizer, stacked generalization is a scheme for estimating (and then correcting for) the error of a generalizer which has been trained on a particular learning set and then asked a particular question. After introducing stacked generalization and justifying its use, this paper presents two numerical experiments. The first demonstrates how stacked generalization improves upon a set of separate generalizers for the NETtalk task of translating text to phonemes. The second demonstrates how stacked generalization improves the performance of a single surface-fitter. With the other experimental evidence in the literature, the usual arguments supporting cross-validation, and the abstract justifications presented in this paper, the conclusion is that for almost any real-world generalization problem one should use some version of stacked generalization to minimize the generalization error rate. This paper ends by discussing some of the variations of stacked generalization, and how it touches on other fields like chaos theory.

4,693 citations


Network Information
Related Journals (5)
Neural Computation

3.1K papers, 381.6K citations

93% related
Neurocomputing

16.5K papers, 389.6K citations

93% related
IEEE Transactions on Neural Networks

6.7K papers, 522K citations

92% related
Pattern Recognition Letters

7.5K papers, 279.1K citations

82% related
arXiv: Learning

45K papers, 837.1K citations

82% related
Performance
Metrics
No. of papers from the Journal in previous years
YearPapers
202243
2021390
2020374
2019203
2018219
2017159