First- and second-order methods for learning: between steepest descent and Newton's method

doi:10.1162/NECO.1992.4.2.141

Journal ArticleDOI

First- and second-order methods for learning: between steepest descent and Newton's method

Roberto Battiti

- 01 Mar 1992 -

Neural Computation

- Vol. 4, Iss: 2, pp 141-166

Chats0

TLDR

First- and second-order optimization methods for learning in feedforward neural networks are reviewed to illustrate the main characteristics of the different methods and their mutual relations.

Abstract:

On-line first-order backpropagation is sufficiently fast and effective for many large-scale classification problems but for very high precision mappings, batch processing may be the method of choice. This paper reviews first- and second-order optimization methods for learning in feedforward neural networks. The viewpoint is that of optimization: many methods can be cast in the language of optimization techniques, allowing the transfer to neural nets of detailed results about computational complexity and safety procedures to ensure convergence and to avoid numerical problems. The review is not intended to deliver detailed prescriptions for the most appropriate methods in specific applications, but to illustrate the main characteristics of the different methods and their mutual relations.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Gradient-based learning applied to document recognition

Yann LeCun, +6 more

TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.

...read moreread less

Journal ArticleDOI

Deep learning in neural networks

Jürgen Schmidhuber

- 01 Jan 2015 -

Neural Networks

TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.

...read moreread less

Journal ArticleDOI

An information-maximization approach to blind separation and blind deconvolution

Anthony J. Bell, +1 more

- 01 Nov 1995 -

Neural Computation

TL;DR: It is suggested that information maximization provides a unifying framework for problems in "blind" signal processing and dependencies of information transfer on time delays are derived.

...read moreread less

Journal ArticleDOI

Training feedforward networks with the Marquardt algorithm

Martin T. Hagan, +1 more

- 01 Nov 1994 -

IEEE Transactions on Neural Networks

TL;DR: The Marquardt algorithm for nonlinear least squares is presented and is incorporated into the backpropagation algorithm for training feedforward neural networks and is found to be much more efficient than either of the other techniques when the network contains no more than a few hundred weights.

...read moreread less

Book

Pattern recognition and neural networks

Brian D. Ripley, +1 more

TL;DR: Professor Ripley brings together two crucial ideas in pattern recognition; statistical methods and machine learning via neural networks in this self-contained account.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Numerical recipes in C

William H. Press, +3 more

TL;DR: The Diskette v 2.06, 3.5''[1.44M] for IBM PC, PS/2 and compatibles [DOS] Reference Record created on 2004-09-07, modified on 2016-08-08.

...read moreread less

Book

Numerical Methods for Unconstrained Optimization and Nonlinear Equations (Classics in Applied Mathematics, 16)

John E. Dennis, +1 more

TL;DR: In this paper, Schnabel proposed a modular system of algorithms for unconstrained minimization and nonlinear equations, based on Newton's method for solving one equation in one unknown convergence of sequences of real numbers.

...read moreread less

Book

Numerical methods for unconstrained optimization and nonlinear equations

John E. Dennis, +1 more

TL;DR: Newton's Method for Nonlinear Equations and Unconstrained Minimization and methods for solving nonlinear least-squares problems with Special Structure.

...read moreread less

Book

Adaptive Signal Processing

Bernard Widrow, +1 more

TL;DR: This chapter discusses Adaptive Arrays and Adaptive Beamforming, as well as other Adaptive Algorithms and Structures, and discusses the Z-Transform in Adaptive Signal Processing.

...read moreread less

Journal ArticleDOI

Original Contribution: A scaled conjugate gradient algorithm for fast supervised learning

Martin F. Møller

- 06 Apr 1993 -

Neural Networks

TL;DR: Experiments show that SCG is considerably faster than BP, CGL, and BFGS, and avoids a time consuming line search.

...read moreread less

Collapse

First- and second-order methods for learning: between steepest descent and Newton's method

Citations

Gradient-based learning applied to document recognition

Deep learning in neural networks

An information-maximization approach to blind separation and blind deconvolution

Training feedforward networks with the Marquardt algorithm

Pattern recognition and neural networks

References

Numerical recipes in C

Numerical Methods for Unconstrained Optimization and Nonlinear Equations (Classics in Applied Mathematics, 16)

Numerical methods for unconstrained optimization and nonlinear equations

Adaptive Signal Processing

Original Contribution: A scaled conjugate gradient algorithm for fast supervised learning

Related Papers (5)

Neural Networks: A Comprehensive Foundation

Learning internal representations by error propagation

Neural networks for pattern recognition

Learning representations by back-propagating errors

Multilayer feedforward networks are universal approximators