scispace - formally typeset
Journal ArticleDOI

First- and second-order methods for learning: between steepest descent and Newton's method

Roberto Battiti
- 01 Mar 1992 - 
- Vol. 4, Iss: 2, pp 141-166
Reads0
Chats0
TLDR
First- and second-order optimization methods for learning in feedforward neural networks are reviewed to illustrate the main characteristics of the different methods and their mutual relations.
Abstract
On-line first-order backpropagation is sufficiently fast and effective for many large-scale classification problems but for very high precision mappings, batch processing may be the method of choice. This paper reviews first- and second-order optimization methods for learning in feedforward neural networks. The viewpoint is that of optimization: many methods can be cast in the language of optimization techniques, allowing the transfer to neural nets of detailed results about computational complexity and safety procedures to ensure convergence and to avoid numerical problems. The review is not intended to deliver detailed prescriptions for the most appropriate methods in specific applications, but to illustrate the main characteristics of the different methods and their mutual relations.

read more

Citations
More filters
Journal ArticleDOI

Gradient-based learning applied to document recognition

TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Journal ArticleDOI

Deep learning in neural networks

TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.
Journal ArticleDOI

An information-maximization approach to blind separation and blind deconvolution

TL;DR: It is suggested that information maximization provides a unifying framework for problems in "blind" signal processing and dependencies of information transfer on time delays are derived.
Journal ArticleDOI

Training feedforward networks with the Marquardt algorithm

TL;DR: The Marquardt algorithm for nonlinear least squares is presented and is incorporated into the backpropagation algorithm for training feedforward neural networks and is found to be much more efficient than either of the other techniques when the network contains no more than a few hundred weights.
Book

Pattern recognition and neural networks

TL;DR: Professor Ripley brings together two crucial ideas in pattern recognition; statistical methods and machine learning via neural networks in this self-contained account.
References
More filters

Numerical recipes in C

TL;DR: The Diskette v 2.06, 3.5''[1.44M] for IBM PC, PS/2 and compatibles [DOS] Reference Record created on 2004-09-07, modified on 2016-08-08.
Book

Numerical Methods for Unconstrained Optimization and Nonlinear Equations (Classics in Applied Mathematics, 16)

TL;DR: In this paper, Schnabel proposed a modular system of algorithms for unconstrained minimization and nonlinear equations, based on Newton's method for solving one equation in one unknown convergence of sequences of real numbers.
Book

Numerical methods for unconstrained optimization and nonlinear equations

TL;DR: Newton's Method for Nonlinear Equations and Unconstrained Minimization and methods for solving nonlinear least-squares problems with Special Structure.
Book

Adaptive Signal Processing

TL;DR: This chapter discusses Adaptive Arrays and Adaptive Beamforming, as well as other Adaptive Algorithms and Structures, and discusses the Z-Transform in Adaptive Signal Processing.
Journal ArticleDOI

Original Contribution: A scaled conjugate gradient algorithm for fast supervised learning

TL;DR: Experiments show that SCG is considerably faster than BP, CGL, and BFGS, and avoids a time consuming line search.