Learning Algorithms for Connectionist Networks: Applied Gradient Methods of Nonlinear Optimization

Open Access

Learning Algorithms for Connectionist Networks: Applied Gradient Methods of Nonlinear Optimization

Raymond L. Watrous

- Vol. 2, pp 619-627

Chats0

TLDR

It is shown that in plateau regions of relatively constant gradient, the momentum term acts to increase the step size by a factor of 1/1-μ, where μ is the momentumTerm, and in valley regions with steep sides,The momentum constant acts to focus the search direction toward the local minimum by averaging oscillations in the gradient.

Abstract:

The problem of learning using connectionist networks, in which network connection strengths are modified systematically so that the response of the network increasingly approximates the desired response can be structured as an optimization problem. The widely used back propagation method of connectionist learning [19, 21, 18] is set in the context of nonlinear optimization. In this framework, the issues of stability, convergence and parallelism are considered. As a form of gradient descent with fixed step size, back propagation is known to be unstable, which is illustrated using Rosenbrock's function. This is contrasted with stable methods which involve a line search in the gradient direction. The convergence criterion for connectionist problems involving binary functions is discussed relative to the behavior of gradient descent in the vicinity of local minima. A minimax criterion is compared with the least squares criterion. The contribution of the momentum term [19, 18] to more rapid convergence is interpreted relative to the geometry of the weight space. It is shown that in plateau regions of relatively constant gradient, the momentum term acts to increase the step size by a factor of 1/1-μ, where μ is the momentum term. In valley regions with steep sides, the momentum constant acts to focus the search direction toward the local minimum by averaging oscillations in the gradient. Comments University of Pennsylvania Department of Computer and Information Science Technical Report No. MSCIS-88-62. This technical report is available at ScholarlyCommons: http://repository.upenn.edu/cis_reports/597 LEARNING ALGORITHMS FOR CONNECTIONIST NETWORKS: APPLIED GRADIENT METHODS OF NONLINEAR OPTIMIZATION

Learning Algorithms for Connectionist Networks: Applied Gradient Methods of Nonlinear Optimization

Citations

ANFIS: adaptive-network-based fuzzy inference system

Original Contribution: A scaled conjugate gradient algorithm for fast supervised learning

The Cascade-Correlation Learning Architecture

Fuzzy logic systems for engineering: a tutorial

Theory of the backpropagation neural network

References

A simplex method for function minimization

Learning internal representations by error propagation

Neural networks and physical systems with emergent collective computational abilities

Learning internal representations by error propagation

Practical Methods of Optimization.

Related Papers (5)

Increased Rates of Convergence Through Learning Rate Adaptation

Learning internal representations by error propagation

Parallel Distributed Processing: Explorations in the Microstructure of Cognition: Foundations

An introduction to computing with neural nets

Multilayer feedforward networks are universal approximators