scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Original Contribution: A scaled conjugate gradient algorithm for fast supervised learning

06 Apr 1993-Neural Networks (Elsevier Science Ltd.)-Vol. 6, Iss: 4, pp 525-533
TL;DR: Experiments show that SCG is considerably faster than BP, CGL, and BFGS, and avoids a time consuming line search.
About: This article is published in Neural Networks.The article was published on 1993-04-06. It has received 3882 citations till now. The article focuses on the topics: Conjugate gradient method & Broyden–Fletcher–Goldfarb–Shanno algorithm.
Citations
More filters
Book
18 Nov 2016
TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.
Abstract: Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts. Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning. The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models. Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.

38,208 citations

Book
01 Jan 1995
TL;DR: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition, and is designed as a text, with over 100 exercises, to benefit anyone involved in the fields of neural computation and pattern recognition.
Abstract: From the Publisher: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition. After introducing the basic concepts, the book examines techniques for modelling probability density functions and the properties and merits of the multi-layer perceptron and radial basis function network models. Also covered are various forms of error functions, principal algorithms for error function minimalization, learning and generalization in neural networks, and Bayesian techniques and their applications. Designed as a text, with over 100 exercises, this fully up-to-date work will benefit anyone involved in the fields of neural computation and pattern recognition.

19,056 citations

Journal ArticleDOI
Xin Yao1
01 Sep 1999
TL;DR: It is shown, through a considerably large literature review, that combinations between ANNs and EAs can lead to significantly better intelligent systems than relying on ANNs or EAs alone.
Abstract: Learning and evolution are two fundamental forms of adaptation. There has been a great interest in combining learning and evolution with artificial neural networks (ANNs) in recent years. This paper: 1) reviews different combinations between ANNs and evolutionary algorithms (EAs), including using EAs to evolve ANN connection weights, architectures, learning rules, and input features; 2) discusses different search operators which have been used in various EAs; and 3) points out possible future research directions. It is shown, through a considerably large literature review, that combinations between ANNs and EAs can lead to significantly better intelligent systems than relying on ANNs or EAs alone.

2,877 citations

Book
01 Jan 2009
TL;DR: A survey of previous comparisons and theoretical work descriptions of methods dataset descriptions criteria for comparison and methodology (including validation) empirical results machine learning on machine learning can be found in this article, where the authors also discuss their own work.
Abstract: Survey of previous comparisons and theoretical work descriptions of methods dataset descriptions criteria for comparison and methodology (including validation) empirical results machine learning on machine learning.

2,325 citations

References
More filters
Book ChapterDOI
01 Jan 1988
TL;DR: This chapter contains sections titled: The Problem, The Generalized Delta Rule, Simulation Results, Some Further Generalizations, Conclusion.
Abstract: This chapter contains sections titled: The Problem, The Generalized Delta Rule, Simulation Results, Some Further Generalizations, Conclusion

17,604 citations

Book
03 Jan 1986
TL;DR: In this paper, the problem of the generalized delta rule is discussed and the Generalized Delta Rule is applied to the simulation results of simulation results in terms of the generalized delta rule.
Abstract: This chapter contains sections titled: The Problem, The Generalized Delta Rule, Simulation Results, Some Further Generalizations, Conclusion

13,579 citations

Book
01 Jan 2009
TL;DR: The aim of this book is to provide a Discussion of Constrained Optimization and its Applications to Linear Programming and Other Optimization Problems.
Abstract: Preface Table of Notation Part 1: Unconstrained Optimization Introduction Structure of Methods Newton-like Methods Conjugate Direction Methods Restricted Step Methods Sums of Squares and Nonlinear Equations Part 2: Constrained Optimization Introduction Linear Programming The Theory of Constrained Optimization Quadratic Programming General Linearly Constrained Optimization Nonlinear Programming Other Optimization Problems Non-Smooth Optimization References Subject Index.

7,278 citations


"Original Contribution: A scaled con..." refers background or methods in this paper

  • ...The Levenberg-Marquardt algorithm [ 2 ] has to raise λk with a constant factor, whenever...

    [...]

  • ...Levenberg-Marquardt approach [ 2 ] in order to scale the step size....

    [...]

  • ...8 λk is also known as a Lagrange Multiplier [ 2 ]....

    [...]

  • ...Praxis shows that this is often the case [ 2 ]....

    [...]

  • ...The idea of the strategy is illustrated in the pseudo algorithm presented below, which minimizes the error function E(w) [ 2 ]....

    [...]

Book
01 Jan 1984
TL;DR: Strodiot and Zentralblatt as discussed by the authors introduced the concept of unconstrained optimization, which is a generalization of linear programming, and showed that it is possible to obtain convergence properties for both standard and accelerated steepest descent methods.
Abstract: This new edition covers the central concepts of practical optimization techniques, with an emphasis on methods that are both state-of-the-art and popular. One major insight is the connection between the purely analytical character of an optimization problem and the behavior of algorithms used to solve a problem. This was a major theme of the first edition of this book and the fourth edition expands and further illustrates this relationship. As in the earlier editions, the material in this fourth edition is organized into three separate parts. Part I is a self-contained introduction to linear programming. The presentation in this part is fairly conventional, covering the main elements of the underlying theory of linear programming, many of the most effective numerical algorithms, and many of its important special applications. Part II, which is independent of Part I, covers the theory of unconstrained optimization, including both derivations of the appropriate optimality conditions and an introduction to basic algorithms. This part of the book explores the general properties of algorithms and defines various notions of convergence. Part III extends the concepts developed in the second part to constrained optimization problems. Except for a few isolated sections, this part is also independent of Part I. It is possible to go directly into Parts II and III omitting Part I, and, in fact, the book has been used in this way in many universities.New to this edition is a chapter devoted to Conic Linear Programming, a powerful generalization of Linear Programming. Indeed, many conic structures are possible and useful in a variety of applications. It must be recognized, however, that conic linear programming is an advanced topic, requiring special study. Another important topic is an accelerated steepest descent method that exhibits superior convergence properties, and for this reason, has become quite popular. The proof of the convergence property for both standard and accelerated steepest descent methods are presented in Chapter 8. As in previous editions, end-of-chapter exercises appear for all chapters.From the reviews of the Third Edition: this very well-written book is a classic textbook in Optimization. It should be present in the bookcase of each student, researcher, and specialist from the host of disciplines from which practical optimization applications are drawn. (Jean-Jacques Strodiot, Zentralblatt MATH, Vol. 1207, 2011)

4,908 citations