scispace - formally typeset
Search or ask a question
Topic

Convergence (routing)

About: Convergence (routing) is a research topic. Over the lifetime, 23702 publications have been published within this topic receiving 415745 citations.


Papers
More filters
Journal Article
TL;DR: A new technique for working set selection in SMO-type decomposition methods that uses second order information to achieve fast convergence andoretical properties such as linear convergence are established.
Abstract: Working set selection is an important step in decomposition methods for training support vector machines (SVMs). This paper develops a new technique for working set selection in SMO-type decomposition methods. It uses second order information to achieve fast convergence. Theoretical properties such as linear convergence are established. Experiments demonstrate that the proposed method is faster than existing selection methods using first order information.

1,461 citations

Journal ArticleDOI
TL;DR: It is shown theoretically that the new algorithm is stable and it is proved is the only member of the class considered for which a certain matrix error is reduced strictly monotonically when minimizing quadratic functions.
Abstract: The Convergence of a Class of Double-rank Minimization Algorithms 2. The New Algorithm d where q and ql are uniquely determined orthonormal vectors. The parameter 1/ is . ntially arbitrary in that it depends upon p. It was suggested in Part 1 that a suitable ice for I] would be zero since if it were negative, or large and positive, the matrix KI hence HI might become needlessly badly conditioned. It was noted moreover that osing I] in this way gives rise to a new algorithm. the two algorithms in this class already published, that due to Davidon (1959) modified by Fletcher & Powell (1963) is obtained by putting P equal to zero and s shown in Part I that this led, in general, to negative values of 1]. We thus expect quence of matrices {HI} obtained by that algorithm to exhibit a tendency to arity and this tendency has been noted by, among others, Broyden (1967) and on (1969). In a more recent algorithm, due to Greenstadt (1967), if H is positive ite the values of 1] are even more negative than those occurring in the DFP ithm. One result of this is that for this algorithm the matrices H cannot, unlike for the DFP algorithm, be proved to be positive definite and this has serious tions when considering numerical stability. this paper we show theoretically that the new algorithm is stable and we prove is the only member of the class considered for which a certain matrix error is reduced strictly monotonically when minimizing quadratic functions. We the effect of rounding and of poor conditioning of H on the attainable accuracy solution and conclude by presenting the results of a numerical survey in he performance of the new algorithm for a variety of test problem is compared t of the DFP algorithm. C. G. BROYDEN Computing Centre, University of Essex, Wivenhoe Park, Colchester, Essex

1,414 citations

01 Jan 1970
TL;DR: This paper presents a more detailed analysis of a class of minimization algorithms, which includes as a special case the DFP (Davidon-Fletcher-Powell) method, than has previously appeared and investigates how the successive errors depend, again for quadratic functions, upon the initial choice of iteration matrix.
Abstract: This paper presents a more detailed analysis of a class of minimization algorithms, which includes as a special case the DFP (Davidon-Fletcher-Powell) method, than has previously appeared. Only quadratic functions are considered but particular attention is paid to the magnitude of successive errors and their dependence upon the initial matrix. On the basis of this a possible explanation of some of the observed characteristics of the class is tentatively suggested. PROBABLY the best-known algorithm for determining the unconstrained minimum of a function of many variables, where explicit expressions are available for the first partial derivatives, is that of Davidon (1959) as modified by Fletcher & Powell (1963). This algorithm has many virtues. It is simple and does not require at any stage the solution of linear equations. It minimizes a quadratic function exactly in a finite number of steps and this property makes convergence of this algorithm rapid, when applied to more general functions, in the neighbourhood of the solution. It is, at least in theory, stable since the iteration matrix H,, which transforms the jth gradient into the /th step direction, may be shown to be positive definite. In practice the algorithm has been generally successful, but it has exhibited some puzzling behaviour. Broyden (1967) noted that H, does not always remain positive definite, and attributed this to rounding errors. Pearson (1968) found that for some problems the solution was obtained more efficiently if H, was reset to a positive definite matrix, often the unit matrix, at intervals during the computation. Bard (1968) noted that H, could become singular, attributed this to rounding error and suggested the use of suitably chosen scaling factors as a remedy. In this paper we analyse the more general algorithm given by Broyden (1967), of which the DFP algorithm is a special case, and determine how for quadratic functions the choice of an arbitrary parameter affects convergence. We investigate how the successive errors depend, again for quadratic functions, upon the initial choice of iteration matrix paying particular attention to the cases where this is either the unit matrix or a good approximation to the inverse Hessian. We finally give a tentative explanation of some of the observed experimental behaviour in the case where the function to be minimized is not quadratic.

1,271 citations

Journal ArticleDOI
TL;DR: The characterization of pattern search methods is exploited to establish a global convergence theory that does not enforce a notion of sufficient decrease, and is possible because the iterates of a pattern search method lie on a scaled, translated integer lattice.
Abstract: We introduce an abstract definition of pattern search methods for solving nonlinear unconstrained optimization problems. Our definition unifies an important collection of optimization methods that neither compute nor explicitly approximate derivatives. We exploit our characterization of pattern search methods to establish a global convergence theory that does not enforce a notion of sufficient decrease. Our analysis is possible because the iterates of a pattern search method lie on a scaled, translated integer lattice. This allows us to relax the classical requirements on the acceptance of the step, at the expense of stronger conditions on the form of the step, and still guarantee global convergence.

1,229 citations

Journal ArticleDOI
TL;DR: This work develops and analyze distributed algorithms based on dual subgradient averaging and provides sharp bounds on their convergence rates as a function of the network size and topology, and shows that the number of iterations required by the algorithm scales inversely in the spectral gap of thenetwork.
Abstract: The goal of decentralized optimization over a network is to optimize a global objective formed by a sum of local (possibly nonsmooth) convex functions using only local computation and communication. It arises in various application domains, including distributed tracking and localization, multi-agent coordination, estimation in sensor networks, and large-scale machine learning. We develop and analyze distributed algorithms based on dual subgradient averaging, and we provide sharp bounds on their convergence rates as a function of the network size and topology. Our analysis allows us to clearly separate the convergence of the optimization algorithm itself and the effects of communication dependent on the network structure. We show that the number of iterations required by our algorithm scales inversely in the spectral gap of the network, and confirm this prediction's sharpness both by theoretical lower bounds and simulations for various networks. Our approach includes the cases of deterministic optimization and communication, as well as problems with stochastic optimization and/or communication.

1,224 citations


Network Information
Related Topics (5)
Nonlinear system
208.1K papers, 4M citations
91% related
Optimization problem
96.4K papers, 2.1M citations
90% related
Differential equation
88K papers, 2M citations
90% related
Partial differential equation
70.8K papers, 1.6M citations
90% related
Matrix (mathematics)
105.5K papers, 1.9M citations
89% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202262
20211,831
20201,524
20191,346
20181,321
20171,075