scispace - formally typeset
Search or ask a question

Showing papers on "Line search published in 2015"


Journal ArticleDOI
TL;DR: Recent results on trust region methods for unconstrained optimization, constrained optimization, nonlinear equations and nonlinear least squares, nonsmooth optimization and optimization without derivatives are reviewed.
Abstract: Trust region methods are a class of numerical methods for optimization. Unlike line search type methods where a line search is carried out in each iteration, trust region methods compute a trial step by solving a trust region subproblem where a model function is minimized within a trust region. Due to the trust region constraint, nonconvex models can be used in trust region subproblems, and trust region algorithms can be applied to nonconvex and ill-conditioned problems. Normally it is easier to establish the global convergence of a trust region algorithm than that of its line search counterpart. In the paper, we review recent results on trust region methods for unconstrained optimization, constrained optimization, nonlinear equations and nonlinear least squares, nonsmooth optimization and optimization without derivatives. Results on trust region subproblems and regularization methods are also discussed.

249 citations


Journal ArticleDOI
TL;DR: In this article, the authors derived convergence results for projected line-search methods on the real algebraic variety of real $m \times n$ matrices of rank at most $k.
Abstract: The aim of this paper is to derive convergence results for projected line-search methods on the real-algebraic variety $\mathcal{M}_{\le k}$ of real $m \times n$ matrices of rank at most $k$. Such methods extend Riemannian optimization methods, which are successfully used on the smooth manifold $\mathcal{M}_k$ of rank-$k$ matrices, to its closure by taking steps along gradient-related directions in the tangent cone, and afterwards projecting back to $\mathcal{M}_{\le k}$. Considering such a method circumvents the difficulties which arise from the nonclosedness and the unbounded curvature of $\mathcal{M}_k$. The pointwise convergence is obtained for real-analytic functions on the basis of a Łojasiewicz inequality for the projection of the antigradient to the tangent cone. If the derived limit point lies on the smooth part of $\mathcal{M}_{\le k}$, i.e., in $\mathcal{M}_k$, this boils down to more or less known results, but with the benefit that asymptotic convergence rate estimates (for specific step-sizes...

110 citations


Posted Content
TL;DR: A probabilistic line search is constructed by combining the structure of existing deterministic methods with notions from Bayesian optimization, which retains a Gaussian process surrogate of the univariate optimization objective, and uses a probabilism belief over the Wolfe conditions to monitor the descent.
Abstract: In deterministic optimization, line searches are a standard tool ensuring stability and efficiency. Where only stochastic gradients are available, no direct equivalent has so far been formulated, because uncertain gradients do not allow for a strict sequence of decisions collapsing the search space. We construct a probabilistic line search by combining the structure of existing deterministic methods with notions from Bayesian optimization. Our method retains a Gaussian process surrogate of the univariate optimization objective, and uses a probabilistic belief over the Wolfe conditions to monitor the descent. The algorithm has very low computational cost, and no user-controlled parameters. Experiments show that it effectively removes the need to define a learning rate for stochastic gradient descent.

104 citations


Journal ArticleDOI
TL;DR: The new method performs remarkably well for the nearest low-rank correlation matrix problem in terms of speed and solution quality and is considerably competitive with the widely used SCF iteration for the Kohn–Sham total energy minimization.
Abstract: This paper considers optimization problems on the Stiefel manifold $$X^{\mathsf{T}}X=I_p$$XTX=Ip, where $$X\in \mathbb {R}^{n \times p}$$X?Rn×p is the variable and $$I_p$$Ip is the $$p$$p-by-$$p$$p identity matrix. A framework of constraint preserving update schemes is proposed by decomposing each feasible point into the range space of $$X$$X and the null space of $$X^{\mathsf{T}}$$XT. While this general framework can unify many existing schemes, a new update scheme with low complexity cost is also discovered. Then we study a feasible Barzilai---Borwein-like method under the new update scheme. The global convergence of the method is established with an adaptive nonmonotone line search. The numerical tests on the nearest low-rank correlation matrix problem, the Kohn---Sham total energy minimization and a specific problem from statistics demonstrate the efficiency of the new method. In particular, the new method performs remarkably well for the nearest low-rank correlation matrix problem in terms of speed and solution quality and is considerably competitive with the widely used SCF iteration for the Kohn---Sham total energy minimization.

98 citations


Journal ArticleDOI
TL;DR: In this article, the equivalence between mirror descent and the conditional gradient method was shown through convex duality, which implies that for certain problems, such as for supervised machine learning problems with nonsmooth losses or problems regularized by non-smooth regularizers, the primal subgradient method and the dual conditional gradient methods are formally equivalent.
Abstract: Given a convex optimization problem and its dual, there are many possible first-order algorithms. In this paper, we show the equivalence between mirror descent algorithms and algorithms generalizing the conditional gradient method. This is done through convex duality and implies notably that for certain problems, such as for supervised machine learning problems with nonsmooth losses or problems regularized by nonsmooth regularizers, the primal subgradient method and the dual conditional gradient method are formally equivalent. The dual interpretation leads to a form of line search for mirror descent, as well as guarantees of convergence for primal-dual certificates.

85 citations


Journal ArticleDOI
TL;DR: This paper shows that the Weiszfeld approach may be extended to a wide variety of problems to find an Lq mean for 1 ≤ q <; 2, while maintaining simplicity and provable convergence and experiments show the improved reliability and robustness compared to L2 optimization.
Abstract: In many computer vision applications, a desired model of some type is computed by minimizing a cost function based on several measurements. Typically, one may compute the model that minimizes the $L_2$ cost, that is the sum of squares of measurement errors with respect to the model. However, the $Lq$ solution which minimizes the sum of the $q$ th power of errors usually gives more robust results in the presence of outliers for some values of $q$ , for example, $q = 1$ . The Weiszfeld algorithm is a classic algorithm for finding the geometric $L1$ mean of a set of points in Euclidean space. It is provably optimal and requires neither differentiation, nor line search. The Weiszfeld algorithm has also been generalized to find the $L1$ mean of a set of points on a Riemannian manifold of non-negative curvature. This paper shows that the Weiszfeld approach may be extended to a wide variety of problems to find an $Lq$ mean for $1\le q \lt 2$ , while maintaining simplicity and provable convergence. We apply this problem to both single-rotation averaging (under which the algorithm provably finds the global $Lq$ optimum) and multiple rotation averaging (for which no such proof exists). Experimental results of $Lq$ optimization for rotations show the improved reliability and robustness compared to $L_2$ optimization.

56 citations


Journal ArticleDOI
TL;DR: In this article, a line search approach is proposed to ensure the convergence of the successive second-order cone programming (SOCP) method for the maximum-crossrange problem.

54 citations


Journal ArticleDOI
TL;DR: This paper suggests and analyze the Levenberg-Marquardt method for solving system of absolute value equations A x - | x | = b, and considers numerical examples to illustrate the implementation and efficiency of the method.

50 citations


Journal ArticleDOI
TL;DR: This paper shows that the average case performance of CGIHT is robust to additive noise well beyond its theoretical worst case guarantees and, in this setting, is typically the fastest iterative hard thresholding algorithm for sparse approximation.
Abstract: Conjugate gradient iterative hard thresholding (CGIHT) for compressed sensing combines the low per iteration computational cost of simple line search iterative hard thresholding algorithms with the improved convergence rates of more sophisticated sparse approximation algorithms. This paper shows that the average case performance of CGIHT is robust to additive noise well beyond its theoretical worst case guarantees and, in this setting, is typically the fastest iterative hard thresholding algorithm for sparse approximation. Moreover, CGIHT is observed to benefit more than other iterative hard thresholding algorithms when jointly considering multiple sparse vectors whose sparsity patterns coincide.

46 citations


Journal ArticleDOI
TL;DR: By introducing an adaptive LM parameter for AM LM algorithm, an efficient AMLM algorithm is proposed and the cubic convergence of the new algorithm is presented while numerical experiments show the new algorithms is promising.

40 citations


Journal ArticleDOI
TL;DR: This paper proposes an improved version of the nonmonotone line search technique proposed by Zhang and Hager, and its results demonstrate that the new line search strategy outperforms the other similar ones.
Abstract: In this paper, a new nonmonotone line search rule is proposed,which is verified to be an improved version of the nonmonotone line search technique proposed by Zhang and Hager. Unlike the Zhang and Hager's method, our nonmonotone line search is proved to own a nice property similar to the standard Armijo line search. In virtue of such a property, global convergence is established for the developed algorithm, where the search direction is supposed to satisfy some mild conditions and the stepsize is chosen by the new line search rule. R-linear convergence of the developed algorithm is proved for strongly convex objective functions. The developed algorithm is used to solve the test problems available in the CUTEr, the numerical results demonstrate that the new line search strategy outperforms the other similar ones.

Journal ArticleDOI
TL;DR: Two descent methods are proposed based on the minimization of a suitable exact penalty function, but they use different rules for updating the penalization parameter and they rely on different types of line search.
Abstract: This paper deals with equilibrium problems with nonlinear constraints. Exploiting a gap function which relies on a polyhedral approximation of the feasible region, we propose two descent methods. They are both based on the minimization of a suitable exact penalty function, but they use different rules for updating the penalization parameter and they rely on different types of line search. The convergence of both algorithms is proved under standard assumptions.

Journal ArticleDOI
TL;DR: Nonmonotonicity of the line search combines well with the variable sample size scheme as it allows more freedom in choosing the search direction and the step size while the sample size is not the maximal one and increases the chances of finding a global solution.
Abstract: Nonmonotone line search methods for unconstrained minimization with the objective functions in the form of mathematical expectation are considered. The objective function is approximated by the sample average approximation (SAA) with a large sample of fixed size. The nonmonotone line search framework is embedded with a variable sample size strategy such that different sample size at each iteration allow us to reduce the cost of the sample average approximation. The variable sample scheme we consider takes into account the decrease in the approximate objective function and the quality of the approximation of the objective function at each iteration and thus the sample size may increase or decrease at each iteration. Nonmonotonicity of the line search combines well with the variable sample size scheme as it allows more freedom in choosing the search direction and the step size while the sample size is not the maximal one and increases the chances of finding a global solution. Eventually the maximal sample size is used so the variable sample size strategy generates the solution of the same quality as the SAA method but with significantly smaller number of function evaluations. Various nonmonotone strategies are compared on a set of test problems.

Posted Content
TL;DR: A new general purpose proximal Newton algorithm that is able to deal with learning problems where both the loss function and the regularizer are nonconvex but belong to the class of difference of convex (DC) functions.
Abstract: We introduce a novel algorithm for solving learning problems where both the loss function and the regularizer are non-convex but belong to the class of difference of convex (DC) functions. Our contribution is a new general purpose proximal Newton algorithm that is able to deal with such a situation. The algorithm consists in obtaining a descent direction from an approximation of the loss function and then in performing a line search to ensure sufficient descent. A theoretical analysis is provided showing that the iterates of the proposed algorithm {admit} as limit points stationary points of the DC objective function. Numerical experiments show that our approach is more efficient than current state of the art for a problem with a convex loss functions and non-convex regularizer. We have also illustrated the benefit of our algorithm in high-dimensional transductive learning problem where both loss function and regularizers are non-convex.

Journal ArticleDOI
TL;DR: Results of computer simulations and real 3-D data show that the proposed algorithm converges much faster than the conventional EM and PCG for smooth edge-preserving regularization and can also be more efficient than the current state-of-art algorithms for the nonsmooth l1 regularization.
Abstract: Iterative image reconstruction for positron emission tomography can improve image quality by using spatial regularization. The most commonly used quadratic penalty often oversmoothes sharp edges and fine features in reconstructed images, while nonquadratic penalties can preserve edges and achieve higher contrast recovery. Existing optimization algorithms such as the expectation maximization (EM) and preconditioned conjugate gradient (PCG) algorithms work well for the quadratic penalty, but are less efficient for high-curvature or nonsmooth edge-preserving regularizations. This paper proposes a new algorithm to accelerate edge-preserving image reconstruction by using two strategies: trust surrogate and optimization transfer descent. Trust surrogate approximates the original penalty by a smoother function at each iteration, but guarantees the algorithm to descend monotonically; Optimization transfer descent accelerates a conventional optimization transfer algorithm by using conjugate gradient and line search. Results of computer simulations and real 3-D data show that the proposed algorithm converges much faster than the conventional EM and PCG for smooth edge-preserving regularization and can also be more efficient than the current state-of-art algorithms for the nonsmooth $\ell _{1}$ regularization.

Journal ArticleDOI
TL;DR: The algorithm presented in this paper not only needs not to compute the largest eigenvalue of the related matrix but also need not to use any line search scheme.

Journal ArticleDOI
TL;DR: Numerical testings and comparisons show that constructed scheme exceeds some known iterations for unconstrained optimization with respect to all three tested properties: number of iterations, CPU time and number of function evaluations.

Journal ArticleDOI
TL;DR: This paper proposes some new FR type directions in the frame of algorithm which is a combination of conjugate gradient approach and hyperplane projection technique to solve nonlinear monotone systems.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed three derivative-free projection methods for solving nonlinear equations with convex constraints, which can be regarded as the combinations of some recently developed conjugate gradient methods and the well-known projection method.
Abstract: In this paper, we propose three derivative-free projection methods for solving nonlinear equations with convex constraints, which can be regarded as the combinations of some recently developed conjugate gradient methods and the well-known projection method Compared with the existing derivative-free projection methods, we use some new hyperplanes to obtain the new iterate, and without the requirement of the Lipschitz continuity of the equation, we prove three new methods are globally convergent with an Armijo-type line search Preliminary numerical results are reported to show the efficiency of the proposed methods

Journal ArticleDOI
26 Oct 2015-PLOS ONE
TL;DR: The numerical results indicate that the first algorithm is effective and competitive for solving unconstrained optimization problems and that the second algorithms is effective for solving large-scale nonlinear equations.
Abstract: Two new PRP conjugate Algorithms are proposed in this paper based on two modified PRP conjugate gradient methods: the first algorithm is proposed for solving unconstrained optimization problems, and the second algorithm is proposed for solving nonlinear equations. The first method contains two aspects of information: function value and gradient value. The two methods both possess some good properties, as follows: 1) βk ≥ 0 2) the search direction has the trust region property without the use of any line search method 3) the search direction has sufficient descent property without the use of any line search method. Under some suitable conditions, we establish the global convergence of the two algorithms. We conduct numerical experiments to evaluate our algorithms. The numerical results indicate that the first algorithm is effective and competitive for solving unconstrained optimization problems and that the second algorithm is effective for solving large-scale nonlinear equations.

Book ChapterDOI
TL;DR: An overall hybrid algorithm combining the appealing properties of both exact and heuristic methods is discussed, with focus on Particle Swarm Optimization (PSO) and line search-based derivative-free algorithms.
Abstract: The hybrid use of exact and heuristic derivative-free methods for global unconstrained optimization problems is presented. Many real-world problems are modeled by computationally expensive functions, such as problems in simulation-based design of complex engineering systems. Objective-function values are often provided by systems of partial differential equations, solved by computationally expensive black-box tools. The objective-function is likely noisy and its derivatives are often not available. On the one hand, the use of exact optimization methods might be computationally too expensive, especially if asymptotic convergence properties are sought. On the other hand, heuristic methods do not guarantee the stationarity of their final solutions. Nevertheless, heuristic methods are usually able to provide an approximate solution at a reasonable computational cost, and have been widely applied to real-world simulation-based design optimization problems. Herein, an overall hybrid algorithm combining the appealing properties of both exact and heuristic methods is discussed, with focus on Particle Swarm Optimization (PSO) and line search-based derivative-free algorithms. The theoretical properties of the hybrid algorithm are detailed, in terms of limit points stationarity. Numerical results are presented for a specific test function and for two real-world optimization problems in ship hydrodynamics.

Journal ArticleDOI
TL;DR: This approach adopts a monotone projected Barzilai–Borwein (MPBB) method as an essential subroutine where the step length is determined without line search to accelerate convergence.

Journal ArticleDOI
TL;DR: By proving that the optimal arc flow solution of the bi-level problem must exist in the boundary of capacity constraints, an exact line search method called golden section search is embedded in a scatter search method for solving this complicated MNDP.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a path-based partial linearization algorithm to approximately solve the restricted CDA-PCL-SUE, which is a three-phase iterative process, where phase 1 is an entropy maximization problem on O-D flow space that can be solved by Bregman's balancing algorithm.
Abstract: The equivalent mathematical formulation of the combined doubly-constrained gravity-based trip distribution and paired-combinatorial-logit stochastic user equilibrium assignment problem (CDA-PCL-SUE) is proposed. Its first order conditions are shown to be equal to the gravity equations and PCL formula. The proposed solution method is a path-based partial linearization algorithm to approximately solve the restricted CDA-PCL-SUE. The proposed algorithm is a three-phase iterative process. Phase 1 is an entropy maximization problem on O-D flow space that can be solved by Bregman’s balancing algorithm. Phase 2 is a PCL SUE problem that can be solved by PCL formula. Phase 3 is line search. CDA-PCL-SUE is solved on a small network and a real network, the city of Winnipeg network. The proposed algorithms with the six line search methods, namely, golden section (GS), bisection (BS), Armijo’s rule (AR), method of successive averages (MSA), self-regulated averaging (SRA) scheme, and quadratic interpolation (QI) scheme, are compared in terms of various convergence characteristics: root mean square error, step size, KKT-based mean square error and objective function. In terms of computational efficiency, under different path set sizes, dispersion parameters, impedance parameters and demand levels, the following line search methods are ordered from best to worst: SRA, GS, AR, QI, BS and MSA. The performances of Armijo’s rule and QI have greater variances. The performance of QI is worse with the increase of the path set size. Given all other factors being the same, the increase of dispersion parameter, path set size or demand level yields the increase of CPU time, whereas the change of impedance parameter does not influence CPU time. In addition, CDA-PCL-SUE is compared with its multinomial-logit counterpart (CDA-MNL-SUE).

Journal ArticleDOI
TL;DR: A new computational procedure for solving the optimal zero-forcing beamforming problem in multiple antenna channels that maximizes user achievable rate with restriction on the per-antenna element power constraints is proposed.

Journal ArticleDOI
TL;DR: A two-phase descent direction method for unconstrained stochastic optimization problem is proposed, and the almost sure convergence of the proposed method is established, under standard assumption for descent direction and SA methods.
Abstract: A two-phase descent direction method for unconstrained stochastic optimization problem is proposed. A line-search method with an arbitrary descent direction is used to determine the step sizes during the initial phase, and the second phase performs the stochastic approximation SA step sizes. The almost sure convergence of the proposed method is established, under standard assumption for descent direction and SA methods. The algorithm used for practical implementation combines a line-search quasi-Newton QN method, in particular the Broyden–Fletcher–Goldfarb–Shanno BFGS and Symmetric Rank 1 SR1 methods, with the SA iterations. Numerical results show good performance of the proposed method for different noise levels.

Journal ArticleDOI
TL;DR: This paper depicts a new hybrid of CG method which relates to the famous Polak-Ribiere-Polyak (PRP) formula and reveals a solution for the PRP case which is not globally convergent with the strong Wolfe-Powell (SWP) line search.
Abstract: Conjugate gradient (CG) method is an interesting tool to solve optimization problems in many fields, such as design, economics, physics, and engineering. In this paper, we depict a new hybrid of CG method which relates to the famous Polak-Ribiere-Polyak (PRP) formula. It reveals a solution for the PRP case which is not globally convergent with the strong Wolfe-Powell (SWP) line search. The new formula possesses the sufficient descent condition and the global convergent properties. In addition, we further explained about the cases where PRP method failed with SWP line search. Furthermore, we provide numerical computations for the new hybrid CG method which is almost better than other related PRP formulas in both the number of iterations and the CPU time under some standard test functions.

Journal ArticleDOI
TL;DR: Based on a new smoothing function of the Fischer–Burmeister function, a smoothing-type algorithm is proposed for solving the second-order cone complementarity problem, which contains the usual monotone line search as a special case.
Abstract: The second-order cone complementarity problem (denoted by SOCCP) can be effectively solved by smoothing-type algorithms, which in general are designed based on some monotone line search. In this paper, based on a new smoothing function of the Fischer–Burmeister function, we propose a smoothing-type algorithm for solving the SOCCP. The proposed algorithm uses a new nonmonotone line search scheme, which contains the usual monotone line search as a special case. Under suitable assumptions, we show that the proposed algorithm is globally and locally quadratically convergent. Some numerical results are reported which indicate the effectiveness of the proposed algorithm.

Proceedings ArticleDOI
10 May 2015
TL;DR: A novel method called smart line search (SLS) has been proposed in this paper and has been combined with Teaching Learning based optimization (TLBO) and it is seen that new method is positively effective on damping time and also power frequency and oscillation when the fault occurs.
Abstract: In this paper a new effectual optimization approach is proposed which optimizes the power system stabilizers (PSSs) parameters in a multi-machine power system. The PSS parameters are established for four PSSs which are linked to four synchronous generators. There is an increased demand for development of such algorithm, so that made researchers look for algorithms not only being metaheuristic, but also inherit desired properties of so-called deterministic approaches. To accomplish this objective, a novel method called smart line search (SLS) has been proposed in this paper and has been combined with Teaching Learning based optimization (TLBO). Being so new-found, SLS tries to take benefits of gradient methods using weighted stochastic selection approach which not only leads to unraveling of new local optimums, but also with a performance rather more speedy than conventional algorithms so far used. The performance of the proposed approach is also compared with other method. By observing the simulation results in two-area four-machine power system and compare with each of old algorithm, have been seen that new method is positively effective on damping time and also power frequency and oscillation when the fault occurs.

Proceedings ArticleDOI
07 Jun 2015
TL;DR: Local search algorithms for timing-driven placement optimization find local slack optima for cells under arbitrary delay models and can be applied late in the design flow, the key ingredients are an implicit path straightening and a clustering of neighboring cells.
Abstract: We present local search algorithms for timing-driven placement optimization. They find local slack optima for cells under arbitrary delay models and can be applied late in the design flow. The key ingredients are an implicit path straightening and a clustering of neighboring cells. Cell clusters are moved jointly to speed up the algorithm and escape suboptimal solutions, in which single cell algorithms are trapped, particularly in the presence of layer assignments. Given a cell cluster, we initially perform a line search for maximum slack on the straight line segment connecting the most critical upstream and downstream cells of the cluster. Thereby, the Euclidean path length is minimized. An iterative application will implicitly straighten the path. Later, slacks are improved further by applying ascent steps in estimated supergradient direction. The benefit of our algorithms is demonstrated experimentally within an industrial microprocessor design flow, and on recent ICCAD benchmarks circuits.