Showing papers in "Siam Journal on Optimization in 2013"

PDF

Open Access

Journal Article•DOI•

A unified convergence analysis of block successive minimization methods for nonsmooth optimization

[...]

Meisam Razaviyayn¹, Mingyi Hong, Zhi-Quan Luo•Institutions (1)

06 Jun 2013-Siam Journal on Optimization

TL;DR: This paper studies an alternative inexact BCD approach which updates the variable blocks by successively minimizing a sequence of approximations of f which are either locally tight upper bounds of $f$ or strictly convex local approximation of f.

...read moreread less

Abstract: The block coordinate descent (BCD) method is widely used for minimizing a continuous function $f$ of several block variables. At each iteration of this method, a single block of variables is optimized, while the remaining variables are held fixed. To ensure the convergence of the BCD method, the subproblem of each block variable needs to be solved to its unique global optimal. Unfortunately, this requirement is often too restrictive for many practical scenarios. In this paper, we study an alternative inexact BCD approach which updates the variable blocks by successively minimizing a sequence of approximations of $f$ which are either locally tight upper bounds of $f$ or strictly convex local approximations of $f$. The main contributions of this work include the characterizations of the convergence conditions for a fairly wide class of such methods, especially for the cases where the objective functions are either nondifferentiable or nonconvex. Our results unify and extend the existing convergence results ...

...read moreread less

1,032 citations

Journal Article•DOI•

Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming

[...]

Saeed Ghadimi, Guanghui Lan

03 Dec 2013-Siam Journal on Optimization

TL;DR: The randomized stochastic gradient (RSG) algorithm as mentioned in this paper is a type of approximation algorithm for non-convex nonlinear programming problems, and it has a nearly optimal rate of convergence if the problem is convex.

...read moreread less

Abstract: In this paper, we introduce a new stochastic approximation type algorithm, namely, the randomized stochastic gradient (RSG) method, for solving an important class of nonlinear (possibly nonconvex) stochastic programming problems. We establish the complexity of this method for computing an approximate stationary point of a nonlinear programming problem. We also show that this method possesses a nearly optimal rate of convergence if the problem is convex. We discuss a variant of the algorithm which consists of applying a postoptimization phase to evaluate a short list of solutions generated by several independent runs of the RSG method, and we show that such modification allows us to improve significantly the large-deviation properties of the algorithm. These methods are then specialized for solving a class of simulation-based optimization problems in which only stochastic zeroth-order information is available.

...read moreread less

599 citations

Journal Article•DOI•

On the Convergence of Block Coordinate Descent Type Methods

[...]

Amir Beck, Luba Tetruashvili

16 Oct 2013-Siam Journal on Optimization

TL;DR: This paper analyzes the block coordinate gradient projection method in which each iteration consists of performing a gradient projection step with respect to a certain block taken in a cyclic order and establishes global sublinear rate of convergence.

...read moreread less

Abstract: In this paper we study smooth convex programming problems where the decision variables vector is split into several blocks of variables. We analyze the block coordinate gradient projection method in which each iteration consists of performing a gradient projection step with respect to a certain block taken in a cyclic order. Global sublinear rate of convergence of this method is established and it is shown that it can be accelerated when the problem is unconstrained. In the unconstrained setting we also prove a sublinear rate of convergence result for the so-called alternating minimization method when the number of blocks is two. When the objective function is also assumed to be strongly convex, linear rate of convergence is established.

...read moreread less

576 citations

Journal Article•DOI•

Low-Rank Matrix Completion By Riemannian Optimization

[...]

Bart Vandereycken¹•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

20 Jun 2013-Siam Journal on Optimization

TL;DR: This work proposes a new algorithm for matrix completion that minimizes the least-square distance on the sampling set over the Riemannian manifold of fixed-rank matrices and proves convergence of a regularized version of the algorithm under the assumption that the restricted isometry property holds for incoherent matrices throughout the iterations.

...read moreread less

Abstract: The matrix completion problem consists of finding or approximating a low-rank matrix based on a few samples of this matrix. We propose a new algorithm for matrix completion that minimizes the least-square distance on the sampling set over the Riemannian manifold of fixed-rank matrices. The algorithm is an adaptation of classical nonlinear conjugate gradients, developed within the framework of retraction-based optimization on manifolds. We describe all the necessary objects from differential geometry necessary to perform optimization over this low-rank matrix manifold, seen as a submanifold embedded in the space of matrices. In particular, we describe how metric projection can be used as retraction and how vector transport lets us obtain the conjugate search directions. Finally, we prove convergence of a regularized version of our algorithm under the assumption that the restricted isometry property holds for incoherent matrices throughout the iterations. The numerical experiments indicate that our approach...

...read moreread less

512 citations

Journal Article•DOI•

Optimization-Based Algorithms for Tensor Decompositions: Canonical Polyadic Decomposition, Decomposition in Rank-$(L_r,L_r,1)$ Terms, and a New Generalization

[...]

Laurent Sorber, Marc Van Barel, Lieven De Lathauwer

18 Apr 2013-Siam Journal on Optimization

TL;DR: Combined with an effective preconditioner, numerical experiments confirm that these methods are among the most efficient and robust currently available for computing the CPD, rank-$(L_r,L-r,1)$ BTD, and their generalized decomposition.

...read moreread less

Abstract: The canonical polyadic and rank-$(L_r,L_r,1)$ block term decomposition (CPD and BTD, respectively) are two closely related tensor decompositions. The CPD and, recently, BTD are important tools in psychometrics, chemometrics, neuroscience, and signal processing. We present a decomposition that generalizes these two and develop algorithms for its computation. Among these algorithms are alternating least squares schemes, several general unconstrained optimization techniques, and matrix-free nonlinear least squares methods. In the latter we exploit the structure of the Jacobian's Gramian to reduce computational and memory cost. Combined with an effective preconditioner, numerical experiments confirm that these methods are among the most efficient and robust currently available for computing the CPD, rank-$(L_r,L_r,1)$ BTD, and their generalized decomposition.

...read moreread less

282 citations

Journal Article•DOI•

Sparsity Constrained Nonlinear Optimization: Optimality Conditions and Algorithms

[...]

Amir Beck¹, Yonina C. Eldar¹•Institutions (1)

Technion – Israel Institute of Technology¹

23 Jul 2013-Siam Journal on Optimization

TL;DR: In this paper, the problem of minimizing a general continuously differentiable function subject to sparsity constraints is treated and several different optimality criteria which are based on the notions of stationarity and coordinatewise optimality are derived.

...read moreread less

Abstract: This paper treats the problem of minimizing a general continuously differentiable function subject to sparsity constraints. We present and analyze several different optimality criteria which are based on the notions of stationarity and coordinatewise optimality. These conditions are then used to derive three numerical algorithms aimed at finding points satisfying the resulting optimality criteria: the iterative hard thresholding method and the greedy and partial sparse-simplex methods. The first algorithm is essentially a gradient projection method, while the remaining two algorithms are of a coordinate descent type. The theoretical convergence of these techniques and their relations to the derived optimality conditions are studied. The algorithms and results are illustrated by several numerical examples.

...read moreread less

272 citations

Journal Article•DOI•

Optimal Stochastic Approximation Algorithms for Strongly Convex Stochastic Composite Optimization, II: Shrinking Procedures and Optimal Algorithms

[...]

Saeed Ghadimi¹, Guanghui Lan¹•Institutions (1)

University of Florida¹

22 Oct 2013-Siam Journal on Optimization

TL;DR: A multistage AC-SA algorithm is introduced, which possesses an optimal rate of convergence for solving strongly convex SCO problems in terms of the dependence on not only the target accuracy, but also a number of problem parameters and the selection of initial points.

...read moreread less

Abstract: In this paper we study new stochastic approximation (SA) type algorithms, namely, the accelerated SA (AC-SA), for solving strongly convex stochastic composite optimization (SCO) problems. Specifically, by introducing a domain shrinking procedure, we significantly improve the large-deviation results associated with the convergence rate of a nearly optimal AC-SA algorithm presented by Ghadimi and Lan in [SIAM J. Optim., 22 (2012), pp 1469--1492]. Moreover, we introduce a multistage AC-SA algorithm, which possesses an optimal rate of convergence for solving strongly convex SCO problems in terms of the dependence on not only the target accuracy, but also a number of problem parameters and the selection of initial points. To the best of our knowledge, this is the first time that such an optimal method has been presented in the literature. From our computational results, these AC-SA algorithms can substantially outperform the classical SA and some other SA type algorithms for solving certain classes of strongly...

...read moreread less

226 citations

Journal Article•DOI•

Iteration-complexity of block-decomposition algorithms and the alternating direction method of multipliers ∗

[...]

Renato D. C. Monteiro¹, Benar Fux Svaiter²•Institutions (2)

Georgia Institute of Technology¹, Instituto Nacional de Matemática Pura e Aplicada²

14 Mar 2013-Siam Journal on Optimization

TL;DR: A framework of block-decomposition prox-type algorithms for solving the monotone inclusion problem and shows that any method in this framework is also a special instance of the hybrid proximal extragradient (HPE) method introduced by Solodov and Svaiter is shown.

...read moreread less

Abstract: In this paper, we consider the monotone inclusion problem consisting of the sum of a continuous monotone map and a point-to-set maximal monotone operator with a separable two-block structure and introduce a framework of block-decomposition prox-type algorithms for solving it which allows for each one of the single-block proximal subproblems to be solved in an approximate sense. Moreover, by showing that any method in this framework is also a special instance of the hybrid proximal extragradient (HPE) method introduced by Solodov and Svaiter, we derive corresponding convergence rate results. We also describe some instances of the framework based on specific and inexpensive schemes for solving the single-block proximal subproblems. Finally, we consider some applications of our methodology to establish for the first time (i) the iteration-complexity of an algorithm for finding a zero of the sum of two arbitrary maximal monotone operators and, as a consequence, the ergodic iteration-complexity of the Douglas-...

...read moreread less

224 citations

Journal Article•DOI•

A Nonlinear Conjugate Gradient Algorithm with an Optimal Property and an Improved Wolfe Line Search

[...]

Yu-Hong Dai, CaiXia Kou¹•Institutions (1)

Chinese Academy of Sciences¹

21 Feb 2013-Siam Journal on Optimization

TL;DR: A family of conjugate gradient methods for unconstrained optimization and an improved Wolfe line search are proposed, which can avoid a numerical drawback of the original Wolfe line Search and guarantee the global convergence of the conjugATE gradient method under mild conditions.

...read moreread less

Abstract: In this paper, we seek the conjugate gradient direction closest to the direction of the scaled memoryless BFGS method and propose a family of conjugate gradient methods for unconstrained optimization. An improved Wolfe line search is also proposed, which can avoid a numerical drawback of the original Wolfe line search and guarantee the global convergence of the conjugate gradient method under mild conditions. To accelerate the algorithm, we introduce adaptive restarts along negative gradients based on the extent to which the function approximates some quadratic function during previous iterations. Numerical experiments with the CUTEr collection show that the proposed algorithm is promising.

...read moreread less

214 citations

Journal Article•DOI•

Accelerated and Inexact Forward-Backward Algorithms

[...]

Silvia Villa, Saverio Salzo, Luca Baldassarre, Alessandro Verri

06 Aug 2013-Siam Journal on Optimization

TL;DR: A convergence analysis of accelerated forward-backward splitting methods for composite function minimization, when the proximity operator is not available in closed form, and can only be computed up to a certain precision is proposed.

...read moreread less

Abstract: We propose a convergence analysis of accelerated forward-backward splitting methods for composite function minimization, when the proximity operator is not available in closed form, and can only be computed up to a certain precision. We prove that the $1/k^2$ convergence rate for the function values can be achieved if the admissible errors are of a certain type and satisfy a sufficiently fast decay condition. Our analysis is based on the machinery of estimate sequences first introduced by Nesterov for the study of accelerated gradient descent algorithms. Furthermore, we give a global complexity analysis, taking into account the cost of computing admissible approximations of the proximal point. An experimental analysis is also presented.

...read moreread less

199 citations

Journal Article•DOI•

Nonconvex Notions of Regularity and Convergence of Fundamental Algorithms for Feasibility Problems

[...]

Robert Hesse¹, D. Russell Luke•Institutions (1)

University of Göttingen¹

03 Dec 2013-Siam Journal on Optimization

TL;DR: In this article, a notion of local subfirm nonexpansiveness with respect to the intersection is introduced for consistent feasibility problems, together with a coercivity condition that relates to the regularity of the collection of sets at points in the intersection, yields local linear convergence of AP for a wide class of nonconvex problems.

...read moreread less

Abstract: We consider projection algorithms for solving (nonconvex) feasibility problems in Euclidean spaces. Of special interest are the method of alternating projections (AP) and the Douglas--Rachford algorithm (DR). In the case of convex feasibility, firm nonexpansiveness of projection mappings is a global property that yields global convergence of AP and for consistent problems DR. A notion of local subfirm nonexpansiveness with respect to the intersection is introduced for consistent feasibility problems. This, together with a coercivity condition that relates to the regularity of the collection of sets at points in the intersection, yields local linear convergence of AP for a wide class of nonconvex problems and even local linear convergence of nonconvex instances of the DR algorithm.

...read moreread less

Journal Article•DOI•

An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and Its Implications to Second-Order Methods

[...]

Renato D. C. Monteiro¹, Benar Fux Svaiter²•Institutions (2)

Georgia Institute of Technology¹, Instituto Nacional de Matemática Pura e Aplicada²

06 Jun 2013-Siam Journal on Optimization

TL;DR: This paper presents an accelerated variant of the hybrid proximal extragradient (H PE) method for convex optimization, referred to as the accelerated HPE (A-HPE) framework, as well as a special version of it, where a large stepsize condition is imposed.

...read moreread less

Abstract: This paper presents an accelerated variant of the hybrid proximal extragradient (HPE) method for convex optimization, referred to as the accelerated HPE (A-HPE) framework. Iteration-complexity results are established for the A-HPE framework, as well as a special version of it, where a large stepsize condition is imposed. Two specific implementations of the A-HPE framework are described in the context of a structured convex optimization problem whose objective function consists of the sum of a smooth convex function and an extended real-valued nonsmooth convex function. In the first implementation, a generalization of a variant of Nesterov's method is obtained for the case where the smooth component of the objective function has Lipschitz continuous gradient. In the second implementation, an accelerated Newton proximal extragradient (A-NPE) method is obtained for the case where the smooth component of the objective function has Lipschitz continuous Hessian. It is shown that the A-NPE method has a ${\cal O}...

...read moreread less

Journal Article•DOI•

Sparse Approximation via Penalty Decomposition Methods

[...]

Zhaosong Lu¹, Yong Zhang¹•Institutions (1)

Simon Fraser University¹

05 Dec 2013-Siam Journal on Optimization

TL;DR: In this paper, the first-order optimality conditions for sparse approximation problems with the norm of a vector being a part of constraints or objective functions were studied, and penalty decomposition (PD) methods for solving them were proposed.

...read moreread less

Abstract: In this paper we consider sparse approximation problems, that is, general $l_0$ minimization problems with the $l_0$-``norm” of a vector being a part of constraints or objective function. In particular, we first study the first-order optimality conditions for these problems. We then propose penalty decomposition (PD) methods for solving them in which a sequence of penalty subproblems are solved by a block coordinate descent (BCD) method. Under some suitable assumptions, we establish that any accumulation point of the sequence generated by the PD methods satisfies the first-order optimality conditions of the problems. Furthermore, for the problems in which the $l_0$ part is the only nonconvex part, we show that such an accumulation point is a local minimizer of the problems. In addition, we show that any accumulation point of the sequence generated by the BCD method is a block coordinate minimizer of the penalty subproblem. Moreover, for the problems in which the $l_0$ part is the only nonconvex part, we e...

...read moreread less

Journal Article•DOI•

Local Linear Convergence of the Alternating Direction Method of Multipliers on Quadratic or Linear Programs

[...]

Daniel Boley

05 Nov 2013-Siam Journal on Optimization

TL;DR: A novel matrix recurrence is introduced yielding a new spectral analysis of the local transient convergence behavior of the alternating direction method of multipliers (ADMM), for the particular case of a quadratic program or a linear program.

...read moreread less

Abstract: We introduce a novel matrix recurrence yielding a new spectral analysis of the local transient convergence behavior of the alternating direction method of multipliers (ADMM), for the particular case of a quadratic program or a linear program. We identify a particular combination of vector iterates whose convergence can be analyzed via a spectral analysis. The theory predicts that ADMM should go through up to four convergence regimes, such as constant step convergence or linear convergence, ending with the latter when close enough to the optimal solution if the optimal solution is unique and satisfies strict complementarity.

...read moreread less

Journal Article•DOI•

Solving Mixed Integer Bilinear Problems Using MILP Formulations

[...]

Akshay Gupte, Shabbir Ahmed, Myun-Seok Cheon, Santanu S. Dey

18 Apr 2013-Siam Journal on Optimization

TL;DR: This paper presents the convex hull of the underlying mixed integer linear set and the effectiveness of this reformulation and associated facet-defining inequalities are computationally evaluated on five classes of instances.

...read moreread less

Abstract: In this paper, we examine a mixed integer linear programming reformulation for mixed integer bilinear problems where each bilinearterm involves the product of a nonnegative integer variable and a nonnegative continuous variable. This reformulation is obtained by first replacing a general integer variable with its binary expansion and then using McCormick envelopes to linearize the resulting product of continuous and binary variables. We present the convex hull of the underlying mixed integer linear set. The effectiveness of this reformulation and associated facet-defining inequalities are computationally evaluated on five classes of instances.

...read moreread less

Journal Article•DOI•

Low-Rank Optimization with Trace Norm Penalty

[...]

Bamdev Mishra¹, Gilles Meyer¹, Francis Bach², Rodolphe Sepulchre¹•Institutions (2)

University of Liège¹, École Normale Supérieure²

05 Nov 2013-Siam Journal on Optimization

TL;DR: The paper addresses the problem of low-rank trace norm minimization with an algorithm that alternates between fixed-rank optimization and rank-one updates and presents a second-order trust-region algorithm with a guaranteed quadratic rate of convergence.

...read moreread less

Abstract: The paper addresses the problem of low-rank trace norm minimization. We propose an algorithm that alternates between fixed-rank optimization and rank-one updates. The fixed-rank optimization is characterized by an efficient factorization that makes the trace norm differentiable in the search space and the computation of duality gap numerically tractable. The search space is nonlinear but is equipped with a Riemannian structure that leads to efficient computations. We present a second-order trust-region algorithm with a guaranteed quadratic rate of convergence. Overall, the proposed optimization scheme converges superlinearly to the global solution while maintaining complexity that is linear in the number of rows and columns of the matrix. To compute a set of solutions efficiently for a grid of regularization parameters we propose a predictor-corrector approach that outperforms the naive warm-restart approach on the fixed-rank quotient manifold. The performance of the proposed algorithm is illustrated on p...

...read moreread less

Journal Article•DOI•

On the Nonasymptotic Convergence of Cyclic Coordinate Descent Methods

[...]

Ankan Saha¹, Ambuj Tewari•Institutions (1)

University of Chicago¹

28 Mar 2013-Siam Journal on Optimization

TL;DR: This work proves O(1/k) convergence rates for two variants of cyclic coordinate descent under an isotonicity assumption by comparing the objective values attained by the two variants with each other, as well as with the gradient descent algorithm.

...read moreread less

Abstract: Cyclic coordinate descent is a classic optimization method that has witnessed a resurgence of interest in signal processing, statistics, and machine learning. Reasons for this renewed interest include the simplicity, speed, and stability of the method, as well as its competitive performance on $\ell_1$ regularized smooth optimization problems. Surprisingly, very little is known about its nonasymptotic convergence behavior on these problems. Most existing results either just prove convergence or provide asymptotic rates. We fill this gap in the literature by proving $O(1/k)$ convergence rates (where $k$ is the iteration count) for two variants of cyclic coordinate descent under an isotonicity assumption. Our analysis proceeds by comparing the objective values attained by the two variants with each other, as well as with the gradient descent algorithm. We show that the iterates generated by the cyclic coordinate descent methods remain better than those of gradient descent uniformly over time.

...read moreread less

Journal Article•DOI•

A Douglas-Rachford type primal-dual method for solving inclusions with mixtures of composite and parallel-sum type monotone operators

[...]

Radu Ioan Boţ¹, Christopher Hendrich¹•Institutions (1)

Chemnitz University of Technology¹

19 Dec 2013-Siam Journal on Optimization

TL;DR: Two different primal-dual splitting algorithms for solving inclusions involving mixtures of composite and parallel-sum type monotone operators which rely on an inexact Douglas--Rachford splitting method, but applied in different underlying Hilbert spaces are proposed.

...read moreread less

Abstract: In this paper we propose two different primal-dual splitting algorithms for solving inclusions involving mixtures of composite and parallel-sum type monotone operators which rely on an inexact Douglas--Rachford splitting method, but applied in different underlying Hilbert spaces. Most importantly, the algorithms allow one to process the bounded linear operators and the set-valued operators occurring in the formulation of the monotone inclusion problem separately at each iteration, the latter being individually accessed via their resolvents. The performance of the primal-dual algorithms is emphasized via some numerical experiments on location and image denoising problems.

...read moreread less

Journal Article•DOI•

Second-Order-Cone Constraints for Extended Trust-Region Subproblems

[...]

Samuel Burer¹, Kurt M. Anstreicher¹•Institutions (1)

University of Iowa¹

28 Feb 2013-Siam Journal on Optimization

TL;DR: This paper provides a new relaxation including second-order-cone constraints that strengthens the usual SDP relaxation in the case where an additional ellipsoidal constraint is added to TRS, resulting in the two trust-region subproblem.

...read moreread less

Abstract: The classical trust-region subproblem (TRS) minimizes a nonconvex quadratic objective over the unit ball. In this paper, we consider extensions of TRS having extra constraints. When two parallel cuts are added to TRS, we show that the resulting nonconvex problem has an exact representation as a semidenite program with additional linear and second-order-cone constraints. For the case where an additional ellipsoidal constraint is added to TRS, resulting in the \two trust-region subproblem" (TTRS), we provide a new relaxation including second-order-cone constraints that strengthens the usual SDP relaxation.

...read moreread less

Journal Article•DOI•

A Proximal-Gradient Homotopy Method for the Sparse Least-Squares Problem

[...]

Lin Xiao¹, Tong Zhang²•Institutions (2)

Microsoft¹, Rutgers University²

28 May 2013-Siam Journal on Optimization

TL;DR: In this paper, a homotopy continuation strategy is proposed to solve the regularized least-squares problem for a sequence of decreasing values of the regularization parameter, and use an approximate solution at the end of each stage to warm start the next stage.

...read moreread less

Abstract: We consider solving the $\ell_1$-regularized least-squares ($\ell_1$-LS) problem in the context of sparse recovery for applications such as compressed sensing. The standard proximal gradient method, also known as iterative soft-thresholding when applied to this problem, has low computational cost per iteration but a rather slow convergence rate. Nevertheless, when the solution is sparse, it often exhibits fast linear convergence in the final stage. We exploit the local linear convergence using a homotopy continuation strategy, i.e., we solve the $\ell_1$-LS problem for a sequence of decreasing values of the regularization parameter, and use an approximate solution at the end of each stage to warm start the next stage. Although similar strategies have been studied in the literature, there have been no theoretical analysis of their global iteration complexity. This paper shows that under suitable assumptions for sparse recovery, the proposed homotopy strategy ensures that all iterates along the homotopy sol...

...read moreread less

Journal Article•DOI•

Randomized Solutions to Convex Programs with Multiple Chance Constraints

[...]

Georg Schildbach, Lorenzo Fagiano, Manfred Morari

05 Dec 2013-Siam Journal on Optimization

TL;DR: The scenario approach as mentioned in this paper provides an intuitive way of approximating the solution to chance-constrained optimization programs, based on finding the optimal solution under a finite number of sampled outcomes of the uncertainty.

...read moreread less

Abstract: The scenario-based optimization approach (``scenario approach'') provides an intuitive way of approximating the solution to chance-constrained optimization programs, based on finding the optimal solution under a finite number of sampled outcomes of the uncertainty (``scenarios''). A key merit of this approach is that it neither requires explicit knowledge of the uncertainty set, as in robust optimization, nor of its probability distribution, as in stochastic optimization. The scenario approach is also computationally efficient because it only requires the solution to a convex optimization program, even if the original chance-constrained problem is nonconvex. Recent research has obtained a rigorous foundation for the scenario approach, by establishing a direct link between the number of scenarios and bounds on the constraint violation probability. These bounds are tight in the general case of an uncertain optimization problem with a single chance constraint. This paper shows that the bounds can be improved...

...read moreread less

Journal Article•DOI•

Stochastic Convex Optimization with Bandit Feedback

[...]

Alekh Agarwal, Dean P. Foster, Daniel Hsu, Sham M. Kakade, Alexander Rakhlin - Show less +1 more

14 Feb 2013-Siam Journal on Optimization

TL;DR: In this paper, a generalization of the ellipsoid algorithm was proposed to minimize a convex Lipschitz function under a stochastic bandit feedback model, where the algorithm is allowed to observe noisy realizations of the function value at any query point.

...read moreread less

Abstract: This paper addresses the problem of minimizing a convex, Lipschitz function $f$ over a convex, compact set $\mathcal{X}$ under a stochastic bandit (i.e., noisy zeroth-order) feedback model. In this model, the algorithm is allowed to observe noisy realizations of the function value $f(x)$ at any query point $x \in \mathcal{X}$. The quantity of interest is the regret of the algorithm, which is the sum of the function values at algorithm's query points minus the optimal function value. We demonstrate a generalization of the ellipsoid algorithm that incurs $\widetilde{\mathcal{O}}({\rm poly}(d)\sqrt{T})$ regret. Since any algorithm has regret at least $\Omega(\sqrt{T})$ on this problem, our algorithm is optimal in terms of the scaling with $T$.

...read moreread less

Journal Article•DOI•

The limited memory conjugate gradient method

[...]

William W. Hager, Hongchao Zhang

05 Nov 2013-Siam Journal on Optimization

TL;DR: A limited memory version of the nonlinear conjugate gradient method is developed that possesses a global convergence property similar to that of the memoryless algorithm but has much better practical performance.

...read moreread less

Abstract: In theory, the successive gradients generated by the conjugate gradient method applied to a quadratic should be orthogonal. However, for some ill-conditioned problems, orthogonality is quickly lost due to rounding errors, and convergence is much slower than expected. A limited memory version of the nonlinear conjugate gradient method is developed. The memory is used to both detect the loss of orthogonality and to restore orthogonality. An implementation of the algorithm is presented based on the CG_DESCENT nonlinear conjugate gradient method. Limited memory CG_DESCENT (L-CG_DESCENT) possesses a global convergence property similar to that of the memoryless algorithm but has much better practical performance. Numerical comparisons to the limited memory BFGS method (L-BFGS) are given using the CUTEr test problems.

...read moreread less

Journal Article•DOI•

A Primal-Dual Splitting Algorithm for Finding Zeros of Sums of Maximal Monotone Operators

[...]

Radu Ioan Boţ, Ernö Robert Csetnek, André Heinrich

15 Oct 2013-Siam Journal on Optimization

TL;DR: This work considers the primal problem of finding the zeros of the sum of a maximal monotone operator and the composition of another maximal monotsone operator with a linear continuous operator.

...read moreread less

Abstract: We consider the primal problem of finding the zeros of the sum of a maximal monotone operator and the composition of another maximal monotone operator with a linear continuous operator. By formulat...

...read moreread less

Journal Article•DOI•

Tilt Stability, Uniform Quadratic Growth, and Strong Metric Regularity of the Subdifferential

[...]

Dmitriy Drusvyatskiy, Adrian S. Lewis

19 Feb 2013-Siam Journal on Optimization

TL;DR: It is proved that uniform second-order growth, tilt stability, and strong metric regularity of the subdifferential---three notions that have appeared in entirely different settings---are all essentially equivalent for any lower-semicontinuous, extended real-valued function.

...read moreread less

Abstract: We prove that uniform second-order growth, tilt stability, and strong metric regularity of the subdifferential---three notions that have appeared in entirely different settings---are all essentially equivalent for any lower-semicontinuous, extended real-valued function.

...read moreread less

Journal Article•DOI•

Optimality conditions and a smoothing trust region newton method for nonlipschitz optimization

[...]

Xiaojun Chen¹, Lingfeng Niu², Ya-xiang Yuan²•Institutions (2)

Hong Kong Polytechnic University¹, Chinese Academy of Sciences²

30 Jul 2013-Siam Journal on Optimization

TL;DR: This paper derives affine-scaled second order necessary and sufficient conditions for local minimizers of minimization problems with nonconvex, nonsmooth, perhaps non-Lipschitz penalty functions and proposes a global convergent smoothing trust region Newton method.

...read moreread less

Abstract: Regularized minimization problems with nonconvex, nonsmooth, perhaps non-Lipschitz penalty functions have attracted considerable attention in recent years, owing to their wide applications in image restoration, signal reconstruction, and variable selection. In this paper, we derive affine-scaled second order necessary and sufficient conditions for local minimizers of such minimization problems. Moreover, we propose a global convergent smoothing trust region Newton method which can find a point satisfying the affine-scaled second order necessary optimality condition from any starting point. Numerical examples are given to demonstrate the effectiveness of the smoothing trust region Newton method.

...read moreread less

Journal Article•DOI•

Pessimistic Bilevel Optimization

[...]

Wolfram Wiesemann, Angelos Tsoukalas, Polyxeni-Margarita Kleniati, Berç Rustem

27 Feb 2013-Siam Journal on Optimization

TL;DR: In this article, a variant of the pessimistic bilevel optimization problem is studied, which comprises constraints that must be satisfied for any optimal solution of a subordinate (lower-level) optimization problem.

...read moreread less

Abstract: We study a variant of the pessimistic bilevel optimization problem, which comprises constraints that must be satisfied for any optimal solution of a subordinate (lower-level) optimization problem. We present conditions that guarantee the existence of optimal solutions in such a problem, and we characterize the computational complexity of various subclasses of the problem. We then focus on problem instances that may lack convexity, but that satisfy a certain independence property. We develop convergent approximations for these instances, and we derive an iterative solution scheme that is reminiscent of the discretization techniques used in semi-infinite programming. We also present a computational study that illustrates the numerical behavior of our algorithm on standard benchmark instances.

...read moreread less

Journal Article•DOI•

Systems of Structured Monotone Inclusions: Duality, Algorithms, and Applications

[...]

Patrick L. Combettes¹•Institutions (1)

Pierre-and-Marie-Curie University¹

05 Dec 2013-Siam Journal on Optimization

TL;DR: A general primal-dual splitting algorithm for solving systems of structured coupled monotone inclusions in Hilbert spaces is introduced and its asymptotic behavior is analyzed, providing a flexible solution method applicable to a variety of problems beyond the reach of the state-of-the-art.

...read moreread less

Abstract: A general primal-dual splitting algorithm for solving systems of structured coupled monotone inclusions in Hilbert spaces is introduced and its asymptotic behavior is analyzed. Each inclusion in the primal system features compositions with linear operators, parallel sums, and Lipschitzian operators. All the operators involved in this structured model are used separately in the proposed algorithm, most steps of which can be executed in parallel. This provides a flexible solution method applicable to a variety of problems beyond the reach of the state-of-the-art. Several applications are discussed to illustrate this point.

...read moreread less

Journal Article•DOI•

Partial Smoothness, Tilt Stability, and Generalized Hessians

[...]

Adrian S. Lewis¹, S. Zhang¹•Institutions (1)

Cornell University¹

29 Jan 2013-Siam Journal on Optimization

TL;DR: A broad setting where computing the generalized Hessian of Mordukhovich is easy is described, and the idea of tilt stability introduced by Poliquin and Rockafellar is equivalent to a classical smooth second-order condition.

...read moreread less

Abstract: We compare two recent variational-analytic approaches to second-order conditions and sensitivity analysis for nonsmooth optimization. We describe a broad setting where computing the generalized Hessian of Mordukhovich is easy. In this setting, the idea of tilt stability introduced by Poliquin and Rockafellar is equivalent to a classical smooth second-order condition.

...read moreread less

Journal Article•DOI•

A New Regularization Method for Mathematical Programs with Complementarity Constraints with Strong Convergence Properties

[...]

Christian Kanzow¹, Alexandra Schwartz¹•Institutions (1)

University of Würzburg¹

23 Apr 2013-Siam Journal on Optimization

TL;DR: This work provides a new regularization method for MPECs which also converges to M-sta..., which is a very weak stationarity concept.

...read moreread less

Abstract: Mathematical programs with equilibrium (or complementarity) constraints (MPECs) form a difficult class of optimization problems. The feasible set has a very special structure and violates most of the standard constraint qualifications. Therefore, one typically applies specialized algorithms in order to solve MPECs. One very prominent class of specialized algorithms are the regularization (or relaxation) methods. The first regularization method for MPECs is due to Scholtes [SIAM J. Optim., 11 (2001), pp. 918--936], but in the meantime, there exist a number of different regularization schemes which try to relax the difficult constraints in different ways. However, almost all regularization methods converge to C-stationary points only, which is a very weak stationarity concept. An exception is a recent method by Kadrani, Dussault, and Benchakroun [SIAM J. Optim., 20 (2009), pp. 78--103], whose limit points are shown to be M-stationary. Here we provide a new regularization method which also converges to M-sta...

...read moreread less