Showing papers in &quot;Mathematical Programming in 2017&quot;

On the linear convergence of the alternating direction method of multipliers

TL;DR: In this paper, the stochastic average gradient (SAG) method is used to optimize the sum of a finite number of smooth convex functions, which achieves a faster convergence rate than black-box SG methods.

...read moreread less

Abstract: We analyze the stochastic average gradient (SAG) method for optimizing the sum of a finite number of smooth convex functions. Like stochastic gradient (SG) methods, the SAG method's iteration cost is independent of the number of terms in the sum. However, by incorporating a memory of previous gradient values the SAG method achieves a faster convergence rate than black-box SG methods. The convergence rate is improved from $$O(1/\sqrt{k})$$O(1/k) to O(1 / k) in general, and when the sum is strongly-convex the convergence rate is improved from the sub-linear O(1 / k) to a linear convergence rate of the form $$O(\rho ^k)$$O(?k) for $$\rho < 1$$?<1. Further, in many cases the convergence rate of the new method is also faster than black-box deterministic gradient methods, in terms of the number of gradient evaluations. This extends our earlier work Le Roux et al. (Adv Neural Inf Process Syst, 2012), which only lead to a faster rate for well-conditioned strongly-convex problems. Numerical experiments indicate that the new algorithm often dramatically outperforms existing SG and deterministic gradient methods, and that the performance may be further improved through the use of non-uniform sampling strategies.

...read moreread less

769 citations

Journal Article•DOI•

[...]

Mingyi Hong¹, Zhi-Quan Luo²•Institutions (2)

Iowa State University¹, The Chinese University of Hong Kong²

From error bounds to the complexity of first-order descent methods for convex functions

TL;DR: This paper establishes the global R-linear convergence of the ADMM for minimizing the sum of any number of convex separable functions, assuming that a certain error bound condition holds true and the dual stepsize is sufficiently small.

...read moreread less

Abstract: We analyze the convergence rate of the alternating direction method of multipliers (ADMM) for minimizing the sum of two or more nonsmooth convex separable functions subject to linear constraints. Previous analysis of the ADMM typically assumes that the objective function is the sum of only two convex functions defined on two separable blocks of variables even though the algorithm works well in numerical experiments for three or more blocks. Moreover, there has been no rate of convergence analysis for the ADMM without strong convexity in the objective function. In this paper we establish the global R-linear convergence of the ADMM for minimizing the sum of any number of convex separable functions, assuming that a certain error bound condition holds true and the dual stepsize is sufficiently small. Such an error bound condition is satisfied for example when the feasible set is a compact polyhedron and the objective function consists of a smooth strictly convex function composed with a linear mapping, and a nonsmooth $$\ell _1$$l1 regularizer. This result implies the linear convergence of the ADMM for contemporary applications such as LASSO without assuming strong convexity of the objective function.

...read moreread less

705 citations

Journal Article•DOI•

[...]

Jérôme Bolte¹, Jérôme Bolte², Trong Phong Nguyen², Juan Peypouquet³, Bruce W. Suter⁴ - Show less +1 more•Institutions (4)

Yamaguchi University¹, University of Toulouse², Federico Santa María Technical University³, Air Force Research Laboratory⁴

01 Oct 2017-Mathematical Programming

TL;DR: In particular, the authors showed that the complexity of first-order descent methods for convex minimization can be computed using the Kurdyka-Łojasiewicz (KL) inequality.

...read moreread less

Abstract: This paper shows that error bounds can be used as effective tools for deriving complexity results for first-order descent methods in convex minimization. In a first stage, this objective led us to revisit the interplay between error bounds and the Kurdyka-Łojasiewicz (KL) inequality. One can show the equivalence between the two concepts for convex functions having a moderately flat profile near the set of minimizers (as those of functions with Holderian growth). A counterexample shows that the equivalence is no longer true for extremely flat functions. This fact reveals the relevance of an approach based on KL inequality. In a second stage, we show how KL inequalities can in turn be employed to compute new complexity bounds for a wealth of descent methods for convex problems. Our approach is completely original and makes use of a one-dimensional worst-case proximal sequence in the spirit of the famous majorant method of Kantorovich. Our result applies to a very simple abstract scheme that covers a wide class of descent methods. As a byproduct of our study, we also provide new results for the globalization of KL inequalities in the convex framework. Our main results inaugurate a simple method: derive an error bound, compute the desingularizing function whenever possible, identify essential constants in the descent method and finally compute the complexity using the one-dimensional worst case proximal sequence. Our method is illustrated through projection methods for feasibility problems, and through the famous iterative shrinkage thresholding algorithm (ISTA), for which we show that the complexity bound is of the form $O(q^{k})$ where the constituents of the bound only depend on error bound constants obtained for an arbitrary least squares objective with $\ell ^1$ regularization.

...read moreread less

258 citations

Journal Article•DOI•

Stochastic compositional gradient descent: algorithms for minimizing compositions of expected-value functions

[...]

Mengdi Wang¹, Ethan X. Fang², Han Liu¹•Institutions (2)

Princeton University¹, Pennsylvania State University²

Smooth strongly convex interpolation and exact worst-case performance of first-order methods

TL;DR: It is proved that the SCGD converge almost surely to an optimal solution for convex optimization problems, as long as such a solution exists and any limit point generated by SCGD is a stationary point, for which the convergence rate analysis is provided.

...read moreread less

Abstract: Classical stochastic gradient methods are well suited for minimizing expected-value objective functions. However, they do not apply to the minimization of a nonlinear function involving expected values or a composition of two expected-value functions, i.e., the problem $$\min _x \mathbf{E}_v\left[ f_v\big (\mathbf{E}_w [g_w(x)]\big ) \right] .$$minxEvfv(Ew[gw(x)]). In order to solve this stochastic composition problem, we propose a class of stochastic compositional gradient descent (SCGD) algorithms that can be viewed as stochastic versions of quasi-gradient method. SCGD update the solutions based on noisy sample gradients of $$f_v,g_{w}$$fv,gw and use an auxiliary variable to track the unknown quantity $$\mathbf{E}_w\left[ g_w(x)\right] $$Ewgw(x). We prove that the SCGD converge almost surely to an optimal solution for convex optimization problems, as long as such a solution exists. The convergence involves the interplay of two iterations with different time scales. For nonsmooth convex problems, the SCGD achieves a convergence rate of $$\mathcal {O}(k^{-1/4})$$O(k-1/4) in the general case and $$\mathcal {O}(k^{-2/3})$$O(k-2/3) in the strongly convex case, after taking k samples. For smooth convex problems, the SCGD can be accelerated to converge at a rate of $$\mathcal {O}(k^{-2/7})$$O(k-2/7) in the general case and $$\mathcal {O}(k^{-4/5})$$O(k-4/5) in the strongly convex case. For nonconvex problems, we prove that any limit point generated by SCGD is a stationary point, for which we also provide the convergence rate analysis. Indeed, the stochastic setting where one wants to optimize compositions of expected-value functions is very common in practice. The proposed SCGD methods find wide applications in learning, estimation, dynamic programming, etc.

...read moreread less

209 citations

Journal Article•DOI•

[...]

Adrien B. Taylor¹, Julien M. Hendrickx¹, François Glineur¹•Institutions (1)

Université catholique de Louvain¹

A trust region algorithm with a worst-case iteration complexity of $$\mathcal{O}(\epsilon ^{-3/2})$$O(∈-3/2) for nonconvex optimization

TL;DR: In this article, the exact worst-case performance of fixed-step first-order methods for unconstrained optimization of smooth (possibly strongly) convex functions can be obtained by solving convex programs.

...read moreread less

Abstract: We show that the exact worst-case performance of fixed-step first-order methods for unconstrained optimization of smooth (possibly strongly) convex functions can be obtained by solving convex programs. Finding the worst-case performance of a black-box first-order method is formulated as an optimization problem over a set of smooth (strongly) convex functions and initial conditions. We develop closed-form necessary and sufficient conditions for smooth (strongly) convex interpolation, which provide a finite representation for those functions. This allows us to reformulate the worst-case performance estimation problem as an equivalent finite dimension-independent semidefinite optimization problem, whose exact solution can be recovered up to numerical precision. Optimal solutions to this performance estimation problem provide both worst-case performance bounds and explicit functions matching them, as our smooth (strongly) convex interpolation procedure is constructive. Our works build on those of Drori and Teboulle (Math Program 145(1---2):451---482, 2014) who introduced and solved relaxations of the performance estimation problem for smooth convex functions. We apply our approach to different fixed-step first-order methods with several performance criteria, including objective function accuracy and gradient norm. We conjecture several numerically supported worst-case bounds on the performance of the fixed-step gradient, fast gradient and optimized gradient methods, both in the smooth convex and the smooth strongly convex cases, and deduce tight estimates of the optimal step size for the gradient method.

...read moreread less

165 citations

Journal Article•DOI•

[...]

Frank E. Curtis¹, Daniel P. Robinson², Mohammadreza Samadi¹•Institutions (2)

Lehigh University¹, Johns Hopkins University²

Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models

TL;DR: It is proved that the trust region algorithm, entitled trace, follows a trust region framework, but employs modified step acceptance criteria and a novel trust region update mechanism that allow the algorithm to achieve such a worst-case global complexity bound.

...read moreread less

Abstract: We propose a trust region algorithm for solving nonconvex smooth optimization problems. For any $$\overline{\epsilon }\in (0,\infty )$$∈¯?(0,?), the algorithm requires at most $$\mathcal{O}(\epsilon ^{-3/2})$$O(∈-3/2) iterations, function evaluations, and derivative evaluations to drive the norm of the gradient of the objective function below any $$\epsilon \in (0,\overline{\epsilon }]$$∈?(0,∈¯]. This improves upon the $$\mathcal{O}(\epsilon ^{-2})$$O(∈-2) bound known to hold for some other trust region algorithms and matches the $$\mathcal{O}(\epsilon ^{-3/2})$$O(∈-3/2) bound for the recently proposed Adaptive Regularisation framework using Cubics, also known as the arc algorithm. Our algorithm, entitled trace, follows a trust region framework, but employs modified step acceptance criteria and a novel trust region update mechanism that allow the algorithm to achieve such a worst-case global complexity bound. Importantly, we prove that our algorithm also attains global and fast local convergence guarantees under similar assumptions as for other trust region algorithms. We also prove a worst-case upper bound on the number of iterations, function evaluations, and derivative evaluations that the algorithm requires to obtain an approximate second-order stationary point.

...read moreread less

165 citations

Journal Article•DOI•

[...]

Ernesto G. Birgin¹, John Lenon Cardoso Gardenghi¹, José Mario Martínez², Sandra A. Santos², Ph. L. Toint³ - Show less +1 more•Institutions (3)

University of São Paulo¹, State University of Campinas², Université de Namur³

Iteration complexity analysis of block coordinate descent methods

TL;DR: The worst-case evaluation complexity for smooth (possibly nonconvex) unconstrained optimization is considered and it is shown that an $$epsilon $$ϵ-approximate first-order critical point can be computed in at most O(ϵ-(p+1)/p) evaluations of the problem’s objective function and its derivatives.

...read moreread less

Abstract: The worst-case evaluation complexity for smooth (possibly nonconvex) unconstrained optimization is considered. It is shown that, if one is willing to use derivatives of the objective function up to order p (for $$p\ge 1$$pź1) and to assume Lipschitz continuity of the p-th derivative, then an $$\epsilon $$∈-approximate first-order critical point can be computed in at most $$O(\epsilon ^{-(p+1)/p})$$O(∈-(p+1)/p) evaluations of the problem's objective function and its derivatives. This generalizes and subsumes results known for $$p=1$$p=1 and $$p=2$$p=2.

...read moreread less

154 citations

Journal Article•DOI•

[...]

Mingyi Hong¹, Xiangfeng Wang², Meisam Razaviyayn³, Zhi-Quan Luo⁴•Institutions (4)

Iowa State University¹, East China Normal University², Stanford University³, The Chinese University of Hong Kong⁴

A unified approach to error bounds for structured convex optimization problems

TL;DR: In this paper, a unified iteration complexity analysis for a family of general block coordinate descent methods, covering popular methods such as the block coordinate gradient descent and the block coordinates proximal gradient, under various different coordinate update rules, is provided.

...read moreread less

Abstract: In this paper, we provide a unified iteration complexity analysis for a family of general block coordinate descent methods, covering popular methods such as the block coordinate gradient descent and the block coordinate proximal gradient, under various different coordinate update rules. We unify these algorithms under the so-called block successive upper-bound minimization (BSUM) framework, and show that for a broad class of multi-block nonsmooth convex problems, all algorithms covered by the BSUM framework achieve a global sublinear iteration complexity of $$\mathcal{{O}}(1/r)$$O(1/r), where r is the iteration index. Moreover, for the case of block coordinate minimization where each block is minimized exactly, we establish the sublinear convergence rate of O(1/r) without per block strong convexity assumption.

...read moreread less

152 citations

Journal Article•DOI•

[...]

Zirui Zhou¹, Anthony Man-Cho So¹•Institutions (1)

The Chinese University of Hong Kong¹

25 Jan 2017-Mathematical Programming

TL;DR: In this paper, the authors present a new framework for establishing error bounds for a class of structured convex optimization problems, in which the objective function is the sum of a smooth convex function and a general closed proper function.

...read moreread less

Abstract: Error bounds, which refer to inequalities that bound the distance of vectors in a test set to a given set by a residual function, have proven to be extremely useful in analyzing the convergence rates of a host of iterative methods for solving optimization problems. In this paper, we present a new framework for establishing error bounds for a class of structured convex optimization problems, in which the objective function is the sum of a smooth convex function and a general closed proper convex function. Such a class encapsulates not only fairly general constrained minimization problems but also various regularized loss minimization formulations in machine learning, signal processing, and statistics. Using our framework, we show that a number of existing error bound results can be recovered in a unified and transparent manner. To further demonstrate the power of our framework, we apply it to a class of nuclear-norm regularized loss minimization problems and establish a new error bound for this class under a strict complementarity-type regularity condition. We then complement this result by constructing an example to show that the said error bound could fail to hold without the regularity condition. We believe that our approach will find further applications in the study of error bounds for structured convex optimization problems.

...read moreread less

121 citations

Journal Article•DOI•

An efficient inexact symmetric Gauss---Seidel based majorized ADMM for high-dimensional convex composite conic programming

[...]

Liang Chen¹, Defeng Sun², Kim-Chuan Toh²•Institutions (2)

Hunan University¹, National University of Singapore²

Mixed-integer quadratic programming is in NP

TL;DR: In this article, an inexact 2-block majorized semi-proximal ADMM was proposed for solving a class of high-dimensional convex composite conic optimization problems to moderate accuracy.

...read moreread less

Abstract: In this paper, we propose an inexact multi-block ADMM-type first-order method for solving a class of high-dimensional convex composite conic optimization problems to moderate accuracy. The design of this method combines an inexact 2-block majorized semi-proximal ADMM and the recent advances in the inexact symmetric Gauss---Seidel (sGS) technique for solving a multi-block convex composite quadratic programming whose objective contains a nonsmooth term involving only the first block-variable. One distinctive feature of our proposed method (the sGS-imsPADMM) is that it only needs one cycle of an inexact sGS method, instead of an unknown number of cycles, to solve each of the subproblems involved. With some simple and implementable error tolerance criteria, the cost for solving the subproblems can be greatly reduced, and many steps in the forward sweep of each sGS cycle can often be skipped, which further contributes to the efficiency of the proposed method. Global convergence as well as the iteration complexity in the non-ergodic sense is established. Preliminary numerical experiments on some high-dimensional linear and convex quadratic SDP problems with a large number of linear equality and inequality constraints are also provided. The results show that for the vast majority of the tested problems, the sGS-imsPADMM is 2---3 times faster than the directly extended multi-block ADMM with the aggressive step-length of 1.618, which is currently the benchmark among first-order methods for solving multi-block linear and quadratic SDP problems though its convergence is not guaranteed.

...read moreread less

112 citations

Journal Article•DOI•

[...]

Alberto Del Pia¹, Santanu S. Dey², Marco Molinaro²•Institutions (2)

University of Wisconsin-Madison¹, Pontifical Catholic University of Rio de Janeiro²

Accelerated schemes for a class of variational inequalities

TL;DR: In this paper, it was shown that the decision version of mixed-integer quadratic programming is in NP-complete and that there is no polynomial-size solution for this problem.

...read moreread less

Abstract: Mixed-integer quadratic programming is the problem of optimizing a quadratic function over points in a polyhedral set where some of the components are restricted to be integral. In this paper, we prove that the decision version of mixed-integer quadratic programming is in NP, thereby showing that it is NP-complete. This is established by showing that if the decision version of mixed-integer quadratic programming is feasible, then there exists a solution of polynomial size. This result generalizes and unifies classical results that quadratic programming is in NP (Vavasis in Inf Process Lett 36(2):73---77 [17]) and integer linear programming is in NP (Borosh and Treybig in Proc Am Math Soc 55:299---304 [1], von zur Gathen and Sieveking in Proc Am Math Soc 72:155---158 [18], Kannan and Monma in Lecture Notes in Economics and Mathematical Systems, vol. 157, pp. 161---172. Springer [9], Papadimitriou in J Assoc Comput Mach 28:765---768 [15]).

...read moreread less

Journal Article•DOI•

[...]

Yunmei Chen¹, Guanghui Lan², Yuyuan Ouyang³•Institutions (3)

University of Florida¹, Georgia Institute of Technology², Clemson University³

Linearly convergent away-step conditional gradient for non-strongly convex functions

TL;DR: In this article, a stochastic accelerated mirror-prox (SAMP) method was proposed for solving a class of monotone variational inequalities (SVI), which is based on a multi-step acceleration scheme.

...read moreread less

Abstract: We propose a novel stochastic method, namely the stochastic accelerated mirror-prox (SAMP) method, for solving a class of monotone stochastic variational inequalities (SVI). The main idea of the proposed algorithm is to incorporate a multi-step acceleration scheme into the stochastic mirror-prox method. The developed SAMP method computes weak solutions with the optimal iteration complexity for SVIs. In particular, if the operator in SVI consists of the stochastic gradient of a smooth function, the iteration complexity of the SAMP method can be accelerated in terms of their dependence on the Lipschitz constant of the smooth function. For SVIs with bounded feasible sets, the bound of the iteration complexity of the SAMP method depends on the diameter of the feasible set. For unbounded SVIs, we adopt the modified gap function introduced by Monteiro and Svaiter for solving monotone inclusion, and show that the iteration complexity of the SAMP method depends on the distance from the initial point to the set of strong solutions. It is worth noting that our study also significantly improves a few existing complexity results for solving deterministic variational inequality problems. We demonstrate the advantages of the SAMP method over some existing algorithms through our preliminary numerical experiments.

...read moreread less

Journal Article•DOI•

[...]

Amir Beck¹, Shimrit Shtern¹•Institutions (1)

Technion – Israel Institute of Technology¹

01 Jul 2017-Mathematical Programming

TL;DR: In this paper, the authors consider the problem of minimizing the sum of a linear function and a composition of a strongly convex function with a linear transformation over a compact polyhedral set.

...read moreread less

Abstract: We consider the problem of minimizing the sum of a linear function and a composition of a strongly convex function with a linear transformation over a compact polyhedral set. Jaggi and Lacoste-Julien (An affine invariant linear convergence analysis for Frank-Wolfe algorithms. NIPS 2013 Workshop on Greedy Algorithms, Frank-Wolfe and Friends, 2014) show that the conditional gradient method with away steps -- employed on the aforementioned problem without the additional linear term -- has a linear rate of convergence, depending on the so-called pyramidal width of the feasible set. We revisit this result and provide a variant of the algorithm and an analysis based on simple linear programming duality arguments, as well as corresponding error bounds. This new analysis (a) enables the incorporation of the additional linear term, and (b) depends on a new constant, that is explicitly expressed in terms of the problem's parameters and the geometry of the feasible set. This constant replaces the pyramidal width, which is difficult to evaluate.

...read moreread less

Journal Article•DOI•

Tightness of the maximum likelihood semidefinite relaxation for angular synchronization

[...]

Afonso S. Bandeira¹, Nicolas Boumal², Amit Singer²•Institutions (2)

New York University¹, Princeton University²

Relative entropy optimization and its applications

TL;DR: In this paper, the authors consider the problem of estimating a collection of n phases, given noisy measurements of the pairwise relative phases, and show that the classical semidefinite relaxation for it is tight with high probability.

...read moreread less

Abstract: Maximum likelihood estimation problems are, in general, intractable optimization problems. As a result, it is common to approximate the maximum likelihood estimator (MLE) using convex relaxations. In some cases, the relaxation is tight: it recovers the true MLE. Most tightness proofs only apply to situations where the MLE exactly recovers a planted solution (known to the analyst). It is then sufficient to establish that the optimality conditions hold at the planted signal. In this paper, we study an estimation problem (angular synchronization) for which the MLE is not a simple function of the planted solution, yet for which the convex relaxation is tight. To establish tightness in this context, the proof is less direct because the point at which to verify optimality conditions is not known explicitly. Angular synchronization consists in estimating a collection of n phases, given noisy measurements of the pairwise relative phases. The MLE for angular synchronization is the solution of a (hard) non-bipartite Grothendieck problem over the complex numbers. We consider a stochastic model for the data: a planted signal (that is, a ground truth set of phases) is corrupted with non-adversarial random noise. Even though the MLE does not coincide with the planted signal, we show that the classical semidefinite relaxation for it is tight, with high probability. This holds even for high levels of noise.

...read moreread less

Journal Article•DOI•

[...]

Venkat Chandrasekaran¹, Parikshit Shah²•Institutions (2)

California Institute of Technology¹, Wisconsin Institutes for Discovery²

On smoothing, regularization, and averaging in stochastic approximation methods for stochastic variational inequality problems

TL;DR: This expository article studies optimization problems specified via linear and relative entropy inequalities and provides solutions based on REPs to a range of problems such as permanent maximization, robust optimization formulations of GPs, and hitting-time estimation in dynamical systems.

...read moreread less

Abstract: In this expository article, we study optimization problems specified via linear and relative entropy inequalities. Such relative entropy programs (REPs) are convex optimization problems as the relative entropy function is jointly convex with respect to both its arguments. Prominent families of convex programs such as geometric programs (GPs), second-order cone programs, and entropy maximization problems are special cases of REPs, although REPs are more general than these classes of problems. We provide solutions based on REPs to a range of problems such as permanent maximization, robust optimization formulations of GPs, and hitting-time estimation in dynamical systems. We survey previous approaches to some of these problems and the limitations of those methods, and we highlight the more powerful generalizations afforded by REPs. We conclude with a discussion of quantum analogs of the relative entropy function, including a review of the similarities and distinctions with respect to the classical case. We also describe a stylized application of quantum relative entropy optimization that exploits the joint convexity of the quantum relative entropy function.

...read moreread less

Journal Article•DOI•

[...]

Farzad Yousefian¹, Angelia Nedic², Uday V. Shanbhag³•Institutions (3)

Oklahoma State University–Stillwater¹, Arizona State University², Pennsylvania State University³

Stochastic variational inequalities: single-stage to multistage

TL;DR: A regularized smoothed SA (RSSA) scheme wherein the stepsize, smoothing, and regularization parameters are reduced after every iteration at a prescribed rate, and it is shown that the algorithm generates iterates that converge to the least norm solution in an almost sure sense.

...read moreread less

Abstract: Traditionally, most stochastic approximation (SA) schemes for stochastic variational inequality (SVI) problems have required the underlying mapping to be either strongly monotone or monotone and Lipschitz continuous. In contrast, we consider SVIs with merely monotone and non-Lipschitzian maps. We develop a regularized smoothed SA (RSSA) scheme wherein the stepsize, smoothing, and regularization parameters are reduced after every iteration at a prescribed rate. Under suitable assumptions on the sequences, we show that the algorithm generates iterates that converge to the least norm solution in an almost sure sense, extending the results in Koshal et al. (IEEE Trans Autom Control 58(3):594–609, 2013) to the non-Lipschitzian regime. Additionally, we provide rate estimates that relate iterates to their counterparts derived from a smoothed Tikhonov trajectory associated with a deterministic problem. To derive non-asymptotic rate statements, we develop a variant of the RSSA scheme, denoted by aRSSA $$_r$$ , in which we employ a weighted iterate-averaging, parameterized by a scalar r where $$r = 1$$ provides us with the standard averaging scheme. The main contributions are threefold: (i) when $$r<1$$ and the parameter sequences are chosen appropriately, we show that the averaged sequence converges to the least norm solution almost surely and a suitably defined gap function diminishes at an approximate rate $$\mathcal{O}({1}\slash {\root 6 \of {k}})$$ after k steps; (ii) when $$r<1$$ , and smoothing and regularization are suppressed, the gap function admits the rate $$\mathcal{O}({1}\slash {\sqrt{k}})$$ , thus improving the rate $$\mathcal{O}(\ln (k)/\sqrt{k})$$ under standard averaging; and (iii) we develop a window-based variant of this scheme that also displays the optimal rate for $$r < 1$$ . Notably, we prove the superiority of the scheme with $$r < 1$$ with its counterpart with $$r=1$$ in terms of the constant factor of the error bound when the size of the averaging window is sufficiently large. We present the performance of the developed schemes on a stochastic Nash–Cournot game with merely monotone and non-Lipschitzian maps.

...read moreread less

Journal Article•DOI•

[...]

R. Tyrrell Rockafellar¹, Roger J.-B. Wets²•Institutions (2)

University of Washington¹, University of California, Davis²

Min---max---min robust combinatorial optimization

TL;DR: The idea that nonanticipativity as an explicit constraint on responses along with an associated “multiplier” element which captures the “price of information” and provides a means of decomposition as a tool in algorithmic developments is extended here to a framework which supports multistage optimization and equilibrium models while also clarifying the single-stage picture.

...read moreread less

Abstract: Variational inequality modeling, analysis and computations are important for many applications, but much of the subject has been developed in a deterministic setting with no uncertainty in a problem's data. In recent years research has proceeded on a track to incorporate stochasticity in one way or another. However, the main focus has been on rather limited ideas of what a stochastic variational inequality might be. Because variational inequalities are especially tuned to capturing conditions for optimality and equilibrium, stochastic variational inequalities ought to provide such service for problems of optimization and equilibrium in a stochastic setting. Therefore they ought to be able to deal with multistage decision processes involving actions that respond to increasing levels of information. Critical for that, as discovered in stochastic programming, is introducing nonanticipativity as an explicit constraint on responses along with an associated "multiplier" element which captures the "price of information" and provides a means of decomposition as a tool in algorithmic developments. That idea is extended here to a framework which supports multistage optimization and equilibrium models while also clarifying the single-stage picture.

...read moreread less

Journal Article•DOI•

[...]

Christoph Buchheim¹, Jannis Kurtz¹•Institutions (1)

Technical University of Dortmund¹

On the Douglas---Rachford algorithm

TL;DR: The case where no first stage variables exist is considered and the min–max–min problem is shown to be NP-hard for every fixed number k, even when the uncertainty set is a polyhedron, given by an inner description.

...read moreread less

Abstract: The idea of k-adaptability in two-stage robust optimization is to calculate a fixed number k of second-stage policies here-and-now. After the actual scenario is revealed, the best of these policies is selected. This idea leads to a min---max---min problem. In this paper, we consider the case where no first stage variables exist and propose to use this approach to solve combinatorial optimization problems with uncertainty in the objective function. We investigate the complexity of this special case for convex uncertainty sets. We first show that the min---max---min problem is as easy as the underlying certain problem if k is greater than the number of variables and if we can optimize a linear function over the uncertainty set in polynomial time. We also provide an exact and practical oracle-based algorithm to solve the latter problem for any underlying combinatorial problem. On the other hand, we prove that the min---max---min problem is NP-hard for every fixed number k, even when the uncertainty set is a polyhedron, given by an inner description. For the case that k is smaller or equal to the number of variables, we finally propose a fast heuristic algorithm and evaluate its performance.

...read moreread less

Journal Article•DOI•

[...]

Heinz H. Bauschke¹, Walaa M. Moursi¹•Institutions (1)

University of British Columbia¹

01 Jul 2017-Mathematical Programming

TL;DR: In this paper, it was shown that in the (possibly inconsistent) convex feasibility setting, the shadow sequence remains bounded and its weak cluster points solve a best approximation problem, and a more general sufficient condition for weak convergence in the general case is presented.

...read moreread less

Abstract: The Douglas---Rachford algorithm is a very popular splitting technique for finding a zero of the sum of two maximally monotone operators. The behaviour of the algorithm remains mysterious in the general inconsistent case, i.e., when the sum problem has no zeros. However, more than a decade ago, it was shown that in the (possibly inconsistent) convex feasibility setting, the shadow sequence remains bounded and its weak cluster points solve a best approximation problem. In this paper, we advance the understanding of the inconsistent case significantly by providing a complete proof of the full weak convergence in the convex feasibility setting. In fact, a more general sufficient condition for the weak convergence in the general case is presented. Our proof relies on a new convergence principle for Fejer monotone sequences. Numerous examples illustrate our results.

...read moreread less

Journal Article•DOI•

Nonanticipative duality, relaxations, and formulations for chance-constrained stochastic programs

[...]

Shabbir Ahmed¹, James Luedtke², Yongjia Song³, Weijun Xie¹•Institutions (3)

Georgia Institute of Technology¹, University of Wisconsin-Madison², Virginia Commonwealth University³

Two-stage stochastic variational inequalities: an ERM-solution procedure

TL;DR: This work proposes two new Lagrangian dual problems for chance-constrained stochastic programs based on relaxing nonanticipativity constraints and proposes a new heuristic method and two new exact algorithms based on these duals and formulations.

...read moreread less

Abstract: We propose two new Lagrangian dual problems for chance-constrained stochastic programs based on relaxing nonanticipativity constraints. We compare the strength of the proposed dual bounds and demonstrate that they are superior to the bound obtained from the continuous relaxation of a standard mixed-integer programming (MIP) formulation. For a given dual solution, the associated Lagrangian relaxation bounds can be calculated by solving a set of single scenario subproblems and then solving a single knapsack problem. We also derive two new primal MIP formulations and demonstrate that for chance-constrained linear programs, the continuous relaxations of these formulations yield bounds equal to the proposed dual bounds. We propose a new heuristic method and two new exact algorithms based on these duals and formulations. The first exact algorithm applies to chance-constrained binary programs, and uses either of the proposed dual bounds in concert with cuts that eliminate solutions found by the subproblems. The second exact method is a branch-and-cut algorithm for solving either of the primal formulations. Our computational results indicate that the proposed dual bounds and heuristic solutions can be obtained efficiently, and the gaps between the best dual bounds and the heuristic solutions are small.

...read moreread less

Journal Article•DOI•

[...]

Xiaojun Chen¹, Ting Kei Pong¹, Roger J.-B. Wets²•Institutions (2)

Hong Kong Polytechnic University¹, University of California, Davis²

14 Mar 2017-Mathematical Programming

TL;DR: A two-stage stochastic variational inequality model to deal with random variables in variational inequalities is proposed, and the solvability, differentiability and convexity of the two- Stage Stochastic programming and the convergence of its sample average approximation are established.

...read moreread less

Abstract: We propose a two-stage stochastic variational inequality model to deal with random variables in variational inequalities, and formulate this model as a two-stage stochastic programming with recourse by using an expected residual minimization solution procedure. The solvability, differentiability and convexity of the two-stage stochastic programming and the convergence of its sample average approximation are established. Examples of this model are given, including the optimality conditions for stochastic programs, a Walras equilibrium problem and Wardrop flow equilibrium. We also formulate stochastic traffic assignments on arcs flow as a two-stage stochastic variational inequality based on Wardrop flow equilibrium and present numerical results of the Douglas–Rachford splitting method for the corresponding two-stage stochastic programming with recourse.

...read moreread less

Journal Article•DOI•

Quadratic reformulations of nonlinear binary optimization problems

[...]

Martin Anthony¹, Endre Boros², Yves Crama³, Aritanan Gruber⁴•Institutions (4)

London School of Economics and Political Science¹, Rutgers University², University of Liège³, University of São Paulo⁴

How to convexify the intersection of a second order cone and a nonconvex quadratic

TL;DR: This paper provides tight lower and upper bounds on the number of auxiliary variables needed in the worst-case for general objective functions, for bounded-degree functions, and for a restricted class of quadratizations.

...read moreread less

Abstract: Very large nonlinear unconstrained binary optimization problems arise in a broad array of applications. Several exact or heuristic techniques have proved quite successful for solving many of these problems when the objective function is a quadratic polynomial. However, no similarly efficient methods are available for the higher degree case. Since high degree objectives are becoming increasingly important in certain application areas, such as computer vision, various techniques have been recently developed to reduce the general case to the quadratic one, at the cost of increasing the number of variables by introducing additional auxiliary variables. In this paper we initiate a systematic study of these quadratization approaches. We provide tight lower and upper bounds on the number of auxiliary variables needed in the worst-case for general objective functions, for bounded-degree functions, and for a restricted class of quadratizations. Our upper bounds are constructive, thus yielding new quadratization procedures. Finally, we completely characterize all "minimal" quadratizations of negative monomials.

...read moreread less

Journal Article•DOI•

[...]

Samuel Burer¹, Fatma Kılınç-Karzan²•Institutions (2)

University of Iowa¹, Carnegie Mellon University²

Two-stage non-cooperative games with risk-averse players

TL;DR: A recent series of papers has examined the extension of disjunctive-programming techniques to mixed-integer second-order-cone programming as discussed by the authors, and it has been shown that the convex hull of the intersection of an ellipsoid and a split disjunction is a disjoint, and that it is possible to compute the disjunct with respect to a convex shape.

...read moreread less

Abstract: A recent series of papers has examined the extension of disjunctive-programming techniques to mixed-integer second-order-cone programming. For example, it has been shown--by several authors using different techniques--that the convex hull of the intersection of an ellipsoid, $$\mathcal {E}$$E, and a split disjunction, $$(l - x_j)(x_j - u) \le 0$$(l-xj)(xj-u)≤0 with $$l < u$$l

...read moreread less

Journal Article•DOI•

[...]

Jong-Shi Pang¹, Suvrajeet Sen¹, Uday V. Shanbhag²•Institutions (2)

University of Southern California¹, Pennsylvania State University²

A spatial branch-and-cut method for nonconvex QCQP with bounded complex variables

TL;DR: A unified formulation of the two-stage SNEP with risk-averse players and convex quadratic recourse functions is introduced and a generalized diagonal dominance condition on the players’ smoothed objective functions is imposed that facilitates the application and ensures the convergence of an iterative best-response scheme.

...read moreread less

Abstract: This paper formally introduces and studies a non-cooperative multi-agent game under uncertainty. The well-known Nash equilibrium is employed as the solution concept of the game. While there are several formulations of a stochastic Nash equilibrium problem, we focus mainly on a two-stage setting of the game wherein each agent is risk-averse and solves a rival-parameterized stochastic program with quadratic recourse. In such a game, each agent takes deterministic actions in the first stage and recourse decisions in the second stage after the uncertainty is realized. Each agent's overall objective consists of a deterministic first-stage component plus a second-stage mean-risk component defined by a coherent risk measure describing the agent's risk aversion. We direct our analysis towards a broad class of quantile-based risk measures and linear-quadratic recourse functions. For this class of non-cooperative games under uncertainty, the agents' objective functions can be shown to be convex in their own decision variables, provided that the deterministic component of these functions have the same convexity property. Nevertheless, due to the non-differentiability of the recourse functions, the agents' objective functions are at best directionally differentiable. Such non-differentiability creates multiple challenges for the analysis and solution of the game, two principal ones being: (1) a stochastic multi-valued variational inequality is needed to characterize a Nash equilibrium, provided that the players' optimization problems are convex; (2) one needs to be careful in the design of algorithms that require differentiability of the objectives. Moreover, the resulting (multi-valued) variational formulation cannot be expected to be of the monotone type in general. The main contributions of this paper are as follows: (a) Prior to addressing the main problem of the paper, we summarize several approaches that have existed in the literature to deal with uncertainty in a non-cooperative game. (b) We introduce a unified formulation of the two-stage SNEP with risk-averse players and convex quadratic recourse functions and highlight the technical challenges in dealing with this game. (c) To handle the lack of smoothness, we propose smoothing schemes and regularization that lead to differentiable approximations. (d) To deal with non-monotonicity, we impose a generalized diagonal dominance condition on the players' smoothed objective functions that facilitates the application and ensures the convergence of an iterative best-response scheme. (e) To handle the expectation operator, we rely on known methods in stochastic programming that include sampling and approximation. (f) We provide convergence results for various versions of the best-response scheme, particularly for the case of private recourse functions. Overall, this paper lays the foundation for future research into the class of SNEPs that provides a constructive paradigm for modeling and solving competitive decision making problems with risk-averse players facing uncertainty; this paradigm is very much at an infancy stage of research and requires extensive treatment in order to meet its broad applications in many engineering and economics domains.

...read moreread less

Journal Article•DOI•

[...]

Chen Chen¹, Alper Atamtürk¹, Shmuel S. Oren¹•Institutions (1)

University of California, Berkeley¹

01 Oct 2017-Mathematical Programming

TL;DR: In this paper, a spatial branch-and-cut approach for nonconvex quadratically constrained quadratic programs with bounded complex variables (CQCQP) was developed.

...read moreread less

Abstract: We develop a spatial branch-and-cut approach for nonconvex quadratically constrained quadratic programs with bounded complex variables (CQCQP). Linear valid inequalities are added at each node of the search tree to strengthen semidefinite programming relaxations of CQCQP. These valid inequalities are derived from the convex hull description of a nonconvex set of $$2 \times 2$$ positive semidefinite Hermitian matrices subject to a rank-one constraint. We propose branching rules based on an alternative to the rank-one constraint that allows for local measurement of constraint violation. Closed-form bound tightening procedures are used to reduce the domain of the problem. We apply the algorithm to solve the alternating current optimal power flow problem with complex variables as well as the box-constrained quadratic programming problem with real variables.

...read moreread less

Journal Article•DOI•

Exact augmented Lagrangian duality for mixed integer linear programming

[...]

Mohammad Javad Feizollahi¹, Shabbir Ahmed², Andy Sun²•Institutions (2)

J. Mack Robinson College of Business¹, Georgia Institute of Technology²

A cutting-plane approach for large-scale capacitated multi-period facility location using a specialized interior-point method

TL;DR: It is shown that, under some mild conditions, ALD using any norm as the augmenting function is able to close the duality gap of an MIP with a finite penalty coefficient, which generalizes the result in Boland and Eberhard (2015) from pure integer programming problems with bounded feasible region to general MIPs.

...read moreread less

Abstract: We investigate the augmented Lagrangian dual (ALD) for mixed integer linear programming (MIP) problems. ALD modifies the classical Lagrangian dual by appending a nonlinear penalty function on the violation of the dualized constraints in order to reduce the duality gap. We first provide a primal characterization for ALD for MIPs and prove that ALD is able to asymptotically achieve zero duality gap when the weight on the penalty function is allowed to go to infinity. This provides an alternative characterization and proof of a recent result in Boland and Eberhard (Math Program 150(2):491---509, 2015, Proposition 3). We further show that, under some mild conditions, ALD using any norm as the augmenting function is able to close the duality gap of an MIP with a finite penalty coefficient. This generalizes the result in Boland and Eberhard (2015, Corollary 1) from pure integer programming problems with bounded feasible region to general MIPs. We also present an example where ALD with a quadratic augmenting function is not able to close the duality gap for any finite penalty coefficient.

...read moreread less

Journal Article•DOI•

[...]

Jordi Castro¹, Stefano Nasini², Francisco Saldanha-da-Gama³•Institutions (3)

Polytechnic University of Catalonia¹, Lille Catholic University², University of Lisbon³

On imposing connectivity constraints in integer programs

TL;DR: In this article, a cutting-plane approach was proposed to solve a class of capacitated multi-period facility location problems, where the primal block-angular structure of the resulting linear optimization problems is exploited by the interior-point method, allowing the efficient solution of large instances.

...read moreread less

Abstract: We propose a cutting-plane approach (namely, Benders decomposition) for a class of capacitated multi-period facility location problems. The novelty of this approach lies on the use of a specialized interior-point method for solving the Benders subproblems. The primal block-angular structure of the resulting linear optimization problems is exploited by the interior-point method, allowing the (either exact or inexact) efficient solution of large instances. The consequences of different modeling conditions and problem specifications on the computational performance are also investigated both theoretically and empirically, providing a deeper understanding of the significant factors influencing the overall efficiency of the cutting-plane method. The methodology proposed allowed the solution of instances of up to 200 potential locations, one million customers and three periods, resulting in mixed integer linear optimization problems of up to 600 binary and 600 millions of continuous variables. Those problems were solved by the specialized approach in less than one hour and a half, outperforming other state-of-the-art methods, which exhausted the (144 GB of) available memory in the largest instances.

...read moreread less

Journal Article•DOI•

[...]

Yiming Wang¹, Austin Buchanan², Sergiy Butenko¹•Institutions (2)

Texas A&M University¹, Oklahoma State University–Stillwater²

08 Feb 2017-Mathematical Programming

TL;DR: This paper studies the connected subgraph polytope of a graph, which is the convex hull of subsets of vertices that induce a connected sub graph.

...read moreread less

Abstract: In many network applications, one searches for a connected subset of vertices that exhibits other desirable properties. To this end, this paper studies the connected subgraph polytope of a graph, which is the convex hull of subsets of vertices that induce a connected subgraph. Much of our work is devoted to the study of two nontrivial classes of valid inequalities. The first are the a, b-separator inequalities, which have been successfully used to enforce connectivity in previous computational studies. The second are the indegree inequalities, which have previously been shown to induce all nontrivial facets for trees. We determine the precise conditions under which these inequalities induce facets and when each class fully describes the connected subgraph polytope. Both classes of inequalities can be separated in polynomial time and admit compact extended formulations. However, while the a, b-separator inequalities can be lifted in linear time, it is NP-hard to lift the indegree inequalities.

...read moreread less

Journal Article•DOI•

On reduced semidefinite programs for second order moment bounds with applications

[...]

Karthik Natarajan¹, Chung-Piaw Teo²•Institutions (2)

Singapore University of Technology and Design¹, National University of Singapore²