scispace - formally typeset
Search or ask a question

Showing papers in "Mathematical Programming in 2017"


Journal ArticleDOI
TL;DR: In this paper, the stochastic average gradient (SAG) method is used to optimize the sum of a finite number of smooth convex functions, which achieves a faster convergence rate than black-box SG methods.
Abstract: We analyze the stochastic average gradient (SAG) method for optimizing the sum of a finite number of smooth convex functions. Like stochastic gradient (SG) methods, the SAG method's iteration cost is independent of the number of terms in the sum. However, by incorporating a memory of previous gradient values the SAG method achieves a faster convergence rate than black-box SG methods. The convergence rate is improved from $$O(1/\sqrt{k})$$O(1/k) to O(1 / k) in general, and when the sum is strongly-convex the convergence rate is improved from the sub-linear O(1 / k) to a linear convergence rate of the form $$O(\rho ^k)$$O(?k) for $$\rho < 1$$?<1. Further, in many cases the convergence rate of the new method is also faster than black-box deterministic gradient methods, in terms of the number of gradient evaluations. This extends our earlier work Le Roux et al. (Adv Neural Inf Process Syst, 2012), which only lead to a faster rate for well-conditioned strongly-convex problems. Numerical experiments indicate that the new algorithm often dramatically outperforms existing SG and deterministic gradient methods, and that the performance may be further improved through the use of non-uniform sampling strategies.

769 citations


Journal ArticleDOI
TL;DR: This paper establishes the global R-linear convergence of the ADMM for minimizing the sum of any number of convex separable functions, assuming that a certain error bound condition holds true and the dual stepsize is sufficiently small.
Abstract: We analyze the convergence rate of the alternating direction method of multipliers (ADMM) for minimizing the sum of two or more nonsmooth convex separable functions subject to linear constraints. Previous analysis of the ADMM typically assumes that the objective function is the sum of only two convex functions defined on two separable blocks of variables even though the algorithm works well in numerical experiments for three or more blocks. Moreover, there has been no rate of convergence analysis for the ADMM without strong convexity in the objective function. In this paper we establish the global R-linear convergence of the ADMM for minimizing the sum of any number of convex separable functions, assuming that a certain error bound condition holds true and the dual stepsize is sufficiently small. Such an error bound condition is satisfied for example when the feasible set is a compact polyhedron and the objective function consists of a smooth strictly convex function composed with a linear mapping, and a nonsmooth $$\ell _1$$l1 regularizer. This result implies the linear convergence of the ADMM for contemporary applications such as LASSO without assuming strong convexity of the objective function.

705 citations


Journal ArticleDOI
TL;DR: In particular, the authors showed that the complexity of first-order descent methods for convex minimization can be computed using the Kurdyka-Łojasiewicz (KL) inequality.
Abstract: This paper shows that error bounds can be used as effective tools for deriving complexity results for first-order descent methods in convex minimization. In a first stage, this objective led us to revisit the interplay between error bounds and the Kurdyka-Łojasiewicz (KL) inequality. One can show the equivalence between the two concepts for convex functions having a moderately flat profile near the set of minimizers (as those of functions with Holderian growth). A counterexample shows that the equivalence is no longer true for extremely flat functions. This fact reveals the relevance of an approach based on KL inequality. In a second stage, we show how KL inequalities can in turn be employed to compute new complexity bounds for a wealth of descent methods for convex problems. Our approach is completely original and makes use of a one-dimensional worst-case proximal sequence in the spirit of the famous majorant method of Kantorovich. Our result applies to a very simple abstract scheme that covers a wide class of descent methods. As a byproduct of our study, we also provide new results for the globalization of KL inequalities in the convex framework. Our main results inaugurate a simple method: derive an error bound, compute the desingularizing function whenever possible, identify essential constants in the descent method and finally compute the complexity using the one-dimensional worst case proximal sequence. Our method is illustrated through projection methods for feasibility problems, and through the famous iterative shrinkage thresholding algorithm (ISTA), for which we show that the complexity bound is of the form \(O(q^{k})\) where the constituents of the bound only depend on error bound constants obtained for an arbitrary least squares objective with \(\ell ^1\) regularization.

258 citations


Journal ArticleDOI
TL;DR: It is proved that the SCGD converge almost surely to an optimal solution for convex optimization problems, as long as such a solution exists and any limit point generated by SCGD is a stationary point, for which the convergence rate analysis is provided.
Abstract: Classical stochastic gradient methods are well suited for minimizing expected-value objective functions. However, they do not apply to the minimization of a nonlinear function involving expected values or a composition of two expected-value functions, i.e., the problem $$\min _x \mathbf{E}_v\left[ f_v\big (\mathbf{E}_w [g_w(x)]\big ) \right] .$$minxEvfv(Ew[gw(x)]). In order to solve this stochastic composition problem, we propose a class of stochastic compositional gradient descent (SCGD) algorithms that can be viewed as stochastic versions of quasi-gradient method. SCGD update the solutions based on noisy sample gradients of $$f_v,g_{w}$$fv,gw and use an auxiliary variable to track the unknown quantity $$\mathbf{E}_w\left[ g_w(x)\right] $$Ewgw(x). We prove that the SCGD converge almost surely to an optimal solution for convex optimization problems, as long as such a solution exists. The convergence involves the interplay of two iterations with different time scales. For nonsmooth convex problems, the SCGD achieves a convergence rate of $$\mathcal {O}(k^{-1/4})$$O(k-1/4) in the general case and $$\mathcal {O}(k^{-2/3})$$O(k-2/3) in the strongly convex case, after taking k samples. For smooth convex problems, the SCGD can be accelerated to converge at a rate of $$\mathcal {O}(k^{-2/7})$$O(k-2/7) in the general case and $$\mathcal {O}(k^{-4/5})$$O(k-4/5) in the strongly convex case. For nonconvex problems, we prove that any limit point generated by SCGD is a stationary point, for which we also provide the convergence rate analysis. Indeed, the stochastic setting where one wants to optimize compositions of expected-value functions is very common in practice. The proposed SCGD methods find wide applications in learning, estimation, dynamic programming, etc.

209 citations


Journal ArticleDOI
TL;DR: In this article, the exact worst-case performance of fixed-step first-order methods for unconstrained optimization of smooth (possibly strongly) convex functions can be obtained by solving convex programs.
Abstract: We show that the exact worst-case performance of fixed-step first-order methods for unconstrained optimization of smooth (possibly strongly) convex functions can be obtained by solving convex programs. Finding the worst-case performance of a black-box first-order method is formulated as an optimization problem over a set of smooth (strongly) convex functions and initial conditions. We develop closed-form necessary and sufficient conditions for smooth (strongly) convex interpolation, which provide a finite representation for those functions. This allows us to reformulate the worst-case performance estimation problem as an equivalent finite dimension-independent semidefinite optimization problem, whose exact solution can be recovered up to numerical precision. Optimal solutions to this performance estimation problem provide both worst-case performance bounds and explicit functions matching them, as our smooth (strongly) convex interpolation procedure is constructive. Our works build on those of Drori and Teboulle (Math Program 145(1---2):451---482, 2014) who introduced and solved relaxations of the performance estimation problem for smooth convex functions. We apply our approach to different fixed-step first-order methods with several performance criteria, including objective function accuracy and gradient norm. We conjecture several numerically supported worst-case bounds on the performance of the fixed-step gradient, fast gradient and optimized gradient methods, both in the smooth convex and the smooth strongly convex cases, and deduce tight estimates of the optimal step size for the gradient method.

165 citations


Journal ArticleDOI
TL;DR: It is proved that the trust region algorithm, entitled trace, follows a trust region framework, but employs modified step acceptance criteria and a novel trust region update mechanism that allow the algorithm to achieve such a worst-case global complexity bound.
Abstract: We propose a trust region algorithm for solving nonconvex smooth optimization problems. For any $$\overline{\epsilon }\in (0,\infty )$$∈¯?(0,?), the algorithm requires at most $$\mathcal{O}(\epsilon ^{-3/2})$$O(∈-3/2) iterations, function evaluations, and derivative evaluations to drive the norm of the gradient of the objective function below any $$\epsilon \in (0,\overline{\epsilon }]$$∈?(0,∈¯]. This improves upon the $$\mathcal{O}(\epsilon ^{-2})$$O(∈-2) bound known to hold for some other trust region algorithms and matches the $$\mathcal{O}(\epsilon ^{-3/2})$$O(∈-3/2) bound for the recently proposed Adaptive Regularisation framework using Cubics, also known as the arc algorithm. Our algorithm, entitled trace, follows a trust region framework, but employs modified step acceptance criteria and a novel trust region update mechanism that allow the algorithm to achieve such a worst-case global complexity bound. Importantly, we prove that our algorithm also attains global and fast local convergence guarantees under similar assumptions as for other trust region algorithms. We also prove a worst-case upper bound on the number of iterations, function evaluations, and derivative evaluations that the algorithm requires to obtain an approximate second-order stationary point.

165 citations


Journal ArticleDOI
TL;DR: The worst-case evaluation complexity for smooth (possibly nonconvex) unconstrained optimization is considered and it is shown that an $$epsilon $$ϵ-approximate first-order critical point can be computed in at most O(ϵ-(p+1)/p) evaluations of the problem’s objective function and its derivatives.
Abstract: The worst-case evaluation complexity for smooth (possibly nonconvex) unconstrained optimization is considered. It is shown that, if one is willing to use derivatives of the objective function up to order p (for $$p\ge 1$$pź1) and to assume Lipschitz continuity of the p-th derivative, then an $$\epsilon $$∈-approximate first-order critical point can be computed in at most $$O(\epsilon ^{-(p+1)/p})$$O(∈-(p+1)/p) evaluations of the problem's objective function and its derivatives. This generalizes and subsumes results known for $$p=1$$p=1 and $$p=2$$p=2.

154 citations


Journal ArticleDOI
TL;DR: In this paper, a unified iteration complexity analysis for a family of general block coordinate descent methods, covering popular methods such as the block coordinate gradient descent and the block coordinates proximal gradient, under various different coordinate update rules, is provided.
Abstract: In this paper, we provide a unified iteration complexity analysis for a family of general block coordinate descent methods, covering popular methods such as the block coordinate gradient descent and the block coordinate proximal gradient, under various different coordinate update rules. We unify these algorithms under the so-called block successive upper-bound minimization (BSUM) framework, and show that for a broad class of multi-block nonsmooth convex problems, all algorithms covered by the BSUM framework achieve a global sublinear iteration complexity of $$\mathcal{{O}}(1/r)$$O(1/r), where r is the iteration index. Moreover, for the case of block coordinate minimization where each block is minimized exactly, we establish the sublinear convergence rate of O(1/r) without per block strong convexity assumption.

152 citations


Journal ArticleDOI
TL;DR: In this paper, the authors present a new framework for establishing error bounds for a class of structured convex optimization problems, in which the objective function is the sum of a smooth convex function and a general closed proper function.
Abstract: Error bounds, which refer to inequalities that bound the distance of vectors in a test set to a given set by a residual function, have proven to be extremely useful in analyzing the convergence rates of a host of iterative methods for solving optimization problems. In this paper, we present a new framework for establishing error bounds for a class of structured convex optimization problems, in which the objective function is the sum of a smooth convex function and a general closed proper convex function. Such a class encapsulates not only fairly general constrained minimization problems but also various regularized loss minimization formulations in machine learning, signal processing, and statistics. Using our framework, we show that a number of existing error bound results can be recovered in a unified and transparent manner. To further demonstrate the power of our framework, we apply it to a class of nuclear-norm regularized loss minimization problems and establish a new error bound for this class under a strict complementarity-type regularity condition. We then complement this result by constructing an example to show that the said error bound could fail to hold without the regularity condition. We believe that our approach will find further applications in the study of error bounds for structured convex optimization problems.

121 citations


Journal ArticleDOI
TL;DR: In this article, an inexact 2-block majorized semi-proximal ADMM was proposed for solving a class of high-dimensional convex composite conic optimization problems to moderate accuracy.
Abstract: In this paper, we propose an inexact multi-block ADMM-type first-order method for solving a class of high-dimensional convex composite conic optimization problems to moderate accuracy. The design of this method combines an inexact 2-block majorized semi-proximal ADMM and the recent advances in the inexact symmetric Gauss---Seidel (sGS) technique for solving a multi-block convex composite quadratic programming whose objective contains a nonsmooth term involving only the first block-variable. One distinctive feature of our proposed method (the sGS-imsPADMM) is that it only needs one cycle of an inexact sGS method, instead of an unknown number of cycles, to solve each of the subproblems involved. With some simple and implementable error tolerance criteria, the cost for solving the subproblems can be greatly reduced, and many steps in the forward sweep of each sGS cycle can often be skipped, which further contributes to the efficiency of the proposed method. Global convergence as well as the iteration complexity in the non-ergodic sense is established. Preliminary numerical experiments on some high-dimensional linear and convex quadratic SDP problems with a large number of linear equality and inequality constraints are also provided. The results show that for the vast majority of the tested problems, the sGS-imsPADMM is 2---3 times faster than the directly extended multi-block ADMM with the aggressive step-length of 1.618, which is currently the benchmark among first-order methods for solving multi-block linear and quadratic SDP problems though its convergence is not guaranteed.

112 citations


Journal ArticleDOI
TL;DR: In this paper, it was shown that the decision version of mixed-integer quadratic programming is in NP-complete and that there is no polynomial-size solution for this problem.
Abstract: Mixed-integer quadratic programming is the problem of optimizing a quadratic function over points in a polyhedral set where some of the components are restricted to be integral. In this paper, we prove that the decision version of mixed-integer quadratic programming is in NP, thereby showing that it is NP-complete. This is established by showing that if the decision version of mixed-integer quadratic programming is feasible, then there exists a solution of polynomial size. This result generalizes and unifies classical results that quadratic programming is in NP (Vavasis in Inf Process Lett 36(2):73---77 [17]) and integer linear programming is in NP (Borosh and Treybig in Proc Am Math Soc 55:299---304 [1], von zur Gathen and Sieveking in Proc Am Math Soc 72:155---158 [18], Kannan and Monma in Lecture Notes in Economics and Mathematical Systems, vol. 157, pp. 161---172. Springer [9], Papadimitriou in J Assoc Comput Mach 28:765---768 [15]).

Journal ArticleDOI
TL;DR: In this article, a stochastic accelerated mirror-prox (SAMP) method was proposed for solving a class of monotone variational inequalities (SVI), which is based on a multi-step acceleration scheme.
Abstract: We propose a novel stochastic method, namely the stochastic accelerated mirror-prox (SAMP) method, for solving a class of monotone stochastic variational inequalities (SVI). The main idea of the proposed algorithm is to incorporate a multi-step acceleration scheme into the stochastic mirror-prox method. The developed SAMP method computes weak solutions with the optimal iteration complexity for SVIs. In particular, if the operator in SVI consists of the stochastic gradient of a smooth function, the iteration complexity of the SAMP method can be accelerated in terms of their dependence on the Lipschitz constant of the smooth function. For SVIs with bounded feasible sets, the bound of the iteration complexity of the SAMP method depends on the diameter of the feasible set. For unbounded SVIs, we adopt the modified gap function introduced by Monteiro and Svaiter for solving monotone inclusion, and show that the iteration complexity of the SAMP method depends on the distance from the initial point to the set of strong solutions. It is worth noting that our study also significantly improves a few existing complexity results for solving deterministic variational inequality problems. We demonstrate the advantages of the SAMP method over some existing algorithms through our preliminary numerical experiments.

Journal ArticleDOI
TL;DR: In this paper, the authors consider the problem of minimizing the sum of a linear function and a composition of a strongly convex function with a linear transformation over a compact polyhedral set.
Abstract: We consider the problem of minimizing the sum of a linear function and a composition of a strongly convex function with a linear transformation over a compact polyhedral set. Jaggi and Lacoste-Julien (An affine invariant linear convergence analysis for Frank-Wolfe algorithms. NIPS 2013 Workshop on Greedy Algorithms, Frank-Wolfe and Friends, 2014) show that the conditional gradient method with away steps -- employed on the aforementioned problem without the additional linear term -- has a linear rate of convergence, depending on the so-called pyramidal width of the feasible set. We revisit this result and provide a variant of the algorithm and an analysis based on simple linear programming duality arguments, as well as corresponding error bounds. This new analysis (a) enables the incorporation of the additional linear term, and (b) depends on a new constant, that is explicitly expressed in terms of the problem's parameters and the geometry of the feasible set. This constant replaces the pyramidal width, which is difficult to evaluate.

Journal ArticleDOI
TL;DR: In this paper, the authors consider the problem of estimating a collection of n phases, given noisy measurements of the pairwise relative phases, and show that the classical semidefinite relaxation for it is tight with high probability.
Abstract: Maximum likelihood estimation problems are, in general, intractable optimization problems. As a result, it is common to approximate the maximum likelihood estimator (MLE) using convex relaxations. In some cases, the relaxation is tight: it recovers the true MLE. Most tightness proofs only apply to situations where the MLE exactly recovers a planted solution (known to the analyst). It is then sufficient to establish that the optimality conditions hold at the planted signal. In this paper, we study an estimation problem (angular synchronization) for which the MLE is not a simple function of the planted solution, yet for which the convex relaxation is tight. To establish tightness in this context, the proof is less direct because the point at which to verify optimality conditions is not known explicitly. Angular synchronization consists in estimating a collection of n phases, given noisy measurements of the pairwise relative phases. The MLE for angular synchronization is the solution of a (hard) non-bipartite Grothendieck problem over the complex numbers. We consider a stochastic model for the data: a planted signal (that is, a ground truth set of phases) is corrupted with non-adversarial random noise. Even though the MLE does not coincide with the planted signal, we show that the classical semidefinite relaxation for it is tight, with high probability. This holds even for high levels of noise.

Journal ArticleDOI
TL;DR: This expository article studies optimization problems specified via linear and relative entropy inequalities and provides solutions based on REPs to a range of problems such as permanent maximization, robust optimization formulations of GPs, and hitting-time estimation in dynamical systems.
Abstract: In this expository article, we study optimization problems specified via linear and relative entropy inequalities. Such relative entropy programs (REPs) are convex optimization problems as the relative entropy function is jointly convex with respect to both its arguments. Prominent families of convex programs such as geometric programs (GPs), second-order cone programs, and entropy maximization problems are special cases of REPs, although REPs are more general than these classes of problems. We provide solutions based on REPs to a range of problems such as permanent maximization, robust optimization formulations of GPs, and hitting-time estimation in dynamical systems. We survey previous approaches to some of these problems and the limitations of those methods, and we highlight the more powerful generalizations afforded by REPs. We conclude with a discussion of quantum analogs of the relative entropy function, including a review of the similarities and distinctions with respect to the classical case. We also describe a stylized application of quantum relative entropy optimization that exploits the joint convexity of the quantum relative entropy function.

Journal ArticleDOI
TL;DR: A regularized smoothed SA (RSSA) scheme wherein the stepsize, smoothing, and regularization parameters are reduced after every iteration at a prescribed rate, and it is shown that the algorithm generates iterates that converge to the least norm solution in an almost sure sense.
Abstract: Traditionally, most stochastic approximation (SA) schemes for stochastic variational inequality (SVI) problems have required the underlying mapping to be either strongly monotone or monotone and Lipschitz continuous. In contrast, we consider SVIs with merely monotone and non-Lipschitzian maps. We develop a regularized smoothed SA (RSSA) scheme wherein the stepsize, smoothing, and regularization parameters are reduced after every iteration at a prescribed rate. Under suitable assumptions on the sequences, we show that the algorithm generates iterates that converge to the least norm solution in an almost sure sense, extending the results in Koshal et al. (IEEE Trans Autom Control 58(3):594–609, 2013) to the non-Lipschitzian regime. Additionally, we provide rate estimates that relate iterates to their counterparts derived from a smoothed Tikhonov trajectory associated with a deterministic problem. To derive non-asymptotic rate statements, we develop a variant of the RSSA scheme, denoted by aRSSA $$_r$$ , in which we employ a weighted iterate-averaging, parameterized by a scalar r where $$r = 1$$ provides us with the standard averaging scheme. The main contributions are threefold: (i) when $$r<1$$ and the parameter sequences are chosen appropriately, we show that the averaged sequence converges to the least norm solution almost surely and a suitably defined gap function diminishes at an approximate rate $$\mathcal{O}({1}\slash {\root 6 \of {k}})$$ after k steps; (ii) when $$r<1$$ , and smoothing and regularization are suppressed, the gap function admits the rate $$\mathcal{O}({1}\slash {\sqrt{k}})$$ , thus improving the rate $$\mathcal{O}(\ln (k)/\sqrt{k})$$ under standard averaging; and (iii) we develop a window-based variant of this scheme that also displays the optimal rate for $$r < 1$$ . Notably, we prove the superiority of the scheme with $$r < 1$$ with its counterpart with $$r=1$$ in terms of the constant factor of the error bound when the size of the averaging window is sufficiently large. We present the performance of the developed schemes on a stochastic Nash–Cournot game with merely monotone and non-Lipschitzian maps.

Journal ArticleDOI
TL;DR: The idea that nonanticipativity as an explicit constraint on responses along with an associated “multiplier” element which captures the “price of information” and provides a means of decomposition as a tool in algorithmic developments is extended here to a framework which supports multistage optimization and equilibrium models while also clarifying the single-stage picture.
Abstract: Variational inequality modeling, analysis and computations are important for many applications, but much of the subject has been developed in a deterministic setting with no uncertainty in a problem's data. In recent years research has proceeded on a track to incorporate stochasticity in one way or another. However, the main focus has been on rather limited ideas of what a stochastic variational inequality might be. Because variational inequalities are especially tuned to capturing conditions for optimality and equilibrium, stochastic variational inequalities ought to provide such service for problems of optimization and equilibrium in a stochastic setting. Therefore they ought to be able to deal with multistage decision processes involving actions that respond to increasing levels of information. Critical for that, as discovered in stochastic programming, is introducing nonanticipativity as an explicit constraint on responses along with an associated "multiplier" element which captures the "price of information" and provides a means of decomposition as a tool in algorithmic developments. That idea is extended here to a framework which supports multistage optimization and equilibrium models while also clarifying the single-stage picture.

Journal ArticleDOI
TL;DR: The case where no first stage variables exist is considered and the min–max–min problem is shown to be NP-hard for every fixed number k, even when the uncertainty set is a polyhedron, given by an inner description.
Abstract: The idea of k-adaptability in two-stage robust optimization is to calculate a fixed number k of second-stage policies here-and-now. After the actual scenario is revealed, the best of these policies is selected. This idea leads to a min---max---min problem. In this paper, we consider the case where no first stage variables exist and propose to use this approach to solve combinatorial optimization problems with uncertainty in the objective function. We investigate the complexity of this special case for convex uncertainty sets. We first show that the min---max---min problem is as easy as the underlying certain problem if k is greater than the number of variables and if we can optimize a linear function over the uncertainty set in polynomial time. We also provide an exact and practical oracle-based algorithm to solve the latter problem for any underlying combinatorial problem. On the other hand, we prove that the min---max---min problem is NP-hard for every fixed number k, even when the uncertainty set is a polyhedron, given by an inner description. For the case that k is smaller or equal to the number of variables, we finally propose a fast heuristic algorithm and evaluate its performance.

Journal ArticleDOI
TL;DR: In this paper, it was shown that in the (possibly inconsistent) convex feasibility setting, the shadow sequence remains bounded and its weak cluster points solve a best approximation problem, and a more general sufficient condition for weak convergence in the general case is presented.
Abstract: The Douglas---Rachford algorithm is a very popular splitting technique for finding a zero of the sum of two maximally monotone operators. The behaviour of the algorithm remains mysterious in the general inconsistent case, i.e., when the sum problem has no zeros. However, more than a decade ago, it was shown that in the (possibly inconsistent) convex feasibility setting, the shadow sequence remains bounded and its weak cluster points solve a best approximation problem. In this paper, we advance the understanding of the inconsistent case significantly by providing a complete proof of the full weak convergence in the convex feasibility setting. In fact, a more general sufficient condition for the weak convergence in the general case is presented. Our proof relies on a new convergence principle for Fejer monotone sequences. Numerous examples illustrate our results.

Journal ArticleDOI
TL;DR: This work proposes two new Lagrangian dual problems for chance-constrained stochastic programs based on relaxing nonanticipativity constraints and proposes a new heuristic method and two new exact algorithms based on these duals and formulations.
Abstract: We propose two new Lagrangian dual problems for chance-constrained stochastic programs based on relaxing nonanticipativity constraints. We compare the strength of the proposed dual bounds and demonstrate that they are superior to the bound obtained from the continuous relaxation of a standard mixed-integer programming (MIP) formulation. For a given dual solution, the associated Lagrangian relaxation bounds can be calculated by solving a set of single scenario subproblems and then solving a single knapsack problem. We also derive two new primal MIP formulations and demonstrate that for chance-constrained linear programs, the continuous relaxations of these formulations yield bounds equal to the proposed dual bounds. We propose a new heuristic method and two new exact algorithms based on these duals and formulations. The first exact algorithm applies to chance-constrained binary programs, and uses either of the proposed dual bounds in concert with cuts that eliminate solutions found by the subproblems. The second exact method is a branch-and-cut algorithm for solving either of the primal formulations. Our computational results indicate that the proposed dual bounds and heuristic solutions can be obtained efficiently, and the gaps between the best dual bounds and the heuristic solutions are small.

Journal ArticleDOI
TL;DR: A two-stage stochastic variational inequality model to deal with random variables in variational inequalities is proposed, and the solvability, differentiability and convexity of the two- Stage Stochastic programming and the convergence of its sample average approximation are established.
Abstract: We propose a two-stage stochastic variational inequality model to deal with random variables in variational inequalities, and formulate this model as a two-stage stochastic programming with recourse by using an expected residual minimization solution procedure. The solvability, differentiability and convexity of the two-stage stochastic programming and the convergence of its sample average approximation are established. Examples of this model are given, including the optimality conditions for stochastic programs, a Walras equilibrium problem and Wardrop flow equilibrium. We also formulate stochastic traffic assignments on arcs flow as a two-stage stochastic variational inequality based on Wardrop flow equilibrium and present numerical results of the Douglas–Rachford splitting method for the corresponding two-stage stochastic programming with recourse.

Journal ArticleDOI
TL;DR: This paper provides tight lower and upper bounds on the number of auxiliary variables needed in the worst-case for general objective functions, for bounded-degree functions, and for a restricted class of quadratizations.
Abstract: Very large nonlinear unconstrained binary optimization problems arise in a broad array of applications. Several exact or heuristic techniques have proved quite successful for solving many of these problems when the objective function is a quadratic polynomial. However, no similarly efficient methods are available for the higher degree case. Since high degree objectives are becoming increasingly important in certain application areas, such as computer vision, various techniques have been recently developed to reduce the general case to the quadratic one, at the cost of increasing the number of variables by introducing additional auxiliary variables. In this paper we initiate a systematic study of these quadratization approaches. We provide tight lower and upper bounds on the number of auxiliary variables needed in the worst-case for general objective functions, for bounded-degree functions, and for a restricted class of quadratizations. Our upper bounds are constructive, thus yielding new quadratization procedures. Finally, we completely characterize all "minimal" quadratizations of negative monomials.

Journal ArticleDOI
TL;DR: A recent series of papers has examined the extension of disjunctive-programming techniques to mixed-integer second-order-cone programming as discussed by the authors, and it has been shown that the convex hull of the intersection of an ellipsoid and a split disjunction is a disjoint, and that it is possible to compute the disjunct with respect to a convex shape.
Abstract: A recent series of papers has examined the extension of disjunctive-programming techniques to mixed-integer second-order-cone programming. For example, it has been shown--by several authors using different techniques--that the convex hull of the intersection of an ellipsoid, $$\mathcal {E}$$E, and a split disjunction, $$(l - x_j)(x_j - u) \le 0$$(l-xj)(xj-u)≤0 with $$l < u$$l

Journal ArticleDOI
TL;DR: A unified formulation of the two-stage SNEP with risk-averse players and convex quadratic recourse functions is introduced and a generalized diagonal dominance condition on the players’ smoothed objective functions is imposed that facilitates the application and ensures the convergence of an iterative best-response scheme.
Abstract: This paper formally introduces and studies a non-cooperative multi-agent game under uncertainty. The well-known Nash equilibrium is employed as the solution concept of the game. While there are several formulations of a stochastic Nash equilibrium problem, we focus mainly on a two-stage setting of the game wherein each agent is risk-averse and solves a rival-parameterized stochastic program with quadratic recourse. In such a game, each agent takes deterministic actions in the first stage and recourse decisions in the second stage after the uncertainty is realized. Each agent's overall objective consists of a deterministic first-stage component plus a second-stage mean-risk component defined by a coherent risk measure describing the agent's risk aversion. We direct our analysis towards a broad class of quantile-based risk measures and linear-quadratic recourse functions. For this class of non-cooperative games under uncertainty, the agents' objective functions can be shown to be convex in their own decision variables, provided that the deterministic component of these functions have the same convexity property. Nevertheless, due to the non-differentiability of the recourse functions, the agents' objective functions are at best directionally differentiable. Such non-differentiability creates multiple challenges for the analysis and solution of the game, two principal ones being: (1) a stochastic multi-valued variational inequality is needed to characterize a Nash equilibrium, provided that the players' optimization problems are convex; (2) one needs to be careful in the design of algorithms that require differentiability of the objectives. Moreover, the resulting (multi-valued) variational formulation cannot be expected to be of the monotone type in general. The main contributions of this paper are as follows: (a) Prior to addressing the main problem of the paper, we summarize several approaches that have existed in the literature to deal with uncertainty in a non-cooperative game. (b) We introduce a unified formulation of the two-stage SNEP with risk-averse players and convex quadratic recourse functions and highlight the technical challenges in dealing with this game. (c) To handle the lack of smoothness, we propose smoothing schemes and regularization that lead to differentiable approximations. (d) To deal with non-monotonicity, we impose a generalized diagonal dominance condition on the players' smoothed objective functions that facilitates the application and ensures the convergence of an iterative best-response scheme. (e) To handle the expectation operator, we rely on known methods in stochastic programming that include sampling and approximation. (f) We provide convergence results for various versions of the best-response scheme, particularly for the case of private recourse functions. Overall, this paper lays the foundation for future research into the class of SNEPs that provides a constructive paradigm for modeling and solving competitive decision making problems with risk-averse players facing uncertainty; this paradigm is very much at an infancy stage of research and requires extensive treatment in order to meet its broad applications in many engineering and economics domains.

Journal ArticleDOI
TL;DR: In this paper, a spatial branch-and-cut approach for nonconvex quadratically constrained quadratic programs with bounded complex variables (CQCQP) was developed.
Abstract: We develop a spatial branch-and-cut approach for nonconvex quadratically constrained quadratic programs with bounded complex variables (CQCQP). Linear valid inequalities are added at each node of the search tree to strengthen semidefinite programming relaxations of CQCQP. These valid inequalities are derived from the convex hull description of a nonconvex set of $$2 \times 2$$ positive semidefinite Hermitian matrices subject to a rank-one constraint. We propose branching rules based on an alternative to the rank-one constraint that allows for local measurement of constraint violation. Closed-form bound tightening procedures are used to reduce the domain of the problem. We apply the algorithm to solve the alternating current optimal power flow problem with complex variables as well as the box-constrained quadratic programming problem with real variables.

Journal ArticleDOI
TL;DR: It is shown that, under some mild conditions, ALD using any norm as the augmenting function is able to close the duality gap of an MIP with a finite penalty coefficient, which generalizes the result in Boland and Eberhard (2015) from pure integer programming problems with bounded feasible region to general MIPs.
Abstract: We investigate the augmented Lagrangian dual (ALD) for mixed integer linear programming (MIP) problems. ALD modifies the classical Lagrangian dual by appending a nonlinear penalty function on the violation of the dualized constraints in order to reduce the duality gap. We first provide a primal characterization for ALD for MIPs and prove that ALD is able to asymptotically achieve zero duality gap when the weight on the penalty function is allowed to go to infinity. This provides an alternative characterization and proof of a recent result in Boland and Eberhard (Math Program 150(2):491---509, 2015, Proposition 3). We further show that, under some mild conditions, ALD using any norm as the augmenting function is able to close the duality gap of an MIP with a finite penalty coefficient. This generalizes the result in Boland and Eberhard (2015, Corollary 1) from pure integer programming problems with bounded feasible region to general MIPs. We also present an example where ALD with a quadratic augmenting function is not able to close the duality gap for any finite penalty coefficient.

Journal ArticleDOI
TL;DR: In this article, a cutting-plane approach was proposed to solve a class of capacitated multi-period facility location problems, where the primal block-angular structure of the resulting linear optimization problems is exploited by the interior-point method, allowing the efficient solution of large instances.
Abstract: We propose a cutting-plane approach (namely, Benders decomposition) for a class of capacitated multi-period facility location problems. The novelty of this approach lies on the use of a specialized interior-point method for solving the Benders subproblems. The primal block-angular structure of the resulting linear optimization problems is exploited by the interior-point method, allowing the (either exact or inexact) efficient solution of large instances. The consequences of different modeling conditions and problem specifications on the computational performance are also investigated both theoretically and empirically, providing a deeper understanding of the significant factors influencing the overall efficiency of the cutting-plane method. The methodology proposed allowed the solution of instances of up to 200 potential locations, one million customers and three periods, resulting in mixed integer linear optimization problems of up to 600 binary and 600 millions of continuous variables. Those problems were solved by the specialized approach in less than one hour and a half, outperforming other state-of-the-art methods, which exhausted the (144 GB of) available memory in the largest instances.

Journal ArticleDOI
TL;DR: This paper studies the connected subgraph polytope of a graph, which is the convex hull of subsets of vertices that induce a connected sub graph.
Abstract: In many network applications, one searches for a connected subset of vertices that exhibits other desirable properties. To this end, this paper studies the connected subgraph polytope of a graph, which is the convex hull of subsets of vertices that induce a connected subgraph. Much of our work is devoted to the study of two nontrivial classes of valid inequalities. The first are the a, b-separator inequalities, which have been successfully used to enforce connectivity in previous computational studies. The second are the indegree inequalities, which have previously been shown to induce all nontrivial facets for trees. We determine the precise conditions under which these inequalities induce facets and when each class fully describes the connected subgraph polytope. Both classes of inequalities can be separated in polynomial time and admit compact extended formulations. However, while the a, b-separator inequalities can be lifted in linear time, it is NP-hard to lift the indegree inequalities.

Journal ArticleDOI
TL;DR: The usefulness of the reduced semidefinite programming bounds in estimating the expected range of random variables with two applications arising in random walks and best–worst choice models is illustrated.
Abstract: We show that the complexity of computing the second order moment bound on the expected optimal value of a mixed integer linear program with a random objective coefficient vector is closely related to the complexity of characterizing the convex hull of the points $$\{{1 \atopwithdelims (){\varvec{x}}}{1 \atopwithdelims (){\varvec{x}}}' \ | \ {\varvec{x}} \in {\mathcal {X}}\}$${1x1xź|xźX} where $${\mathcal {X}}$$X is the feasible region. In fact, we can replace the completely positive programming formulation for the moment bound on $${\mathcal {X}}$$X, with an associated semidefinite program, provided we have a linear or a semidefinite representation of this convex hull. As an application of the result, we identify a new polynomial time solvable semidefinite relaxation of the distributionally robust multi-item newsvendor problem by exploiting results from the Boolean quadric polytope. For $${\mathcal {X}}$$X described explicitly by a finite set of points, our formulation leads to a reduction in the size of the semidefinite program. We illustrate the usefulness of the reduced semidefinite programming bounds in estimating the expected range of random variables with two applications arising in random walks and best---worst choice models.

Journal ArticleDOI
TL;DR: Any local solution (stationary point) is a sparse estimator, under some conditions on the parameters of the folded concave penalties, and under the restricted eigenvalue condition, any S$$^3$$3ONC solution with a better objective value than the Lasso solution entails the strong oracle property.
Abstract: This paper concerns the folded concave penalized sparse linear regression (FCPSLR), a class of popular sparse recovery methods. Although FCPSLR yields desirable recovery performance when solved globally, computing a global solution is NP-complete. Despite some existing statistical performance analyses on local minimizers or on specific FCPSLR-based learning algorithms, it still remains open questions whether local solutions that are known to admit fully polynomial-time approximation schemes (FPTAS) may already be sufficient to ensure the statistical performance, and whether that statistical performance can be non-contingent on the specific designs of computing procedures. To address the questions, this paper presents the following threefold results: (i) Any local solution (stationary point) is a sparse estimator, under some conditions on the parameters of the folded concave penalties. (ii) Perhaps more importantly, any local solution satisfying a significant subspace second-order necessary condition (S3ONC), which is weaker than the second-order KKT condition, yields a bounded error in approximating the true parameter with high probability. In addition, if the minimal signal strength is sufficient, the S3ONC solution likely recovers the oracle solution. This result also explicates that the goal of improving the statistical performance is consistent with the optimization criteria of minimizing the suboptimality gap in solving the non-convex programming formulation of FCPSLR. (iii) We apply (ii) to the special case of FCPSLR with minimax concave penalty (MCP) and show that under the restricted eigenvalue condition, any S3ONC solution with a better objective value than the Lasso solution entails the strong oracle property. In addition, such a solution generates a model error (ME) comparable to the optimal but exponential-time sparse estimator given a sufficient sample size, while the worst-case ME is comparable to the Lasso in general. Furthermore, to guarantee the S3ONC admits FPTAS.