Showing papers in &quot;Mathematical Programming in 2020&quot;

Chordal decomposition in operator-splitting methods for sparse semidefinite programs

TL;DR: In this paper, the authors present a new class of decentralized primal-dual type algorithms, namely the decentralized communication sliding (DCS) methods, which can skip the inter-node communications while agents solve the primal subproblems iteratively through linearizations of their local objective functions.

...read moreread less

Abstract: We present a new class of decentralized first-order methods for nonsmooth and stochastic optimization problems defined over multiagent networks. Considering that communication is a major bottleneck in decentralized optimization, our main goal in this paper is to develop algorithmic frameworks which can significantly reduce the number of inter-node communications. Our major contribution is to present a new class of decentralized primal–dual type algorithms, namely the decentralized communication sliding (DCS) methods, which can skip the inter-node communications while agents solve the primal subproblems iteratively through linearizations of their local objective functions. By employing DCS, agents can find an $$\epsilon $$-solution both in terms of functional optimality gap and feasibility residual in $${{\mathcal {O}}}(1/\epsilon )$$ (resp., $${{\mathcal {O}}}(1/\sqrt{\epsilon })$$) communication rounds for general convex functions (resp., strongly convex functions), while maintaining the $${{\mathcal {O}}}(1/\epsilon ^2)$$ (resp., $$\mathcal{O}(1/\epsilon )$$) bound on the total number of intra-node subgradient evaluations. We also present a stochastic counterpart for these algorithms, denoted by SDCS, for solving stochastic optimization problems whose objective function cannot be evaluated exactly. In comparison with existing results for decentralized nonsmooth and stochastic optimization, we can reduce the total number of inter-node communication rounds by orders of magnitude while still maintaining the optimal complexity bounds on intra-node stochastic subgradient evaluations. The bounds on the (stochastic) subgradient evaluations are actually comparable to those required for centralized nonsmooth and stochastic optimization under certain conditions on the target accuracy.

...read moreread less

219 citations

Journal Article•DOI•

[...]

Yang Zheng¹, Giovanni Fantuzzi², Antonis Papachristodoulou¹, Paul J. Goulart¹, Andrew Wynn² - Show less +1 more•Institutions (2)

University of Oxford¹, Imperial College London²

First-order optimization algorithms via inertial systems with Hessian driven damping

TL;DR: Chordal decomposition is employed to reformulate a large and sparse semidefinite program (SDP), either in primal or dual standard form, into an equivalent SDP with smaller positive semideFinite (PSD) constraints, enabling the development of efficient and scalable algorithms.

...read moreread less

Abstract: We employ chordal decomposition to reformulate a large and sparse semidefinite program (SDP), either in primal or dual standard form, into an equivalent SDP with smaller positive semidefinite (PSD) constraints. In contrast to previous approaches, the decomposed SDP is suitable for the application of first-order operator-splitting methods, enabling the development of efficient and scalable algorithms. In particular, we apply the alternating direction method of multipliers (ADMM) to solve decomposed primal- and dual-standard-form SDPs. Each iteration of such ADMM algorithms requires a projection onto an affine subspace, and a set of projections onto small PSD cones that can be computed in parallel. We also formulate the homogeneous self-dual embedding (HSDE) of a primal-dual pair of decomposed SDPs, and extend a recent ADMM-based algorithm to exploit the structure of our HSDE. The resulting HSDE algorithm has the same leading-order computational cost as those for the primal or dual problems only, with the advantage of being able to identify infeasible problems and produce an infeasibility certificate. All algorithms are implemented in the open-source MATLAB solver CDCS. Numerical experiments on a range of large-scale SDPs demonstrate the computational advantages of the proposed methods compared to common state-of-the-art solvers.

...read moreread less

93 citations

Journal Article•DOI•

[...]

Hedy Attouch¹, Zaki Chbani², Jalal M. Fadili¹, Hassan Riahi²•Institutions (2)

Centre national de la recherche scientifique¹, Cadi Ayyad University²

16 Nov 2020-Mathematical Programming

TL;DR: On the basis of a regularization technique using the Moreau envelope, a class of first-order algorithms involving inertial features involving both viscous and Hessian-driven dampings are extended to non-smooth convex functions with extended real values.

...read moreread less

Abstract: In a Hilbert space setting, for convex optimization, we analyze the convergence rate of a class of first-order algorithms involving inertial features. They can be interpreted as discrete time versions of inertial dynamics involving both viscous and Hessian-driven dampings. The geometrical damping driven by the Hessian intervenes in the dynamics in the form $$ abla ^2 f (x(t)) \dot{x} (t)$$ . By treating this term as the time derivative of $$ abla f (x (t)) $$ , this gives, in discretized form, first-order algorithms in time and space. In addition to the convergence properties attached to Nesterov-type accelerated gradient methods, the algorithms thus obtained are new and show a rapid convergence towards zero of the gradients. On the basis of a regularization technique using the Moreau envelope, we extend these methods to non-smooth convex functions with extended real values. The introduction of time scale factors makes it possible to further accelerate these algorithms. We also report numerical results on structured problems to support our theoretical findings.

...read moreread less

77 citations

Journal Article•DOI•

A generic exact solver for vehicle routing and related problems

[...]

Artur Alves Pessoa¹, Ruslan Sadykov², Eduardo Uchoa¹, François Vanderbeck•Institutions (2)

Federal Fluminense University¹, French Institute for Research in Computer Science and Automation²

01 Sep 2020-Mathematical Programming

TL;DR: This work proposes a BCP solver for a generic model that encompasses a wide class of VRPs and incorporates the key elements found in the best existing VRP algorithms: ng-path relaxation, rank-1 cuts with limited memory, path enumeration, and rounded capacity cuts; all generalized through the new concepts of “packing set’ and “elementarity set”.

...read moreread less

Abstract: Major advances were recently obtained in the exact solution of vehicle routing problems (VRPs) Sophisticated branch-cut-and-price (BCP) algorithms for some of the most classical VRP variants now solve many instances with up to a few hundreds of customers However, adapting and reimplementing those successful algorithms for other variants can be a very demanding task This work proposes a BCP solver for a generic model that encompasses a wide class of VRPs It incorporates the key elements found in the best existing VRP algorithms: ng-path relaxation, rank-1 cuts with limited memory, path enumeration, and rounded capacity cuts; all generalized through the new concepts of “packing set” and “elementarity set” The concepts are also used to derive a branching rule based on accumulated resource consumption and to generalize the Ryan and Foster branching rule Extensive experiments on several variants show that the generic solver has an excellent overall performance, in many problems being better than the best specific algorithms Even some non-VRPs, like bin packing, vector packing and generalized assignment, can be modeled and effectively solved

...read moreread less

64 citations

Journal Article•DOI•

On the equivalence of the primal-dual hybrid gradient method and Douglas–Rachford splitting

[...]

Daniel T. O'Connor¹, Lieven Vandenberghe²•Institutions (2)

University of San Francisco¹, University of California, Los Angeles²

Convergence rate of inertial Forward–Backward algorithm beyond Nesterov’s rule

TL;DR: It is shown that the PDHG algorithm can be viewed as a special case of the Douglas–Rachford splitting algorithm for minimizing the sum of two convex functions.

...read moreread less

Abstract: The primal-dual hybrid gradient (PDHG) algorithm proposed by Esser, Zhang, and Chan, and by Pock, Cremers, Bischof, and Chambolle is known to include as a special case the Douglas–Rachford splitting algorithm for minimizing the sum of two convex functions. We show that, conversely, the PDHG algorithm can be viewed as a special case of the Douglas–Rachford splitting algorithm.

...read moreread less

61 citations

Journal Article•DOI•

[...]

Vassilis Apidopoulos, Jean-François Aujol, Charles Dossal

On the optimization landscape of tensor decompositions

TL;DR: This paper shows that for a sequence of over-relaxation parameters, that do not satisfy Nesterov’s rule, one can still expect some relatively fast convergence properties for the objective function.

...read moreread less

Abstract: In this paper we study the convergence of an Inertial Forward-Backward algorithm, with a particular choice of an over-relaxation term. In particular we show that for a sequence of overrrelaxation parameters, that do not satisfy Nesterov’s rule one can still expect some relatively fast convergence properties for the objective function. In addition we complement this work by studying the convergence of the algorithm in the case where the proximal operator is inexactly computed with the presence of some errors and we give sufficient conditions over these errors in order to obtain some convergence properties for the objective function .

...read moreread less

60 citations

Journal Article•DOI•

[...]

Rong Ge¹, Tengyu Ma²•Institutions (2)

Duke University¹, Stanford University²

24 Oct 2020-Mathematical Programming

TL;DR: For the random over-complete tensor decomposition problem, the authors showed that for any small constant > 0, all local optima are (approximately) global optima, i.e., the set of points with function values that are larger than the expectation of the function, all the local maxima are approximate global maxima.

...read moreread less

Abstract: Non-convex optimization with local search heuristics has been widely used in machine learning, achieving many state-of-art results. It becomes increasingly important to understand why they can work for these NP-hard problems on typical data. The landscape of many objective functions in learning has been conjectured to have the geometric property that “all local optima are (approximately) global optima”, and thus they can be solved efficiently by local search algorithms. However, establishing such property can be very difficult. In this paper, we analyze the optimization landscape of the random over-complete tensor decomposition problem, which has many applications in unsupervised learning, especially in learning latent variable models. In practice, it can be efficiently solved by gradient ascent on a non-convex objective. We show that for any small constant $$\varepsilon > 0$$ , among the set of points with function values $$(1+\varepsilon )$$ -factor larger than the expectation of the function, all the local maxima are approximate global maxima. Previously, the best-known result only characterizes the geometry in small neighborhoods around the true components. Our result implies that even with an initialization that is barely better than the random guess, the gradient ascent algorithm is guaranteed to solve this problem. However, achieving such a initialization with random guess would still require super-polynomial number of attempts. Our main technique uses Kac–Rice formula and random matrix theory. To our best knowledge, this is the first time when Kac–Rice formula is successfully applied to counting the number of local optima of a highly-structured random polynomial with dependent coefficients.

...read moreread less

58 citations

Journal Article•DOI•

A Newton-CG algorithm with complexity guarantees for smooth unconstrained optimization

[...]

Clément W. Royer¹, Michael O'Neill¹, Stephen J. Wright¹•Institutions (1)

University of Wisconsin-Madison¹

An efficient Hessian based algorithm for solving large-scale sparse group Lasso problems

TL;DR: In this article, an iterative algorithm based on Newton's method and the linear conjugate gradient algorithm, with explicit detection and use of negative curvature directions for the Hessian of the objective function, was proposed.

...read moreread less

Abstract: We consider minimization of a smooth nonconvex objective function using an iterative algorithm based on Newton’s method and the linear conjugate gradient algorithm, with explicit detection and use of negative curvature directions for the Hessian of the objective function. The algorithm tracks Newton-conjugate gradient procedures developed in the 1980s closely, but includes enhancements that allow worst-case complexity results to be proved for convergence to points that satisfy approximate first-order and second-order optimality conditions. The complexity results match the best known results in the literature for second-order methods.

...read moreread less

52 citations

Journal Article•DOI•

[...]

Yangjing Zhang¹, Ning Zhang², Defeng Sun², Kim-Chuan Toh¹•Institutions (2)

National University of Singapore¹, Hong Kong Polytechnic University²

An Alternative to EM for Gaussian Mixture Models: Batch and Stochastic Riemannian Optimization

TL;DR: In this paper, an efficient augmented Lagrangian method for large-scale non-overlapping sparse group Lasso problems with each subproblem being solved by a superlinearly convergent inexact semismooth Newton method was developed.

...read moreread less

Abstract: The sparse group Lasso is a widely used statistical model which encourages the sparsity both on a group and within the group level. In this paper, we develop an efficient augmented Lagrangian method for large-scale non-overlapping sparse group Lasso problems with each subproblem being solved by a superlinearly convergent inexact semismooth Newton method. Theoretically, we prove that, if the penalty parameter is chosen sufficiently large, the augmented Lagrangian method converges globally at an arbitrarily fast linear rate for the primal iterative sequence, the dual infeasibility, and the duality gap of the primal and dual objective functions. Computationally, we derive explicitly the generalized Jacobian of the proximal mapping associated with the sparse group Lasso regularizer and exploit fully the underlying second order sparsity through the semismooth Newton method. The efficiency and robustness of our proposed algorithm are demonstrated by numerical experiments on both the synthetic and real data sets.

...read moreread less

48 citations

Journal Article•DOI•

[...]

Reshad Hosseini¹, Suvrit Sra²•Institutions (2)

University of Tehran¹, Massachusetts Institute of Technology²

Faster algorithms for extensive-form game solving via improved smoothing functions

TL;DR: An alternative to EM grounded in the Riemannian geometry of positive definite matrices is proposed, and a non-asymptotic convergence analysis for the stochastic method is provided, which is also the first (to the authors' knowledge) such global analysis for Riem Mannian stochastics gradient.

...read moreread less

Abstract: We consider maximum likelihood estimation for Gaussian Mixture Models (Gmm s). This task is almost invariably solved (in theory and practice) via the Expectation Maximization (EM) algorithm. EM owes its success to various factors, of which is its ability to fulfill positive definiteness constraints in closed form is of key importance. We propose an alternative to EM grounded in the Riemannian geometry of positive definite matrices, using which we cast Gmm parameter estimation as a Riemannian optimization problem. Surprisingly, such an out-of-the-box Riemannian formulation completely fails and proves much inferior to EM. This motivates us to take a closer look at the problem geometry, and derive a better formulation that is much more amenable to Riemannian optimization. We then develop Riemannian batch and stochastic gradient algorithms that outperform EM, often substantially. We provide a non-asymptotic convergence analysis for our stochastic method, which is also the first (to our knowledge) such global analysis for Riemannian stochastic gradient. Numerous empirical results are included to demonstrate the effectiveness of our methods.

...read moreread less

47 citations

Journal Article•DOI•

[...]

Christian Kroer¹, Kevin Waugh², Fatma Kılınç-Karzan¹, Tuomas Sandholm¹•Institutions (2)

Carnegie Mellon University¹, University of Alberta²

Optimality conditions and global convergence for nonlinear semidefinite programming

TL;DR: It is shown that, for the first time, the excessive gap technique, a classical FOM, can be made faster than the counterfactual regret minimization algorithm in practice for large games, and that the aggressive stepsize scheme of CFR+ is the only reason that the algorithm is faster in practice.

...read moreread less

Abstract: Sparse iterative methods, in particular first-order methods, are known to be among the most effective in solving large-scale two-player zero-sum extensive-form games. The convergence rates of these methods depend heavily on the properties of the distance-generating function that they are based on. We investigate both the theoretical and practical performance improvement of first-order methods (FOMs) for solving extensive-form games through better design of the dilated entropy function—a class of distance-generating functions related to the domains associated with the extensive-form games. By introducing a new weighting scheme for the dilated entropy function, we develop the first distance-generating function for the strategy spaces of sequential games that has only a logarithmic dependence on the branching factor of the player. This result improves the overall convergence rate of several FOMs working with dilated entropy function by a factor of $$\Omega (b^dd)$$, where b is the branching factor of the player, and d is the depth of the game tree. Thus far, counterfactual regret minimization methods have been faster in practice, and more popular, than FOMs despite their theoretically inferior convergence rates. Using our new weighting scheme and a practical parameter tuning procedure we show that, for the first time, the excessive gap technique, a classical FOM, can be made faster than the counterfactual regret minimization algorithm in practice for large games, and that the aggressive stepsize scheme of CFR+ is the only reason that the algorithm is faster in practice.

...read moreread less

Journal Article•DOI•

[...]

Roberto Andreani¹, Gabriel Haeser², Daiana S. Viana²•Institutions (2)

State University of Campinas¹, University of São Paulo²

Exact Semidefinite Formulations for a Class of (Random and Non-Random) Nonconvex Quadratic Programs

TL;DR: An augmented Lagrangian algorithm is proposed that generates these types of sequences and new constraint qualifications are proposed, weaker than previously considered ones, which are sufficient for the global convergence of the algorithm to a stationary point.

...read moreread less

Abstract: Sequential optimality conditions have played a major role in unifying and extending global convergence results for several classes of algorithms for general nonlinear optimization. In this paper, we extend theses concepts for nonlinear semidefinite programming. We define two sequential optimality conditions for nonlinear semidefinite programming. The first is a natural extension of the so-called Approximate-Karush–Kuhn–Tucker (AKKT), well known in nonlinear optimization. The second one, called Trace-AKKT, is more natural in the context of semidefinite programming as the computation of eigenvalues is avoided. We propose an augmented Lagrangian algorithm that generates these types of sequences and new constraint qualifications are proposed, weaker than previously considered ones, which are sufficient for the global convergence of the algorithm to a stationary point.

...read moreread less

Journal Article•DOI•

[...]

Samuel Burer¹, Yinyu Ye²•Institutions (2)

University of Iowa¹, Stanford University²

Quantile-based risk sharing with heterogeneous beliefs

TL;DR: In this paper, the authors study a class of quadratically constrained quadratic programs (QCQPs), called diagonal QCQPs, which contain no off-diagonal terms and provide a sufficient condition on the problem data guaranteeing that the basic Shor semidefinite relaxation is exact.

...read moreread less

Abstract: We study a class of quadratically constrained quadratic programs (QCQPs), called diagonal QCQPs, which contain no off-diagonal terms $$x_j x_k$$ for $$j e k$$ , and we provide a sufficient condition on the problem data guaranteeing that the basic Shor semidefinite relaxation is exact. Our condition complements and refines those already present in the literature and can be checked in polynomial time. We then extend our analysis from diagonal QCQPs to general QCQPs, i.e., ones with no particular structure. By reformulating a general QCQP into diagonal form, we establish new, polynomial-time-checkable sufficient conditions for the semidefinite relaxations of general QCQPs to be exact. Finally, these ideas are extended to show that a class of random general QCQPs has exact semidefinite relaxations with high probability as long as the number of constraints grows no faster than a fixed polynomial in the number of variables. To the best of our knowledge, this is the first result establishing the exactness of the semidefinite relaxation for random general QCQPs.

...read moreread less

Journal Article•DOI•

[...]

Paul Embrechts¹, Paul Embrechts², Haiyan Liu³, Tiantian Mao⁴, Ruodu Wang⁵ - Show less +1 more•Institutions (5)

ETH Zurich¹, Swiss Finance Institute², Michigan State University³, University of Science and Technology of China⁴, University of Waterloo⁵

01 Jun 2020-Mathematical Programming

TL;DR: In this paper, the authors study risk sharing problems with quantile-based risk measures and heterogeneous beliefs, motivated by the use of internal models in finance and insurance, and obtain explicit forms of Pareto-optimal allocations and competitive equilibria by solving various optimization problems.

...read moreread less

Abstract: We study risk sharing problems with quantile-based risk measures and heterogeneous beliefs, motivated by the use of internal models in finance and insurance. Explicit forms of Pareto-optimal allocations and competitive equilibria are obtained by solving various optimization problems. For Expected Shortfall (ES) agents, Pareto-optimal allocations are shown to be equivalent to equilibrium allocations, and the equilibrium pricing measure is unique. For Value-at-Risk (VaR) agents or mixed VaR and ES agents, a competitive equilibrium does not exist. Our results generalize existing ones on risk sharing problems with risk measures and belief homogeneity, and draw an interesting connection to early work on optimization properties of ES and VaR.

...read moreread less

Journal Article•DOI•

On the efficient computation of a generalized Jacobian of the projector over the Birkhoff polytope

[...]

Xudong Li¹, Defeng Sun², Kim-Chuan Toh³•Institutions (3)

Fudan University¹, Hong Kong Polytechnic University², National University of Singapore³

Stationarity conditions and constraint qualifications for mathematical programs with switching constraints

TL;DR: A highly efficient augmented Lagrangian method (ALM) is designed for solving a class of convex quadratic programming (QP) problems constrained by the Birkhoff polytope and is demonstrated to be much more efficient than Gurobi in solving a collection of QP problems arising from the relaxation ofquadratic assignment problems.

...read moreread less

Abstract: We derive an explicit formula, as well as an efficient procedure, for constructing a generalized Jacobian for the projector of a given square matrix onto the Birkhoff polytope, i.e., the set of doubly stochastic matrices. To guarantee the high efficiency of our procedure, a semismooth Newton method for solving the dual of the projection problem is proposed and efficiently implemented. Extensive numerical experiments are presented to demonstrate the merits and effectiveness of our method by comparing its performance against other powerful solvers such as the commercial software Gurobi and the academic code PPROJ (Hager and Zhang in SIAM J Optim 26:1773–1798, 2016). In particular, our algorithm is able to solve the projection problem with over one billion variables and nonnegative constraints to a very high accuracy in less than 15 min on a modest desktop computer. More importantly, based on our efficient computation of the projections and their generalized Jacobians, we can design a highly efficient augmented Lagrangian method (ALM) for solving a class of convex quadratic programming (QP) problems constrained by the Birkhoff polytope. The resulted ALM is demonstrated to be much more efficient than Gurobi in solving a collection of QP problems arising from the relaxation of quadratic assignment problems.

...read moreread less

Journal Article•DOI•

[...]

Patrick Mehlitz¹•Institutions (1)

Brandenburg University of Technology¹

Improved convergence analysis of Lasserre’s measure-based upper bounds for polynomial minimization on compact sets

TL;DR: This paper introduces suitable notions of weak, Mordukhovich-, and strong stationarity for mathematical programs with switching constraints and presents some associated constraint qualifications, and applies these results to optimization problems with either-or-constraints.

...read moreread less

Abstract: In optimal control, switching structures demanding at most one control to be active at any time instance appear frequently. Discretizing such problems, a so-called mathematical program with switching constraints is obtained. Although these problems are related to other types of disjunctive programs like optimization problems with complementarity or vanishing constraints, their inherent structure makes a separate consideration necessary. Since standard constraint qualifications are likely to fail at the feasible points of switching-constrained optimization problems, stationarity notions which are weaker than the associated Karush–Kuhn–Tucker conditions need to be investigated in order to find applicable necessary optimality conditions. Furthermore, appropriately tailored constraint qualifications need to be formulated. In this paper, we introduce suitable notions of weak, Mordukhovich-, and strong stationarity for mathematical programs with switching constraints and present some associated constraint qualifications. Our findings are exploited to state necessary optimality conditions for (discretized) optimal control problems with switching constraints. Furthermore, we apply our results to optimization problems with either-or-constraints. First, a novel reformulation of such problems using switching constraints is presented. Second, the derived surrogate problem is exploited to obtain necessary optimality conditions for the original program.

...read moreread less

Journal Article•DOI•

[...]

Lucas Slot¹, Monique Laurent², Monique Laurent¹•Institutions (2)

Centrum Wiskunde & Informatica¹, Tilburg University²

25 Jan 2020-Mathematical Programming

TL;DR: This analysis applies to simplices, balls and convex bodies that locally look like a ball, while also allowing for a broader class of reference measures, including the Lebesgue measure.

...read moreread less

Abstract: We consider the problem of computing the minimum value fmin,K of a polynomial f over a compact set K⊆Rn, which can be reformulated as finding a probability measure ν on K minimizing ∫Kfdν. Lasserre showed that it suffices to consider such measures of the form ν=qμ, where q is a sum-of-squares polynomial and μ is a given Borel measure supported on K. By bounding the degree of q by 2r one gets a converging hierarchy of upper bounds f(r) for fmin,K. When K is the hypercube [−1,1]n, equipped with the Chebyshev measure, the parameters f(r) are known to converge to fmin,K at a rate in O(1/r2). We extend this error estimate to a wider class of convex bodies, while also allowing for a broader class of reference measures, including the Lebesgue measure. Our analysis applies to simplices, balls and convex bodies that locally look like a ball. In addition, we show an error estimate in O(logr/r) when K satisfies a minor geometrical condition, and in O(log2r/r2) when K is a convex body, equipped with the Lebesgue measure. This improves upon the currently best known error estimates in O(1/r√) and O(1/r) for these two respective cases.

...read moreread less

Journal Article•DOI•

Necessary conditions for linear convergence of iterated expansive, set-valued mappings

[...]

D. Russell Luke¹, Marc Teboulle², Nguyen Hieu Thao³, Nguyen Hieu Thao⁴•Institutions (4)

University of Göttingen¹, Tel Aviv University², Can Tho University³, Delft University of Technology⁴

Convergence analysis of a Lasserre hierarchy of upper bounds for polynomial minimization on the sphere

TL;DR: In this article, the authors present necessary conditions for monotonicity of fixed point iterations of mappings that may violate the usual nonexpansive property, and specialize these results to the alternating projections iteration where the metric subregularity property takes on a distinct geometric characterization of sets at points of intersection called subtransversality.

...read moreread less

Abstract: We present necessary conditions for monotonicity of fixed point iterations of mappings that may violate the usual nonexpansive property. Notions of linear-type monotonicity of fixed point sequences—weaker than Fejer monotonicity—are shown to imply metric subregularity. This, together with the almost averaging property recently introduced by Luke et al. (Math Oper Res, 2018. https://doi.org/10.1287/moor.2017.0898), guarantees linear convergence of the sequence to a fixed point. We specialize these results to the alternating projections iteration where the metric subregularity property takes on a distinct geometric characterization of sets at points of intersection called subtransversality. Subtransversality is shown to be necessary for linear convergence of alternating projections for consistent feasibility.

...read moreread less

Journal Article•DOI•

[...]

Etienne de Klerk¹, Monique Laurent¹, Monique Laurent²•Institutions (2)

Tilburg University¹, Centrum Wiskunde & Informatica²

21 Jan 2020-Mathematical Programming

TL;DR: The convergence rate of a hierarchy of upper bounds for polynomial minimization problems, proposed by Lasserre (SIAM J Optim 21(3):864-885, 2011), for the special case when the feasible set is the unit (hyper)sphere was studied in this article.

...read moreread less

Abstract: We study the convergence rate of a hierarchy of upper bounds for polynomial minimization problems, proposed by Lasserre (SIAM J Optim 21(3):864–885, 2011), for the special case when the feasible set is the unit (hyper)sphere. The upper bound at level r∈N of the hierarchy is defined as the minimal expected value of the polynomial over all probability distributions on the sphere, when the probability density function is a sum-of-squares polynomial of degree at most 2r with respect to the surface measure. We show that the rate of convergence is O(1/r2) and we give a class of polynomials of any positive degree for which this rate is tight. In addition, we explore the implications for the related rate of convergence for the generalized problem of moments on the sphere.

...read moreread less

Journal Article•DOI•

Distances between optimal solutions of mixed-integer programs

[...]

Joseph Paat¹, Robert Weismantel¹, Stefan Weltge²•Institutions (2)

ETH Zurich¹, Technische Universität München²

When do birds of a feather flock together? k-Means, proximity, and conic programming

TL;DR: A classic result of Cook et al. (Math. Program. 34:251–264, 1986) is shown to be bounded in terms of the number of variables and a parameter Δ, which quantifies sub-determinants of the underlying linear inequalities.

...read moreread less

Abstract: A classic result of Cook et al. (Math. Program. 34:251–264, 1986) bounds the distances between optimal solutions of mixed-integer linear programs and optimal solutions of the corresponding linear relaxations. Their bound is given in terms of the number of variables and a parameter $$ \varDelta $$, which quantifies sub-determinants of the underlying linear inequalities. We show that this distance can be bounded in terms of $$ \varDelta $$ and the number of integer variables rather than the total number of variables. To this end, we make use of a result by Olson (J. Number Theory 1:8–10, 1969) in additive combinatorics and demonstrate how it implies feasibility of certain mixed-integer linear programs. We conjecture that our bound can be improved to a function that only depends on $$ \varDelta $$, in general.

...read moreread less

Journal Article•DOI•

[...]

Xiaodong Li¹, Yang Li¹, Shuyang Ling², Thomas Strohmer¹, Ke Wei³ - Show less +1 more•Institutions (3)

University of California, Davis¹, Courant Institute of Mathematical Sciences², Fudan University³

Outer-product-free sets for polynomial optimization and oracle-based cuts

TL;DR: This work presents an improved proximity condition under which the Peng–Wei relaxation of k-means recovers the underlying clusters exactly and improves upon Kumar and Kannan's proximity condition, which is comparable to that of Awashti and Sheffet.

...read moreread less

Abstract: Given a set of data, one central goal is to group them into clusters based on some notion of similarity between the individual objects. One of the most popular and widely-used approaches is k-means despite the computational hardness to find its global minimum. We study and compare the properties of different convex relaxations by relating them to corresponding proximity conditions, an idea originally introduced by Kumar and Kannan. Using conic duality theory, we present an improved proximity condition under which the Peng–Wei relaxation of k-means recovers the underlying clusters exactly. Our proximity condition improves upon Kumar and Kannan and is comparable to that of Awashti and Sheffet, where proximity conditions are established for projective k-means. In addition, we provide a necessary proximity condition for the exactness of the Peng–Wei relaxation. For the special case of equal cluster sizes, we establish a different and completely localized proximity condition under which the Amini–Levina relaxation yields exact clustering, thereby having addressed an open problem by Awasthi and Sheffet in the balanced case. Our framework is not only deterministic and model-free but also comes with a clear geometric meaning which allows for further analysis and generalization. Moreover, it can be conveniently applied to analyzing various data generative models such as the stochastic ball models and Gaussian mixture models. With this method, we improve the current minimum separation bound for the stochastic ball models and achieve the state-of-the-art results of learning Gaussian mixture models.

...read moreread less

Journal Article•DOI•

[...]

Daniel Bienstock¹, Chen Chen², Gonzalo Muñoz•Institutions (2)

Columbia University¹, Ohio State University²

01 Sep 2020-Mathematical Programming

TL;DR: A pure cutting plane algorithm is constructed which is shown to converge if the initial relaxation is a polyhedron, and a theory of outer-product-free sets, where S is the set of real, symmetric matrices of the form $$xx^T$$ x x T .

...read moreread less

Abstract: This paper introduces cutting planes that involve minimal structural assumptions, enabling the generation of strong polyhedral relaxations for a broad class of problems. We consider valid inequalities for the set $$S\cap P$$ , where S is a closed set, and P is a polyhedron. Given an oracle that provides the distance from a point to S, we construct a pure cutting plane algorithm which is shown to converge if the initial relaxation is a polyhedron. These cuts are generated from convex forbidden zones, or S-free sets, derived from the oracle. We also consider the special case of polynomial optimization. Accordingly we develop a theory of outer-product-free sets, where S is the set of real, symmetric matrices of the form $$xx^T$$ . All maximal outer-product-free sets of full dimension are shown to be convex cones and we identify several families of such sets. These families are used to generate strengthened intersection cuts that can separate any infeasible extreme point of a linear programming relaxation efficiently. Computational experiments demonstrate the promise of our approach.

...read moreread less

Journal Article•DOI•

Multistage distributionally robust mixed-integer programming with decision-dependent moment-based ambiguity sets

[...]

Xian Yu¹, Siqian Shen¹•Institutions (1)

University of Michigan¹

21 Oct 2020-Mathematical Programming

TL;DR: In this paper, the authors study multistage distributionally robust mixed-integer programs under endogenous uncertainty, where the probability distribution of stage-wise uncertainty depends on the decisions made in previous stages.

...read moreread less

Abstract: We study multistage distributionally robust mixed-integer programs under endogenous uncertainty, where the probability distribution of stage-wise uncertainty depends on the decisions made in previous stages. We first consider two ambiguity sets defined by decision-dependent bounds on the first and second moments of uncertain parameters and by mean and covariance matrix that exactly match decision-dependent empirical ones, respectively. For both sets, we show that the subproblem in each stage can be recast as a mixed-integer linear program (MILP). Moreover, we extend the general moment-based ambiguity set in Delage and Ye (Oper Res 58(3):595–612, 2010) to the multistage decision-dependent setting, and derive mixed-integer semidefinite programming (MISDP) reformulations of stage-wise subproblems. We develop methods for attaining lower and upper bounds of the optimal objective value of the multistage MISDPs, and approximate them using a series of MILPs. We deploy the Stochastic Dual Dynamic integer Programming (SDDiP) method for solving the problem under the three ambiguity sets with risk-neutral or risk-averse objective functions, and conduct numerical studies on multistage facility-location instances having diverse sizes under different parameter and uncertainty settings. Our results show that the SDDiP quickly finds optimal solutions for moderate-sized instances under the first two ambiguity sets, and also finds good approximate bounds for the multistage MISDPs derived under the third ambiguity set. We also demonstrate the efficacy of incorporating decision-dependent distributional ambiguity in multistage decision-making processes.

...read moreread less

Journal Article•DOI•

Compact representation of near-optimal integer programming solutions

[...]

Thiago Serra¹, John N. Hooker¹•Institutions (1)

Carnegie Mellon University¹

01 Jul 2020-Mathematical Programming

TL;DR: It is shown how all solutions within a given tolerance of the optimal value can be efficiently and compactly represented in a weighted decision diagram and concludes that postoptimality analysis based on sound-reduced diagrams has the potential to extract significantly more useful information from an integer programming model than was previously feasible.

...read moreread less

Abstract: It is often useful in practice to explore near-optimal solutions of an integer programming problem. We show how all solutions within a given tolerance of the optimal value can be efficiently and compactly represented in a weighted decision diagram. The structure of the diagram facilitates rapid processing of a wide range of queries about the near-optimal solution space, as well as reoptimization after changes in the objective function. We also exploit the paradoxical fact that the diagram can be reduced in size if it is allowed to represent additional solutions. We show that a “sound reduction” operation, applied repeatedly, yields the smallest such diagram that is suitable for postoptimality analysis, and one that is typically far smaller than a tree that represents the same set of near-optimal solutions. We conclude that postoptimality analysis based on sound-reduced diagrams has the potential to extract significantly more useful information from an integer programming model than was previously feasible.

...read moreread less

Journal Article•DOI•

Uniqueness of DRS as the 2 operator resolvent-splitting and impossibility of 3 operator resolvent-splitting

[...]

Ernest K. Ryu¹•Institutions (1)

University of California, Los Angeles¹

01 Jul 2020-Mathematical Programming

TL;DR: In this paper, a 3-operator resolvent-splitting with provably minimal lifting is presented, which can directly generalize DRS to 3-operators without lifting, where lifting roughly corresponds to enlarging the problem size.

...read moreread less

Abstract: Given the success of Douglas–Rachford splitting (DRS), it is natural to ask whether DRS can be generalized. Are there other 2 operator resolvent-splittings sharing the favorable properties of DRS? Can DRS be generalized to 3 operators? This work presents the answers: no and no. In a certain sense, DRS is the unique 2 operator resolvent-splitting, and generalizing DRS to 3 operators is impossible without lifting, where lifting roughly corresponds to enlarging the problem size. The impossibility result further raises a question. How much lifting is necessary to generalize DRS to 3 operators? This work presents the answer by providing a novel 3 operator resolvent-splitting with provably minimal lifting that directly generalizes DRS.

...read moreread less

Journal Article•DOI•

MIDAS: A mixed integer dynamic approximation scheme

[...]

Andy Philpott¹, F. Wahid, Joseph Frédéric Bonnans²•Institutions (2)

University of Auckland¹, École Polytechnique²

Derivative-free robust optimization by outer approximations

TL;DR: This work provides a general description of MIDAS, and proves its almost-sure convergence to a 2 T ε -optimal policy for problems with T stages when the Bellman functions are known to be monotonic, and the sampling process satisfies standard assumptions.

...read moreread less

Abstract: Mixed integer dynamic approximation scheme (MIDAS) is a new sampling-based algorithm for solving finite-horizon stochastic dynamic programs with monotonic Bellman functions. MIDAS approximates these value functions using step functions, leading to stage problems that are mixed integer programs. We provide a general description of MIDAS, and prove its almost-sure convergence to a $$2T\varepsilon $$-optimal policy for problems with T stages when the Bellman functions are known to be monotonic, and the sampling process satisfies standard assumptions.

...read moreread less

Journal Article•DOI•

[...]

Matt Menickelly¹, Stefan M. Wild¹•Institutions (1)

Argonne National Laboratory¹

Combinatorial n -fold integer programming and applications

TL;DR: An algorithm for minimax problems that arise in robust optimization in the absence of objective function derivatives that utilizes an extension of methods for inexact outer approximation in sampling a potentially infinite-cardinality uncertainty set is developed.

...read moreread less

Abstract: We develop an algorithm for minimax problems that arise in robust optimization in the absence of objective function derivatives. The algorithm utilizes an extension of methods for inexact outer approximation in sampling a potentially infinite-cardinality uncertainty set. Clarke stationarity of the algorithm output is established alongside desirable features of the model-based trust-region subproblems encountered. We demonstrate the practical benefits of the algorithm on a new class of test problems.

...read moreread less

Journal Article•DOI•

[...]

Dušan Knop¹, Martin Koutecký², Martin Koutecký³, Matthias Mnich⁴•Institutions (4)

Czech Technical University in Prague¹, Charles University in Prague², Technion – Israel Institute of Technology³, University of Bonn⁴

01 Nov 2020-Mathematical Programming

TL;DR: A single-exponential algorithm for so-called combinatorial n-fold integer programs, which is remarkably similar to prior ILP formulations for various problems, but unlike them, also allow variable dimension.

...read moreread less

Abstract: Many fundamental $$\mathsf {NP}$$ -hard problems can be formulated as integer linear programs (ILPs). A famous algorithm by Lenstra solves ILPs in time that is exponential only in the dimension of the program, and polynomial in the size of the ILP. That algorithm became a ubiquitous tool in the design of fixed-parameter algorithms for $$\mathsf {NP}$$ -hard problems, where one wishes to isolate the hardness of a problem by some parameter. However, in many cases using Lenstra’s algorithm has two drawbacks: First, the run time of the resulting algorithms is often double-exponential in the parameter, and second, an ILP formulation in small dimension cannot easily express problems involving many different costs. Inspired by the work of Hemmecke et al. (Math Program 137(1–2, Ser. A):325–341, 2013), we develop a single-exponential algorithm for so-called combinatorial n-fold integer programs, which are remarkably similar to prior ILP formulations for various problems, but unlike them, also allow variable dimension. We then apply our algorithm to many relevant problems problems like Closest String, Swap Bribery, Weighted Set Multicover, and several others, and obtain exponential speedups in the dependence on the respective parameters, the input size, or both. Unlike Lenstra’s algorithm, which is essentially a bounded search tree algorithm, our result uses the technique of augmenting steps. At its heart is a deep result stating that in combinatorial n-fold IPs, existence of an augmenting step implies existence of a “local” augmenting step, which can be found using dynamic programming. Our results provide an important insight into many problems by showing that they exhibit this phenomenon, and highlights the importance of augmentation techniques.

...read moreread less

Journal Article•DOI•

Fully dynamic bin packing revisited

[...]

Sebastian Berndt¹, Klaus Jansen¹, Kim-Manuel Klein²•Institutions (2)

University of Kiel¹, École Polytechnique Fédérale de Lausanne²