scispace - formally typeset
Search or ask a question

Showing papers in "Mathematical Programming in 2020"


Journal ArticleDOI
TL;DR: In this paper, the authors present a new class of decentralized primal-dual type algorithms, namely the decentralized communication sliding (DCS) methods, which can skip the inter-node communications while agents solve the primal subproblems iteratively through linearizations of their local objective functions.
Abstract: We present a new class of decentralized first-order methods for nonsmooth and stochastic optimization problems defined over multiagent networks. Considering that communication is a major bottleneck in decentralized optimization, our main goal in this paper is to develop algorithmic frameworks which can significantly reduce the number of inter-node communications. Our major contribution is to present a new class of decentralized primal–dual type algorithms, namely the decentralized communication sliding (DCS) methods, which can skip the inter-node communications while agents solve the primal subproblems iteratively through linearizations of their local objective functions. By employing DCS, agents can find an $$\epsilon $$-solution both in terms of functional optimality gap and feasibility residual in $${{\mathcal {O}}}(1/\epsilon )$$ (resp., $${{\mathcal {O}}}(1/\sqrt{\epsilon })$$) communication rounds for general convex functions (resp., strongly convex functions), while maintaining the $${{\mathcal {O}}}(1/\epsilon ^2)$$ (resp., $$\mathcal{O}(1/\epsilon )$$) bound on the total number of intra-node subgradient evaluations. We also present a stochastic counterpart for these algorithms, denoted by SDCS, for solving stochastic optimization problems whose objective function cannot be evaluated exactly. In comparison with existing results for decentralized nonsmooth and stochastic optimization, we can reduce the total number of inter-node communication rounds by orders of magnitude while still maintaining the optimal complexity bounds on intra-node stochastic subgradient evaluations. The bounds on the (stochastic) subgradient evaluations are actually comparable to those required for centralized nonsmooth and stochastic optimization under certain conditions on the target accuracy.

219 citations


Journal ArticleDOI
TL;DR: Chordal decomposition is employed to reformulate a large and sparse semidefinite program (SDP), either in primal or dual standard form, into an equivalent SDP with smaller positive semideFinite (PSD) constraints, enabling the development of efficient and scalable algorithms.
Abstract: We employ chordal decomposition to reformulate a large and sparse semidefinite program (SDP), either in primal or dual standard form, into an equivalent SDP with smaller positive semidefinite (PSD) constraints. In contrast to previous approaches, the decomposed SDP is suitable for the application of first-order operator-splitting methods, enabling the development of efficient and scalable algorithms. In particular, we apply the alternating direction method of multipliers (ADMM) to solve decomposed primal- and dual-standard-form SDPs. Each iteration of such ADMM algorithms requires a projection onto an affine subspace, and a set of projections onto small PSD cones that can be computed in parallel. We also formulate the homogeneous self-dual embedding (HSDE) of a primal-dual pair of decomposed SDPs, and extend a recent ADMM-based algorithm to exploit the structure of our HSDE. The resulting HSDE algorithm has the same leading-order computational cost as those for the primal or dual problems only, with the advantage of being able to identify infeasible problems and produce an infeasibility certificate. All algorithms are implemented in the open-source MATLAB solver CDCS. Numerical experiments on a range of large-scale SDPs demonstrate the computational advantages of the proposed methods compared to common state-of-the-art solvers.

93 citations


Journal ArticleDOI
TL;DR: On the basis of a regularization technique using the Moreau envelope, a class of first-order algorithms involving inertial features involving both viscous and Hessian-driven dampings are extended to non-smooth convex functions with extended real values.
Abstract: In a Hilbert space setting, for convex optimization, we analyze the convergence rate of a class of first-order algorithms involving inertial features. They can be interpreted as discrete time versions of inertial dynamics involving both viscous and Hessian-driven dampings. The geometrical damping driven by the Hessian intervenes in the dynamics in the form $$ abla ^2 f (x(t)) \dot{x} (t)$$ . By treating this term as the time derivative of $$ abla f (x (t)) $$ , this gives, in discretized form, first-order algorithms in time and space. In addition to the convergence properties attached to Nesterov-type accelerated gradient methods, the algorithms thus obtained are new and show a rapid convergence towards zero of the gradients. On the basis of a regularization technique using the Moreau envelope, we extend these methods to non-smooth convex functions with extended real values. The introduction of time scale factors makes it possible to further accelerate these algorithms. We also report numerical results on structured problems to support our theoretical findings.

77 citations


Journal ArticleDOI
TL;DR: This work proposes a BCP solver for a generic model that encompasses a wide class of VRPs and incorporates the key elements found in the best existing VRP algorithms: ng-path relaxation, rank-1 cuts with limited memory, path enumeration, and rounded capacity cuts; all generalized through the new concepts of “packing set’ and “elementarity set”.
Abstract: Major advances were recently obtained in the exact solution of vehicle routing problems (VRPs) Sophisticated branch-cut-and-price (BCP) algorithms for some of the most classical VRP variants now solve many instances with up to a few hundreds of customers However, adapting and reimplementing those successful algorithms for other variants can be a very demanding task This work proposes a BCP solver for a generic model that encompasses a wide class of VRPs It incorporates the key elements found in the best existing VRP algorithms: ng-path relaxation, rank-1 cuts with limited memory, path enumeration, and rounded capacity cuts; all generalized through the new concepts of “packing set” and “elementarity set” The concepts are also used to derive a branching rule based on accumulated resource consumption and to generalize the Ryan and Foster branching rule Extensive experiments on several variants show that the generic solver has an excellent overall performance, in many problems being better than the best specific algorithms Even some non-VRPs, like bin packing, vector packing and generalized assignment, can be modeled and effectively solved

64 citations


Journal ArticleDOI
TL;DR: It is shown that the PDHG algorithm can be viewed as a special case of the Douglas–Rachford splitting algorithm for minimizing the sum of two convex functions.
Abstract: The primal-dual hybrid gradient (PDHG) algorithm proposed by Esser, Zhang, and Chan, and by Pock, Cremers, Bischof, and Chambolle is known to include as a special case the Douglas–Rachford splitting algorithm for minimizing the sum of two convex functions. We show that, conversely, the PDHG algorithm can be viewed as a special case of the Douglas–Rachford splitting algorithm.

61 citations


Journal ArticleDOI
TL;DR: This paper shows that for a sequence of over-relaxation parameters, that do not satisfy Nesterov’s rule, one can still expect some relatively fast convergence properties for the objective function.
Abstract: In this paper we study the convergence of an Inertial Forward-Backward algorithm, with a particular choice of an over-relaxation term. In particular we show that for a sequence of overrrelaxation parameters, that do not satisfy Nesterov’s rule one can still expect some relatively fast convergence properties for the objective function. In addition we complement this work by studying the convergence of the algorithm in the case where the proximal operator is inexactly computed with the presence of some errors and we give sufficient conditions over these errors in order to obtain some convergence properties for the objective function .

60 citations


Journal ArticleDOI
TL;DR: For the random over-complete tensor decomposition problem, the authors showed that for any small constant > 0, all local optima are (approximately) global optima, i.e., the set of points with function values that are larger than the expectation of the function, all the local maxima are approximate global maxima.
Abstract: Non-convex optimization with local search heuristics has been widely used in machine learning, achieving many state-of-art results. It becomes increasingly important to understand why they can work for these NP-hard problems on typical data. The landscape of many objective functions in learning has been conjectured to have the geometric property that “all local optima are (approximately) global optima”, and thus they can be solved efficiently by local search algorithms. However, establishing such property can be very difficult. In this paper, we analyze the optimization landscape of the random over-complete tensor decomposition problem, which has many applications in unsupervised learning, especially in learning latent variable models. In practice, it can be efficiently solved by gradient ascent on a non-convex objective. We show that for any small constant $$\varepsilon > 0$$ , among the set of points with function values $$(1+\varepsilon )$$ -factor larger than the expectation of the function, all the local maxima are approximate global maxima. Previously, the best-known result only characterizes the geometry in small neighborhoods around the true components. Our result implies that even with an initialization that is barely better than the random guess, the gradient ascent algorithm is guaranteed to solve this problem. However, achieving such a initialization with random guess would still require super-polynomial number of attempts. Our main technique uses Kac–Rice formula and random matrix theory. To our best knowledge, this is the first time when Kac–Rice formula is successfully applied to counting the number of local optima of a highly-structured random polynomial with dependent coefficients.

58 citations


Journal ArticleDOI
TL;DR: In this article, an iterative algorithm based on Newton's method and the linear conjugate gradient algorithm, with explicit detection and use of negative curvature directions for the Hessian of the objective function, was proposed.
Abstract: We consider minimization of a smooth nonconvex objective function using an iterative algorithm based on Newton’s method and the linear conjugate gradient algorithm, with explicit detection and use of negative curvature directions for the Hessian of the objective function. The algorithm tracks Newton-conjugate gradient procedures developed in the 1980s closely, but includes enhancements that allow worst-case complexity results to be proved for convergence to points that satisfy approximate first-order and second-order optimality conditions. The complexity results match the best known results in the literature for second-order methods.

52 citations


Journal ArticleDOI
TL;DR: In this paper, an efficient augmented Lagrangian method for large-scale non-overlapping sparse group Lasso problems with each subproblem being solved by a superlinearly convergent inexact semismooth Newton method was developed.
Abstract: The sparse group Lasso is a widely used statistical model which encourages the sparsity both on a group and within the group level. In this paper, we develop an efficient augmented Lagrangian method for large-scale non-overlapping sparse group Lasso problems with each subproblem being solved by a superlinearly convergent inexact semismooth Newton method. Theoretically, we prove that, if the penalty parameter is chosen sufficiently large, the augmented Lagrangian method converges globally at an arbitrarily fast linear rate for the primal iterative sequence, the dual infeasibility, and the duality gap of the primal and dual objective functions. Computationally, we derive explicitly the generalized Jacobian of the proximal mapping associated with the sparse group Lasso regularizer and exploit fully the underlying second order sparsity through the semismooth Newton method. The efficiency and robustness of our proposed algorithm are demonstrated by numerical experiments on both the synthetic and real data sets.

48 citations


Journal ArticleDOI
TL;DR: An alternative to EM grounded in the Riemannian geometry of positive definite matrices is proposed, and a non-asymptotic convergence analysis for the stochastic method is provided, which is also the first (to the authors' knowledge) such global analysis for Riem Mannian stochastics gradient.
Abstract: We consider maximum likelihood estimation for Gaussian Mixture Models (Gmm s). This task is almost invariably solved (in theory and practice) via the Expectation Maximization (EM) algorithm. EM owes its success to various factors, of which is its ability to fulfill positive definiteness constraints in closed form is of key importance. We propose an alternative to EM grounded in the Riemannian geometry of positive definite matrices, using which we cast Gmm parameter estimation as a Riemannian optimization problem. Surprisingly, such an out-of-the-box Riemannian formulation completely fails and proves much inferior to EM. This motivates us to take a closer look at the problem geometry, and derive a better formulation that is much more amenable to Riemannian optimization. We then develop Riemannian batch and stochastic gradient algorithms that outperform EM, often substantially. We provide a non-asymptotic convergence analysis for our stochastic method, which is also the first (to our knowledge) such global analysis for Riemannian stochastic gradient. Numerous empirical results are included to demonstrate the effectiveness of our methods.

47 citations


Journal ArticleDOI
TL;DR: It is shown that, for the first time, the excessive gap technique, a classical FOM, can be made faster than the counterfactual regret minimization algorithm in practice for large games, and that the aggressive stepsize scheme of CFR+ is the only reason that the algorithm is faster in practice.
Abstract: Sparse iterative methods, in particular first-order methods, are known to be among the most effective in solving large-scale two-player zero-sum extensive-form games. The convergence rates of these methods depend heavily on the properties of the distance-generating function that they are based on. We investigate both the theoretical and practical performance improvement of first-order methods (FOMs) for solving extensive-form games through better design of the dilated entropy function—a class of distance-generating functions related to the domains associated with the extensive-form games. By introducing a new weighting scheme for the dilated entropy function, we develop the first distance-generating function for the strategy spaces of sequential games that has only a logarithmic dependence on the branching factor of the player. This result improves the overall convergence rate of several FOMs working with dilated entropy function by a factor of $$\Omega (b^dd)$$, where b is the branching factor of the player, and d is the depth of the game tree. Thus far, counterfactual regret minimization methods have been faster in practice, and more popular, than FOMs despite their theoretically inferior convergence rates. Using our new weighting scheme and a practical parameter tuning procedure we show that, for the first time, the excessive gap technique, a classical FOM, can be made faster than the counterfactual regret minimization algorithm in practice for large games, and that the aggressive stepsize scheme of CFR+ is the only reason that the algorithm is faster in practice.

Journal ArticleDOI
TL;DR: An augmented Lagrangian algorithm is proposed that generates these types of sequences and new constraint qualifications are proposed, weaker than previously considered ones, which are sufficient for the global convergence of the algorithm to a stationary point.
Abstract: Sequential optimality conditions have played a major role in unifying and extending global convergence results for several classes of algorithms for general nonlinear optimization. In this paper, we extend theses concepts for nonlinear semidefinite programming. We define two sequential optimality conditions for nonlinear semidefinite programming. The first is a natural extension of the so-called Approximate-Karush–Kuhn–Tucker (AKKT), well known in nonlinear optimization. The second one, called Trace-AKKT, is more natural in the context of semidefinite programming as the computation of eigenvalues is avoided. We propose an augmented Lagrangian algorithm that generates these types of sequences and new constraint qualifications are proposed, weaker than previously considered ones, which are sufficient for the global convergence of the algorithm to a stationary point.

Journal ArticleDOI
TL;DR: In this paper, the authors study a class of quadratically constrained quadratic programs (QCQPs), called diagonal QCQPs, which contain no off-diagonal terms and provide a sufficient condition on the problem data guaranteeing that the basic Shor semidefinite relaxation is exact.
Abstract: We study a class of quadratically constrained quadratic programs (QCQPs), called diagonal QCQPs, which contain no off-diagonal terms $$x_j x_k$$ for $$j e k$$ , and we provide a sufficient condition on the problem data guaranteeing that the basic Shor semidefinite relaxation is exact. Our condition complements and refines those already present in the literature and can be checked in polynomial time. We then extend our analysis from diagonal QCQPs to general QCQPs, i.e., ones with no particular structure. By reformulating a general QCQP into diagonal form, we establish new, polynomial-time-checkable sufficient conditions for the semidefinite relaxations of general QCQPs to be exact. Finally, these ideas are extended to show that a class of random general QCQPs has exact semidefinite relaxations with high probability as long as the number of constraints grows no faster than a fixed polynomial in the number of variables. To the best of our knowledge, this is the first result establishing the exactness of the semidefinite relaxation for random general QCQPs.

Journal ArticleDOI
TL;DR: In this paper, the authors study risk sharing problems with quantile-based risk measures and heterogeneous beliefs, motivated by the use of internal models in finance and insurance, and obtain explicit forms of Pareto-optimal allocations and competitive equilibria by solving various optimization problems.
Abstract: We study risk sharing problems with quantile-based risk measures and heterogeneous beliefs, motivated by the use of internal models in finance and insurance. Explicit forms of Pareto-optimal allocations and competitive equilibria are obtained by solving various optimization problems. For Expected Shortfall (ES) agents, Pareto-optimal allocations are shown to be equivalent to equilibrium allocations, and the equilibrium pricing measure is unique. For Value-at-Risk (VaR) agents or mixed VaR and ES agents, a competitive equilibrium does not exist. Our results generalize existing ones on risk sharing problems with risk measures and belief homogeneity, and draw an interesting connection to early work on optimization properties of ES and VaR.

Journal ArticleDOI
TL;DR: A highly efficient augmented Lagrangian method (ALM) is designed for solving a class of convex quadratic programming (QP) problems constrained by the Birkhoff polytope and is demonstrated to be much more efficient than Gurobi in solving a collection of QP problems arising from the relaxation ofquadratic assignment problems.
Abstract: We derive an explicit formula, as well as an efficient procedure, for constructing a generalized Jacobian for the projector of a given square matrix onto the Birkhoff polytope, i.e., the set of doubly stochastic matrices. To guarantee the high efficiency of our procedure, a semismooth Newton method for solving the dual of the projection problem is proposed and efficiently implemented. Extensive numerical experiments are presented to demonstrate the merits and effectiveness of our method by comparing its performance against other powerful solvers such as the commercial software Gurobi and the academic code PPROJ (Hager and Zhang in SIAM J Optim 26:1773–1798, 2016). In particular, our algorithm is able to solve the projection problem with over one billion variables and nonnegative constraints to a very high accuracy in less than 15 min on a modest desktop computer. More importantly, based on our efficient computation of the projections and their generalized Jacobians, we can design a highly efficient augmented Lagrangian method (ALM) for solving a class of convex quadratic programming (QP) problems constrained by the Birkhoff polytope. The resulted ALM is demonstrated to be much more efficient than Gurobi in solving a collection of QP problems arising from the relaxation of quadratic assignment problems.

Journal ArticleDOI
TL;DR: This paper introduces suitable notions of weak, Mordukhovich-, and strong stationarity for mathematical programs with switching constraints and presents some associated constraint qualifications, and applies these results to optimization problems with either-or-constraints.
Abstract: In optimal control, switching structures demanding at most one control to be active at any time instance appear frequently. Discretizing such problems, a so-called mathematical program with switching constraints is obtained. Although these problems are related to other types of disjunctive programs like optimization problems with complementarity or vanishing constraints, their inherent structure makes a separate consideration necessary. Since standard constraint qualifications are likely to fail at the feasible points of switching-constrained optimization problems, stationarity notions which are weaker than the associated Karush–Kuhn–Tucker conditions need to be investigated in order to find applicable necessary optimality conditions. Furthermore, appropriately tailored constraint qualifications need to be formulated. In this paper, we introduce suitable notions of weak, Mordukhovich-, and strong stationarity for mathematical programs with switching constraints and present some associated constraint qualifications. Our findings are exploited to state necessary optimality conditions for (discretized) optimal control problems with switching constraints. Furthermore, we apply our results to optimization problems with either-or-constraints. First, a novel reformulation of such problems using switching constraints is presented. Second, the derived surrogate problem is exploited to obtain necessary optimality conditions for the original program.

Journal ArticleDOI
TL;DR: This analysis applies to simplices, balls and convex bodies that locally look like a ball, while also allowing for a broader class of reference measures, including the Lebesgue measure.
Abstract: We consider the problem of computing the minimum value fmin,K of a polynomial f over a compact set K⊆Rn, which can be reformulated as finding a probability measure ν on K minimizing ∫Kfdν. Lasserre showed that it suffices to consider such measures of the form ν=qμ, where q is a sum-of-squares polynomial and μ is a given Borel measure supported on K. By bounding the degree of q by 2r one gets a converging hierarchy of upper bounds f(r) for fmin,K. When K is the hypercube [−1,1]n, equipped with the Chebyshev measure, the parameters f(r) are known to converge to fmin,K at a rate in O(1/r2). We extend this error estimate to a wider class of convex bodies, while also allowing for a broader class of reference measures, including the Lebesgue measure. Our analysis applies to simplices, balls and convex bodies that locally look like a ball. In addition, we show an error estimate in O(logr/r) when K satisfies a minor geometrical condition, and in O(log2r/r2) when K is a convex body, equipped with the Lebesgue measure. This improves upon the currently best known error estimates in O(1/r√) and O(1/r) for these two respective cases.

Journal ArticleDOI
TL;DR: In this article, the authors present necessary conditions for monotonicity of fixed point iterations of mappings that may violate the usual nonexpansive property, and specialize these results to the alternating projections iteration where the metric subregularity property takes on a distinct geometric characterization of sets at points of intersection called subtransversality.
Abstract: We present necessary conditions for monotonicity of fixed point iterations of mappings that may violate the usual nonexpansive property. Notions of linear-type monotonicity of fixed point sequences—weaker than Fejer monotonicity—are shown to imply metric subregularity. This, together with the almost averaging property recently introduced by Luke et al. (Math Oper Res, 2018. https://doi.org/10.1287/moor.2017.0898), guarantees linear convergence of the sequence to a fixed point. We specialize these results to the alternating projections iteration where the metric subregularity property takes on a distinct geometric characterization of sets at points of intersection called subtransversality. Subtransversality is shown to be necessary for linear convergence of alternating projections for consistent feasibility.

Journal ArticleDOI
TL;DR: The convergence rate of a hierarchy of upper bounds for polynomial minimization problems, proposed by Lasserre (SIAM J Optim 21(3):864-885, 2011), for the special case when the feasible set is the unit (hyper)sphere was studied in this article.
Abstract: We study the convergence rate of a hierarchy of upper bounds for polynomial minimization problems, proposed by Lasserre (SIAM J Optim 21(3):864–885, 2011), for the special case when the feasible set is the unit (hyper)sphere. The upper bound at level r∈N of the hierarchy is defined as the minimal expected value of the polynomial over all probability distributions on the sphere, when the probability density function is a sum-of-squares polynomial of degree at most 2r with respect to the surface measure. We show that the rate of convergence is O(1/r2) and we give a class of polynomials of any positive degree for which this rate is tight. In addition, we explore the implications for the related rate of convergence for the generalized problem of moments on the sphere.

Journal ArticleDOI
TL;DR: A classic result of Cook et al. (Math. Program. 34:251–264, 1986) is shown to be bounded in terms of the number of variables and a parameter Δ, which quantifies sub-determinants of the underlying linear inequalities.
Abstract: A classic result of Cook et al. (Math. Program. 34:251–264, 1986) bounds the distances between optimal solutions of mixed-integer linear programs and optimal solutions of the corresponding linear relaxations. Their bound is given in terms of the number of variables and a parameter $$ \varDelta $$, which quantifies sub-determinants of the underlying linear inequalities. We show that this distance can be bounded in terms of $$ \varDelta $$ and the number of integer variables rather than the total number of variables. To this end, we make use of a result by Olson (J. Number Theory 1:8–10, 1969) in additive combinatorics and demonstrate how it implies feasibility of certain mixed-integer linear programs. We conjecture that our bound can be improved to a function that only depends on $$ \varDelta $$, in general.

Journal ArticleDOI
TL;DR: This work presents an improved proximity condition under which the Peng–Wei relaxation of k-means recovers the underlying clusters exactly and improves upon Kumar and Kannan's proximity condition, which is comparable to that of Awashti and Sheffet.
Abstract: Given a set of data, one central goal is to group them into clusters based on some notion of similarity between the individual objects. One of the most popular and widely-used approaches is k-means despite the computational hardness to find its global minimum. We study and compare the properties of different convex relaxations by relating them to corresponding proximity conditions, an idea originally introduced by Kumar and Kannan. Using conic duality theory, we present an improved proximity condition under which the Peng–Wei relaxation of k-means recovers the underlying clusters exactly. Our proximity condition improves upon Kumar and Kannan and is comparable to that of Awashti and Sheffet, where proximity conditions are established for projective k-means. In addition, we provide a necessary proximity condition for the exactness of the Peng–Wei relaxation. For the special case of equal cluster sizes, we establish a different and completely localized proximity condition under which the Amini–Levina relaxation yields exact clustering, thereby having addressed an open problem by Awasthi and Sheffet in the balanced case. Our framework is not only deterministic and model-free but also comes with a clear geometric meaning which allows for further analysis and generalization. Moreover, it can be conveniently applied to analyzing various data generative models such as the stochastic ball models and Gaussian mixture models. With this method, we improve the current minimum separation bound for the stochastic ball models and achieve the state-of-the-art results of learning Gaussian mixture models.

Journal ArticleDOI
TL;DR: A pure cutting plane algorithm is constructed which is shown to converge if the initial relaxation is a polyhedron, and a theory of outer-product-free sets, where S is the set of real, symmetric matrices of the form $$xx^T$$ x x T .
Abstract: This paper introduces cutting planes that involve minimal structural assumptions, enabling the generation of strong polyhedral relaxations for a broad class of problems. We consider valid inequalities for the set $$S\cap P$$ , where S is a closed set, and P is a polyhedron. Given an oracle that provides the distance from a point to S, we construct a pure cutting plane algorithm which is shown to converge if the initial relaxation is a polyhedron. These cuts are generated from convex forbidden zones, or S-free sets, derived from the oracle. We also consider the special case of polynomial optimization. Accordingly we develop a theory of outer-product-free sets, where S is the set of real, symmetric matrices of the form $$xx^T$$ . All maximal outer-product-free sets of full dimension are shown to be convex cones and we identify several families of such sets. These families are used to generate strengthened intersection cuts that can separate any infeasible extreme point of a linear programming relaxation efficiently. Computational experiments demonstrate the promise of our approach.

Journal ArticleDOI
TL;DR: In this paper, the authors study multistage distributionally robust mixed-integer programs under endogenous uncertainty, where the probability distribution of stage-wise uncertainty depends on the decisions made in previous stages.
Abstract: We study multistage distributionally robust mixed-integer programs under endogenous uncertainty, where the probability distribution of stage-wise uncertainty depends on the decisions made in previous stages. We first consider two ambiguity sets defined by decision-dependent bounds on the first and second moments of uncertain parameters and by mean and covariance matrix that exactly match decision-dependent empirical ones, respectively. For both sets, we show that the subproblem in each stage can be recast as a mixed-integer linear program (MILP). Moreover, we extend the general moment-based ambiguity set in Delage and Ye (Oper Res 58(3):595–612, 2010) to the multistage decision-dependent setting, and derive mixed-integer semidefinite programming (MISDP) reformulations of stage-wise subproblems. We develop methods for attaining lower and upper bounds of the optimal objective value of the multistage MISDPs, and approximate them using a series of MILPs. We deploy the Stochastic Dual Dynamic integer Programming (SDDiP) method for solving the problem under the three ambiguity sets with risk-neutral or risk-averse objective functions, and conduct numerical studies on multistage facility-location instances having diverse sizes under different parameter and uncertainty settings. Our results show that the SDDiP quickly finds optimal solutions for moderate-sized instances under the first two ambiguity sets, and also finds good approximate bounds for the multistage MISDPs derived under the third ambiguity set. We also demonstrate the efficacy of incorporating decision-dependent distributional ambiguity in multistage decision-making processes.

Journal ArticleDOI
TL;DR: It is shown how all solutions within a given tolerance of the optimal value can be efficiently and compactly represented in a weighted decision diagram and concludes that postoptimality analysis based on sound-reduced diagrams has the potential to extract significantly more useful information from an integer programming model than was previously feasible.
Abstract: It is often useful in practice to explore near-optimal solutions of an integer programming problem. We show how all solutions within a given tolerance of the optimal value can be efficiently and compactly represented in a weighted decision diagram. The structure of the diagram facilitates rapid processing of a wide range of queries about the near-optimal solution space, as well as reoptimization after changes in the objective function. We also exploit the paradoxical fact that the diagram can be reduced in size if it is allowed to represent additional solutions. We show that a “sound reduction” operation, applied repeatedly, yields the smallest such diagram that is suitable for postoptimality analysis, and one that is typically far smaller than a tree that represents the same set of near-optimal solutions. We conclude that postoptimality analysis based on sound-reduced diagrams has the potential to extract significantly more useful information from an integer programming model than was previously feasible.

Journal ArticleDOI
TL;DR: In this paper, a 3-operator resolvent-splitting with provably minimal lifting is presented, which can directly generalize DRS to 3-operators without lifting, where lifting roughly corresponds to enlarging the problem size.
Abstract: Given the success of Douglas–Rachford splitting (DRS), it is natural to ask whether DRS can be generalized. Are there other 2 operator resolvent-splittings sharing the favorable properties of DRS? Can DRS be generalized to 3 operators? This work presents the answers: no and no. In a certain sense, DRS is the unique 2 operator resolvent-splitting, and generalizing DRS to 3 operators is impossible without lifting, where lifting roughly corresponds to enlarging the problem size. The impossibility result further raises a question. How much lifting is necessary to generalize DRS to 3 operators? This work presents the answer by providing a novel 3 operator resolvent-splitting with provably minimal lifting that directly generalizes DRS.

Journal ArticleDOI
TL;DR: This work provides a general description of MIDAS, and proves its almost-sure convergence to a 2 T ε -optimal policy for problems with T stages when the Bellman functions are known to be monotonic, and the sampling process satisfies standard assumptions.
Abstract: Mixed integer dynamic approximation scheme (MIDAS) is a new sampling-based algorithm for solving finite-horizon stochastic dynamic programs with monotonic Bellman functions. MIDAS approximates these value functions using step functions, leading to stage problems that are mixed integer programs. We provide a general description of MIDAS, and prove its almost-sure convergence to a $$2T\varepsilon $$-optimal policy for problems with T stages when the Bellman functions are known to be monotonic, and the sampling process satisfies standard assumptions.

Journal ArticleDOI
TL;DR: An algorithm for minimax problems that arise in robust optimization in the absence of objective function derivatives that utilizes an extension of methods for inexact outer approximation in sampling a potentially infinite-cardinality uncertainty set is developed.
Abstract: We develop an algorithm for minimax problems that arise in robust optimization in the absence of objective function derivatives. The algorithm utilizes an extension of methods for inexact outer approximation in sampling a potentially infinite-cardinality uncertainty set. Clarke stationarity of the algorithm output is established alongside desirable features of the model-based trust-region subproblems encountered. We demonstrate the practical benefits of the algorithm on a new class of test problems.

Journal ArticleDOI
TL;DR: A single-exponential algorithm for so-called combinatorial n-fold integer programs, which is remarkably similar to prior ILP formulations for various problems, but unlike them, also allow variable dimension.
Abstract: Many fundamental $$\mathsf {NP}$$ -hard problems can be formulated as integer linear programs (ILPs). A famous algorithm by Lenstra solves ILPs in time that is exponential only in the dimension of the program, and polynomial in the size of the ILP. That algorithm became a ubiquitous tool in the design of fixed-parameter algorithms for $$\mathsf {NP}$$ -hard problems, where one wishes to isolate the hardness of a problem by some parameter. However, in many cases using Lenstra’s algorithm has two drawbacks: First, the run time of the resulting algorithms is often double-exponential in the parameter, and second, an ILP formulation in small dimension cannot easily express problems involving many different costs. Inspired by the work of Hemmecke et al. (Math Program 137(1–2, Ser. A):325–341, 2013), we develop a single-exponential algorithm for so-called combinatorial n-fold integer programs, which are remarkably similar to prior ILP formulations for various problems, but unlike them, also allow variable dimension. We then apply our algorithm to many relevant problems problems like Closest String, Swap Bribery, Weighted Set Multicover, and several others, and obtain exponential speedups in the dependence on the respective parameters, the input size, or both. Unlike Lenstra’s algorithm, which is essentially a bounded search tree algorithm, our result uses the technique of augmenting steps. At its heart is a deep result stating that in combinatorial n-fold IPs, existence of an augmenting step implies existence of a “local” augmenting step, which can be found using dynamic programming. Our results provide an important insight into many problems by showing that they exhibit this phenomenon, and highlights the importance of augmentation techniques.

Journal ArticleDOI
TL;DR: This work considers the fully dynamic bin packing problem, using a new dynamic rounding technique and new ideas to handle small items in a dynamic setting such that no amortization is needed.
Abstract: We consider the fully dynamic bin packing problem, where items arrive and depart in an online fashion and repacking of previously packed items is allowed. The goal is, of course, to minimize both the number of bins used as well as the amount of repacking. A recently introduced way of measuring the repacking costs at each timestep is the migration factor, defined as the total size of repacked items divided by the size of an arriving or departing item. Concerning the trade-off between number of bins and migration factor, if we wish to achieve an asymptotic competitive ratio of $$1 + \epsilon $$ for the number of bins, a relatively simple argument proves a lower bound of $$\Omega ({1}/{\epsilon })$$ for the migration factor. We establish a nearly matching upper bound of $$O({1}/{\epsilon }^4 \log {1}/{\epsilon })$$ using a new dynamic rounding technique and new ideas to handle small items in a dynamic setting such that no amortization is needed. The running time of our algorithm is polynomial in the number of items nand in $${1}/{\epsilon }$$. The previous best trade-off was for an asymptotic competitive ratio of $${5}/{4}$$ for the bins (rather than $$1+\epsilon $$) and needed an amortized number of $$O(\log n)$$ repackings (while in our scheme the number of repackings is independent of n and non-amortized).

Journal ArticleDOI
TL;DR: This work derives effective lower bounds for the bilevel knapsack problem and presents an exact method that exploits the structure of the induced follower’s problem, which strongly outperforms the current state-of-the-art algorithms designed for the problem.
Abstract: We consider the bilevel knapsack problem with interdiction constraints, an extension of the classic 0–1 knapsack problem formulated as a Stackelberg game with two agents, a leader and a follower, that choose items from a common set and hold their own private knapsacks. First, the leader selects some items to be interdicted for the follower while satisfying a capacity constraint. Then the follower packs a set of the remaining items according to his knapsack constraint in order to maximize the profits. The goal of the leader is to minimize the follower’s total profit. We derive effective lower bounds for the bilevel knapsack problem and present an exact method that exploits the structure of the induced follower’s problem. The approach strongly outperforms the current state-of-the-art algorithms designed for the problem. We extend the same algorithmic framework to the interval min–max regret knapsack problem after providing a novel bilevel programming reformulation. Also for this problem, the proposed approach outperforms the exact algorithms available in the literature.