scispace - formally typeset
Search or ask a question

Showing papers in "Mathematical Programming in 2011"


Journal ArticleDOI
TL;DR: A simple and effective stochastic sub-gradient descent algorithm for solving the optimization problem cast by Support Vector Machines, which is particularly well suited for large text classification problems, and demonstrates an order-of-magnitude speedup over previous SVM learning methods.
Abstract: We describe and analyze a simple and effective stochastic sub-gradient descent algorithm for solving the optimization problem cast by Support Vector Machines (SVM). We prove that the number of iterations required to obtain a solution of accuracy $${\epsilon}$$ is $${\tilde{O}(1 / \epsilon)}$$, where each iteration operates on a single training example. In contrast, previous analyses of stochastic gradient descent methods for SVMs require $${\Omega(1 / \epsilon^2)}$$ iterations. As in previously devised SVM solvers, the number of iterations also scales linearly with 1/λ, where λ is the regularization parameter of SVM. For a linear kernel, the total run-time of our method is $${\tilde{O}(d/(\lambda \epsilon))}$$, where d is a bound on the number of non-zero features in each example. Since the run-time does not depend directly on the size of the training set, the resulting algorithm is especially suited for learning from large datasets. Our approach also extends to non-linear kernels while working solely on the primal objective function, though in this case the runtime does depend linearly on the training set size. Our algorithm is particularly well suited for large text classification problems, where we demonstrate an order-of-magnitude speedup over previous SVM learning methods.

2,037 citations


Journal ArticleDOI
TL;DR: A very fast, robust and powerful algorithm, which the authors call FPCA (Fixed Point Continuation with Approximate SVD), that can solve very large matrix rank minimization problems and proves convergence of the first of these algorithms.
Abstract: The linearly constrained matrix rank minimization problem is widely applicable in many fields such as control, signal processing and system identification. The tightest convex relaxation of this problem is the linearly constrained nuclear norm minimization. Although the latter can be cast as a semidefinite programming problem, such an approach is computationally expensive to solve when the matrices are large. In this paper, we propose fixed point and Bregman iterative algorithms for solving the nuclear norm minimization problem and prove convergence of the first of these algorithms. By using a homotopy approach together with an approximate singular value decomposition procedure, we get a very fast, robust and powerful algorithm, which we call FPCA (Fixed Point Continuation with Approximate SVD), that can solve very large matrix rank minimization problems (the code can be downloaded from http://www.columbia.edu/~sm2756/FPCA.htmfor non-commercial use). Our numerical results on randomly generated and real matrix completion problems demonstrate that this algorithm is much faster and provides much better recoverability than semidefinite programming solvers such as SDPT3. For example, our algorithm can recover 1000 × 1000 matrices of rank 50 with a relative error of 10−5 in about 3 min by sampling only 20% of the elements. We know of no other method that achieves as good recoverability. Numerical experiments on online recommendation, DNA microarray data set and image inpainting problems demonstrate the effectiveness of our algorithms.

1,099 citations


Journal ArticleDOI
TL;DR: A convergence and rate of convergence analysis of a variety of incremental methods, including some that involve randomization in the selection of components, and applications in a few contexts, including signal processing and inference/machine learning are discussed.
Abstract: We consider the minimization of a sum $${\sum_{i=1}^mf_i(x)}$$ consisting of a large number of convex component functions f i . For this problem, incremental methods consisting of gradient or subgradient iterations applied to single components have proved very effective. We propose new incremental methods, consisting of proximal iterations applied to single components, as well as combinations of gradient, subgradient, and proximal iterations. We provide a convergence and rate of convergence analysis of a variety of such methods, including some that involve randomization in the selection of components. We also discuss applications in a few contexts, including signal processing and inference/machine learning.

381 citations


Journal ArticleDOI
TL;DR: An Adaptive Regularisation algorithm using Cubics (ARC) is proposed for unconstrained optimization, generalizing at the same time an unpublished method due to Griewank, an algorithm by Nesterov and Polyak and a proposal by Weiser et al.
Abstract: An Adaptive Regularisation algorithm using Cubics (ARC) is proposed for unconstrained optimization, generalizing at the same time an unpublished method due to Griewank (Technical Report NA/12, 1981, DAMTP, University of Cambridge), an algorithm by Nesterov and Polyak (Math Program 108(1):177–205, 2006) and a proposal by Weiser et al. (Optim Methods Softw 22(3):413–431, 2007). At each iteration of our approach, an approximate global minimizer of a local cubic regularisation of the objective function is determined, and this ensures a significant improvement in the objective so long as the Hessian of the objective is locally Lipschitz continuous. The new method uses an adaptive estimation of the local Lipschitz constant and approximations to the global model-minimizer which remain computationally-viable even for large-scale problems. We show that the excellent global and local convergence properties obtained by Nesterov and Polyak are retained, and sometimes extended to a wider class of problems, by our ARC approach. Preliminary numerical experiments with small-scale test problems from the CUTEr set show encouraging performance of the ARC algorithm when compared to a basic trust-region implementation.

366 citations


Journal ArticleDOI
TL;DR: The approach is more general in that it allows the cubic model to be solved only approximately and may employ approximate Hessians, and the orders of these bounds match those proved for Algorithm 3.3 of Nesterov and Polyak which minimizes the cubicmodel globally on each iteration.
Abstract: An Adaptive Regularisation framework using Cubics (ARC) was proposed for unconstrained optimization and analysed in Cartis, Gould and Toint (Part I, Math Program, doi: 10.1007/s10107-009-0286-5, 2009), generalizing at the same time an unpublished method due to Griewank (Technical Report NA/12, 1981, DAMTP, University of Cambridge), an algorithm by Nesterov and Polyak (Math Program 108(1):177–205, 2006) and a proposal by Weiser, Deuflhard and Erdmann (Optim Methods Softw 22(3):413–431, 2007). In this companion paper, we further the analysis by providing worst-case global iteration complexity bounds for ARC and a second-order variant to achieve approximate first-order, and for the latter second-order, criticality of the iterates. In particular, the second-order ARC algorithm requires at most $${\mathcal{O}(\epsilon^{-3/2})}$$ iterations, or equivalently, function- and gradient-evaluations, to drive the norm of the gradient of the objective below the desired accuracy $${\epsilon}$$, and $${\mathcal{O}(\epsilon^{-3})}$$ iterations, to reach approximate nonnegative curvature in a subspace. The orders of these bounds match those proved for Algorithm 3.3 of Nesterov and Polyak which minimizes the cubic model globally on each iteration. Our approach is more general in that it allows the cubic model to be solved only approximately and may employ approximate Hessians.

305 citations


Journal ArticleDOI
TL;DR: This paper proposes an efficient method to estimate the approximation error introduced by this rather drastic means of complexity reduction: it applies the linear decision rule restriction not only to the primal but also to a dual version of the stochastic program.
Abstract: Linear stochastic programming provides a flexible toolbox for analyzing real-life decision situations, but it can become computationally cumbersome when recourse decisions are involved. The latter are usually modeled as decision rules, i.e., functions of the uncertain problem data. It has recently been argued that stochastic programs can quite generally be made tractable by restricting the space of decision rules to those that exhibit a linear data dependence. In this paper, we propose an efficient method to estimate the approximation error introduced by this rather drastic means of complexity reduction: we apply the linear decision rule restriction not only to the primal but also to a dual version of the stochastic program. By employing techniques that are commonly used in modern robust optimization, we show that both arising approximate problems are equivalent to tractable linear or semidefinite programs of moderate sizes. The gap between their optimal values estimates the loss of optimality incurred by the linear decision rule approximation. Our method remains applicable if the stochastic program has random recourse and multiple decision stages. It also extends to cases involving ambiguous probability distributions.

282 citations


Journal ArticleDOI
TL;DR: In this paper, the authors propose a statistical ranking method called HodgeRank for ranking data that may be incomplete and imbalanced, characteristics common in modern datasets coming from e-commerce and internet applications.
Abstract: We propose a technique that we call HodgeRank for ranking data that may be incomplete and imbalanced, characteristics common in modern datasets coming from e-commerce and internet applications. We are primarily interested in cardinal data based on scores or ratings though our methods also give specific insights on ordinal data. From raw ranking data, we construct pairwise rankings, represented as edge flows on an appropriate graph. Our statistical ranking method exploits the graph Helmholtzian, which is the graph theoretic analogue of the Helmholtz operator or vector Laplacian, in much the same way the graph Laplacian is an analogue of the Laplace operator or scalar Laplacian. We shall study the graph Helmholtzian using combinatorial Hodge theory, which provides a way to unravel ranking information from edge flows. In particular, we show that every edge flow representing pairwise ranking can be resolved into two orthogonal components, a gradient flow that represents the l 2-optimal global ranking and a divergence-free flow (cyclic) that measures the validity of the global ranking obtained—if this is large, then it indicates that the data does not have a good global ranking. This divergence-free flow can be further decomposed orthogonally into a curl flow (locally cyclic) and a harmonic flow (locally acyclic but globally cyclic); these provides information on whether inconsistency in the ranking data arises locally or globally. When applied to statistical ranking problems, Hodge decomposition sheds light on whether a given dataset may be globally ranked in a meaningful way or if the data is inherently inconsistent and thus could not have any reasonable global ranking; in the latter case it provides information on the nature of the inconsistencies. An obvious advantage over the NP-hardness of Kemeny optimization is that HodgeRank may be easily computed via a linear least squares regression. We also discuss connections with well-known ordinal ranking techniques such as Kemeny optimization and Borda count from social choice theory.

278 citations


Journal ArticleDOI
TL;DR: It is proved that finding the global minimal value of the problem is strongly NP-Hard, but computing a local minimizer of theproblem can be done in polynomial time.
Abstract: We discuss the L p (0 ≤ p < 1) minimization problem arising from sparse solution construction and compressed sensing. For any fixed 0 < p < 1, we prove that finding the global minimal value of the problem is strongly NP-Hard, but computing a local minimizer of the problem can be done in polynomial time. We also develop an interior-point potential reduction algorithm with a provable complexity bound and demonstrate preliminary computational results of effectiveness of the algorithm.

274 citations


Journal ArticleDOI
TL;DR: It is proved that the new formulations for piecewise linear functions of one and two variables that use a number of binary variables and extra constraints logarithmic in the number of linear pieces of the functions have favorable tightness properties and can significantly outperform other mixed integer binary formulations.
Abstract: Many combinatorial constraints over continuous variables such as SOS1 and SOS2 constraints can be interpreted as disjunctive constraints that restrict the variables to lie in the union of a finite number of specially structured polyhedra. Known mixed integer binary formulations for these constraints have a number of binary variables and extra constraints linear in the number of polyhedra. We give sufficient conditions for constructing formulations for these constraints with a number of binary variables and extra constraints logarithmic in the number of polyhedra. Using these conditions we introduce mixed integer binary formulations for SOS1 and SOS2 constraints that have a number of binary variables and extra constraints logarithmic in the number of continuous variables. We also introduce the first mixed integer binary formulations for piecewise linear functions of one and two variables that use a number of binary variables and extra constraints logarithmic in the number of linear pieces of the functions. We prove that the new formulations for piecewise linear functions have favorable tightness properties and present computational results showing that they can significantly outperform other mixed integer binary formulations.

229 citations


Journal ArticleDOI
TL;DR: The new matrix decomposition theorems for Hermitian positive semidefinite matrices are proven by construction in this paper, and it is demonstrated that the constructive procedures can be implemented efficiently, stably, and accurately.
Abstract: In this paper, we present several new rank-one decomposition theorems for Hermitian positive semidefinite matrices, which generalize our previous results in Huang and Zhang (Math Oper Res 32(3):758–768, 2007), Ai and Zhang (SIAM J Optim 19(4):1735–1756, 2009). The new matrix rank-one decomposition theorems appear to have wide applications in theory as well as in practice. On the theoretical side, for example, we show how to further extend some of the classical results including a lemma due to Yuan (Math Program 47:53–63, 1990), the classical results on the convexity of the joint numerical ranges (Pang and Zhang in Unpublished Manuscript, 2004; Au-Yeung and Poon in Southeast Asian Bull Math 3:85–92, 1979), and the so-called Finsler’s lemma (Bohnenblust in Unpublished Manuscript; Au-Yeung and Poon in Southeast Asian Bull Math 3:85–92, 1979). On the practical side, we show that the new results can be applied to solve two typical problems in signal processing and communication: one for radar code optimization and the other for robust beamforming. The new matrix decomposition theorems are proven by construction in this paper, and we demonstrate that the constructive procedures can be implemented efficiently, stably, and accurately. The URL of our Matlab programs is given in this paper. We strongly believe that the new decomposition procedures, as a means to solve non-convex quadratic optimization with a few quadratic constraints, are useful for many other potential engineering applications.

180 citations


Journal ArticleDOI
TL;DR: This paper characterize properties of the null space of the linear operator defining the constraint set that are necessary and sufficient for the heuristic to succeed, and obtains dimension-free bounds under which these null space properties hold almost surely as the matrix dimensions tend to infinity.
Abstract: Minimizing the rank of a matrix subject to constraints is a challenging problem that arises in many applications in machine learning, control theory, and discrete geometry. This class of optimization problems, known as rank minimization, is NP-hard, and for most practical problems there are no efficient algorithms that yield exact solutions. A popular heuristic replaces the rank function with the nuclear norm—equal to the sum of the singular values—of the decision variable and has been shown to provide the optimal low rank solution in a variety of scenarios. In this paper, we assess the practical performance of this heuristic for finding the minimum rank matrix subject to linear equality constraints. We characterize properties of the null space of the linear operator defining the constraint set that are necessary and sufficient for the heuristic to succeed. We then analyze linear constraints sampled uniformly at random, and obtain dimension-free bounds under which our null space properties hold almost surely as the matrix dimensions tend to infinity. Finally, we provide empirical evidence that these probabilistic bounds provide accurate predictions of the heuristic’s performance in non-asymptotic scenarios.

Journal ArticleDOI
TL;DR: This paper discusses first-order methods suitable for solving primal-dual convex and nonsmooth minimization reformulations of the cone programming problem, and proposes a variant of Nesterov’s optimal method which has outperformed the latter one in the authors' computational experiments.
Abstract: In this paper we consider the general cone programming problem, and propose primal-dual convex (smooth and/or nonsmooth) minimization reformulations for it. We then discuss first-order methods suitable for solving these reformulations, namely, Nesterov’s optimal method (Nesterov in Doklady AN SSSR 269:543–547, 1983; Math Program 103:127–152, 2005), Nesterov’s smooth approximation scheme (Nesterov in Math Program 103:127–152, 2005), and Nemirovski’s prox-method (Nemirovski in SIAM J Opt 15:229–251, 2005), and propose a variant of Nesterov’s optimal method which has outperformed the latter one in our computational experiments. We also derive iteration-complexity bounds for these first-order methods applied to the proposed primal-dual reformulations of the cone programming problem. The performance of these methods is then compared using a set of randomly generated linear programming and semidefinite programming instances. We also compare the approach based on the variant of Nesterov’s optimal method with the low-rank method proposed by Burer and Monteiro (Math Program Ser B 95:329–357, 2003; Math Program 103:427–444, 2005) for solving a set of randomly generated SDP instances.

Journal ArticleDOI
TL;DR: In this paper, the authors study a projected multi-agent subgradient algorithm under state-dependent communication and show that the algorithm converges to the same optimal solution with probability one under different assumptions on the local constraint sets and the stepsize sequence.
Abstract: We study distributed algorithms for solving global optimization problems in which the objective function is the sum of local objective functions of agents and the constraint set is given by the intersection of local constraint sets of agents. We assume that each agent knows only his own local objective function and constraint set, and exchanges information with the other agents over a randomly varying network topology to update his information state. We assume a state-dependent communication model over this topology: communication is Markovian with respect to the states of the agents and the probability with which the links are available depends on the states of the agents. We study a projected multi-agent subgradient algorithm under state-dependent communication. The state-dependence of the communication introduces significant challenges and couples the study of information exchange with the analysis of subgradient steps and projection errors. We first show that the multi-agent subgradient algorithm when used with a constant stepsize may result in the agent estimates to diverge with probability one. Under some assumptions on the stepsize sequence, we provide convergence rate bounds on a “disagreement metric” between the agent estimates. Our bounds are time-nonhomogeneous in the sense that they depend on the initial starting time. Despite this, we show that agent estimates reach an almost sure consensus and converge to the same optimal solution of the global optimization problem with probability one under different assumptions on the local constraint sets and the stepsize sequence.

Journal ArticleDOI
TL;DR: This paper deals with iterative gradient and subgradient methods with random feasibility steps for solving constrained convex minimization problems, where the constraint set is specified as the intersection of possibly infinitely many constraint sets.
Abstract: This paper deals with iterative gradient and subgradient methods with random feasibility steps for solving constrained convex minimization problems, where the constraint set is specified as the intersection of possibly infinitely many constraint sets. Each constraint set is assumed to be given as a level set of a convex but not necessarily differentiable function. The proposed algorithms are applicable to the situation where the whole constraint set of the problem is not known in advance, but it is rather learned in time through observations. Also, the algorithms are of interest for constrained optimization problems where the constraints are known but the number of constraints is either large or not finite. We analyze the proposed algorithm for the case when the objective function is differentiable with Lipschitz gradients and the case when the objective function is not necessarily differentiable. The behavior of the algorithm is investigated both for diminishing and non-diminishing stepsize values. The almost sure convergence to an optimal solution is established for diminishing stepsize. For non-diminishing stepsize, the error bounds are established for the expected distances of the weighted averages of the iterates from the constraint set, as well as for the expected sub-optimality of the function values along the weighted averages.

Journal ArticleDOI
TL;DR: A basic framework for exploiting sparsity via positive semidefinite matrix completion is presented for an optimization problem with linear and nonlinear matrix inequalities and preliminary numerical results on the conversion methods indicate their potential for improving the efficiency of solving various problems.
Abstract: A basic framework for exploiting sparsity via positive semidefinite matrix completion is presented for an optimization problem with linear and nonlinear matrix inequalities. The sparsity, characterized with a chordal graph structure, can be detected in the variable matrix or in a linear or nonlinear matrix-inequality constraint of the problem. We classify the sparsity in two types, the domain-space sparsity (d-space sparsity) for the symmetric matrix variable in the objective and/or constraint functions of the problem, which is required to be positive semidefinite, and the range-space sparsity (r-space sparsity) for a linear or nonlinear matrix-inequality constraint of the problem. Four conversion methods are proposed in this framework: two for exploiting the d-space sparsity and the other two for exploiting the r-space sparsity. When applied to a polynomial semidefinite program (SDP), these conversion methods enhance the structured sparsity of the problem called the correlative sparsity. As a result, the resulting polynomial SDP can be solved more effectively by applying the sparse SDP relaxation. Preliminary numerical results on the conversion methods indicate their potential for improving the efficiency of solving various problems.

Journal ArticleDOI
TL;DR: Using theoretical analysis, it is shown that the recently developed doubly nonnegative relaxation is equivalent to the Shor relaxation, when the latter is enhanced with a partial first-order relaxation-linearization technique.
Abstract: At the intersection of nonlinear and combinatorial optimization, quadratic programming has attracted significant interest over the past several decades. A variety of relaxations for quadratically constrained quadratic programming (QCQP) can be formulated as semidefinite programs (SDPs). The primary purpose of this paper is to present a systematic comparison of SDP relaxations for QCQP. Using theoretical analysis, it is shown that the recently developed doubly nonnegative relaxation is equivalent to the Shor relaxation, when the latter is enhanced with a partial first-order relaxation-linearization technique. These two relaxations are shown to theoretically dominate six other SDP relaxations. A computational comparison reveals that the two dominant relaxations require three orders of magnitude more computational time than the weaker relaxations, while providing relaxation gaps averaging 3% as opposed to gaps of up to 19% for weaker relaxations, on 700 randomly generated problems with up to 60 variables. An SDP relaxation derived from Lagrangian relaxation, after the addition of redundant nonlinear constraints to the primal, achieves gaps averaging 13% in a few CPU seconds.

Journal ArticleDOI
TL;DR: In this paper, the authors considered the problem of finding a maximum clique in a graph and a maximum edge biclique in a bipartite graph, both of which are NP-hard.
Abstract: We consider the problems of finding a maximum clique in a graph and finding a maximum-edge biclique in a bipartite graph. Both problems are NP-hard. We write both problems as matrix-rank minimization and then relax them using the nuclear norm. This technique, which may be regarded as a generalization of compressive sensing, has recently been shown to be an effective way to solve rank optimization problems. In the special case that the input graph has a planted clique or biclique (i.e., a single large clique or biclique plus diversionary edges), our algorithm successfully provides an exact solution to the original instance. For each problem, we provide two analyses of when our algorithm succeeds. In the first analysis, the diversionary edges are placed by an adversary. In the second, they are placed at random. In the case of random edges for the planted clique problem, we obtain the same bound as Alon, Krivelevich and Sudakov as well as Feige and Krauthgamer, but we use different techniques.

Journal ArticleDOI
TL;DR: This paper presents a generic branching scheme in which the pricing oracle of the root node remains of use after branching and is the first branch-and-price algorithm capable of solving such problems to integrality without modifying the subproblem or expanding its variable space.
Abstract: Developing a branching scheme that is compatible with the column generation procedure can be challenging. Application specific and generic schemes have been proposed in the literature, but they have their drawbacks. One generic scheme is to implement standard branching in the space of the compact formulation to which the Dantzig-Wolfe reformulation was applied. However, in the presence of multiple identical subsystems, the mapping to the original variable space typically induces symmetries. An alternative, in an application specific context, can be to expand the compact formulation to offer a wider choice of branching variables. Other existing generic schemes for use in branch-and-price imply modifications to the pricing problem. This is a concern because the pricing oracle on which the method relies might become obsolete beyond the root node. This paper presents a generic branching scheme in which the pricing oracle of the root node remains of use after branching (assuming that the pricing oracle can handle bounds on the subproblem variables). The scheme does not require the use of an extended formulation of the original problem. It proceeds by recursively partitioning the subproblem solution set. Branching constraints are enforced in the pricing problem instead of being dualized via Lagrangian relaxation, and the pricing problem is solved by a limited number of calls to the pricing oracle. This generic scheme builds on previously proposed approaches and unifies them. We illustrate its use on the cutting stock and bin packing problems. This is the first branch-and-price algorithm capable of solving such problems to integrality without modifying the subproblem or expanding its variable space.

Journal ArticleDOI
TL;DR: It is proved that the directed cut model for the STP defined in the layered graph, dominates the best previously known models for the HMSTP, and it is shown that the Steiner directed cuts in the extended layered graph space can be viewed as being a stronger version of some previously known HMSTPs in the original design space.
Abstract: The hop-constrained minimum spanning tree problem (HMSTP) is an NP-hard problem arising in the design of centralized telecommunication networks with quality of service constraints. We show that the HMSTP is equivalent to a Steiner tree problem (STP) in an appropriate layered graph. We prove that the directed cut model for the STP defined in the layered graph, dominates the best previously known models for the HMSTP. We also show that the Steiner directed cuts in the extended layered graph space can be viewed as being a stronger version of some previously known HMSTP cuts in the original design space. Moreover, we show that these strengthened cuts can be combined and projected into new families of cuts, including facet defining ones, in the original design space. We also adapt the proposed approach to the diameter-constrained minimum spanning tree problem (DMSTP). Computational results with a branch-and-cut algorithm show that the proposed method is significantly better than previously known methods on both problems.

Journal ArticleDOI
TL;DR: This paper performs a polyhedral analysis of a relevant mixed-integer set and exploits the structure of the utility function h to strengthen the standard submodular formulation significantly, and shows the effectiveness of the new formulation on expected utility maximization in capital budgeting.
Abstract: Given a finite ground set N and a value vector $${a \in \mathbb{R}^N}$$, we consider optimization problems involving maximization of a submodular set utility function of the form $${h(S)= f \left(\sum_{i \in S} a_i \right ), S \subseteq N}$$, where f is a strictly concave, increasing, differentiable function. This utility function appears frequently in combinatorial optimization problems when modeling risk aversion and decreasing marginal preferences, for instance, in risk-averse capital budgeting under uncertainty, competitive facility location, and combinatorial auctions. These problems can be formulated as linear mixed 0-1 programs. However, the standard formulation of these problems using submodular inequalities is ineffective for their solution, except for very small instances. In this paper, we perform a polyhedral analysis of a relevant mixed-integer set and, by exploiting the structure of the utility function h, strengthen the standard submodular formulation significantly. We show the lifting problem of the submodular inequalities to be a submodular maximization problem with a special structure solvable by a greedy algorithm, which leads to an easily-computable strengthening by subadditive lifting of the inequalities. Computational experiments on expected utility maximization in capital budgeting show the effectiveness of the new formulation.

Journal ArticleDOI
TL;DR: This paper shows that an SDP-based algorithm of Nemirovski, which is developed for solving a class of quadratic optimization problems with orthogonality constraints, has a logarithmic approximation guarantee, and improves upon the polynomial approximation guarantee established earlier by NemiroVSki.
Abstract: In this paper, we consider various moment inequalities for sums of random matrices—which are well-studied in the functional analysis and probability theory literature—and demonstrate how they can be used to obtain the best known performance guarantees for several problems in optimization. First, we show that the validity of a recent conjecture of Nemirovski is actually a direct consequence of the so-called non-commutative Khintchine’s inequality in functional analysis. Using this result, we show that an SDP-based algorithm of Nemirovski, which is developed for solving a class of quadratic optimization problems with orthogonality constraints, has a logarithmic approximation guarantee. This improves upon the polynomial approximation guarantee established earlier by Nemirovski. Furthermore, we obtain improved safe tractable approximations of a certain class of chance constrained linear matrix inequalities. Secondly, we consider a recent result of Delage and Ye on the so-called data-driven distributionally robust stochastic programming problem. One of the assumptions in the Delage–Ye result is that the underlying probability distribution has bounded support. However, using a suitable moment inequality, we show that the result in fact holds for a much larger class of probability distributions. Given the close connection between the behavior of sums of random matrices and the theoretical properties of various optimization problems, we expect that the moment inequalities discussed in this paper will find further applications in optimization.

Journal ArticleDOI
TL;DR: A cutting plane based solution algorithm is described, which demonstrates the effectiveness and the scale-up properties of the solution algorithm, as applied to the SSD model of Roman et al. (Math Program, Ser B 108:541–569, 2006).
Abstract: Second-order stochastic dominance (SSD) is widely recognised as an important decision criterion in portfolio selection. Unfortunately, stochastic dominance models are known to be very demanding from a computational point of view. In this paper we consider two classes of models which use SSD as a choice criterion. The first, proposed by Dentcheva and Ruszczynski (J Bank Finance 30:433–451, 2006), uses a SSD constraint, which can be expressed as integrated chance constraints (ICCs). The second, proposed by Roman et al. (Math Program, Ser B 108:541–569, 2006) uses SSD through a multi-objective formulation with CVaR objectives. Cutting plane representations and algorithms were proposed by Klein Haneveld and Van der Vlerk (Comput Manage Sci 3:245–269, 2006) for ICCs, and by Kunzi-Bay and Mayer (Comput Manage Sci 3:3–27, 2006) for CVaR minimization. These concepts are taken into consideration to propose representations and solution methods for the above class of SSD based models. We describe a cutting plane based solution algorithm and outline implementation details. A computational study is presented, which demonstrates the effectiveness and the scale-up properties of the solution algorithm, as applied to the SSD model of Roman et al. (Math Program, Ser B 108:541–569, 2006).

Journal ArticleDOI
TL;DR: This paper presents the first polynomial-time approximation schemes for maximum-weight matching andmaximum-weight matroid intersection with one additional budget constraint, and exploits the adjacency relations on the solution polytope and, surprisingly, the solution to an old combinatorial puzzle.
Abstract: Many polynomial-time solvable combinatorial optimization problems become NP-hard if an additional complicating constraint is added to restrict the set of feasible solutions. In this paper, we consider two such problems, namely maximum-weight matching and maximum-weight matroid intersection with one additional budget constraint. We present the first polynomial-time approximation schemes for these problems. Similarly to other approaches for related problems, our schemes compute two solutions to the Lagrangian relaxation of the problem and patch them together to obtain a near-optimal solution. However, due to the richer combinatorial structure of the problems considered here, standard patching techniques do not apply. To circumvent this problem, we crucially exploit the adjacency relations on the solution polytope and, surprisingly, the solution to an old combinatorial puzzle.

Journal ArticleDOI
TL;DR: In this article, the authors used semidefinite relaxation techniques to test the nullspace property on a matrix A and showed on some numerical examples that these relaxation bounds can prove perfect recovery of sparse solutions with relatively high cardinality.
Abstract: Recent results in compressed sensing show that, under certain conditions, the sparsest solution to an underdetermined set of linear equations can be recovered by solving a linear program. These results either rely on computing sparse eigenvalues of the design matrix or on properties of its nullspace. So far, no tractable algorithm is known to test these conditions and most current results rely on asymptotic properties of random matrices. Given a matrix A, we use semidefinite relaxation techniques to test the nullspace property on A and show on some numerical examples that these relaxation bounds can prove perfect recovery of sparse solutions with relatively high cardinality.

Journal ArticleDOI
TL;DR: In this paper, necessary and sufficient conditions for a sensing matrix to be "s-good" were discussed for exact l 1-recovery of sparse signals with s nonzero entries when no measurement noise is present.
Abstract: We discuss necessary and sufficient conditions for a sensing matrix to be “s-good”—to allow for exact l 1-recovery of sparse signals with s nonzero entries when no measurement noise is present. Then we express the error bounds for imperfect l 1-recovery (nonzero measurement noise, nearly s-sparse signal, near-optimal solution of the optimization problem yielding the l 1-recovery) in terms of the characteristics underlying these conditions. Further, we demonstrate (and this is the principal result of the paper) that these characteristics, although difficult to evaluate, lead to verifiable sufficient conditions for exact sparse l 1-recovery and to efficiently computable upper bounds on those s for which a given sensing matrix is s-good. We establish also instructive links between our approach and the basic concepts of the Compressed Sensing theory, like Restricted Isometry or Restricted Eigenvalue properties.

Journal ArticleDOI
TL;DR: It is shown that a greedy augmentation procedure that employs only directions from certain Graver bases needs only polynomially many augmentation steps to solve the given problem.
Abstract: In this paper we consider the solution of certain convex integer minimization problems via greedy augmentation procedures. We show that a greedy augmentation procedure that employs only directions from certain Graver bases needs only polynomially many augmentation steps to solve the given problem. We extend these results to convex N-fold integer minimization problems and to convex 2-stage stochastic integer minimization problems. Finally, we present some applications of convex N-fold integer minimization problems for which our approach provides polynomial time solution algorithms.

Journal ArticleDOI
TL;DR: Closeness of the constraint set mapping with respect to perturbations of the underlying probability measure is derived and large-scale, block-structured, mixed- integer linear programming equivalents to the dominance constrained stochastic programs are identified.
Abstract: We introduce stochastic integer programs with second-order dominance constraints induced by mixed-integer linear recourse. Closedness of the constraint set mapping with respect to perturbations of the underlying probability measure is derived. For discrete probability measures, large-scale, block-structured, mixed- integer linear programming equivalents to the dominance constrained stochastic programs are identified. For these models, a decomposition algorithm is proposed and tested with instances from power optimization.

Journal ArticleDOI
TL;DR: It is shown how to derive conic valid inequalities for a conic integer program from conic inequalities valid for its lower-dimensional restrictions.
Abstract: Lifting is a procedure for deriving valid inequalities for mixed-integer sets from valid inequalities for suitable restrictions of those sets. Lifting has been shown to be very effective in developing strong valid inequalities for linear integer programming and it has been successfully used to solve such problems with branch-and-cut algorithms. Here we generalize the theory of lifting to conic integer programming, i.e., integer programs with conic constraints. We show how to derive conic valid inequalities for a conic integer program from conic inequalities valid for its lower-dimensional restrictions. In order to simplify the computations, we also discuss sequence-independent lifting for conic integer programs. When the cones are restricted to nonnegative orthants, conic lifting reduces to the lifting for linear integer programming as one may expect.

Journal ArticleDOI
TL;DR: This paper reduces three sphere constrained polynomial optimization problems to that of determining the L2-diameters of certain convex bodies, and shows that they can all be approximated to within a factor of Ω((log n/n)d/2–1) deterministically, which improves upon the currently best known approximation bound in the literature.
Abstract: Due to their fundamental nature and numerous applications, sphere constrained polynomial optimization problems have received a lot of attention lately. In this paper, we consider three such problems: (i) maximizing a homogeneous polynomial over the sphere; (ii) maximizing a multilinear form over a Cartesian product of spheres; and (iii) maximizing a multiquadratic form over a Cartesian product of spheres. Since these problems are generally intractable, our focus is on designing polynomial-time approximation algorithms for them. By reducing the above problems to that of determining the L 2-diameters of certain convex bodies, we show that they can all be approximated to within a factor of Ω((log n/n) d/2–1) deterministically, where n is the number of variables and d is the degree of the polynomial. This improves upon the currently best known approximation bound of Ω((1/n) d/2–1) in the literature. We believe that our approach will find further applications in the design of approximation algorithms for polynomial optimization problems with provable guarantees.

Journal ArticleDOI
TL;DR: This paper investigates the main ingredients of a disjunctive cut separation procedure, and analyzes their impact on the quality of the root-node bound for a set of instances taken from MIPLIB library.
Abstract: Disjunctive cuts for Mixed-Integer Linear Programs (MIPs) were introduced by Egon Balas in the late 1970s and have been successfully exploited in practice since the late 1990s. In this paper we investigate the main ingredients of a disjunctive cut separation procedure, and analyze their impact on the quality of the root-node bound for a set of instances taken from MIPLIB library. We compare alternative normalization conditions, and try to better understand their role. In particular, we point out that constraints that become redundant (because of the disjunction used) can produce over-weak cuts, and analyze this property with respect to the normalization used. Finally, we introduce a new normalization condition and analyze its theoretical properties and computational behavior. Along the way, we make use of a number of small numerical examples to illustrate some basic (and often misinterpreted) disjunctive programming features.