Showing papers in &quot;Mathematical Programming in 2012&quot;

Sample size selection in optimization methods for machine learning

TL;DR: The accelerated stochastic approximation (AC-SA) algorithm based on Nesterov’s optimal method for smooth CP is introduced, and it is shown that the AC-SA algorithm can achieve the aforementioned lower bound on the rate of convergence for SCO.

...read moreread less

Abstract: This paper considers an important class of convex programming (CP) problems, namely, the stochastic composite optimization (SCO), whose objective function is given by the summation of general nonsmooth and smooth stochastic components. Since SCO covers non-smooth, smooth and stochastic CP as certain special cases, a valid lower bound on the rate of convergence for solving these problems is known from the classic complexity theory of convex programming. Note however that the optimization algorithms that can achieve this lower bound had never been developed. In this paper, we show that the simple mirror-descent stochastic approximation method exhibits the best-known rate of convergence for solving these problems. Our major contribution is to introduce the accelerated stochastic approximation (AC-SA) algorithm based on Nesterov’s optimal method for smooth CP (Nesterov in Doklady AN SSSR 269:543–547, 1983; Nesterov in Math Program 103:127–152, 2005), and show that the AC-SA algorithm can achieve the aforementioned lower bound on the rate of convergence for SCO. To the best of our knowledge, it is also the first universally optimal algorithm in the literature for solving non-smooth, smooth and stochastic CP problems. We illustrate the significant advantages of the AC-SA algorithm over existing methods in the context of solving a special but broad class of stochastic programming problems.

...read moreread less

531 citations

Journal Article•DOI•

[...]

Richard H. Byrd¹, Gillian M. Chin², Jorge Nocedal², Yuchen Wu³•Institutions (3)

University of Colorado Boulder¹, Northwestern University², Google³

Smoothing methods for nonsmooth, nonconvex minimization

TL;DR: A criterion for increasing the sample size based on variance estimates obtained during the computation of a batch gradient, and establishes an O(1/\epsilon) complexity bound on the total cost of a gradient method.

...read moreread less

Abstract: This paper presents a methodology for using varying sample sizes in batch-type optimization methods for large-scale machine learning problems. The first part of the paper deals with the delicate issue of dynamic sample selection in the evaluation of the function and gradient. We propose a criterion for increasing the sample size based on variance estimates obtained during the computation of a batch gradient. We establish an $${O(1/\epsilon)}$$ complexity bound on the total cost of a gradient method. The second part of the paper describes a practical Newton method that uses a smaller sample to compute Hessian vector-products than to evaluate the function and the gradient, and that also employs a dynamic sampling technique. The focus of the paper shifts in the third part of the paper to L 1-regularized problems designed to produce sparse solutions. We propose a Newton-like method that consists of two phases: a (minimalistic) gradient projection phase that identifies zero variables, and subspace phase that applies a subsampled Hessian Newton iteration in the free variables. Numerical tests on speech recognition problems illustrate the performance of the algorithms.

...read moreread less

380 citations

Journal Article•DOI•

[...]

Xiaojun Chen¹•Institutions (1)

Hong Kong Polytechnic University¹

Is bilevel programming a special case of a mathematical program with complementarity constraints

TL;DR: In this article, the authors consider a class of smoothing methods for minimization problems where the feasible set is convex but the objective function is not convex, not differentiable and perhaps not even locally Lipschitz at the solutions.

...read moreread less

Abstract: We consider a class of smoothing methods for minimization problems where the feasible set is convex but the objective function is not convex, not differentiable and perhaps not even locally Lipschitz at the solutions. Such optimization problems arise from wide applications including image restoration, signal reconstruction, variable selection, optimal control, stochastic equilibrium and spherical approximations. In this paper, we focus on smoothing methods for solving such optimization problems, which use the structure of the minimization problems and composition of smoothing functions for the plus function (x)+. Many existing optimization algorithms and codes can be used in the inner iteration of the smoothing methods. We present properties of the smoothing functions and the gradient consistency of subdifferential associated with a smoothing function. Moreover, we describe how to update the smoothing parameter in the outer iteration of the smoothing methods to guarantee convergence of the smoothing methods to a stationary point of the original minimization problem.

...read moreread less

270 citations

Journal Article•DOI•

[...]

Stephan Dempe, Joydeep Dutta¹•Institutions (1)

Indian Institute of Technology Kanpur¹

On the Power and Limitations of Affine Policies in Two-Stage Adaptive Optimization

TL;DR: It is shown that global optimal Solutions of the MPCC correspond to global optimal solutions of the bilevel problem provided the lower-level problem satisfies the Slater’s constraint qualification and that this correspondence can fail if the Slater's constraint qualification fails to hold at lower level.

...read moreread less

Abstract: Bilevel programming problems are often reformulated using the Karush–Kuhn–Tucker conditions for the lower level problem resulting in a mathematical program with complementarity constraints(MPCC). Clearly, both problems are closely related. But the answer to the question posed is “No” even in the case when the lower level programming problem is a parametric convex optimization problem. This is not obvious and concerns local optimal solutions. We show that global optimal solutions of the MPCC correspond to global optimal solutions of the bilevel problem provided the lower-level problem satisfies the Slater’s constraint qualification. We also show by examples that this correspondence can fail if the Slater’s constraint qualification fails to hold at lower-level. When we consider the local solutions, the relationship between the bilevel problem and its corresponding MPCC is more complicated. We also demonstrate the issues relating to a local minimum through examples.

...read moreread less

211 citations

Journal Article•DOI•

[...]

Dimitris Bertsimas¹, Vineet Goyal²•Institutions (2)

Massachusetts Institute of Technology¹, Columbia University²

01 Sep 2012-Mathematical Programming

TL;DR: It is shown that the worst-case cost of an optimal affine policy can be $${\Omega(m^{1/2-\delta})}$$ times the best- case cost of the optimal fully-adaptable solution for any δ > 0, where m is the number of linear constraints.

...read moreread less

Abstract: We consider a two-stage adaptive linear optimization problem under right hand side uncertainty with a min–max objective and give a sharp characterization of the power and limitations of affine policies (where the second stage solution is an affine function of the right hand side uncertainty). In particular, we show that the worst-case cost of an optimal affine policy can be $${\Omega(m^{1/2-\delta})}$$ times the worst-case cost of an optimal fully-adaptable solution for any δ > 0, where m is the number of linear constraints. We also show that the worst-case cost of the best affine policy is $${O(\sqrt m)}$$ times the optimal cost when the first-stage constraint matrix has non-negative coefficients. Moreover, if there are only k ≤ m uncertain parameters, we generalize the performance bound for affine policies to $${O(\sqrt k)}$$ , which is particularly useful if only a few parameters are uncertain. We also provide an $${O(\sqrt k)}$$ -approximation algorithm for the general case without any restriction on the constraint matrix but the solution is not an affine function of the uncertain parameters. We also give a tight characterization of the conditions under which an affine policy is optimal for the above model. In particular, we show that if the uncertainty set, $${{\mathcal U} \subseteq {\mathbb R}^m_+}$$ is a simplex, then an affine policy is optimal. However, an affine policy is suboptimal even if $${{\mathcal U}}$$ is a convex combination of only (m + 3) extreme points (only two more extreme points than a simplex) and the worst-case cost of an optimal affine policy can be a factor (2 − δ) worse than the worst-case cost of an optimal fully-adaptable solution for any δ > 0.

...read moreread less

180 citations

Journal Article•DOI•

Tractable stochastic analysis in high dimensions via robust optimization

[...]

Chaithanya Bandi¹, Dimitris Bertsimas¹•Institutions (1)

Massachusetts Institute of Technology¹

Validation analysis of mirror descent stochastic approximation method

TL;DR: This work proposes a new approach to analyze stochastic systems based on robust optimization, which replaces the Kolmogorov axioms and the concept of random variables as primitives of probability theory with uncertainty sets that are derived from some of the asymptotic implications of probabilities theory like the central limit theorem.

...read moreread less

Abstract: Modern probability theory, whose foundation is based on the axioms set forth by Kolmogorov, is currently the major tool for performance analysis in stochastic systems. While it offers insights in understanding such systems, probability theory, in contrast to optimization, has not been developed with computational tractability as an objective when the dimension increases. Correspondingly, some of its major areas of application remain unsolved when the underlying systems become multidimensional: Queueing networks, auction design in multi-item, multi-bidder auctions, network information theory, pricing multi-dimensional options, among others. We propose a new approach to analyze stochastic systems based on robust optimization. The key idea is to replace the Kolmogorov axioms and the concept of random variables as primitives of probability theory, with uncertainty sets that are derived from some of the asymptotic implications of probability theory like the central limit theorem. In addition, we observe that several desired system properties such as incentive compatibility and individual rationality in auction design are naturally expressed in the language of robust optimization. In this way, the performance analysis questions become highly structured optimization problems (linear, semidefinite, mixed integer) for which there exist efficient, practical algorithms that are capable of solving problems in high dimensions. We demonstrate that the proposed approach achieves computationally tractable methods for (a) analyzing queueing networks, (b) designing multi-item, multi-bidder auctions with budget constraints, and (c) pricing multi-dimensional options.

...read moreread less

157 citations

Journal Article•DOI•

[...]

Guanghui Lan¹, Arkadi Nemirovski², Alexander Shapiro²•Institutions (2)

University of Florida¹, Georgia Institute of Technology²

01 Sep 2012-Mathematical Programming

TL;DR: It is demonstrated that for a certain class of convex stochastic programs these bounds are comparable in quality with similar bounds computed by the sample average approximation method, while their computational cost is considerably smaller.

...read moreread less

Abstract: The main goal of this paper is to develop accuracy estimates for stochastic programming problems by employing stochastic approximation (SA) type algorithms. To this end we show that while running a Mirror Descent Stochastic Approximation procedure one can compute, with a small additional effort, lower and upper statistical bounds for the optimal objective value. We demonstrate that for a certain class of convex stochastic programs these bounds are comparable in quality with similar bounds computed by the sample average approximation method, while their computational cost is considerably smaller.

...read moreread less

151 citations

Journal Article•DOI•

A relaxed constant positive linear dependence constraint qualification and applications

[...]

Roberto Andreani¹, Gabriel Haeser², María Laura Schuverdt³, Paulo J. S. Silva⁴•Institutions (4)

State University of Campinas¹, Federal University of São Paulo², National Scientific and Technical Research Council³, University of São Paulo⁴

An implementable proximal point algorithmic framework for nuclear norm minimization

TL;DR: This work introduces a relaxed version of the constant positive linear dependence constraint qualification (CPLD) that it is shown is enough to ensure the convergence of an augmented Lagrangian algorithm and that it asserts the validity of an error bound.

...read moreread less

Abstract: In this work we introduce a relaxed version of the constant positive linear dependence constraint qualification (CPLD) that we call RCPLD. This development is inspired by a recent generalization of the constant rank constraint qualification by Minchenko and Stakhovski that was called RCRCQ. We show that RCPLD is enough to ensure the convergence of an augmented Lagrangian algorithm and that it asserts the validity of an error bound. We also provide proofs and counter-examples that show the relations of RCRCQ and RCPLD with other known constraint qualifications. In particular, RCPLD is strictly weaker than CPLD and RCRCQ, while still stronger than Abadie’s constraint qualification. We also verify that the second order necessary optimality condition holds under RCRCQ.

...read moreread less

147 citations

Journal Article•DOI•

[...]

Yong-Jin Liu¹, Defeng Sun², Kim-Chuan Toh²•Institutions (2)

Shenyang Aerospace University¹, National University of Singapore²

On mixing sets arising in chance-constrained programming

TL;DR: This paper investigates the performance of the proposed algorithms in which the inner sub-problems are approximately solved by the gradient projection method or the accelerated proximal gradient method, and shows that these algorithms perform favorably in comparison to several recently proposed state-of-the-art algorithms.

...read moreread less

Abstract: The nuclear norm minimization problem is to find a matrix with the minimum nuclear norm subject to linear and second order cone constraints. Such a problem often arises from the convex relaxation of a rank minimization problem with noisy data, and arises in many fields of engineering and science. In this paper, we study inexact proximal point algorithms in the primal, dual and primal-dual forms for solving the nuclear norm minimization with linear equality and second order cone constraints. We design efficient implementations of these algorithms and present comprehensive convergence results. In particular, we investigate the performance of our proposed algorithms in which the inner sub-problems are approximately solved by the gradient projection method or the accelerated proximal gradient method. Our numerical results for solving randomly generated matrix completion problems and real matrix completion problems show that our algorithms perform favorably in comparison to several recently proposed state-of-the-art algorithms. Interestingly, our proposed algorithms are connected with other algorithms that have been studied in the literature.

...read moreread less

137 citations

Journal Article•DOI•

[...]

Simge Küçükyavuz¹•Institutions (1)

Ohio State University¹

01 Apr 2012-Mathematical Programming

TL;DR: A compact extended reformulation that characterizes a linear programming equivalent of a single chance constraint with equal scenario probabilities and a compact extended linear program for the intersection of multiple mixing sets and a cardinality constraint for a special case is given.

...read moreread less

Abstract: The mixing set with a knapsack constraint arises in deterministic equivalent of chance-constrained programming problems with finite discrete distributions. We first consider the case that the chance-constrained program has equal probabilities for each scenario. We study the resulting mixing set with a cardinality constraint and propose facet-defining inequalities that subsume known explicit inequalities for this set. We extend these inequalities to obtain valid inequalities for the mixing set with a knapsack constraint. In addition, we propose a compact extended reformulation (with polynomial number of variables and constraints) that characterizes a linear programming equivalent of a single chance constraint with equal scenario probabilities. We introduce a blending procedure to find valid inequalities for intersection of multiple mixing sets. We propose a polynomial-size extended formulation for the intersection of multiple mixing sets with a knapsack constraint that is stronger than the original mixing formulation. We also give a compact extended linear program for the intersection of multiple mixing sets and a cardinality constraint for a special case. We illustrate the effectiveness of the proposed inequalities in our computational experiments with probabilistic lot-sizing problems.

...read moreread less

128 citations

Journal Article•DOI•

The integer approximation error in mixed-integer optimal control

[...]

Sebastian Sager¹, Hans Georg Bock¹, Moritz Diehl²•Institutions (2)

Interdisciplinary Center for Scientific Computing¹, Katholieke Universiteit Leuven²

On convex relaxations for quadratically constrained quadratic programming

TL;DR: A constructive way to obtain an integer solution with a guaranteed bound on the performance loss in polynomial time is presented and it is proved that this bound depends linearly on the control discretization grid.

...read moreread less

Abstract: We extend recent work on nonlinear optimal control problems with integer restrictions on some of the control functions (mixed-integer optimal control problems, MIOCP). We improve a theorem (Sager et al. in Math Program 118(1): 109–149, 2009) that states that the solution of a relaxed and convexified problem can be approximated with arbitrary precision by a solution fulfilling the integer requirements. Unlike in previous publications the new proof avoids the usage of the Krein-Milman theorem, which is undesirable as it only states the existence of a solution that may switch infinitely often. We present a constructive way to obtain an integer solution with a guaranteed bound on the performance loss in polynomial time. We prove that this bound depends linearly on the control discretization grid. A numerical benchmark example illustrates the procedure. As a byproduct, we obtain an estimate of the Hausdorff distance between reachable sets. We improve the approximation order to linear grid size h instead of the previously known result with order $${\sqrt{h}}$$ (Hackl in Reachable sets, control sets and their computation, augsburger mathematisch-naturwissenschaftliche schriften. Dr. Bernd Wisner, Augsburg, 1996). We are able to include a Special Ordered Set condition which will allow for a transfer of the results to a more general, multi-dimensional and nonlinear case compared to the Theorems in Pietrus and Veliov in (Syst Control Lett 58:395–399, 2009).

...read moreread less

Journal Article•DOI•

[...]

Kurt M. Anstreicher¹•Institutions (1)

University of Iowa¹

01 Dec 2012-Mathematical Programming

TL;DR: This work considers convex relaxations for the problem of minimizing a (possibly nonconvex) quadratic objective subject to linear and ( possibly nonconvergent) quadRatic constraints.

...read moreread less

Abstract: We consider convex relaxations for the problem of minimizing a (possibly nonconvex) quadratic objective subject to linear and (possibly nonconvex) quadratic constraints. Let $$\mathcal{F }$$ denote the feasible region for the linear constraints. We first show that replacing the quadratic objective and constraint functions with their convex lower envelopes on $$\mathcal{F }$$ is dominated by an alternative methodology based on convexifying the range of the quadratic form $$\genfrac(){0.0pt}{}{1}{x}\genfrac(){0.0pt}{}{1}{x}^T$$ for $$x\in \mathcal{F }$$. We next show that the use of "$$\alpha $$BB" underestimators as computable estimates of convex lower envelopes is dominated by a relaxation of the convex hull of the quadratic form that imposes semidefiniteness and linear constraints on diagonal terms. Finally, we show that the use of a large class of D.C. ("difference of convex") underestimators is dominated by a relaxation that combines semidefiniteness with RLT constraints.

...read moreread less

Journal Article•DOI•

Global optimization of mixed-integer quadratically-constrained quadratic programs (MIQCQP) through piecewise-linear and edge-concave relaxations

[...]

Ruth Misener¹, Christodoulos A. Floudas¹•Institutions (1)

Princeton University¹

24 May 2012-Mathematical Programming

TL;DR: A deterministic global optimization approach, whose novel contributions are rooted in the edge-concave and piecewise-linear underestimators, to address nonconvex mixed-integer quadratically-constrained quadratic programs (MIQCQP) to global optimality.

...read moreread less

Abstract: We propose a deterministic global optimization approach, whose novel contributions are rooted in the edge-concave and piecewise-linear underestimators, to address nonconvex mixed-integer quadratically-constrained quadratic programs (MIQCQP) to ${\epsilon}$ -global optimality. The facets of low-dimensional (n ≤ 3) edge-concave aggregations dominating the termwise relaxation of MIQCQP are introduced at every node of a branch-and-bound tree. Concave multivariable terms and sparsely distributed bilinear terms that do not participate in connected edge-concave aggregations are addressed through piecewise-linear relaxations. Extensive computational studies are presented for point packing problems, standard and generalized pooling problems, and examples from GLOBALLib (Meeraus, Globallib. http://www.gamsworld.org/global/globallib.htm).

...read moreread less

Journal Article•DOI•

An Augmented Lagrangian Approach for Sparse Principal Component Analysis

[...]

Zhaosong Lu¹, Yong Zhang¹•Institutions (1)

Simon Fraser University¹

Robust inversion, dimensionality reduction, and randomized sampling

TL;DR: In this paper, the authors proposed a new formulation for sparse PCA, aiming at finding sparse and nearly uncorrelated PCs with orthogonal loading vectors while explaining as much of the total variance as possible.

...read moreread less

Abstract: Principal component analysis (PCA) is a widely used technique for data analysis and dimension reduction with numerous applications in science and engineering. However, the standard PCA suffers from the fact that the principal components (PCs) are usually linear combinations of all the original variables, and it is thus often difficult to interpret the PCs. To alleviate this drawback, various sparse PCA approaches were proposed in literature [15, 6, 17, 28, 8, 25, 18, 7, 16]. Despite success in achieving sparsity, some important properties enjoyed by the standard PCA are lost in these methods such as uncorrelation of PCs and orthogonality of loading vectors. Also, the total explained variance that they attempt to maximize can be too optimistic. In this paper we propose a new formulation for sparse PCA, aiming at finding sparse and nearly uncorrelated PCs with orthogonal loading vectors while explaining as much of the total variance as possible. We also develop a novel augmented Lagrangian method for solving a class of nonsmooth constrained optimization problems, which is well suited for our formulation of sparse PCA. We show that it converges to a feasible point, and moreover under some regularity assumptions, it converges to a stationary point. Additionally, we propose two nonmonotone gradient methods for solving the augmented Lagrangian subproblems, and establish their global and local convergence. Finally, we compare our sparse PCA approach with several existing methods on synthetic, random, and real data, respectively. The computational results demonstrate that the sparse PCs produced by our approach substantially outperform those by other methods in terms of total explained variance, correlation of PCs, and orthogonality of loading vectors.

...read moreread less

Journal Article•DOI•

[...]

Aleksandr Y. Aravkin¹, Michael P. Friedlander¹, Felix J. Herrmann¹, Tristan van Leeuwen¹•Institutions (1)

University of British Columbia¹

Extending the QCR method to general mixed-integer programs

TL;DR: A class of inverse problems in which the forward model is the solution operator to linear ODEs or PDEs is considered, which admits several dimensionality-reduction techniques based on data averaging or sampling, which are especially useful for large-scale problems.

...read moreread less

Abstract: We consider a class of inverse problems in which the forward model is the solution operator to linear ODEs or PDEs. This class admits several dimensionality-reduction techniques based on data averaging or sampling, which are especially useful for large-scale problems. We survey these approaches and their connection to stochastic optimization. The data-averaging approach is only viable, however, for a least-squares misfit, which is sensitive to outliers in the data and artifacts unexplained by the forward model. This motivates us to propose a robust formulation based on the Student’s t-distribution of the error. We demonstrate how the corresponding penalty function, together with the sampling approach, can obtain good results for a large-scale seismic inverse problem with 50 % corrupted data.

...read moreread less

Journal Article•DOI•

[...]

Alain Billionnet, Sourour Elloumi¹, Amélie Lambert¹•Institutions (1)

Conservatoire national des arts et métiers¹

Divide to conquer: decomposition methods for energy optimization

TL;DR: It is proved that the reformulation of (MQP) is the best one within a convex reformulation scheme, from the continuous relaxation point of view, and can be solved by a standard solver that uses a branch and bound algorithm.

...read moreread less

Abstract: Let (MQP) be a general mixed integer quadratic program that consists of minimizing a quadratic function subject to linear constraints. In this paper, we present a convex reformulation of (MQP), i.e. we reformulate (MQP) into an equivalent program, with a convex objective function. Such a reformulation can be solved by a standard solver that uses a branch and bound algorithm. We prove that our reformulation is the best one within a convex reformulation scheme, from the continuous relaxation point of view. This reformulation, that we call MIQCR (Mixed Integer Quadratic Convex Reformulation), is based on the solution of an SDP relaxation of (MQP). Computational experiences are carried out with instances of (MQP) including one equality constraint or one inequality constraint. The results show that most of the considered instances with up to 40 variables can be solved in 1 h of CPU time by a standard solver.

...read moreread less

Journal Article•DOI•

[...]

Claudia Sagastizábal¹•Institutions (1)

Instituto Nacional de Matemática Pura e Aplicada¹

Analysis of direct searches for discontinuous functions

TL;DR: For structured optimization and generalized equilibrium problems, this work explores some variants of solution methods based on Lagrangian relaxation and on Benders decomposition and keeps as a leading thread the actual practical value of such techniques in terms of their efficiency to solve energy related problems.

...read moreread less

Abstract: Modern electricity systems provide a plethora of challenging issues in optimization. The increasing penetration of low carbon renewable sources of energy introduces uncertainty in problems traditionally modeled in a deterministic setting. The liberalization of the electricity sector brought the need of designing sound markets, ensuring capacity investments while properly reflecting strategic interactions. In all these problems, hedging risk, possibly in a dynamic manner, is also a concern. The fact of representing uncertainty and/or competition of different companies in a multi-settlement power market considerably increases the number of variables and constraints. For this reason, usually a trade-off needs to be found between modeling and numerical tractability: the more details are brought into the model, the harder becomes the optimization problem. For structured optimization and generalized equilibrium problems, we explore some variants of solution methods based on Lagrangian relaxation and on Benders decomposition. Throughout we keep as a leading thread the actual practical value of such techniques in terms of their efficiency to solve energy related problems.

...read moreread less

Journal Article•DOI•

[...]

Luís Nunes Vicente¹, A. L. Custódio•Institutions (1)

University of Coimbra¹

Some results on the strength of relaxations of multilinear functions

TL;DR: It is shown that Rockafellar derivatives are also nonnegative along the limit directions of those subsequences of unsuccessful iterates when the function values converge to the function value at the limit point.

...read moreread less

Abstract: It is known that the Clarke generalized directional derivative is nonnegative along the limit directions generated by directional direct-search methods at a limit point of certain subsequences of unsuccessful iterates, if the function being minimized is Lipschitz continuous near the limit point. In this paper we generalize this result for discontinuous functions using Rockafellar generalized directional derivatives (upper subderivatives). We show that Rockafellar derivatives are also nonnegative along the limit directions of those subsequences of unsuccessful iterates when the function values converge to the function value at the limit point. This result is obtained assuming that the function is directionally Lipschitz with respect to the limit direction. It is also possible under appropriate conditions to establish more insightful results by showing that the sequence of points generated by these methods eventually approaches the limit point along the locally best branch or step function (when the number of steps is equal to two). The results of this paper are presented for constrained optimization and illustrated numerically.

...read moreread less

Journal Article•DOI•

[...]

James Luedtke¹, Mahdi Namazifar¹, Jeff Linderoth¹•Institutions (1)

University of Wisconsin-Madison¹

21 Oct 2012-Mathematical Programming

TL;DR: It is shown that for a multilinear function having a single product term, this approach yields the convex and concave envelopes if the bounds on all variables are symmetric around zero and for bilinear functions it is proved that the difference between the concave upper bounding and convex lower bounding functions obtained from the McCormick relaxation approach is always within a constant.

...read moreread less

Abstract: We study approaches for obtaining convex relaxations of global optimization problems containing multilinear functions Specifically, we compare the concave and convex envelopes of these functions with the relaxations that are obtained with a standard relaxation approach, due to McCormick The standard approach reformulates the problem to contain only bilinear terms and then relaxes each term independently We show that for a multilinear function having a single product term, this approach yields the convex and concave envelopes if the bounds on all variables are symmetric around zero We then review and extend some results on conditions when the concave envelope of a multilinear function can be written as a sum of concave envelopes of its individual terms Finally, for bilinear functions we prove that the difference between the concave upper bounding and convex lower bounding functions obtained from the McCormick relaxation approach is always within a constant of the difference between the concave and convex envelopes These results, along with numerical examples we provide, give insight into how to construct strong relaxations of multilinear functions

...read moreread less

Journal Article•DOI•

An improved column generation algorithm for minimum sum-of-squares clustering

[...]

Daniel Aloise¹, Pierre Hansen², Leo Liberti²•Institutions (2)

Federal University of Rio Grande do Norte¹, École Polytechnique²

Fixed point optimization algorithm and its application to power control in CDMA data networks

TL;DR: This work proposes a new way to solve the auxiliary problem of finding a column with negative reduced cost based on geometric arguments that greatly improves the efficiency of the whole algorithm and leads to exact solution of instances with over 2,300 entities.

...read moreread less

Abstract: Given a set of entities associated with points in Euclidean space, minimum sum-of-squares clustering (MSSC) consists in partitioning this set into clusters such that the sum of squared distances from each point to the centroid of its cluster is minimized. A column generation algorithm for MSSC was given by du Merle et al. in SIAM Journal Scientific Computing 21:1485–1505. The bottleneck of that algorithm is the resolution of the auxiliary problem of finding a column with negative reduced cost. We propose a new way to solve this auxiliary problem based on geometric arguments. This greatly improves the efficiency of the whole algorithm and leads to exact solution of instances with over 2,300 entities, i.e., more than 10 times as much as previously done.

...read moreread less

Journal Article•DOI•

[...]

Hideaki Iiduka¹•Institutions (1)

Kyushu Institute of Technology¹

Sample average approximation of stochastic dominance constrained programs

TL;DR: A fixed point optimization algorithm for the variational inequality problem for a continuous operator over the fixed point set of a nonexpansive mapping is devised and a convergence analysis is performed on it.

...read moreread less

Abstract: We discuss the variational inequality problem for a continuous operator over the fixed point set of a nonexpansive mapping. One application of this problem is a power control for a direct-sequence code-division multiple-access data network. For such a power control, each user terminal has to be able to quickly transmit at an ideal power level such that it can get a sufficient signal-to-interference-plus-noise ratio and achieve the required quality of service. Iterative algorithms to solve this problem should not involve auxiliary optimization problems and complicated computations. To ensure this, we devise a fixed point optimization algorithm for the variational inequality problem and perform a convergence analysis on it. We give numerical examples of the algorithm as a power control.

...read moreread less

Journal Article•DOI•

[...]

Jian Hu¹, Tito Homem-de-Mello², Sanjay Mehrotra¹•Institutions (2)

Northwestern University¹, University of Illinois at Chicago²

Reformulations in mathematical programming: automatic symmetry detection and exploitation

TL;DR: The concept of C-dominance is introduced, which generalizes some notions of multi-variate dominance found in the literature and develops a finitely convergent method to find an $$epsilon}-optimal solution of the SAA problem.

...read moreread less

Abstract: In this paper we study optimization problems with second-order stochastic dominance constraints. This class of problems allows for the modeling of optimization problems where a risk-averse decision maker wants to ensure that the solution produced by the model dominates certain benchmarks. Here we deal with the case of multi-variate stochastic dominance under general distributions and nonlinear functions. We introduce the concept of $${\mathcal{C}}$$ -dominance, which generalizes some notions of multi-variate dominance found in the literature. We apply the Sample Average Approximation (SAA) method to this problem, which results in a semi-infinite program, and study asymptotic convergence of optimal values and optimal solutions, as well as the rate of convergence of the feasibility set of the resulting semi-infinite program as the sample size goes to infinity. We develop a finitely convergent method to find an $${\epsilon}$$ -optimal solution of the SAA problem. An important aspect of our contribution is the construction of practical statistical lower and upper bounds for the true optimal objective value. We also show that the bounds are asymptotically tight as the sample size goes to infinity.

...read moreread less

Journal Article•DOI•

[...]

Leo Liberti¹•Institutions (1)

École Polytechnique¹

LP-based approximation algorithms for capacitated facility location

TL;DR: This work proposes a method for automatically finding the formulation group of any given Mixed-Integer Nonlinear Program, and for reformulating the problem by means of static symmetry breaking constraints.

...read moreread less

Abstract: If a mathematical program has many symmetric optima, solving it via Branch-and-Bound techniques often yields search trees of disproportionate sizes; thus, finding and exploiting symmetries is an important task. We propose a method for automatically finding the formulation group of any given Mixed-Integer Nonlinear Program, and for reformulating the problem by means of static symmetry breaking constraints. The reformulated problem—which is likely to have fewer symmetric optima—can then be solved via standard Branch-and-Bound codes such as CPLEX (for linear programs) and Couenne (for nonlinear programs). Our computational results include formulation group tables for the MIPLib3, MIPLib2003, GlobalLib and MINLPLib instance libraries and solution tables for some instances in the aforementioned libraries.

...read moreread less

Journal Article•DOI•

[...]

Retsef Levi¹, David B. Shmoys², Chaitanya Swamy³•Institutions (3)

Massachusetts Institute of Technology¹, Cornell University², University of Waterloo³

A convex polynomial that is not sos-convex

TL;DR: This work gives a 5-approximation algorithm for the special case in which all of the facility costs are equal, by rounding the optimal solution to the standard LP relaxation.

...read moreread less

Abstract: In the capacitated facility location problem with hard capacities, we are given a set of facilities, $${\mathcal{F}}$$, and a set of clients $${\mathcal{D}}$$ in a common metric space. Each facility i has a facility opening cost f i and capacity u i that specifies the maximum number of clients that may be assigned to this facility. We want to open some facilities from the set $${\mathcal{F}}$$ and assign each client to an open facility so that at most u i clients are assigned to any open facility i. The cost of assigning client j to facility i is given by the distance c ij , and our goal is to minimize the sum of the facility opening costs and the client assignment costs. The only known approximation algorithms that deliver solutions within a constant factor of optimal for this NP-hard problem are based on local search techniques. It is an open problem to devise an approximation algorithm for this problem based on a linear programming lower bound (or indeed, to prove a constant integrality gap for any LP relaxation). We make progress on this question by giving a 5-approximation algorithm for the special case in which all of the facility costs are equal, by rounding the optimal solution to the standard LP relaxation. One notable aspect of our algorithm is that it relies on partitioning the input into a collection of single-demand capacitated facility location problems, approximately solving them, and then combining these solutions in a natural way.

...read moreread less

Journal Article•DOI•

[...]

Amir Ali Ahmadi¹, Pablo A. Parrilo¹•Institutions (1)

Massachusetts Institute of Technology¹

A line search exact penalty method using steering rules

TL;DR: A negative answer to the question of whether sos-convexity is also a necessary condition for convexity of polynomials is given by presenting an explicit example of a trivariate homogeneous polynomial of degree eight that is convex but not sos.

...read moreread less

Abstract: A multivariate polynomial p(x) = p(x 1, . . . , x n ) is sos-convex if its Hessian H(x) can be factored as H(x) = M T (x) M(x) with a possibly nonsquare polynomial matrix M(x). It is easy to see that sos-convexity is a sufficient condition for convexity of p(x). Moreover, the problem of deciding sos-convexity of a polynomial can be cast as the feasibility of a semidefinite program, which can be solved efficiently. Motivated by this computational tractability, it is natural to study whether sos-convexity is also a necessary condition for convexity of polynomials. In this paper, we give a negative answer to this question by presenting an explicit example of a trivariate homogeneous polynomial of degree eight that is convex but not sos-convex.

...read moreread less

Journal Article•DOI•

[...]

Richard H. Byrd¹, Gabriel Lopez-Calva², Jorge Nocedal²•Institutions (2)

University of Colorado Boulder¹, Northwestern University²

Moscow State University¹, Instituto Nacional de Matemática Pura e Aplicada²

TL;DR: An exact penalization approach is described that extends the class of problems that can be solved with line search sequential quadratic programming methods and it is shown that the algorithm enjoys favorable global convergence properties.

...read moreread less

Abstract: Line search algorithms for nonlinear programming must include safeguards to enjoy global convergence properties. This paper describes an exact penalization approach that extends the class of problems that can be solved with line search sequential quadratic programming methods. In the new algorithm, the penalty parameter is adjusted at every iteration to ensure sufficient progress in linear feasibility and to promote acceptance of the step. A trust region is used to assist in the determination of the penalty parameter, but not in the step computation. It is shown that the algorithm enjoys favorable global convergence properties. Numerical experiments illustrate the behavior of the algorithm on various difficult situations.

...read moreread less

Journal Article•DOI•

Stabilized SQP revisited

[...]

Alexey F. Izmailov¹, Mikhail V. Solodov²•Institutions (2)

A primal–dual interior point method for nonlinear semidefinite programming

TL;DR: In this paper, it was shown that the stabilized version of the sSQP algorithm is locally superlinearly convergent under the noncritical multiplier assumption, weaker than SOSC employed originally.

...read moreread less

Abstract: The stabilized version of the sequential quadratic programming algorithm (sSQP) had been developed in order to achieve superlinear convergence in situations when the Lagrange multipliers associated to a solution are not unique. Within the framework of Fischer (Math Program 94:91–124, 2002), the key to local superlinear convergence of sSQP are the following two properties: upper Lipschitzian behavior of solutions of the Karush-Kuhn-Tucker (KKT) system under canonical perturbations and local solvability of sSQP subproblems with the associated primal-dual step being of the order of the distance from the current iterate to the solution set of the unperturbed KKT system. According to Fernandez and Solodov (Math Program 125:47–73, 2010), both of these properties are ensured by the second-order sufficient optimality condition (SOSC) without any constraint qualification assumptions. In this paper, we state precise relationships between the upper Lipschitzian property of solutions of KKT systems, error bounds for KKT systems, the notion of critical Lagrange multipliers (a subclass of multipliers that violate SOSC in a very special way), the second-order necessary condition for optimality, and solvability of sSQP subproblems. Moreover, for the problem with equality constraints only, we prove superlinear convergence of sSQP under the assumption that the dual starting point is close to a noncritical multiplier. Since noncritical multipliers include all those satisfying SOSC but are not limited to them, we believe this gives the first superlinear convergence result for any Newtonian method for constrained optimization under assumptions that do not include any constraint qualifications and are weaker than SOSC. In the general case when inequality constraints are present, we show that such a relaxation of assumptions is not possible. We also consider applying sSQP to the problem where inequality constraints are reformulated into equalities using slack variables, and discuss the assumptions needed for convergence in this approach. We conclude with consequences for local regularization methods proposed in (Izmailov and Solodov SIAM J Optim 16:210–228, 2004; Wright SIAM J. Optim. 15:673–676, 2005). In particular, we show that these methods are still locally superlinearly convergent under the noncritical multiplier assumption, weaker than SOSC employed originally.

...read moreread less

Journal Article•DOI•

[...]

Hiroshi Yamashita, Hiroshi Yabe¹, Kouhei Harada•Institutions (1)

Tokyo University of Science¹

A limited memory steepest descent method

TL;DR: By combining the primal barrier penalty function and the primal–dual barrier function, a new primal-dual merit function is proposed and it is proved the global convergence property of the method.

...read moreread less

Abstract: This paper is concerned with a primal–dual interior point method for solving nonlinear semidefinite programming problems. The method consists of the outer iteration (SDPIP) that finds a KKT point and the inner iteration (SDPLS) that calculates an approximate barrier KKT point. Algorithm SDPLS uses a commutative class of Newton-like directions for the generation of line search directions. By combining the primal barrier penalty function and the primal–dual barrier function, a new primal–dual merit function is proposed. We prove the global convergence property of our method. Finally some numerical experiments are given.

...read moreread less

Journal Article•DOI•

[...]

Roger Fletcher¹•Institutions (1)

University of Dundee¹