scispace - formally typeset
Search or ask a question

Showing papers in "Siam Journal on Optimization in 2013"


Journal ArticleDOI
TL;DR: This paper studies an alternative inexact BCD approach which updates the variable blocks by successively minimizing a sequence of approximations of f which are either locally tight upper bounds of $f$ or strictly convex local approximation of f.
Abstract: The block coordinate descent (BCD) method is widely used for minimizing a continuous function $f$ of several block variables. At each iteration of this method, a single block of variables is optimized, while the remaining variables are held fixed. To ensure the convergence of the BCD method, the subproblem of each block variable needs to be solved to its unique global optimal. Unfortunately, this requirement is often too restrictive for many practical scenarios. In this paper, we study an alternative inexact BCD approach which updates the variable blocks by successively minimizing a sequence of approximations of $f$ which are either locally tight upper bounds of $f$ or strictly convex local approximations of $f$. The main contributions of this work include the characterizations of the convergence conditions for a fairly wide class of such methods, especially for the cases where the objective functions are either nondifferentiable or nonconvex. Our results unify and extend the existing convergence results ...

1,032 citations


Journal ArticleDOI
TL;DR: The randomized stochastic gradient (RSG) algorithm as mentioned in this paper is a type of approximation algorithm for non-convex nonlinear programming problems, and it has a nearly optimal rate of convergence if the problem is convex.
Abstract: In this paper, we introduce a new stochastic approximation type algorithm, namely, the randomized stochastic gradient (RSG) method, for solving an important class of nonlinear (possibly nonconvex) stochastic programming problems. We establish the complexity of this method for computing an approximate stationary point of a nonlinear programming problem. We also show that this method possesses a nearly optimal rate of convergence if the problem is convex. We discuss a variant of the algorithm which consists of applying a postoptimization phase to evaluate a short list of solutions generated by several independent runs of the RSG method, and we show that such modification allows us to improve significantly the large-deviation properties of the algorithm. These methods are then specialized for solving a class of simulation-based optimization problems in which only stochastic zeroth-order information is available.

599 citations


Journal ArticleDOI
TL;DR: This paper analyzes the block coordinate gradient projection method in which each iteration consists of performing a gradient projection step with respect to a certain block taken in a cyclic order and establishes global sublinear rate of convergence.
Abstract: In this paper we study smooth convex programming problems where the decision variables vector is split into several blocks of variables. We analyze the block coordinate gradient projection method in which each iteration consists of performing a gradient projection step with respect to a certain block taken in a cyclic order. Global sublinear rate of convergence of this method is established and it is shown that it can be accelerated when the problem is unconstrained. In the unconstrained setting we also prove a sublinear rate of convergence result for the so-called alternating minimization method when the number of blocks is two. When the objective function is also assumed to be strongly convex, linear rate of convergence is established.

576 citations


Journal ArticleDOI
TL;DR: This work proposes a new algorithm for matrix completion that minimizes the least-square distance on the sampling set over the Riemannian manifold of fixed-rank matrices and proves convergence of a regularized version of the algorithm under the assumption that the restricted isometry property holds for incoherent matrices throughout the iterations.
Abstract: The matrix completion problem consists of finding or approximating a low-rank matrix based on a few samples of this matrix. We propose a new algorithm for matrix completion that minimizes the least-square distance on the sampling set over the Riemannian manifold of fixed-rank matrices. The algorithm is an adaptation of classical nonlinear conjugate gradients, developed within the framework of retraction-based optimization on manifolds. We describe all the necessary objects from differential geometry necessary to perform optimization over this low-rank matrix manifold, seen as a submanifold embedded in the space of matrices. In particular, we describe how metric projection can be used as retraction and how vector transport lets us obtain the conjugate search directions. Finally, we prove convergence of a regularized version of our algorithm under the assumption that the restricted isometry property holds for incoherent matrices throughout the iterations. The numerical experiments indicate that our approach...

512 citations


Journal ArticleDOI
TL;DR: Combined with an effective preconditioner, numerical experiments confirm that these methods are among the most efficient and robust currently available for computing the CPD, rank-$(L_r,L-r,1)$ BTD, and their generalized decomposition.
Abstract: The canonical polyadic and rank-$(L_r,L_r,1)$ block term decomposition (CPD and BTD, respectively) are two closely related tensor decompositions. The CPD and, recently, BTD are important tools in psychometrics, chemometrics, neuroscience, and signal processing. We present a decomposition that generalizes these two and develop algorithms for its computation. Among these algorithms are alternating least squares schemes, several general unconstrained optimization techniques, and matrix-free nonlinear least squares methods. In the latter we exploit the structure of the Jacobian's Gramian to reduce computational and memory cost. Combined with an effective preconditioner, numerical experiments confirm that these methods are among the most efficient and robust currently available for computing the CPD, rank-$(L_r,L_r,1)$ BTD, and their generalized decomposition.

282 citations


Journal ArticleDOI
TL;DR: In this paper, the problem of minimizing a general continuously differentiable function subject to sparsity constraints is treated and several different optimality criteria which are based on the notions of stationarity and coordinatewise optimality are derived.
Abstract: This paper treats the problem of minimizing a general continuously differentiable function subject to sparsity constraints. We present and analyze several different optimality criteria which are based on the notions of stationarity and coordinatewise optimality. These conditions are then used to derive three numerical algorithms aimed at finding points satisfying the resulting optimality criteria: the iterative hard thresholding method and the greedy and partial sparse-simplex methods. The first algorithm is essentially a gradient projection method, while the remaining two algorithms are of a coordinate descent type. The theoretical convergence of these techniques and their relations to the derived optimality conditions are studied. The algorithms and results are illustrated by several numerical examples.

272 citations


Journal ArticleDOI
TL;DR: A multistage AC-SA algorithm is introduced, which possesses an optimal rate of convergence for solving strongly convex SCO problems in terms of the dependence on not only the target accuracy, but also a number of problem parameters and the selection of initial points.
Abstract: In this paper we study new stochastic approximation (SA) type algorithms, namely, the accelerated SA (AC-SA), for solving strongly convex stochastic composite optimization (SCO) problems. Specifically, by introducing a domain shrinking procedure, we significantly improve the large-deviation results associated with the convergence rate of a nearly optimal AC-SA algorithm presented by Ghadimi and Lan in [SIAM J. Optim., 22 (2012), pp 1469--1492]. Moreover, we introduce a multistage AC-SA algorithm, which possesses an optimal rate of convergence for solving strongly convex SCO problems in terms of the dependence on not only the target accuracy, but also a number of problem parameters and the selection of initial points. To the best of our knowledge, this is the first time that such an optimal method has been presented in the literature. From our computational results, these AC-SA algorithms can substantially outperform the classical SA and some other SA type algorithms for solving certain classes of strongly...

226 citations


Journal ArticleDOI
TL;DR: A framework of block-decomposition prox-type algorithms for solving the monotone inclusion problem and shows that any method in this framework is also a special instance of the hybrid proximal extragradient (HPE) method introduced by Solodov and Svaiter is shown.
Abstract: In this paper, we consider the monotone inclusion problem consisting of the sum of a continuous monotone map and a point-to-set maximal monotone operator with a separable two-block structure and introduce a framework of block-decomposition prox-type algorithms for solving it which allows for each one of the single-block proximal subproblems to be solved in an approximate sense. Moreover, by showing that any method in this framework is also a special instance of the hybrid proximal extragradient (HPE) method introduced by Solodov and Svaiter, we derive corresponding convergence rate results. We also describe some instances of the framework based on specific and inexpensive schemes for solving the single-block proximal subproblems. Finally, we consider some applications of our methodology to establish for the first time (i) the iteration-complexity of an algorithm for finding a zero of the sum of two arbitrary maximal monotone operators and, as a consequence, the ergodic iteration-complexity of the Douglas-...

224 citations


Journal ArticleDOI
TL;DR: A family of conjugate gradient methods for unconstrained optimization and an improved Wolfe line search are proposed, which can avoid a numerical drawback of the original Wolfe line Search and guarantee the global convergence of the conjugATE gradient method under mild conditions.
Abstract: In this paper, we seek the conjugate gradient direction closest to the direction of the scaled memoryless BFGS method and propose a family of conjugate gradient methods for unconstrained optimization. An improved Wolfe line search is also proposed, which can avoid a numerical drawback of the original Wolfe line search and guarantee the global convergence of the conjugate gradient method under mild conditions. To accelerate the algorithm, we introduce adaptive restarts along negative gradients based on the extent to which the function approximates some quadratic function during previous iterations. Numerical experiments with the CUTEr collection show that the proposed algorithm is promising.

214 citations


Journal ArticleDOI
TL;DR: A convergence analysis of accelerated forward-backward splitting methods for composite function minimization, when the proximity operator is not available in closed form, and can only be computed up to a certain precision is proposed.
Abstract: We propose a convergence analysis of accelerated forward-backward splitting methods for composite function minimization, when the proximity operator is not available in closed form, and can only be computed up to a certain precision. We prove that the $1/k^2$ convergence rate for the function values can be achieved if the admissible errors are of a certain type and satisfy a sufficiently fast decay condition. Our analysis is based on the machinery of estimate sequences first introduced by Nesterov for the study of accelerated gradient descent algorithms. Furthermore, we give a global complexity analysis, taking into account the cost of computing admissible approximations of the proximal point. An experimental analysis is also presented.

199 citations


Journal ArticleDOI
TL;DR: In this article, a notion of local subfirm nonexpansiveness with respect to the intersection is introduced for consistent feasibility problems, together with a coercivity condition that relates to the regularity of the collection of sets at points in the intersection, yields local linear convergence of AP for a wide class of nonconvex problems.
Abstract: We consider projection algorithms for solving (nonconvex) feasibility problems in Euclidean spaces. Of special interest are the method of alternating projections (AP) and the Douglas--Rachford algorithm (DR). In the case of convex feasibility, firm nonexpansiveness of projection mappings is a global property that yields global convergence of AP and for consistent problems DR. A notion of local subfirm nonexpansiveness with respect to the intersection is introduced for consistent feasibility problems. This, together with a coercivity condition that relates to the regularity of the collection of sets at points in the intersection, yields local linear convergence of AP for a wide class of nonconvex problems and even local linear convergence of nonconvex instances of the DR algorithm.

Journal ArticleDOI
TL;DR: This paper presents an accelerated variant of the hybrid proximal extragradient (H PE) method for convex optimization, referred to as the accelerated HPE (A-HPE) framework, as well as a special version of it, where a large stepsize condition is imposed.
Abstract: This paper presents an accelerated variant of the hybrid proximal extragradient (HPE) method for convex optimization, referred to as the accelerated HPE (A-HPE) framework. Iteration-complexity results are established for the A-HPE framework, as well as a special version of it, where a large stepsize condition is imposed. Two specific implementations of the A-HPE framework are described in the context of a structured convex optimization problem whose objective function consists of the sum of a smooth convex function and an extended real-valued nonsmooth convex function. In the first implementation, a generalization of a variant of Nesterov's method is obtained for the case where the smooth component of the objective function has Lipschitz continuous gradient. In the second implementation, an accelerated Newton proximal extragradient (A-NPE) method is obtained for the case where the smooth component of the objective function has Lipschitz continuous Hessian. It is shown that the A-NPE method has a ${\cal O}...

Journal ArticleDOI
TL;DR: In this paper, the first-order optimality conditions for sparse approximation problems with the norm of a vector being a part of constraints or objective functions were studied, and penalty decomposition (PD) methods for solving them were proposed.
Abstract: In this paper we consider sparse approximation problems, that is, general $l_0$ minimization problems with the $l_0$-``norm” of a vector being a part of constraints or objective function. In particular, we first study the first-order optimality conditions for these problems. We then propose penalty decomposition (PD) methods for solving them in which a sequence of penalty subproblems are solved by a block coordinate descent (BCD) method. Under some suitable assumptions, we establish that any accumulation point of the sequence generated by the PD methods satisfies the first-order optimality conditions of the problems. Furthermore, for the problems in which the $l_0$ part is the only nonconvex part, we show that such an accumulation point is a local minimizer of the problems. In addition, we show that any accumulation point of the sequence generated by the BCD method is a block coordinate minimizer of the penalty subproblem. Moreover, for the problems in which the $l_0$ part is the only nonconvex part, we e...

Journal ArticleDOI
TL;DR: A novel matrix recurrence is introduced yielding a new spectral analysis of the local transient convergence behavior of the alternating direction method of multipliers (ADMM), for the particular case of a quadratic program or a linear program.
Abstract: We introduce a novel matrix recurrence yielding a new spectral analysis of the local transient convergence behavior of the alternating direction method of multipliers (ADMM), for the particular case of a quadratic program or a linear program. We identify a particular combination of vector iterates whose convergence can be analyzed via a spectral analysis. The theory predicts that ADMM should go through up to four convergence regimes, such as constant step convergence or linear convergence, ending with the latter when close enough to the optimal solution if the optimal solution is unique and satisfies strict complementarity.

Journal ArticleDOI
TL;DR: This paper presents the convex hull of the underlying mixed integer linear set and the effectiveness of this reformulation and associated facet-defining inequalities are computationally evaluated on five classes of instances.
Abstract: In this paper, we examine a mixed integer linear programming reformulation for mixed integer bilinear problems where each bilinearterm involves the product of a nonnegative integer variable and a nonnegative continuous variable. This reformulation is obtained by first replacing a general integer variable with its binary expansion and then using McCormick envelopes to linearize the resulting product of continuous and binary variables. We present the convex hull of the underlying mixed integer linear set. The effectiveness of this reformulation and associated facet-defining inequalities are computationally evaluated on five classes of instances.

Journal ArticleDOI
TL;DR: The paper addresses the problem of low-rank trace norm minimization with an algorithm that alternates between fixed-rank optimization and rank-one updates and presents a second-order trust-region algorithm with a guaranteed quadratic rate of convergence.
Abstract: The paper addresses the problem of low-rank trace norm minimization. We propose an algorithm that alternates between fixed-rank optimization and rank-one updates. The fixed-rank optimization is characterized by an efficient factorization that makes the trace norm differentiable in the search space and the computation of duality gap numerically tractable. The search space is nonlinear but is equipped with a Riemannian structure that leads to efficient computations. We present a second-order trust-region algorithm with a guaranteed quadratic rate of convergence. Overall, the proposed optimization scheme converges superlinearly to the global solution while maintaining complexity that is linear in the number of rows and columns of the matrix. To compute a set of solutions efficiently for a grid of regularization parameters we propose a predictor-corrector approach that outperforms the naive warm-restart approach on the fixed-rank quotient manifold. The performance of the proposed algorithm is illustrated on p...

Journal ArticleDOI
TL;DR: This work proves O(1/k) convergence rates for two variants of cyclic coordinate descent under an isotonicity assumption by comparing the objective values attained by the two variants with each other, as well as with the gradient descent algorithm.
Abstract: Cyclic coordinate descent is a classic optimization method that has witnessed a resurgence of interest in signal processing, statistics, and machine learning. Reasons for this renewed interest include the simplicity, speed, and stability of the method, as well as its competitive performance on $\ell_1$ regularized smooth optimization problems. Surprisingly, very little is known about its nonasymptotic convergence behavior on these problems. Most existing results either just prove convergence or provide asymptotic rates. We fill this gap in the literature by proving $O(1/k)$ convergence rates (where $k$ is the iteration count) for two variants of cyclic coordinate descent under an isotonicity assumption. Our analysis proceeds by comparing the objective values attained by the two variants with each other, as well as with the gradient descent algorithm. We show that the iterates generated by the cyclic coordinate descent methods remain better than those of gradient descent uniformly over time.

Journal ArticleDOI
TL;DR: Two different primal-dual splitting algorithms for solving inclusions involving mixtures of composite and parallel-sum type monotone operators which rely on an inexact Douglas--Rachford splitting method, but applied in different underlying Hilbert spaces are proposed.
Abstract: In this paper we propose two different primal-dual splitting algorithms for solving inclusions involving mixtures of composite and parallel-sum type monotone operators which rely on an inexact Douglas--Rachford splitting method, but applied in different underlying Hilbert spaces. Most importantly, the algorithms allow one to process the bounded linear operators and the set-valued operators occurring in the formulation of the monotone inclusion problem separately at each iteration, the latter being individually accessed via their resolvents. The performance of the primal-dual algorithms is emphasized via some numerical experiments on location and image denoising problems.

Journal ArticleDOI
TL;DR: This paper provides a new relaxation including second-order-cone constraints that strengthens the usual SDP relaxation in the case where an additional ellipsoidal constraint is added to TRS, resulting in the two trust-region subproblem.
Abstract: The classical trust-region subproblem (TRS) minimizes a nonconvex quadratic objective over the unit ball. In this paper, we consider extensions of TRS having extra constraints. When two parallel cuts are added to TRS, we show that the resulting nonconvex problem has an exact representation as a semidenite program with additional linear and second-order-cone constraints. For the case where an additional ellipsoidal constraint is added to TRS, resulting in the \two trust-region subproblem" (TTRS), we provide a new relaxation including second-order-cone constraints that strengthens the usual SDP relaxation.

Journal ArticleDOI
TL;DR: In this paper, a homotopy continuation strategy is proposed to solve the regularized least-squares problem for a sequence of decreasing values of the regularization parameter, and use an approximate solution at the end of each stage to warm start the next stage.
Abstract: We consider solving the $\ell_1$-regularized least-squares ($\ell_1$-LS) problem in the context of sparse recovery for applications such as compressed sensing. The standard proximal gradient method, also known as iterative soft-thresholding when applied to this problem, has low computational cost per iteration but a rather slow convergence rate. Nevertheless, when the solution is sparse, it often exhibits fast linear convergence in the final stage. We exploit the local linear convergence using a homotopy continuation strategy, i.e., we solve the $\ell_1$-LS problem for a sequence of decreasing values of the regularization parameter, and use an approximate solution at the end of each stage to warm start the next stage. Although similar strategies have been studied in the literature, there have been no theoretical analysis of their global iteration complexity. This paper shows that under suitable assumptions for sparse recovery, the proposed homotopy strategy ensures that all iterates along the homotopy sol...

Journal ArticleDOI
TL;DR: The scenario approach as mentioned in this paper provides an intuitive way of approximating the solution to chance-constrained optimization programs, based on finding the optimal solution under a finite number of sampled outcomes of the uncertainty.
Abstract: The scenario-based optimization approach (``scenario approach'') provides an intuitive way of approximating the solution to chance-constrained optimization programs, based on finding the optimal solution under a finite number of sampled outcomes of the uncertainty (``scenarios''). A key merit of this approach is that it neither requires explicit knowledge of the uncertainty set, as in robust optimization, nor of its probability distribution, as in stochastic optimization. The scenario approach is also computationally efficient because it only requires the solution to a convex optimization program, even if the original chance-constrained problem is nonconvex. Recent research has obtained a rigorous foundation for the scenario approach, by establishing a direct link between the number of scenarios and bounds on the constraint violation probability. These bounds are tight in the general case of an uncertain optimization problem with a single chance constraint. This paper shows that the bounds can be improved...

Journal ArticleDOI
TL;DR: In this paper, a generalization of the ellipsoid algorithm was proposed to minimize a convex Lipschitz function under a stochastic bandit feedback model, where the algorithm is allowed to observe noisy realizations of the function value at any query point.
Abstract: This paper addresses the problem of minimizing a convex, Lipschitz function $f$ over a convex, compact set $\mathcal{X}$ under a stochastic bandit (i.e., noisy zeroth-order) feedback model. In this model, the algorithm is allowed to observe noisy realizations of the function value $f(x)$ at any query point $x \in \mathcal{X}$. The quantity of interest is the regret of the algorithm, which is the sum of the function values at algorithm's query points minus the optimal function value. We demonstrate a generalization of the ellipsoid algorithm that incurs $\widetilde{\mathcal{O}}({\rm poly}(d)\sqrt{T})$ regret. Since any algorithm has regret at least $\Omega(\sqrt{T})$ on this problem, our algorithm is optimal in terms of the scaling with $T$.

Journal ArticleDOI
TL;DR: A limited memory version of the nonlinear conjugate gradient method is developed that possesses a global convergence property similar to that of the memoryless algorithm but has much better practical performance.
Abstract: In theory, the successive gradients generated by the conjugate gradient method applied to a quadratic should be orthogonal. However, for some ill-conditioned problems, orthogonality is quickly lost due to rounding errors, and convergence is much slower than expected. A limited memory version of the nonlinear conjugate gradient method is developed. The memory is used to both detect the loss of orthogonality and to restore orthogonality. An implementation of the algorithm is presented based on the CG_DESCENT nonlinear conjugate gradient method. Limited memory CG_DESCENT (L-CG_DESCENT) possesses a global convergence property similar to that of the memoryless algorithm but has much better practical performance. Numerical comparisons to the limited memory BFGS method (L-BFGS) are given using the CUTEr test problems.

Journal ArticleDOI
TL;DR: This work considers the primal problem of finding the zeros of the sum of a maximal monotone operator and the composition of another maximal monotsone operator with a linear continuous operator.
Abstract: We consider the primal problem of finding the zeros of the sum of a maximal monotone operator and the composition of another maximal monotone operator with a linear continuous operator. By formulat...

Journal ArticleDOI
TL;DR: It is proved that uniform second-order growth, tilt stability, and strong metric regularity of the subdifferential---three notions that have appeared in entirely different settings---are all essentially equivalent for any lower-semicontinuous, extended real-valued function.
Abstract: We prove that uniform second-order growth, tilt stability, and strong metric regularity of the subdifferential---three notions that have appeared in entirely different settings---are all essentially equivalent for any lower-semicontinuous, extended real-valued function.

Journal ArticleDOI
TL;DR: This paper derives affine-scaled second order necessary and sufficient conditions for local minimizers of minimization problems with nonconvex, nonsmooth, perhaps non-Lipschitz penalty functions and proposes a global convergent smoothing trust region Newton method.
Abstract: Regularized minimization problems with nonconvex, nonsmooth, perhaps non-Lipschitz penalty functions have attracted considerable attention in recent years, owing to their wide applications in image restoration, signal reconstruction, and variable selection. In this paper, we derive affine-scaled second order necessary and sufficient conditions for local minimizers of such minimization problems. Moreover, we propose a global convergent smoothing trust region Newton method which can find a point satisfying the affine-scaled second order necessary optimality condition from any starting point. Numerical examples are given to demonstrate the effectiveness of the smoothing trust region Newton method.

Journal ArticleDOI
TL;DR: In this article, a variant of the pessimistic bilevel optimization problem is studied, which comprises constraints that must be satisfied for any optimal solution of a subordinate (lower-level) optimization problem.
Abstract: We study a variant of the pessimistic bilevel optimization problem, which comprises constraints that must be satisfied for any optimal solution of a subordinate (lower-level) optimization problem. We present conditions that guarantee the existence of optimal solutions in such a problem, and we characterize the computational complexity of various subclasses of the problem. We then focus on problem instances that may lack convexity, but that satisfy a certain independence property. We develop convergent approximations for these instances, and we derive an iterative solution scheme that is reminiscent of the discretization techniques used in semi-infinite programming. We also present a computational study that illustrates the numerical behavior of our algorithm on standard benchmark instances.

Journal ArticleDOI
TL;DR: A general primal-dual splitting algorithm for solving systems of structured coupled monotone inclusions in Hilbert spaces is introduced and its asymptotic behavior is analyzed, providing a flexible solution method applicable to a variety of problems beyond the reach of the state-of-the-art.
Abstract: A general primal-dual splitting algorithm for solving systems of structured coupled monotone inclusions in Hilbert spaces is introduced and its asymptotic behavior is analyzed. Each inclusion in the primal system features compositions with linear operators, parallel sums, and Lipschitzian operators. All the operators involved in this structured model are used separately in the proposed algorithm, most steps of which can be executed in parallel. This provides a flexible solution method applicable to a variety of problems beyond the reach of the state-of-the-art. Several applications are discussed to illustrate this point.

Journal ArticleDOI
TL;DR: A broad setting where computing the generalized Hessian of Mordukhovich is easy is described, and the idea of tilt stability introduced by Poliquin and Rockafellar is equivalent to a classical smooth second-order condition.
Abstract: We compare two recent variational-analytic approaches to second-order conditions and sensitivity analysis for nonsmooth optimization. We describe a broad setting where computing the generalized Hessian of Mordukhovich is easy. In this setting, the idea of tilt stability introduced by Poliquin and Rockafellar is equivalent to a classical smooth second-order condition.

Journal ArticleDOI
TL;DR: This work provides a new regularization method for MPECs which also converges to M-sta..., which is a very weak stationarity concept.
Abstract: Mathematical programs with equilibrium (or complementarity) constraints (MPECs) form a difficult class of optimization problems. The feasible set has a very special structure and violates most of the standard constraint qualifications. Therefore, one typically applies specialized algorithms in order to solve MPECs. One very prominent class of specialized algorithms are the regularization (or relaxation) methods. The first regularization method for MPECs is due to Scholtes [SIAM J. Optim., 11 (2001), pp. 918--936], but in the meantime, there exist a number of different regularization schemes which try to relax the difficult constraints in different ways. However, almost all regularization methods converge to C-stationary points only, which is a very weak stationarity concept. An exception is a recent method by Kadrani, Dussault, and Benchakroun [SIAM J. Optim., 20 (2009), pp. 78--103], whose limit points are shown to be M-stationary. Here we provide a new regularization method which also converges to M-sta...