Showing papers in "arXiv: Optimization and Control in 2015"

PDF

Open Access

Journal Article•DOI•

JuMP: A Modeling Language for Mathematical Optimization

[...]

Iain Dunning, Joey Huchette, Miles Lubin

09 Aug 2015-arXiv: Optimization and Control

TL;DR: JuMP as mentioned in this paper is an open-source modeling language that allows users to express a wide range of optimization problems (linear, mixed-integer, quadratic, conic-quadratic, semidefinite, and nonlinear) in a high-level, algebraic syntax.

...read moreread less

Abstract: JuMP is an open-source modeling language that allows users to express a wide range of optimization problems (linear, mixed-integer, quadratic, conic-quadratic, semidefinite, and nonlinear) in a high-level, algebraic syntax. JuMP takes advantage of advanced features of the Julia programming language to offer unique functionality while achieving performance on par with commercial modeling tools for standard tasks. In this work we will provide benchmarks, present the novel aspects of the implementation, and discuss how JuMP can be extended to new problem classes and composed with state-of-the-art tools for visualization and interactivity.

...read moreread less

907 citations

Posted Content•

Data-driven Distributionally Robust Optimization Using the Wasserstein Metric: Performance Guarantees and Tractable Reformulations

[...]

Peyman Mohajerin Esfahani¹, Daniel Kuhn²•Institutions (2)

Delft University of Technology¹, École Polytechnique Fédérale de Lausanne²

19 May 2015-arXiv: Optimization and Control

TL;DR: It is demonstrated that the distributionally robust optimization problems over Wasserstein balls can in fact be reformulated as finite convex programs—in many interesting cases even as tractable linear programs.

...read moreread less

Abstract: We consider stochastic programs where the distribution of the uncertain parameters is only observable through a finite training dataset. Using the Wasserstein metric, we construct a ball in the space of (multivariate and non-discrete) probability distributions centered at the uniform distribution on the training samples, and we seek decisions that perform best in view of the worst-case distribution within this Wasserstein ball. The state-of-the-art methods for solving the resulting distributionally robust optimization problems rely on global optimization techniques, which quickly become computationally excruciating. In this paper we demonstrate that, under mild assumptions, the distributionally robust optimization problems over Wasserstein balls can in fact be reformulated as finite convex programs---in many interesting cases even as tractable linear programs. Leveraging recent measure concentration results, we also show that their solutions enjoy powerful finite-sample performance guarantees. Our theoretical results are exemplified in mean-risk portfolio optimization as well as uncertainty quantification.

...read moreread less

808 citations

Posted Content•

Coordinate Descent Algorithms

[...]

Stephen J. Wright¹•Institutions (1)

University of Wisconsin-Madison¹

17 Feb 2015-arXiv: Optimization and Control

TL;DR: Coordinate descent algorithms solve optimization problems by successively performing approximate minimization along coordinate directions or coordinate hyperplanes as mentioned in this paper, and they have been used in many applications, such as data analysis, machine learning, and other areas of current interest.

...read moreread less

Abstract: Coordinate descent algorithms solve optimization problems by successively performing approximate minimization along coordinate directions or coordinate hyperplanes. They have been used in applications for many years, and their popularity continues to grow because of their usefulness in data analysis, machine learning, and other areas of current interest. This paper describes the fundamentals of the coordinate descent approach, together with variants and extensions and their convergence properties, mostly with reference to convex objectives. We pay particular attention to a certain problem structure that arises frequently in machine learning applications, showing that efficient implementations of accelerated coordinate descent algorithms are possible for problems of this type. We also present some parallel variants and discuss their convergence properties under several models of parallel execution.

...read moreread less

659 citations

Journal Article•DOI•

Markov Chain Analysis of Cumulative Step-size Adaptation on a Linear Constrained Problem

[...]

Alexandre Chotard, Anne Auger¹, Nikolaus Hansen¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

15 Oct 2015-arXiv: Optimization and Control

TL;DR: In this paper, a randomized comparison-based adaptive search algorithm is proposed to optimize a linear function with a linear constraint, where resampling is used to handle the linear constraint.

...read moreread less

Abstract: This paper analyzes a (1, $\lambda$)-Evolution Strategy, a randomized comparison-based adaptive search algorithm, optimizing a linear function with a linear constraint. The algorithm uses resampling to handle the constraint. Two cases are investigated: first the case where the step-size is constant, and second the case where the step-size is adapted using cumulative step-size adaptation. We exhibit for each case a Markov chain describing the behaviour of the algorithm. Stability of the chain implies, by applying a law of large numbers, either convergence or divergence of the algorithm. Divergence is the desired behaviour. In the constant step-size case, we show stability of the Markov chain and prove the divergence of the algorithm. In the cumulative step-size adaptation case, we prove stability of the Markov chain in the simplified case where the cumulation parameter equals 1, and discuss steps to obtain similar results for the full (default) algorithm where the cumulation parameter is smaller than 1. The stability of the Markov chain allows us to deduce geometric divergence or convergence , depending on the dimension, constraint angle, population size and damping parameter, at a rate that we estimate. Our results complement previous studies where stability was assumed.

...read moreread less

463 citations

Posted Content•

Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization

[...]

Xiangru Lian¹, Yijun Huang¹, Yuncheng Li¹, Ji Liu¹•Institutions (1)

University of Rochester¹

27 Jun 2015-arXiv: Optimization and Control

TL;DR: In this article, two asynchronous parallel implementations of stochastic gradient (SG) have been studied on the computer network and the shared memory system, and it was shown that the linear speedup is achievable if the number of workers is bounded by the total number of iterations.

...read moreread less

Abstract: Asynchronous parallel implementations of stochastic gradient (SG) have been broadly used in solving deep neural network and received many successes in practice recently. However, existing theories cannot explain their convergence and speedup properties, mainly due to the nonconvexity of most deep learning formulations and the asynchronous parallel mechanism. To fill the gaps in theory and provide theoretical supports, this paper studies two asynchronous parallel implementations of SG: one is on the computer network and the other is on the shared memory system. We establish an ergodic convergence rate $O(1/\sqrt{K})$ for both algorithms and prove that the linear speedup is achievable if the number of workers is bounded by $\sqrt{K}$ ($K$ is the total number of iterations). Our results generalize and improve existing analysis for convex minimization.

...read moreread less

346 citations

Posted Content•

Global Convergence of ADMM in Nonconvex Nonsmooth Optimization

[...]

Yu Wang¹, Wotao Yin², Jinshan Zeng³•Institutions (3)

University of California, Berkeley¹, University of California, Los Angeles², Jiangxi Normal University³

18 Nov 2015-arXiv: Optimization and Control

TL;DR: ADMM might be a better choice than ALM for some nonconvex nonsmooth problems, because ADMM is not only easier to implement, it is also more likely to converge for the concerned scenarios.

...read moreread less

Abstract: In this paper, we analyze the convergence of the alternating direction method of multipliers (ADMM) for minimizing a nonconvex and possibly nonsmooth objective function, $\phi(x_0,\ldots,x_p,y)$, subject to coupled linear equality constraints. Our ADMM updates each of the primal variables $x_0,\ldots,x_p,y$, followed by updating the dual variable. We separate the variable $y$ from $x_i$'s as it has a special role in our analysis. The developed convergence guarantee covers a variety of nonconvex functions such as piecewise linear functions, $\ell_q$ quasi-norm, Schatten-$q$ quasi-norm ($0

...read moreread less

300 citations

Posted Content•

A Universal Catalyst for First-Order Optimization

[...]

Hongzhou Lin, Julien Mairal, Zaid Harchaoui¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

06 Jun 2015-arXiv: Optimization and Control

TL;DR: In this article, a generic scheme for accelerating first-order optimization methods in the sense of Nesterov is introduced, which builds upon a new analysis of the accelerated proximal point algorithm.

...read moreread less

Abstract: We introduce a generic scheme for accelerating first-order optimization methods in the sense of Nesterov, which builds upon a new analysis of the accelerated proximal point algorithm. Our approach consists of minimizing a convex objective by approximately solving a sequence of well-chosen auxiliary problems, leading to faster convergence. This strategy applies to a large class of algorithms, including gradient descent, block coordinate descent, SAG, SAGA, SDCA, SVRG, Finito/MISO, and their proximal variants. For all of these methods, we provide acceleration and explicit support for non-strongly convex objectives. In addition to theoretical speed-up, we also show that acceleration is useful in practice, especially for ill-conditioned problems where we measure significant improvements.

...read moreread less

270 citations

Posted Content•

Low-rank Solutions of Linear Matrix Equations via Procrustes Flow

[...]

Stephen Tu¹, Ross Boczar¹, Max Simchowitz¹, Mahdi Soltanolkotabi, Benjamin Recht¹ - Show less +1 more•Institutions (1)

University of California, Berkeley¹

13 Jul 2015-arXiv: Optimization and Control

TL;DR: Procrustes Flow as mentioned in this paper uses a thresholding scheme followed by gradient descent on a non-convex objective to recover a low-rank matrix from linear measurements and shows that as long as the measurements obey a standard restricted isometry property, their algorithm converges to the unknown matrix at a geometric rate.

...read moreread less

Abstract: In this paper we study the problem of recovering a low-rank matrix from linear measurements. Our algorithm, which we call Procrustes Flow, starts from an initial estimate obtained by a thresholding scheme followed by gradient descent on a non-convex objective. We show that as long as the measurements obey a standard restricted isometry property, our algorithm converges to the unknown matrix at a geometric rate. In the case of Gaussian measurements, such convergence occurs for a $n_1 \times n_2$ matrix of rank $r$ when the number of measurements exceeds a constant times $(n_1+n_2)r$.

...read moreread less

242 citations

Journal Article•DOI•

Secondary Frequency and Voltage Control of Islanded Microgrids via Distributed Averaging

[...]

John W. Simpson-Porco¹, Qobad Shafiee², Florian Dörfler³, Juan C. Vasquez², Josep M. Guerrero², Francesco Bullo¹ - Show less +2 more•Institutions (3)

University of California, Santa Barbara¹, Aalborg University², École Polytechnique Fédérale de Lausanne³

26 Apr 2015-arXiv: Optimization and Control

TL;DR: New distributed controllers for secondary frequency and voltage control in islanded microgrids Inspired by techniques from cooperative control, the proposed controllers use localized information and nearest-neighbor communication to collectively perform secondary control actions.

...read moreread less

Abstract: In this work we present new distributed controllers for secondary frequency and voltage control in islanded microgrids. Inspired by techniques from cooperative control, the proposed controllers use localized information and nearest-neighbor communication to collectively perform secondary control actions. The frequency controller rapidly regulates the microgrid frequency to its nominal value while maintaining active power sharing among the distributed generators. Tuning of the voltage controller provides a simple and intuitive trade-off between the conflicting goals of voltage regulation and reactive power sharing. Our designs require no knowledge of the microgrid topology, impedances or loads. The distributed architecture allows for flexibility and redundancy, and eliminates the need for a central microgrid controller. We provide a voltage stability analysis and present extensive experimental results validating our designs, verifying robust performance under communication failure and during plug-and-play operation.

...read moreread less

231 citations

Posted Content•

A Unified Algorithmic Framework for Block-Structured Optimization Involving Big Data

[...]

Mingyi Hong, Meisam Razaviyayn, Zhi-Quan Luo, Jong-Shi Pang

09 Nov 2015-arXiv: Optimization and Control

TL;DR: The Block Successive Upper Bound Minimization (BSUM) as mentioned in this paper is a powerful algorithmic framework for big data optimization, which includes as special cases many well-known methods for analyzing massive data sets, such as the block coordinate Descent (BCD), the Convex-Concave Procedure (CCCP), the Block Coordinate Proximal Gradient (BCPG), the Nonnegative Matrix Factorization (NMF), the Expectation Maximization (EM) method and so on.

...read moreread less

Abstract: This article presents a powerful algorithmic framework for big data optimization, called the Block Successive Upper bound Minimization (BSUM). The BSUM includes as special cases many well-known methods for analyzing massive data sets, such as the Block Coordinate Descent (BCD), the Convex-Concave Procedure (CCCP), the Block Coordinate Proximal Gradient (BCPG) method, the Nonnegative Matrix Factorization (NMF), the Expectation Maximization (EM) method and so on. In this article, various features and properties of the BSUM are discussed from the viewpoint of design flexibility, computational efficiency, parallel/distributed implementation and the required communication overhead. Illustrative examples from networking, signal processing and machine learning are presented to demonstrate the practical performance of the BSUM framework

...read moreread less

215 citations

Journal Article•DOI•

ARock: an Algorithmic Framework for Asynchronous Parallel Coordinate Updates

[...]

Zhimin Peng¹, Yangyang Xu², Ming Yan, Wotao Yin¹•Institutions (2)

University of California, Los Angeles¹, University of Alabama²

08 Jun 2015-arXiv: Optimization and Control

TL;DR: Theoretically, it is shown that if the nonexpansive operator $T$ has a fixed point, then with probability one, ARock generates a sequence that converges to a fixed points of $T$.

...read moreread less

Abstract: Finding a fixed point to a nonexpansive operator, i.e., $x^*=Tx^*$, abstracts many problems in numerical linear algebra, optimization, and other areas of scientific computing. To solve fixed-point problems, we propose ARock, an algorithmic framework in which multiple agents (machines, processors, or cores) update $x$ in an asynchronous parallel fashion. Asynchrony is crucial to parallel computing since it reduces synchronization wait, relaxes communication bottleneck, and thus speeds up computing significantly. At each step of ARock, an agent updates a randomly selected coordinate $x_i$ based on possibly out-of-date information on $x$. The agents share $x$ through either global memory or communication. If writing $x_i$ is atomic, the agents can read and write $x$ without memory locks. Theoretically, we show that if the nonexpansive operator $T$ has a fixed point, then with probability one, ARock generates a sequence that converges to a fixed points of $T$. Our conditions on $T$ and step sizes are weaker than comparable work. Linear convergence is also obtained. We propose special cases of ARock for linear systems, convex optimization, machine learning, as well as distributed and decentralized consensus problems. Numerical experiments of solving sparse logistic regression problems are presented.

...read moreread less

Journal Article•DOI•

Projected Reflected Gradient Methods for Monotone Variational Inequalities

[...]

Yu. V. Malitsky

17 Feb 2015-arXiv: Optimization and Control

TL;DR: In this paper, the projected gradient method with monotone and Lipschitz-continuous mapping in Hilbert space has been proposed for solving variational inequalities with variational inequality problems.

...read moreread less

Abstract: This paper is concerned with some new projection methods for solving variational inequality problems with monotone and Lipschitz-continuous mapping in Hilbert space. First, we propose the projected reflected gradient algorithm with a constant stepsize. It is similar to the projected gradient method, namely, the method requires only one projection onto the feasible set and only one value of the mapping per iteration. This distinguishes our method from most other projection-type methods for variational inequalities with monotone mapping. Also we prove that it has R-linear rate of convergence under the strong monotonicity assumption. The usual drawback of algorithms with constant stepsize is the requirement to know the Lipschitz constant of the mapping. To avoid this, we modify our first algorithm so that the algorithm needs at most two projections per iteration. In fact, our computational experience shows that such cases with two projections are very rare. This scheme, at least theoretically, seems to be very effective. All methods are shown to be globally convergent to a solution of the variational inequality. Preliminary results from numerical experiments are quite promising.

...read moreread less

Posted Content•

A General Analysis of the Convergence of ADMM

[...]

Robert Nishihara¹, Laurent Lessard¹, Benjamin Recht¹, Andrew Packard¹, Michael I. Jordan¹ - Show less +1 more•Institutions (1)

University of California, Berkeley¹

06 Feb 2015-arXiv: Optimization and Control

TL;DR: In this article, the authors provide a new proof of the linear convergence of the alternating direction method of multipliers (ADMM) when one of the objective terms is strongly convex, based on a framework for analyzing optimization algorithms introduced in Lessard et al.

...read moreread less

Abstract: We provide a new proof of the linear convergence of the alternating direction method of multipliers (ADMM) when one of the objective terms is strongly convex. Our proof is based on a framework for analyzing optimization algorithms introduced in Lessard et al. (2014), reducing algorithm convergence to verifying the stability of a dynamical system. This approach generalizes a number of existing results and obviates any assumptions about specific choices of algorithm parameters. On a numerical example, we demonstrate that minimizing the derived bound on the convergence rate provides a practical approach to selecting algorithm parameters for particular ADMM instances. We complement our upper bound by constructing a nearly-matching lower bound on the worst-case rate of convergence.

...read moreread less

Posted Content•

When Are Nonconvex Problems Not Scary

[...]

Ju Sun, Qing Qu, John Wright

21 Oct 2015-arXiv: Optimization and Control

TL;DR: This work describes a second-order trust-region algorithm that provably converges to a global minimizer efficiently, without special initializations, and highlights alternatives.

...read moreread less

Abstract: In this note, we focus on smooth nonconvex optimization problems that obey: (1) all local minimizers are also global; and (2) around any saddle point or local maximizer, the objective has a negative directional curvature. Concrete applications such as dictionary learning, generalized phase retrieval, and orthogonal tensor decomposition are known to induce such structures. We describe a second-order trust-region algorithm that provably converges to a global minimizer efficiently, without special initializations. Finally we highlight alternatives, and open problems in this direction.

...read moreread less

Posted Content•

On the Global Linear Convergence of Frank-Wolfe Optimization Variants

[...]

Simon Lacoste-Julien¹, Martin Jaggi²•Institutions (2)

École Normale Supérieure¹, ETH Zurich²

18 Nov 2015-arXiv: Optimization and Control

TL;DR: This paper highlights and clarify several variants of the Frank-Wolfe optimization algorithm that have been successfully applied in practice: away-steps FW, pairwise FW, fully-corrective FW and Wolfe's minimum norm point algorithm, and proves for the first time that they all enjoy global linear convergence, under a weaker condition than strong convexity of the objective.

...read moreread less

Abstract: The Frank-Wolfe (FW) optimization algorithm has lately re-gained popularity thanks in particular to its ability to nicely handle the structured constraints appearing in machine learning applications. However, its convergence rate is known to be slow (sublinear) when the solution lies at the boundary. A simple less-known fix is to add the possibility to take 'away steps' during optimization, an operation that importantly does not require a feasibility oracle. In this paper, we highlight and clarify several variants of the Frank-Wolfe optimization algorithm that have been successfully applied in practice: away-steps FW, pairwise FW, fully-corrective FW and Wolfe's minimum norm point algorithm, and prove for the first time that they all enjoy global linear convergence, under a weaker condition than strong convexity of the objective. The constant in the convergence rate has an elegant interpretation as the product of the (classical) condition number of the function with a novel geometric quantity that plays the role of a 'condition number' of the constraint set. We provide pointers to where these algorithms have made a difference in practice, in particular with the flow polytope, the marginal polytope and the base polytope for submodular optimization.

...read moreread less

Posted Content•

Distributionally Robust Logistic Regression

[...]

Soroosh Shafieezadeh-Abadeh¹, Peyman Mohajerin Esfahani¹, Daniel Kuhn¹•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

30 Sep 2015-arXiv: Optimization and Control

TL;DR: This paper uses the Wasserstein distance to construct a ball in the space of probability distributions centered at the uniform distribution on the training samples, and proposes a distributionally robust logistic regression model that minimizes a worst-case expected logloss function.

...read moreread less

Abstract: This paper proposes a distributionally robust approach to logistic regression. We use the Wasserstein distance to construct a ball in the space of probability distributions centered at the uniform distribution on the training samples. If the radius of this ball is chosen judiciously, we can guarantee that it contains the unknown data-generating distribution with high confidence. We then formulate a distributionally robust logistic regression model that minimizes a worst-case expected logloss function, where the worst case is taken over all distributions in the Wasserstein ball. We prove that this optimization problem admits a tractable reformulation and encapsulates the classical as well as the popular regularized logistic regression problems as special cases. We further propose a distributionally robust approach based on Wasserstein balls to compute upper and lower confidence bounds on the misclassification probability of the resulting classifier. These bounds are given by the optimal values of two highly tractable linear programs. We validate our theoretical out-of-sample guarantees through simulated and empirical experiments.

...read moreread less

Journal Article•DOI•

Distributed Convex Optimization for Continuous-Time Dynamics with Time-Varying Cost Function

[...]

Salar Rahili, Wei Ren

17 Jul 2015-arXiv: Optimization and Control

TL;DR: In this paper, a time-varying distributed convex optimization problem is studied for continuous-time multi-agent systems, and two discontinuous algorithms based on the signum function are proposed to solve the problem in each case.

...read moreread less

Abstract: In this paper, a time-varying distributed convex optimization problem is studied for continuous-time multi-agent systems. Control algorithms are designed for the cases of single-integrator and double-integrator dynamics. Two discontinuous algorithms based on the signum function are proposed to solve the problem in each case. Then in the case of double-integrator dynamics, two continuous algorithms based on, respectively, a time-varying and a fixed boundary layer are proposed as continuous approximations of the signum function. Also, to account for inter-agent collision for physical agents, a distributed convex optimization problem with swarm tracking behavior is introduced for both single-integrator and double-integrator dynamics.

...read moreread less

Posted Content•

Fast Local Voltage Control under Limited Reactive Power: Optimality and Stability Analysis

[...]

Hao Zhu¹, Hao Jan Liu¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

23 Oct 2015-arXiv: Optimization and Control

TL;DR: In this article, a local reactive power (VAR) control framework is developed that can fast respond to voltage mismatch and address the robustness issues of (de-)centralized approaches against communication delays and noises.

...read moreread less

Abstract: High penetration of distributed energy resources presents several challenges and opportunities for voltage regulation in power distribution systems. A local reactive power (VAR) control framework will be developed that can fast respond to voltage mismatch and address the robustness issues of (de-)centralized approaches against communication delays and noises. Using local bus voltage measurements, the proposed gradient-projection based schemes explicitly account for the VAR limit of every bus, and are proven convergent to a surrogate centralized problem with proper parameter choices. This optimality result quantifies the capability of local VAR control without requiring any real-time communications. The proposed framework and analysis generalize earlier results on the droop VAR control design, which may suffer from under-utilization of VAR resources in order to ensure stability. Numerical tests have demonstrated the validity of our analytical results and the effectiveness of proposed approaches implemented on realistic three-phase systems.

...read moreread less

Posted Content•

A geometric alternative to Nesterov's accelerated gradient descent

[...]

Sébastien Bubeck, Yin Tat Lee, Mohit Singh

26 Jun 2015-arXiv: Optimization and Control

TL;DR: A new method for unconstrained optimization of a smooth and strongly convex function is proposed, which attains the optimal rate of convergence of Nesterov’s accelerated gradient descent.

...read moreread less

Abstract: We propose a new method for unconstrained optimization of a smooth and strongly convex function, which attains the optimal rate of convergence of Nesterov’s accelerated gradient descent. The new algorithm has a simple geometric interpretation, loosely inspired by the ellipsoid method. We provide some numerical evidence that the new method can be superior to Nesterov’s accelerated gradient descent.

...read moreread less

Posted Content•

Linear convergence of first order methods for non-strongly convex optimization

[...]

Ion Necoara, Yurii Nesterov, François Glineur

23 Apr 2015-arXiv: Optimization and Control

TL;DR: This paper derives linear convergence rates of several first order methods for solving smooth non-strongly convex constrained optimization problems, i.e. involving an objective function with a Lipschitz continuous gradient that satisfies some relaxed strong convexity condition.

...read moreread less

Abstract: The standard assumption for proving linear convergence of first order methods for smooth convex optimization is the strong convexity of the objective function, an assumption which does not hold for many practical applications. In this paper, we derive linear convergence rates of several first order methods for solving smooth non-strongly convex constrained optimization problems, i.e. involving an objective function with a Lipschitz continuous gradient that satisfies some relaxed strong convexity condition. In particular, in the case of smooth constrained convex optimization, we provide several relaxations of the strong convexity conditions and prove that they are sufficient for getting linear convergence for several first order methods such as projected gradient, fast gradient and feasible descent methods. We also provide examples of functional classes that satisfy our proposed relaxations of strong convexity conditions. Finally, we show that the proposed relaxed strong convexity conditions cover important applications ranging from solving linear systems, Linear Programming, and dual formulations of linearly constrained convex problems.

...read moreread less

Book Chapter•DOI•

Weak Solutions for First Order Mean Field Games with Local Coupling

[...]

Pierre Cardaliaguet¹•Institutions (1)

Paris Dauphine University¹

01 Jan 2015-arXiv: Optimization and Control

TL;DR: In this paper, the existence and uniqueness of a weak solution for first order mean field game systems with local coupling is obtained by variational methods, which can be used to devise e−Nash equilibria for deterministic differential games with a finite (but large) number of players.

...read moreread less

Abstract: Existence and uniqueness of a weak solution for first order mean field game systems with local coupling are obtained by variational methods. This solution can be used to devise e−Nash equilibria for deterministic differential games with a finite (but large) number of players. For smooth data, the first component of the weak solution of the MFG system is proved to satisfy (in a viscosity sense) a time-space degenerate elliptic differential equation.

...read moreread less

Posted Content•

Fast Convergence Rates for Distributed Non-Bayesian Learning

[...]

Angelia Nedic¹, Alex Olshevsky², César A. Uribe³•Institutions (3)

Arizona State University¹, Boston University², University of Illinois at Urbana–Champaign³

21 Aug 2015-arXiv: Optimization and Control

TL;DR: This work proposes a distributed algorithm and establishes consistency, as well as a nonasymptotic, explicit, and geometric convergence rate for the concentration of the beliefs around the set of optimal hypotheses.

...read moreread less

Abstract: We consider the problem of distributed learning, where a network of agents collectively aim to agree on a hypothesis that best explains a set of distributed observations of conditionally independent random processes. We propose a distributed algorithm and establish consistency, as well as a non-asymptotic, explicit and geometric convergence rate for the concentration of the beliefs around the set of optimal hypotheses. Additionally, if the agents interact over static networks, we provide an improved learning protocol with better scalability with respect to the number of nodes in the network.

...read moreread less

Journal Article•DOI•

Robust Collaborative Nonnegative Matrix Factorization For Hyperspectral Unmixing (R-CoNMF)

[...]

Jun Li, Jose M. Bioucas-Dias, Antonio Plaza, Lin Liu

16 Jun 2015-arXiv: Optimization and Control

TL;DR: In this paper, a robust version of CoNMF called R-CoNMF has been proposed to estimate the number of endmembers, the mixing matrix, and the fractional abundances from hyperspectral linear mixtures.

...read moreread less

Abstract: The recently introduced collaborative nonnegative matrix factorization (CoNMF) algorithm was conceived to simultaneously estimate the number of endmembers, the mixing matrix, and the fractional abundances from hyperspectral linear mixtures. This paper introduces R-CoNMF, which is a robust version of CoNMF. The robustness has been added by a) including a volume regularizer which penalizes the distance to a mixing matrix inferred by a pure pixel algorithm; and by b) introducing a new proximal alternating optimization (PAO) algorithm for which convergence to a critical point is guaranteed. Our experimental results indicate that R-CoNMF provides effective estimates both when the number of endmembers are unknown and when they are known.

...read moreread less

Posted Content•

A Linearly-Convergent Stochastic L-BFGS Algorithm

[...]

Philipp Moritz¹, Robert Nishihara¹, Michael I. Jordan¹•Institutions (1)

University of California, Berkeley¹

09 Aug 2015-arXiv: Optimization and Control

TL;DR: In this paper, a new stochastic L-BFGS algorithm was proposed and proved to have a linear convergence rate for strongly convex and smooth functions, and the algorithm was shown to perform well for a wide range of step sizes.

...read moreread less

Abstract: We propose a new stochastic L-BFGS algorithm and prove a linear convergence rate for strongly convex and smooth functions. Our algorithm draws heavily from a recent stochastic variant of L-BFGS proposed in Byrd et al. (2014) as well as a recent approach to variance reduction for stochastic gradient descent from Johnson and Zhang (2013). We demonstrate experimentally that our algorithm performs well on large-scale convex and non-convex optimization problems, exhibiting linear convergence and rapidly solving the optimization problems to high levels of precision. Furthermore, we show that our algorithm performs well for a wide-range of step sizes, often differing by several orders of magnitude.

...read moreread less

Journal Article•DOI•

Optimal Placement of Virtual Inertia in Power Grids

[...]

Bala Kameshwar Poolla¹, Saverio Bolognani¹, Florian Dörfler¹•Institutions (1)

ETH Zurich¹

06 Oct 2015-arXiv: Optimization and Control

TL;DR: In this article, the optimal inertia placement problem in a linear network-reduced power system model is investigated and a set of closed-form global optimality results for particular problem instances as well as a computational approach resulting in locally optimal solutions are provided.

...read moreread less

Abstract: A major transition in the operation of electric power grids is the replacement of synchronous machines by distributed generation connected via power electronic converters. The accompanying "loss of rotational inertia" and the fluctuations by renewable sources jeopardize the system stability, as testified by the ever-growing number of frequency incidents. As a remedy, numerous studies demonstrate how virtual inertia can be emulated through various devices, but few of them address the question of "where" to place this inertia. It is however strongly believed that the placement of virtual inertia hugely impacts system efficiency, as demonstrated by recent case studies. In this article, we carry out a comprehensive analysis in an attempt to address the optimal inertia placement problem. We consider a linear network-reduced power system model along with an H2 performance metric accounting for the network coherency. The optimal inertia placement problem turns out to be non-convex, yet we provide a set of closed-form global optimality results for particular problem instances as well as a computational approach resulting in locally optimal solutions. Further, we also consider the robust inertia allocation problem, wherein the optimization is carried out accounting for the worst-case disturbance location. We illustrate our results with a three-region power grid case study and compare our locally optimal solution with different placement heuristics in terms of different performance metrics.

...read moreread less

Journal Article•DOI•

An Efficient Inexact Symmetric Gauss-Seidel Based Majorized ADMM for High-Dimensional Convex Composite Conic Programming

[...]

Liang Chen¹, Defeng Sun², Kim-Chuan Toh²•Institutions (2)

Hunan University¹, National University of Singapore²

02 Jun 2015-arXiv: Optimization and Control

TL;DR: The results show that for the vast majority of the tested problems, the sGS-imsPADMM is 2–3 times faster than the directly extended multi-block ADMM with the aggressive step-length of 1.618, which is currently the benchmark among first-order methods for solving multi- block linear and quadratic SDP problems though its convergence is not guaranteed.

...read moreread less

Abstract: In this paper, we propose an inexact multi-block ADMM-type first-order method for solving a class of high-dimensional convex composite conic optimization problems to moderate accuracy. The design of this method combines an inexact 2-block majorized semi-proximal ADMM and the recent advances in the inexact symmetric Gauss-Seidel (sGS) technique for solving a multi-block convex composite quadratic programming whose objective contains a nonsmooth term involving only the first block-variable. One distinctive feature of our proposed method (the sGS-imsPADMM) is that it only needs one cycle of an inexact sGS method, instead of an unknown number of cycles, to solve each of the subproblems involved.With some simple and implementable error tolerance criteria, the cost for solving the subproblems can be greatly reduced, and many steps in the forward sweep of each sGS cycle can often be skipped, which further contributes to the efficiency of the proposed method. Global convergence as well as the iteration complexity in the non-ergodic sense is established.Preliminary numerical experiments on some high-dimensional linear and convex quadratic SDP problems with a large number of linear equality and inequality constraints are also provided. The results show that for the vast majority of the tested problems, the sGS-imsPADMM is 2 to 3 times faster than the directly extended multi-block ADMM with the aggressive step-length of 1.618, which is currently the benchmark among first-order methods for solving multi-block linear and quadratic SDP problems though its convergence is not guaranteed.

...read moreread less

Posted Content•

Asymptotic convergence of constrained primal-dual dynamics

[...]

Ashish Cherukuri¹, Enrique Mallada², Jorge E. Cortes¹•Institutions (2)

University of California, San Diego¹, California Institute of Technology²

07 Oct 2015-arXiv: Optimization and Control

TL;DR: This paper uses the invariance principle for discontinuous Caratheodory systems to establish that the primal-dual optimizers are globally asymptotically stable under the primal -dual dynamics and that each solution of the dynamics converges to an optimizer.

...read moreread less

Abstract: This paper studies the asymptotic convergence properties of the primal-dual dynamics designed for solving constrained concave optimization problems using classical notions from stability analysis. We motivate the need for this study by providing an example that rules out the possibility of employing the invariance principle for hybrid automata to study asymptotic convergence. We understand the solutions of the primal-dual dynamics in the Caratheodory sense and characterize their existence, uniqueness, and continuity with respect to the initial condition. We use the invariance principle for discontinuous Caratheodory systems to establish that the primal-dual optimizers are globally asymptotically stable under the primal-dual dynamics and that each solution of the dynamics converges to an optimizer.

...read moreread less

Journal Article•DOI•

Dynamic State Estimation for Multi-Machine Power System by Unscented Kalman Filter with Enhanced Numerical Stability

[...]

Junjian Qi¹, Kai Sun², Jianhui Wang¹, Hui Liu¹•Institutions (2)

Argonne National Laboratory¹, University of Tennessee²

24 Sep 2015-arXiv: Optimization and Control

TL;DR: A new UKF with guaranteed positive semidifinite estimation error covariance (UKF-GPS) is proposed and compared with five existing approaches, finding that UKF-schol, UKF - modified and SR-UKF can always work well, indicating their better scalability mainly due to the enhanced numerical stability.

...read moreread less

Abstract: In this paper, in order to enhance the numerical stability of the unscented Kalman filter (UKF) used for power system dynamic state estimation, a new UKF with guaranteed positive semidifinite estimation error covariance (UKF-GPS) is proposed and compared with five existing approaches, including UKF-schol, UKF-$\kappa$, UKF-modified, UKF-$\Delta Q$, and the square-root unscented Kalman filter (SR-UKF). These methods and the extended Kalman filter (EKF) are tested by performing dynamic state estimation on WSCC 3-machine 9-bus system and NPCC 48-machine 140-bus system. For WSCC system, all methods obtain good estimates. However, for NPCC system, both EKF and the classic UKF fail. It is found that UKF-schol, UKF-$\kappa$, and UKF-$\Delta Q$ do not work well in some estimations while UKF-GPS works well in most cases. UKF-modified and SR-UKF can always work well, indicating their better scalability mainly due to the enhanced numerical stability.

...read moreread less

Proceedings Article•DOI•

Optimal Control and Coordination of Connected and Automated Vehicles at Urban Traffic Intersections

[...]

Yue J. Zhang¹, Andreas A. Malikopoulos², Christos G. Cassandras¹•Institutions (2)

Boston University¹, Oak Ridge National Laboratory²

29 Sep 2015-arXiv: Optimization and Control

TL;DR: A decentralized optimal control framework whose solution yields for each vehicle the optimal acceleration/deceleration at any time in the sense of minimizing fuel consumption is presented.

...read moreread less

Abstract: We address the problem of coordinating online a continuous flow of connected and automated vehicles (CAVs) crossing two adjacent intersections in an urban area. We present a decentralized optimal control framework whose solution yields for each vehicle the optimal acceleration/deceleration at any time in the sense of minimizing fuel consumption. The solu- tion, when it exists, allows the vehicles to cross the intersections without the use of traffic lights, without creating congestion on the connecting road, and under the hard safety constraint of collision avoidance. The effectiveness of the proposed solution is validated through simulation considering two intersections located in downtown Boston, and it is shown that coordination of CAVs can reduce significantly both fuel consumption and travel time.

...read moreread less

Posted Content•

An Asynchronous Mini-Batch Algorithm for Regularized Stochastic Optimization

[...]

Hamid Reza Feyzmahdavian¹, Arda Aytekin¹, Mikael Johansson¹•Institutions (1)

Royal Institute of Technology¹

18 May 2015-arXiv: Optimization and Control

TL;DR: This work proposes an asynchronous mini-batch algorithm for regularized stochastic optimization problems that eliminates idle waiting and allows workers to run at their maximal update rates and enjoys near-linear speedup if the number of workers is O(1/√ϵ).

...read moreread less

Abstract: Mini-batch optimization has proven to be a powerful paradigm for large-scale learning. However, the state of the art parallel mini-batch algorithms assume synchronous operation or cyclic update orders. When worker nodes are heterogeneous (due to different computational capabilities or different communication delays), synchronous and cyclic operations are inefficient since they will leave workers idle waiting for the slower nodes to complete their computations. In this paper, we propose an asynchronous mini-batch algorithm for regularized stochastic optimization problems with smooth loss functions that eliminates idle waiting and allows workers to run at their maximal update rates. We show that by suitably choosing the step-size values, the algorithm achieves a rate of the order $O(1/\sqrt{T})$ for general convex regularization functions, and the rate $O(1/T)$ for strongly convex regularization functions, where $T$ is the number of iterations. In both cases, the impact of asynchrony on the convergence rate of our algorithm is asymptotically negligible, and a near-linear speedup in the number of workers can be expected. Theoretical results are confirmed in real implementations on a distributed computing infrastructure.

...read moreread less

Collapse