scispace - formally typeset
Search or ask a question

Showing papers in "arXiv: Optimization and Control in 2015"


Journal ArticleDOI
TL;DR: JuMP as mentioned in this paper is an open-source modeling language that allows users to express a wide range of optimization problems (linear, mixed-integer, quadratic, conic-quadratic, semidefinite, and nonlinear) in a high-level, algebraic syntax.
Abstract: JuMP is an open-source modeling language that allows users to express a wide range of optimization problems (linear, mixed-integer, quadratic, conic-quadratic, semidefinite, and nonlinear) in a high-level, algebraic syntax. JuMP takes advantage of advanced features of the Julia programming language to offer unique functionality while achieving performance on par with commercial modeling tools for standard tasks. In this work we will provide benchmarks, present the novel aspects of the implementation, and discuss how JuMP can be extended to new problem classes and composed with state-of-the-art tools for visualization and interactivity.

907 citations


Posted Content
TL;DR: It is demonstrated that the distributionally robust optimization problems over Wasserstein balls can in fact be reformulated as finite convex programs—in many interesting cases even as tractable linear programs.
Abstract: We consider stochastic programs where the distribution of the uncertain parameters is only observable through a finite training dataset. Using the Wasserstein metric, we construct a ball in the space of (multivariate and non-discrete) probability distributions centered at the uniform distribution on the training samples, and we seek decisions that perform best in view of the worst-case distribution within this Wasserstein ball. The state-of-the-art methods for solving the resulting distributionally robust optimization problems rely on global optimization techniques, which quickly become computationally excruciating. In this paper we demonstrate that, under mild assumptions, the distributionally robust optimization problems over Wasserstein balls can in fact be reformulated as finite convex programs---in many interesting cases even as tractable linear programs. Leveraging recent measure concentration results, we also show that their solutions enjoy powerful finite-sample performance guarantees. Our theoretical results are exemplified in mean-risk portfolio optimization as well as uncertainty quantification.

808 citations


Posted Content
TL;DR: Coordinate descent algorithms solve optimization problems by successively performing approximate minimization along coordinate directions or coordinate hyperplanes as mentioned in this paper, and they have been used in many applications, such as data analysis, machine learning, and other areas of current interest.
Abstract: Coordinate descent algorithms solve optimization problems by successively performing approximate minimization along coordinate directions or coordinate hyperplanes. They have been used in applications for many years, and their popularity continues to grow because of their usefulness in data analysis, machine learning, and other areas of current interest. This paper describes the fundamentals of the coordinate descent approach, together with variants and extensions and their convergence properties, mostly with reference to convex objectives. We pay particular attention to a certain problem structure that arises frequently in machine learning applications, showing that efficient implementations of accelerated coordinate descent algorithms are possible for problems of this type. We also present some parallel variants and discuss their convergence properties under several models of parallel execution.

659 citations


Journal ArticleDOI
TL;DR: In this paper, a randomized comparison-based adaptive search algorithm is proposed to optimize a linear function with a linear constraint, where resampling is used to handle the linear constraint.
Abstract: This paper analyzes a (1, $\lambda$)-Evolution Strategy, a randomized comparison-based adaptive search algorithm, optimizing a linear function with a linear constraint. The algorithm uses resampling to handle the constraint. Two cases are investigated: first the case where the step-size is constant, and second the case where the step-size is adapted using cumulative step-size adaptation. We exhibit for each case a Markov chain describing the behaviour of the algorithm. Stability of the chain implies, by applying a law of large numbers, either convergence or divergence of the algorithm. Divergence is the desired behaviour. In the constant step-size case, we show stability of the Markov chain and prove the divergence of the algorithm. In the cumulative step-size adaptation case, we prove stability of the Markov chain in the simplified case where the cumulation parameter equals 1, and discuss steps to obtain similar results for the full (default) algorithm where the cumulation parameter is smaller than 1. The stability of the Markov chain allows us to deduce geometric divergence or convergence , depending on the dimension, constraint angle, population size and damping parameter, at a rate that we estimate. Our results complement previous studies where stability was assumed.

463 citations


Posted Content
TL;DR: In this article, two asynchronous parallel implementations of stochastic gradient (SG) have been studied on the computer network and the shared memory system, and it was shown that the linear speedup is achievable if the number of workers is bounded by the total number of iterations.
Abstract: Asynchronous parallel implementations of stochastic gradient (SG) have been broadly used in solving deep neural network and received many successes in practice recently. However, existing theories cannot explain their convergence and speedup properties, mainly due to the nonconvexity of most deep learning formulations and the asynchronous parallel mechanism. To fill the gaps in theory and provide theoretical supports, this paper studies two asynchronous parallel implementations of SG: one is on the computer network and the other is on the shared memory system. We establish an ergodic convergence rate $O(1/\sqrt{K})$ for both algorithms and prove that the linear speedup is achievable if the number of workers is bounded by $\sqrt{K}$ ($K$ is the total number of iterations). Our results generalize and improve existing analysis for convex minimization.

346 citations


Posted Content
TL;DR: ADMM might be a better choice than ALM for some nonconvex nonsmooth problems, because ADMM is not only easier to implement, it is also more likely to converge for the concerned scenarios.
Abstract: In this paper, we analyze the convergence of the alternating direction method of multipliers (ADMM) for minimizing a nonconvex and possibly nonsmooth objective function, $\phi(x_0,\ldots,x_p,y)$, subject to coupled linear equality constraints. Our ADMM updates each of the primal variables $x_0,\ldots,x_p,y$, followed by updating the dual variable. We separate the variable $y$ from $x_i$'s as it has a special role in our analysis. The developed convergence guarantee covers a variety of nonconvex functions such as piecewise linear functions, $\ell_q$ quasi-norm, Schatten-$q$ quasi-norm ($0

300 citations


Posted Content
TL;DR: In this article, a generic scheme for accelerating first-order optimization methods in the sense of Nesterov is introduced, which builds upon a new analysis of the accelerated proximal point algorithm.
Abstract: We introduce a generic scheme for accelerating first-order optimization methods in the sense of Nesterov, which builds upon a new analysis of the accelerated proximal point algorithm. Our approach consists of minimizing a convex objective by approximately solving a sequence of well-chosen auxiliary problems, leading to faster convergence. This strategy applies to a large class of algorithms, including gradient descent, block coordinate descent, SAG, SAGA, SDCA, SVRG, Finito/MISO, and their proximal variants. For all of these methods, we provide acceleration and explicit support for non-strongly convex objectives. In addition to theoretical speed-up, we also show that acceleration is useful in practice, especially for ill-conditioned problems where we measure significant improvements.

270 citations


Posted Content
TL;DR: Procrustes Flow as mentioned in this paper uses a thresholding scheme followed by gradient descent on a non-convex objective to recover a low-rank matrix from linear measurements and shows that as long as the measurements obey a standard restricted isometry property, their algorithm converges to the unknown matrix at a geometric rate.
Abstract: In this paper we study the problem of recovering a low-rank matrix from linear measurements. Our algorithm, which we call Procrustes Flow, starts from an initial estimate obtained by a thresholding scheme followed by gradient descent on a non-convex objective. We show that as long as the measurements obey a standard restricted isometry property, our algorithm converges to the unknown matrix at a geometric rate. In the case of Gaussian measurements, such convergence occurs for a $n_1 \times n_2$ matrix of rank $r$ when the number of measurements exceeds a constant times $(n_1+n_2)r$.

242 citations


Journal ArticleDOI
TL;DR: New distributed controllers for secondary frequency and voltage control in islanded microgrids Inspired by techniques from cooperative control, the proposed controllers use localized information and nearest-neighbor communication to collectively perform secondary control actions.
Abstract: In this work we present new distributed controllers for secondary frequency and voltage control in islanded microgrids. Inspired by techniques from cooperative control, the proposed controllers use localized information and nearest-neighbor communication to collectively perform secondary control actions. The frequency controller rapidly regulates the microgrid frequency to its nominal value while maintaining active power sharing among the distributed generators. Tuning of the voltage controller provides a simple and intuitive trade-off between the conflicting goals of voltage regulation and reactive power sharing. Our designs require no knowledge of the microgrid topology, impedances or loads. The distributed architecture allows for flexibility and redundancy, and eliminates the need for a central microgrid controller. We provide a voltage stability analysis and present extensive experimental results validating our designs, verifying robust performance under communication failure and during plug-and-play operation.

231 citations


Posted Content
TL;DR: The Block Successive Upper Bound Minimization (BSUM) as mentioned in this paper is a powerful algorithmic framework for big data optimization, which includes as special cases many well-known methods for analyzing massive data sets, such as the block coordinate Descent (BCD), the Convex-Concave Procedure (CCCP), the Block Coordinate Proximal Gradient (BCPG), the Nonnegative Matrix Factorization (NMF), the Expectation Maximization (EM) method and so on.
Abstract: This article presents a powerful algorithmic framework for big data optimization, called the Block Successive Upper bound Minimization (BSUM). The BSUM includes as special cases many well-known methods for analyzing massive data sets, such as the Block Coordinate Descent (BCD), the Convex-Concave Procedure (CCCP), the Block Coordinate Proximal Gradient (BCPG) method, the Nonnegative Matrix Factorization (NMF), the Expectation Maximization (EM) method and so on. In this article, various features and properties of the BSUM are discussed from the viewpoint of design flexibility, computational efficiency, parallel/distributed implementation and the required communication overhead. Illustrative examples from networking, signal processing and machine learning are presented to demonstrate the practical performance of the BSUM framework

215 citations


Journal ArticleDOI
TL;DR: Theoretically, it is shown that if the nonexpansive operator $T$ has a fixed point, then with probability one, ARock generates a sequence that converges to a fixed points of $T$.
Abstract: Finding a fixed point to a nonexpansive operator, i.e., $x^*=Tx^*$, abstracts many problems in numerical linear algebra, optimization, and other areas of scientific computing. To solve fixed-point problems, we propose ARock, an algorithmic framework in which multiple agents (machines, processors, or cores) update $x$ in an asynchronous parallel fashion. Asynchrony is crucial to parallel computing since it reduces synchronization wait, relaxes communication bottleneck, and thus speeds up computing significantly. At each step of ARock, an agent updates a randomly selected coordinate $x_i$ based on possibly out-of-date information on $x$. The agents share $x$ through either global memory or communication. If writing $x_i$ is atomic, the agents can read and write $x$ without memory locks. Theoretically, we show that if the nonexpansive operator $T$ has a fixed point, then with probability one, ARock generates a sequence that converges to a fixed points of $T$. Our conditions on $T$ and step sizes are weaker than comparable work. Linear convergence is also obtained. We propose special cases of ARock for linear systems, convex optimization, machine learning, as well as distributed and decentralized consensus problems. Numerical experiments of solving sparse logistic regression problems are presented.

Journal ArticleDOI
TL;DR: In this paper, the projected gradient method with monotone and Lipschitz-continuous mapping in Hilbert space has been proposed for solving variational inequalities with variational inequality problems.
Abstract: This paper is concerned with some new projection methods for solving variational inequality problems with monotone and Lipschitz-continuous mapping in Hilbert space. First, we propose the projected reflected gradient algorithm with a constant stepsize. It is similar to the projected gradient method, namely, the method requires only one projection onto the feasible set and only one value of the mapping per iteration. This distinguishes our method from most other projection-type methods for variational inequalities with monotone mapping. Also we prove that it has R-linear rate of convergence under the strong monotonicity assumption. The usual drawback of algorithms with constant stepsize is the requirement to know the Lipschitz constant of the mapping. To avoid this, we modify our first algorithm so that the algorithm needs at most two projections per iteration. In fact, our computational experience shows that such cases with two projections are very rare. This scheme, at least theoretically, seems to be very effective. All methods are shown to be globally convergent to a solution of the variational inequality. Preliminary results from numerical experiments are quite promising.

Posted Content
TL;DR: In this article, the authors provide a new proof of the linear convergence of the alternating direction method of multipliers (ADMM) when one of the objective terms is strongly convex, based on a framework for analyzing optimization algorithms introduced in Lessard et al.
Abstract: We provide a new proof of the linear convergence of the alternating direction method of multipliers (ADMM) when one of the objective terms is strongly convex. Our proof is based on a framework for analyzing optimization algorithms introduced in Lessard et al. (2014), reducing algorithm convergence to verifying the stability of a dynamical system. This approach generalizes a number of existing results and obviates any assumptions about specific choices of algorithm parameters. On a numerical example, we demonstrate that minimizing the derived bound on the convergence rate provides a practical approach to selecting algorithm parameters for particular ADMM instances. We complement our upper bound by constructing a nearly-matching lower bound on the worst-case rate of convergence.

Posted Content
TL;DR: This work describes a second-order trust-region algorithm that provably converges to a global minimizer efficiently, without special initializations, and highlights alternatives.
Abstract: In this note, we focus on smooth nonconvex optimization problems that obey: (1) all local minimizers are also global; and (2) around any saddle point or local maximizer, the objective has a negative directional curvature. Concrete applications such as dictionary learning, generalized phase retrieval, and orthogonal tensor decomposition are known to induce such structures. We describe a second-order trust-region algorithm that provably converges to a global minimizer efficiently, without special initializations. Finally we highlight alternatives, and open problems in this direction.

Posted Content
TL;DR: This paper highlights and clarify several variants of the Frank-Wolfe optimization algorithm that have been successfully applied in practice: away-steps FW, pairwise FW, fully-corrective FW and Wolfe's minimum norm point algorithm, and proves for the first time that they all enjoy global linear convergence, under a weaker condition than strong convexity of the objective.
Abstract: The Frank-Wolfe (FW) optimization algorithm has lately re-gained popularity thanks in particular to its ability to nicely handle the structured constraints appearing in machine learning applications. However, its convergence rate is known to be slow (sublinear) when the solution lies at the boundary. A simple less-known fix is to add the possibility to take 'away steps' during optimization, an operation that importantly does not require a feasibility oracle. In this paper, we highlight and clarify several variants of the Frank-Wolfe optimization algorithm that have been successfully applied in practice: away-steps FW, pairwise FW, fully-corrective FW and Wolfe's minimum norm point algorithm, and prove for the first time that they all enjoy global linear convergence, under a weaker condition than strong convexity of the objective. The constant in the convergence rate has an elegant interpretation as the product of the (classical) condition number of the function with a novel geometric quantity that plays the role of a 'condition number' of the constraint set. We provide pointers to where these algorithms have made a difference in practice, in particular with the flow polytope, the marginal polytope and the base polytope for submodular optimization.

Posted Content
TL;DR: This paper uses the Wasserstein distance to construct a ball in the space of probability distributions centered at the uniform distribution on the training samples, and proposes a distributionally robust logistic regression model that minimizes a worst-case expected logloss function.
Abstract: This paper proposes a distributionally robust approach to logistic regression. We use the Wasserstein distance to construct a ball in the space of probability distributions centered at the uniform distribution on the training samples. If the radius of this ball is chosen judiciously, we can guarantee that it contains the unknown data-generating distribution with high confidence. We then formulate a distributionally robust logistic regression model that minimizes a worst-case expected logloss function, where the worst case is taken over all distributions in the Wasserstein ball. We prove that this optimization problem admits a tractable reformulation and encapsulates the classical as well as the popular regularized logistic regression problems as special cases. We further propose a distributionally robust approach based on Wasserstein balls to compute upper and lower confidence bounds on the misclassification probability of the resulting classifier. These bounds are given by the optimal values of two highly tractable linear programs. We validate our theoretical out-of-sample guarantees through simulated and empirical experiments.

Journal ArticleDOI
TL;DR: In this paper, a time-varying distributed convex optimization problem is studied for continuous-time multi-agent systems, and two discontinuous algorithms based on the signum function are proposed to solve the problem in each case.
Abstract: In this paper, a time-varying distributed convex optimization problem is studied for continuous-time multi-agent systems. Control algorithms are designed for the cases of single-integrator and double-integrator dynamics. Two discontinuous algorithms based on the signum function are proposed to solve the problem in each case. Then in the case of double-integrator dynamics, two continuous algorithms based on, respectively, a time-varying and a fixed boundary layer are proposed as continuous approximations of the signum function. Also, to account for inter-agent collision for physical agents, a distributed convex optimization problem with swarm tracking behavior is introduced for both single-integrator and double-integrator dynamics.

Posted Content
TL;DR: In this article, a local reactive power (VAR) control framework is developed that can fast respond to voltage mismatch and address the robustness issues of (de-)centralized approaches against communication delays and noises.
Abstract: High penetration of distributed energy resources presents several challenges and opportunities for voltage regulation in power distribution systems. A local reactive power (VAR) control framework will be developed that can fast respond to voltage mismatch and address the robustness issues of (de-)centralized approaches against communication delays and noises. Using local bus voltage measurements, the proposed gradient-projection based schemes explicitly account for the VAR limit of every bus, and are proven convergent to a surrogate centralized problem with proper parameter choices. This optimality result quantifies the capability of local VAR control without requiring any real-time communications. The proposed framework and analysis generalize earlier results on the droop VAR control design, which may suffer from under-utilization of VAR resources in order to ensure stability. Numerical tests have demonstrated the validity of our analytical results and the effectiveness of proposed approaches implemented on realistic three-phase systems.

Posted Content
TL;DR: A new method for unconstrained optimization of a smooth and strongly convex function is proposed, which attains the optimal rate of convergence of Nesterov’s accelerated gradient descent.
Abstract: We propose a new method for unconstrained optimization of a smooth and strongly convex function, which attains the optimal rate of convergence of Nesterov’s accelerated gradient descent. The new algorithm has a simple geometric interpretation, loosely inspired by the ellipsoid method. We provide some numerical evidence that the new method can be superior to Nesterov’s accelerated gradient descent.

Posted Content
TL;DR: This paper derives linear convergence rates of several first order methods for solving smooth non-strongly convex constrained optimization problems, i.e. involving an objective function with a Lipschitz continuous gradient that satisfies some relaxed strong convexity condition.
Abstract: The standard assumption for proving linear convergence of first order methods for smooth convex optimization is the strong convexity of the objective function, an assumption which does not hold for many practical applications. In this paper, we derive linear convergence rates of several first order methods for solving smooth non-strongly convex constrained optimization problems, i.e. involving an objective function with a Lipschitz continuous gradient that satisfies some relaxed strong convexity condition. In particular, in the case of smooth constrained convex optimization, we provide several relaxations of the strong convexity conditions and prove that they are sufficient for getting linear convergence for several first order methods such as projected gradient, fast gradient and feasible descent methods. We also provide examples of functional classes that satisfy our proposed relaxations of strong convexity conditions. Finally, we show that the proposed relaxed strong convexity conditions cover important applications ranging from solving linear systems, Linear Programming, and dual formulations of linearly constrained convex problems.

Book ChapterDOI
TL;DR: In this paper, the existence and uniqueness of a weak solution for first order mean field game systems with local coupling is obtained by variational methods, which can be used to devise e−Nash equilibria for deterministic differential games with a finite (but large) number of players.
Abstract: Existence and uniqueness of a weak solution for first order mean field game systems with local coupling are obtained by variational methods. This solution can be used to devise e−Nash equilibria for deterministic differential games with a finite (but large) number of players. For smooth data, the first component of the weak solution of the MFG system is proved to satisfy (in a viscosity sense) a time-space degenerate elliptic differential equation.

Posted Content
TL;DR: This work proposes a distributed algorithm and establishes consistency, as well as a nonasymptotic, explicit, and geometric convergence rate for the concentration of the beliefs around the set of optimal hypotheses.
Abstract: We consider the problem of distributed learning, where a network of agents collectively aim to agree on a hypothesis that best explains a set of distributed observations of conditionally independent random processes. We propose a distributed algorithm and establish consistency, as well as a non-asymptotic, explicit and geometric convergence rate for the concentration of the beliefs around the set of optimal hypotheses. Additionally, if the agents interact over static networks, we provide an improved learning protocol with better scalability with respect to the number of nodes in the network.

Journal ArticleDOI
TL;DR: In this paper, a robust version of CoNMF called R-CoNMF has been proposed to estimate the number of endmembers, the mixing matrix, and the fractional abundances from hyperspectral linear mixtures.
Abstract: The recently introduced collaborative nonnegative matrix factorization (CoNMF) algorithm was conceived to simultaneously estimate the number of endmembers, the mixing matrix, and the fractional abundances from hyperspectral linear mixtures. This paper introduces R-CoNMF, which is a robust version of CoNMF. The robustness has been added by a) including a volume regularizer which penalizes the distance to a mixing matrix inferred by a pure pixel algorithm; and by b) introducing a new proximal alternating optimization (PAO) algorithm for which convergence to a critical point is guaranteed. Our experimental results indicate that R-CoNMF provides effective estimates both when the number of endmembers are unknown and when they are known.

Posted Content
TL;DR: In this paper, a new stochastic L-BFGS algorithm was proposed and proved to have a linear convergence rate for strongly convex and smooth functions, and the algorithm was shown to perform well for a wide range of step sizes.
Abstract: We propose a new stochastic L-BFGS algorithm and prove a linear convergence rate for strongly convex and smooth functions. Our algorithm draws heavily from a recent stochastic variant of L-BFGS proposed in Byrd et al. (2014) as well as a recent approach to variance reduction for stochastic gradient descent from Johnson and Zhang (2013). We demonstrate experimentally that our algorithm performs well on large-scale convex and non-convex optimization problems, exhibiting linear convergence and rapidly solving the optimization problems to high levels of precision. Furthermore, we show that our algorithm performs well for a wide-range of step sizes, often differing by several orders of magnitude.

Journal ArticleDOI
TL;DR: In this article, the optimal inertia placement problem in a linear network-reduced power system model is investigated and a set of closed-form global optimality results for particular problem instances as well as a computational approach resulting in locally optimal solutions are provided.
Abstract: A major transition in the operation of electric power grids is the replacement of synchronous machines by distributed generation connected via power electronic converters. The accompanying "loss of rotational inertia" and the fluctuations by renewable sources jeopardize the system stability, as testified by the ever-growing number of frequency incidents. As a remedy, numerous studies demonstrate how virtual inertia can be emulated through various devices, but few of them address the question of "where" to place this inertia. It is however strongly believed that the placement of virtual inertia hugely impacts system efficiency, as demonstrated by recent case studies. In this article, we carry out a comprehensive analysis in an attempt to address the optimal inertia placement problem. We consider a linear network-reduced power system model along with an H2 performance metric accounting for the network coherency. The optimal inertia placement problem turns out to be non-convex, yet we provide a set of closed-form global optimality results for particular problem instances as well as a computational approach resulting in locally optimal solutions. Further, we also consider the robust inertia allocation problem, wherein the optimization is carried out accounting for the worst-case disturbance location. We illustrate our results with a three-region power grid case study and compare our locally optimal solution with different placement heuristics in terms of different performance metrics.

Journal ArticleDOI
TL;DR: The results show that for the vast majority of the tested problems, the sGS-imsPADMM is 2–3 times faster than the directly extended multi-block ADMM with the aggressive step-length of 1.618, which is currently the benchmark among first-order methods for solving multi- block linear and quadratic SDP problems though its convergence is not guaranteed.
Abstract: In this paper, we propose an inexact multi-block ADMM-type first-order method for solving a class of high-dimensional convex composite conic optimization problems to moderate accuracy. The design of this method combines an inexact 2-block majorized semi-proximal ADMM and the recent advances in the inexact symmetric Gauss-Seidel (sGS) technique for solving a multi-block convex composite quadratic programming whose objective contains a nonsmooth term involving only the first block-variable. One distinctive feature of our proposed method (the sGS-imsPADMM) is that it only needs one cycle of an inexact sGS method, instead of an unknown number of cycles, to solve each of the subproblems involved.With some simple and implementable error tolerance criteria, the cost for solving the subproblems can be greatly reduced, and many steps in the forward sweep of each sGS cycle can often be skipped, which further contributes to the efficiency of the proposed method. Global convergence as well as the iteration complexity in the non-ergodic sense is established.Preliminary numerical experiments on some high-dimensional linear and convex quadratic SDP problems with a large number of linear equality and inequality constraints are also provided. The results show that for the vast majority of the tested problems, the sGS-imsPADMM is 2 to 3 times faster than the directly extended multi-block ADMM with the aggressive step-length of 1.618, which is currently the benchmark among first-order methods for solving multi-block linear and quadratic SDP problems though its convergence is not guaranteed.

Posted Content
TL;DR: This paper uses the invariance principle for discontinuous Caratheodory systems to establish that the primal-dual optimizers are globally asymptotically stable under the primal -dual dynamics and that each solution of the dynamics converges to an optimizer.
Abstract: This paper studies the asymptotic convergence properties of the primal-dual dynamics designed for solving constrained concave optimization problems using classical notions from stability analysis. We motivate the need for this study by providing an example that rules out the possibility of employing the invariance principle for hybrid automata to study asymptotic convergence. We understand the solutions of the primal-dual dynamics in the Caratheodory sense and characterize their existence, uniqueness, and continuity with respect to the initial condition. We use the invariance principle for discontinuous Caratheodory systems to establish that the primal-dual optimizers are globally asymptotically stable under the primal-dual dynamics and that each solution of the dynamics converges to an optimizer.

Journal ArticleDOI
TL;DR: A new UKF with guaranteed positive semidifinite estimation error covariance (UKF-GPS) is proposed and compared with five existing approaches, finding that UKF-schol, UKF - modified and SR-UKF can always work well, indicating their better scalability mainly due to the enhanced numerical stability.
Abstract: In this paper, in order to enhance the numerical stability of the unscented Kalman filter (UKF) used for power system dynamic state estimation, a new UKF with guaranteed positive semidifinite estimation error covariance (UKF-GPS) is proposed and compared with five existing approaches, including UKF-schol, UKF-$\kappa$, UKF-modified, UKF-$\Delta Q$, and the square-root unscented Kalman filter (SR-UKF). These methods and the extended Kalman filter (EKF) are tested by performing dynamic state estimation on WSCC 3-machine 9-bus system and NPCC 48-machine 140-bus system. For WSCC system, all methods obtain good estimates. However, for NPCC system, both EKF and the classic UKF fail. It is found that UKF-schol, UKF-$\kappa$, and UKF-$\Delta Q$ do not work well in some estimations while UKF-GPS works well in most cases. UKF-modified and SR-UKF can always work well, indicating their better scalability mainly due to the enhanced numerical stability.

Proceedings ArticleDOI
TL;DR: A decentralized optimal control framework whose solution yields for each vehicle the optimal acceleration/deceleration at any time in the sense of minimizing fuel consumption is presented.
Abstract: We address the problem of coordinating online a continuous flow of connected and automated vehicles (CAVs) crossing two adjacent intersections in an urban area. We present a decentralized optimal control framework whose solution yields for each vehicle the optimal acceleration/deceleration at any time in the sense of minimizing fuel consumption. The solu- tion, when it exists, allows the vehicles to cross the intersections without the use of traffic lights, without creating congestion on the connecting road, and under the hard safety constraint of collision avoidance. The effectiveness of the proposed solution is validated through simulation considering two intersections located in downtown Boston, and it is shown that coordination of CAVs can reduce significantly both fuel consumption and travel time.

Posted Content
TL;DR: This work proposes an asynchronous mini-batch algorithm for regularized stochastic optimization problems that eliminates idle waiting and allows workers to run at their maximal update rates and enjoys near-linear speedup if the number of workers is O(1/√ϵ).
Abstract: Mini-batch optimization has proven to be a powerful paradigm for large-scale learning. However, the state of the art parallel mini-batch algorithms assume synchronous operation or cyclic update orders. When worker nodes are heterogeneous (due to different computational capabilities or different communication delays), synchronous and cyclic operations are inefficient since they will leave workers idle waiting for the slower nodes to complete their computations. In this paper, we propose an asynchronous mini-batch algorithm for regularized stochastic optimization problems with smooth loss functions that eliminates idle waiting and allows workers to run at their maximal update rates. We show that by suitably choosing the step-size values, the algorithm achieves a rate of the order $O(1/\sqrt{T})$ for general convex regularization functions, and the rate $O(1/T)$ for strongly convex regularization functions, where $T$ is the number of iterations. In both cases, the impact of asynchrony on the convergence rate of our algorithm is asymptotically negligible, and a near-linear speedup in the number of workers can be expected. Theoretical results are confirmed in real implementations on a distributed computing infrastructure.