Showing papers on "Dynamic programming published in 2019"

PDF

Open Access

Journal Article•DOI•

Stochastic Optimization of Economic Dispatch for Microgrid Based on Approximate Dynamic Programming

[...]

Hang Shuai¹, Jiakun Fang¹, Xiaomeng Ai¹, Yufei Tang², Jinyu Wen¹, Haibo He³ - Show less +2 more•Institutions (3)

Huazhong University of Science and Technology¹, Florida Atlantic University², University of Rhode Island³

01 May 2019-IEEE Transactions on Smart Grid

TL;DR: The proposed ADPED algorithm can be adaptive to both day-ahead and intra-day operation under uncertainty and can make full use of historical prediction error distribution to reduce the influence of inaccurate forecast on the system operation.

...read moreread less

Abstract: This paper proposes an approximate dynamic programming (ADP)-based approach for the economic dispatch (ED) of microgrid with distributed generations. The time-variant renewable generation, electricity price, and the power demand are considered as stochastic variables in this paper. An ADP-based ED (ADPED) algorithm is proposed to optimally operate the microgrid under these uncertainties. To deal with the uncertainties, Monte Carlo method is adopted to sample the training scenarios to give empirical knowledge to ADPED. The piecewise linear function (PLF) approximation with improved slope updating strategy is employed for the proposed method. With sufficient information extracted from these scenarios and embedded in the PLF function, the proposed ADPED algorithm can not only be used in day-ahead scheduling but also the intra-day optimization process. The algorithm can make full use of historical prediction error distribution to reduce the influence of inaccurate forecast on the system operation. Numerical simulations demonstrate the effectiveness of the proposed approach. The near-optimal decision obtained by ADPED is very close to the global optimality. And it can be adaptive to both day-ahead and intra-day operation under uncertainty.

...read moreread less

198 citations

Journal Article•DOI•

Stochastic dual dynamic integer programming

[...]

Jikai Zou¹, Shabbir Ahmed¹, Xu Andy Sun¹•Institutions (1)

Georgia Institute of Technology¹

01 May 2019-Mathematical Programming

TL;DR: An extension to SDDP—called stochastic dual dynamic integer programming (SDDiP)—for solving MSIP problems with binary state variables is proposed and it is shown that, under fairly reasonable assumptions, an MSIP problem with general state variables can be approximated by one withbinary state variables to desired precision with only a modest increase in problem size.

...read moreread less

Abstract: Multistage stochastic integer programming (MSIP) combines the difficulty of uncertainty, dynamics, and non-convexity, and constitutes a class of extremely challenging problems. A common formulation for these problems is a dynamic programming formulation involving nested cost-to-go functions. In the linear setting, the cost-to-go functions are convex polyhedral, and decomposition algorithms, such as nested Benders’ decomposition and its stochastic variant, stochastic dual dynamic programming (SDDP), which proceed by iteratively approximating these functions by cuts or linear inequalities, have been established as effective approaches. However, it is difficult to directly adapt these algorithms to MSIP due to the nonconvexity of integer programming value functions. In this paper we propose an extension to SDDP—called stochastic dual dynamic integer programming (SDDiP)—for solving MSIP problems with binary state variables. The crucial component of the algorithm is a new reformulation of the subproblems in each stage and a new class of cuts, termed Lagrangian cuts, derived from a Lagrangian relaxation of a specific reformulation of the subproblems in each stage, where local copies of state variables are introduced. We show that the Lagrangian cuts satisfy a tightness condition and provide a rigorous proof of the finite convergence of SDDiP with probability one. We show that, under fairly reasonable assumptions, an MSIP problem with general state variables can be approximated by one with binary state variables to desired precision with only a modest increase in problem size. Thus our proposed SDDiP approach is applicable to very general classes of MSIP problems. Extensive computational experiments on three classes of real-world problems, namely electric generation expansion, financial portfolio management, and network revenue management, show that the proposed methodology is very effective in solving large-scale multistage stochastic integer optimization problems.

...read moreread less

196 citations

Journal Article•DOI•

Adaptive Distributionally Robust Optimization

[...]

Dimitris Bertsimas¹, Melvyn Sim², Meilin Zhang²•Institutions (2)

Massachusetts Institute of Technology¹, National University of Singapore²

01 Feb 2019-Management Science

TL;DR: A modular and tractable framework for solving an adaptive distributionally robust linear optimization problem, where the worst-case expected cost is minimized over an ambiguity set of probability distributions, and it is shown that the adaptive Distributionally robustlinear optimization problem can be formulated as a classical robust optimization problem.

...read moreread less

Abstract: We develop a modular and tractable framework for solving an adaptive distributionally robust linear optimization problem, where we minimize the worst-case expected cost over an ambiguity set of pro...

...read moreread less

192 citations

Journal Article•DOI•

Dynamic Energy Management of a Microgrid Using Approximate Dynamic Programming and Deep Recurrent Neural Network Learning

[...]

Peng Zeng¹, Hepeng Li¹, Haibo He², Shuhui Li³•Institutions (3)

Chinese Academy of Sciences¹, University of Rhode Island², University of Alabama³

01 Jul 2019-IEEE Transactions on Smart Grid

TL;DR: A novel dynamic energy management system is developed to incorporate efficient management of energy storage system into MG real- time dispatch while considering power flow constraints and uncertainties in load, renewable generation and real-time electricity price.

...read moreread less

Abstract: This paper focuses on economical operation of a microgrid (MG) in real-time. A novel dynamic energy management system is developed to incorporate efficient management of energy storage system into MG real-time dispatch while considering power flow constraints and uncertainties in load, renewable generation and real-time electricity price. The developed dynamic energy management mechanism does not require long-term forecast and optimization or distribution knowledge of the uncertainty, but can still optimize the long-term operational costs of MGs. First, the real-time scheduling problem is modeled as a finite-horizon Markov decision process over a day. Then, approximate dynamic programming and deep recurrent neural network learning are employed to derive a near optimal real-time scheduling policy. Last, using real power grid data from California independent system operator, a detailed simulation study is carried out to validate the effectiveness of the proposed method.

...read moreread less

155 citations

Journal Article•DOI•

Joint optimization of vehicle trajectories and intersection controllers with connected automated vehicles: Combined dynamic programming and shooting heuristic approach

[...]

Yi Guo¹, Jiaqi Ma¹, Chenfeng Xiong², Xiaopeng Li³, Fang Zhou⁴, Wei Hao⁵ - Show less +2 more•Institutions (5)

University of Cincinnati¹, University of Maryland, College Park², University of South Florida³, Cincinnati Children's Hospital Medical Center⁴, Changsha University of Science and Technology⁵

01 Jan 2019-Transportation Research Part C-emerging Technologies

TL;DR: An efficient DP-SH (dynamic programming with shooting heuristic as a subroutine) algorithm for the integrated optimization problem that can simultaneously optimize the trajectories of CAVs and intersection controllers is proposed and a two-step approach is developed to effectively obtain near-optimal intersection and trajectory control plans.

...read moreread less

Abstract: Connected and automated vehicle (CAV) technologies offer promising solutions to challenges that face today’s transportation systems. Vehicular trajectory control and intersection controller optimization based on CAV technologies are two approaches that have significant potential to mitigate congestion, lessen the risk of crashes, reduce fuel consumption, and decrease emissions at intersections. These two approaches should be integrated into a single process such that both aspects can be optimized simultaneously to achieve maximum benefits. This paper proposes an efficient DP-SH (dynamic programming with shooting heuristic as a subroutine) algorithm for the integrated optimization problem that can simultaneously optimize the trajectories of CAVs and intersection controllers (i.e., signal timing and phasing of traffic signals), and develops a two-step approach (DP-SH and trajectory optimization) to effectively obtain near-optimal intersection and trajectory control plans. Also, the proposed DP-SH algorithm can also consider mixed traffic stream scenarios with different levels of CAV market penetration. Numerical experiments are conducted, and the results prove the efficiency and sound performance of the proposed optimization framework. The proposed DP-SH algorithm, compared to the adaptive signal control, can reduce the average travel time by up to 35.72% and save the consumption by up to 31.5%. In mixed traffic scenarios, system performance improves with increasing market penetration rates. Even with low levels of penetration, there are significant benefits in fuel consumption savings. The computational efficiency, as evidenced in the case studies, indicates the applicability of DP-SH for real-time implementation.

...read moreread less

155 citations

Journal Article•DOI•

Energy management based on reinforcement learning with double deep Q-learning for a hybrid electric tracked vehicle

[...]

Xuefeng Han¹, Hongwen He¹, Jingda Wu¹, Jiankun Peng¹, Yuecheng Li¹ - Show less +1 more•Institutions (1)

Beijing Institute of Technology¹

15 Nov 2019-Applied Energy

TL;DR: The proposed energy management strategy, based on double deep Q-learning algorithm, prevents training process falling into the overoptimistic estimate of policy value and highlights its significant advantages in terms of the iterative convergence rate and optimization performance.

...read moreread less

143 citations

Journal Article•DOI•

Joint Computation Offloading and Multiuser Scheduling Using Approximate Dynamic Programming in NB-IoT Edge Computing System

[...]

Lei Lei¹, Huijuan Xu², Xiong Xiong³, Kan Zheng³, Wei Xiang¹ - Show less +1 more•Institutions (3)

James Cook University¹, Beijing Jiaotong University², Beijing University of Posts and Telecommunications³

21 Feb 2019-IEEE Internet of Things Journal

TL;DR: A joint computation offloading and multiuser scheduling algorithm in NB-IoT edge computing system that minimizes the long-term average weighted sum of delay and power consumption under stochastic traffic arrival is proposed.

...read moreread less

Abstract: The Internet of Things (IoT) connects a huge number of resource-constraint IoT devices to the Internet, which generate massive amount of data that can be offloaded to the cloud for computation. As some of the applications may require very low latency, the emerging mobile edge computing (MEC) architecture offers cloud services by deploying MEC servers at the mobile base stations (BSs). The IoT devices can transmit the offloaded data to the BS for computation at the MEC server. Narrowband-IoT (NB-IoT) is a new cellular technology for the transmission of IoT data to the BS. In this paper, we propose a joint computation offloading and multiuser scheduling algorithm in NB-IoT edge computing system that minimizes the long-term average weighted sum of delay and power consumption under stochastic traffic arrival. We formulate the dynamic optimization problem into an infinite-horizon average-reward continuous-time Markov decision process (CTMDP) model. In order to deal with the curse-of-dimensionality problem, we use the approximate dynamic programming techniques, i.e., the linear value-function approximation and temporal-difference learning with post-decision state and semi-gradient descent method, to derive a simple algorithm for the solution of the CTMDP model. The proposed algorithm is semi-distributed, where the offloading algorithm is performed locally at the IoT devices, while the scheduling algorithm is auction-based where the IoT devices submit bids to the BS to make the scheduling decision centrally. Simulation results show that the proposed algorithm provides significant performance improvement over the two baseline algorithms and the MUMTO algorithm which is designed based on the deterministic task model.

...read moreread less

103 citations

Journal Article•DOI•

Multistage Stochastic Unit Commitment Using Stochastic Dual Dynamic Integer Programming

[...]

Jikai Zou¹, Shabbir Ahmed¹, Xu Andy Sun¹•Institutions (1)

Georgia Institute of Technology¹

01 May 2019-IEEE Transactions on Power Systems

TL;DR: This work proposes a new type of decomposition algorithm, based on the recently proposed framework of stochastic dual dynamic integer programming (SDDiP), to solve the multistage stochastics unit commitment (MSUC) problem and proposes a variety of computational enhancements to SDDiP.

...read moreread less

Abstract: Unit commitment (UC) is a key operational problem in power systems for the optimal schedule of daily generation commitment. Incorporating uncertainty in this already difficult mixed-integer optimization problem introduces significant computational challenges. Most existing stochastic UC models consider either a two-stage decision structure, where the commitment schedule for the entire planning horizon is decided before the uncertainty is realized, or a multistage stochastic programming model with relatively small scenario trees to ensure tractability. We propose a new type of decomposition algorithm, based on the recently proposed framework of stochastic dual dynamic integer programming (SDDiP), to solve the multistage stochastic unit commitment (MSUC) problem. We propose a variety of computational enhancements to SDDiP, and conduct systematic and extensive computational experiments to demonstrate that the proposed method is able to handle elaborate stochastic processes and can solve MSUCs with a huge number of scenarios that are impossible to handle by existing methods.

...read moreread less

99 citations

Journal Article•DOI•

Optimal Real-Time Operation Strategy for Microgrid: An ADP-Based Stochastic Nonlinear Optimization Approach

[...]

Hang Shuai¹, Jiakun Fang², Xiaomeng Ai¹, Jinyu Wen¹, Haibo He³ - Show less +1 more•Institutions (3)

Huazhong University of Science and Technology¹, Aalborg University², University of Rhode Island³

01 Apr 2019-IEEE Transactions on Sustainable Energy

TL;DR: This paper proposes an approximate dynamic programming (ADP) based algorithm for the real-time operation of the microgrid under uncertainties, which decomposes the original multitime periods MINLP problem into single-time period nonlinear programming problems.

...read moreread less

Abstract: This paper proposes an approximate dynamic programming (ADP) based algorithm for the real-time operation of the microgrid under uncertainties. First, the optimal operation of the microgrid is formulated as a stochastic mixed-integer nonlinear programming (MINLP) problem, combining the ac power flow and the detailed operational character of the battery. For this NP-hard problem, the proposed ADP based energy management algorithm decomposes the original multitime periods MINLP problem into single-time period nonlinear programming problems. Thus, the sequential decisions can be made by solving Bellman's equation. Historical data is utilized offline to improve the optimality of the real-time decision, and the dependency on the forecast information is reduced. Comparative numerical simulations with several existing methods demonstrate the effectiveness and efficiency of the proposed algorithm.

...read moreread less

95 citations

Journal Article•DOI•

LinearFold: linear-time approximate RNA folding by 5'-to-3' dynamic programming and beam search.

[...]

Liang Huang¹, Liang Huang², He Zhang², Dezhong Deng¹, Kai Zhao¹, Kaibo Liu², Kaibo Liu¹, David A. Hendrix³, David A. Hendrix¹, David H. Mathews³ - Show less +6 more•Institutions (3)

Oregon State University¹, Baidu², University of Rochester Medical Center³

15 Jul 2019-Bioinformatics

TL;DR: This work is the first RNA folding algorithm to achieve linear runtime (and linear space) without imposing constraints on the output structure, and leads to significantly more accurate predictions on the longest sequence families in that database, as well as improved accuracies for long-range base pairs.

...read moreread less

Abstract: Motivation Predicting the secondary structure of an ribonucleic acid (RNA) sequence is useful in many applications. Existing algorithms [based on dynamic programming] suffer from a major limitation: their runtimes scale cubically with the RNA length, and this slowness limits their use in genome-wide applications. Results We present a novel alternative O(n3)-time dynamic programming algorithm for RNA folding that is amenable to heuristics that make it run in O(n) time and O(n) space, while producing a high-quality approximation to the optimal solution. Inspired by incremental parsing for context-free grammars in computational linguistics, our alternative dynamic programming algorithm scans the sequence in a left-to-right (5'-to-3') direction rather than in a bottom-up fashion, which allows us to employ the effective beam pruning heuristic. Our work, though inexact, is the first RNA folding algorithm to achieve linear runtime (and linear space) without imposing constraints on the output structure. Surprisingly, our approximate search results in even higher overall accuracy on a diverse database of sequences with known structures. More interestingly, it leads to significantly more accurate predictions on the longest sequence families in that database (16S and 23S Ribosomal RNAs), as well as improved accuracies for long-range base pairs (500+ nucleotides apart), both of which are well known to be challenging for the current models. Availability and implementation Our source code is available at https://github.com/LinearFold/LinearFold, and our webserver is at http://linearfold.org (sequence limit: 100 000nt). Supplementary information Supplementary data are available at Bioinformatics online.

...read moreread less

88 citations

Journal Article•DOI•

A branch-and-price algorithm for the heterogeneous fleet green vehicle routing problem with time windows

[...]

Yang Yu¹, Sihan Wang¹, Junwei Wang², Min Huang¹•Institutions (2)

Northeastern University (China)¹, University of Hong Kong²

01 Apr 2019-Transportation Research Part B-methodological

TL;DR: In the improved BAP, to speed up the solution for the pricing problem, a multi-vehicle approximate dynamic programming (MVADP) algorithm that is based on the labeling algorithm is developed that reduces labels by integrating the calculation of pricing problems for all vehicle types.

...read moreread less

Abstract: Heterogeneous fleet vehicles can be used to reduce carbon emissions. We propose an improved branch-and-price (BAP) algorithm to precisely solve the heterogeneous fleet green vehicle routing problem with time windows (HFGVRPTW). In the improved BAP, to speed up the solution for the pricing problem, we develop a multi-vehicle approximate dynamic programming (MVADP) algorithm that is based on the labeling algorithm. The MVADP algorithm reduces labels by integrating the calculation of pricing problems for all vehicle types. In addition, to rapidly obtain a tighter upper bound, we propose an integer branch method. For each branch, we solve the master problem with the integer constraint by the CPLEX solver using the columns produced by column generation. We retain the smaller of the obtained integer solution and the current upper bound, and the branches are thus reduced significantly. Extensive computational experiments were performed on the Solomon benchmark instances. The results show that the branches and computational time were reduced significantly by the improved BAP algorithm.

...read moreread less

Journal Article•DOI•

Fuel economy optimization of power split hybrid vehicles: A rapid dynamic programming approach

[...]

Yalian Yang¹, Huanxin Pei¹, Xiaosong Hu², Xiaosong Hu¹, Yonggang Liu¹, Cong Hou, Dongpu Cao³ - Show less +3 more•Institutions (3)

Chongqing University¹, Cranfield University², University of Waterloo³

01 Jan 2019-Energy

TL;DR: An approximate optimization method, called rapid dynamic programming (Rapid-DP), is developed and discussed in this paper, and is leveraged, for the first time, to optimize key powertrain parameters for power split hybrid electric vehicles.

...read moreread less

Journal Article•DOI•

Stochastic Optimal Control for Energy Internet: A Bottom-Up Energy Management Approach

[...]

Haochen Hua¹, Yuchao Qin¹, Chuantong Hao¹, Junwei Cao¹•Institutions (1)

Tsinghua University¹

01 Mar 2019-IEEE Transactions on Industrial Informatics

TL;DR: A novel hybrid modeling method combining both recurrent neural networks and Ornstein–Uhlenbeck process is developed to obtain accurate power models for both photovoltaic panels and loads and formulate the energy management issue into a stochastic optimal control problem and solve it via dynamic programming approach.

...read moreread less

Abstract: In this paper, an energy management issue is considered for energy Internet where microgrids (MGs) are interconnected via energy routers (ERs). Focusing on an individual MG, we propose controllers in microturbines (MTs) and the ER, such that the following three criteria are hold simultaneously. First, a bottom-up energy management approach is realized. Second, the operation cost of utilizing battery energy storage devices is minimized. Third, the situation of overcontrol with respect to MTs is considered to be avoided. Besides, we develop a novel hybrid modeling method combining both recurrent neural networks and Ornstein–Uhlenbeck process to obtain accurate power models for both photovoltaic panels and loads. Next, we formulate our energy management issue into a stochastic optimal control problem and solve it via dynamic programming approach. Finally, examples illustrating the feasibility of the proposed methods are provided.

...read moreread less

Journal Article•DOI•

Data-Driven Distributed Optimal Consensus Control for Unknown Multiagent Systems With Input-Delay

[...]

Huaipin Zhang¹, Dong Yue¹, Chunxia Dou¹, Wei Zhao², Xiangpeng Xie¹ - Show less +1 more•Institutions (2)

Nanjing University of Posts and Telecommunications¹, Southeast University²

01 Jun 2019-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: A policy iteration algorithm based on distributed asynchronous update mechanism is proposed to learn the coupled Hamilton–Jacobi–Bellman equations online and the measured data-based critic-actor neural networks are adopted to approximate the value functions and the control policies, respectively.

...read moreread less

Abstract: This paper is concerned with data-driven distributed optimal consensus control for unknown multiagent systems (MASs) with input delays. The input-delayed MAS model is first converted into a delay-free form using a model reduction method. By establishing an equivalent relationship on the predesigned performance indices of the two MASs, optimal consensus control of input-delayed MAS can be fully transformed to that of delay-free MAS. Based on the coupled Hamilton–Jacobi equations and Bellman’s optimality principle, optimal consensus control policies are derived for the transformed delay-free MAS. Then a policy iteration algorithm based on distributed asynchronous update mechanism is proposed to learn the coupled Hamilton–Jacobi–Bellman equations online. To perform the proposed data-driven adaptive dynamic programming algorithm, we adopt the measured data-based critic-actor neural networks to approximate the value functions and the control policies, respectively. Finally, a simulation example is given to illustrate the effectiveness of the proposed method.

...read moreread less

Journal Article•DOI•

Data-driven optimal tracking control of discrete-time multi-agent systems with two-stage policy iteration algorithm

[...]

Zhinan Peng¹, Yiyi Zhao², Jiangping Hu¹, Bijoy K. Ghosh³, Bijoy K. Ghosh¹ - Show less +1 more•Institutions (3)

University of Electronic Science and Technology of China¹, Southwestern University of Finance and Economics², Texas Tech University³

01 May 2019-Information Sciences

TL;DR: A novel adaptive dynamic programming (ADP) algorithm is developed to solve the optimal tracking control problem of discrete-time multi-agent systems and an actor-critic neural network is used to approximate both the iterative control laws and the Iterative performance index functions.

...read moreread less

Journal Article•DOI•

Neural-network-based output-feedback control with stochastic communication protocols

[...]

Derui Ding¹, Zidong Wang², Zidong Wang³, Qing-Long Han¹•Institutions (3)

Swinburne University of Technology¹, Brunel University London², Shandong University of Science and Technology³

01 Aug 2019-Automatica

TL;DR: A novel iterative adaptive dynamic programming (ADP) algorithm is developed to obtain the desired suboptimal solution with the help of auxiliary quasi-HJB equation, and the algorithm convergence is investigated via the intensive use of the mathematical analysis.

...read moreread less

Journal Article•DOI•

Nonlinear Constrained Optimal Control of Wave Energy Converters With Adaptive Dynamic Programming

[...]

Jing Na¹, Bin Wang¹, Guang Li², Siyuan Zhan², Wei He³ - Show less +1 more•Institutions (3)

Kunming University of Science and Technology¹, Queen Mary University of London², University of Science and Technology Beijing³

01 Oct 2019-IEEE Transactions on Industrial Electronics

TL;DR: This paper addresses the energy maximization problem of wave energy converters subject to nonlinearities and constraints, and presents an efficient online control strategy based on the principle of adaptive dynamic programming (ADP) for solving the associated Hamilton–Jacobi–Bellman equation.

...read moreread less

Abstract: In this paper, we address the energy maximization problem of wave energy converters (WEC) subject to nonlinearities and constraints, and present an efficient online control strategy based on the principle of adaptive dynamic programming (ADP) for solving the associated Hamilton–Jacobi–Bellman equation. To solve the derived constrained nonlinear optimal control problem, a critic neural network (NN) is used to approximate the time-dependant optimal cost value and then calculate the practical suboptimal causal control action. The proposed novel WEC control strategy leads to a simplified ADP framework without involving the widely used actor NN. The significantly improved computational efficacy of the proposed control makes it attractive for its practical implementation on a WEC to achieve a reduced unit cost of energy output, which is especially important when the dynamics of a WEC are complicated and need to be described accurately by a high-order model with nonlinearities and constraints. Simulation results are provided to show the efficacy of the proposed control method.

...read moreread less

Proceedings Article•DOI•

Age-optimal Sampling and Transmission Scheduling in Multi-Source Systems

[...]

Ahmed M. Bedewy¹, Yin Sun², Sastry Kompella³, Ness B. Shroff¹•Institutions (3)

Ohio State University¹, Auburn University², United States Naval Research Laboratory³

02 Jul 2019

TL;DR: It is proved that, for any given sampling strategy, the Maximum Age First (MAF) scheduling strategy provides the best age performance among all scheduling strategies.

...read moreread less

Abstract: In this paper, we consider the problem of minimizing the age of information in a multi-source system, where samples are taken from multiple sources and sent to a destination via a channel with random delay. Due to interference, only one source can be scheduled at a time. We consider the problem of finding a decision policy that determines the sampling times and transmission order of the sources for minimizing the total average peak age (TaPA) and the total average age (TaA) of the sources. Our investigation of this problem results in an important separation principle: The optimal scheduling strategy and the optimal sampling strategy are independent of each other. In particular, we prove that, for any given sampling strategy, the Maximum Age First (MAF) scheduling strategy provides the best age performance among all scheduling strategies. This transforms our overall optimization problem into an optimal sampling problem, given that the decision policy follows the MAF scheduling strategy. While the zero-wait sampling strategy (in which a sample is generated once the channel becomes idle) is shown to be optimal for minimizing the TaPA, it does not always minimize the TaA. We use Dynamic Programming (DP) to investigate the optimal sampling problem for minimizing the TaA. Finally, we provide an approximate analysis of Bellman's equation to approximate the TaA-optimal sampling strategy by a water-filling solution which is shown to be very close to optimal through numerical evaluations.

...read moreread less

Journal Article•DOI•

Nonconvex Medium-Term Hydropower Scheduling by Stochastic Dual Dynamic Integer Programming

[...]

Martin N. Hjelmeland¹, Jikai Zou², Arild Helseth³, Shabbir Ahmed²•Institutions (3)

Norwegian University of Science and Technology¹, Georgia Institute of Technology², SINTEF³

01 Jan 2019-IEEE Transactions on Sustainable Energy

TL;DR: The case study demonstrates that it is possible but time-consuming to solve the MTHS problem to optimality, and shows that a new type of cut, known as strengthened Benders cut, significantly contributes to close the optimality gap compared to classical Benders cuts.

...read moreread less

Abstract: Hydropower producers rely on stochastic optimization when scheduling their resources over long periods of time. Due to its computational complexity, the optimization problem is normally cast as a stochastic linear program. In a future power market with more volatile power prices, it becomes increasingly important to capture parts of the hydropower operational characteristics that are not easily linearized, e.g., unit commitment and nonconvex generation curves. Stochastic dual dynamic programming (SDDP) is a state-of-the-art algorithm for long- and medium-term hydropower scheduling with a linear problem formulation. A recently proposed extension of the SDDP method known as stochastic dual dynamic integer programming (SDDiP) has proven convergence also in the nonconvex case. We apply the SDDiP algorithm to the medium-term hydropower scheduling (MTHS) problem and elaborate on how to incorporate stagewise-dependent stochastic variables on the right-hand sides and the objective of the optimization problem. Finally, we demonstrate the capability of the SDDiP algorithm on a case study for a Norwegian hydropower producer. The case study demonstrates that it is possible but time-consuming to solve the MTHS problem to optimality. However, the case study shows that a new type of cut, known as strengthened Benders cut, significantly contributes to close the optimality gap compared to classical Benders cuts.

...read moreread less

Journal Article•DOI•

Load-adaptive real-time energy management strategy for battery/ultracapacitor hybrid energy storage system using dynamic programming optimization

[...]

Chang Liu¹, Yujie Wang¹, Li Wang¹, Zonghai Chen¹•Institutions (1)

University of Science and Technology of China¹

31 Oct 2019-Journal of Power Sources

TL;DR: A load-adaptive rule based control strategy that has the stronger capability of battery protecting and energy-saving under unknown load patterns, and can achieve near-optimal energy management in real time with low computational cost is proposed.

...read moreread less

Journal Article•DOI•

Distributed Fuzzy Adaptive Backstepping Optimal Control for Nonlinear Multimissile Guidance Systems With Input Saturation

[...]

Jingliang Sun¹, Chunsheng Liu¹•Institutions (1)

Nanjing University of Aeronautics and Astronautics¹

01 Mar 2019-IEEE Transactions on Fuzzy Systems

TL;DR: The whole controller, consisting of distributed adaptive feedforward tracking controller and distributed optimal feedback controller, not only guarantees that all signals in the closed-loop system are uniformly ultimately bounded, but also guarantees that the cooperative cost function is minimized.

...read moreread less

Abstract: This paper investigates the distributed optimal tracking control problem for nonlinear multiagent systems with a fixed directed graph. The dynamics of followers are in strict-feedback form with unknown nonlinearities and input saturation. Fuzzy logic systems and auxiliary system are introduced to identify the unknown nonlinearities and compensate the effect of input saturation, respectively. Then, by using the command-filtered backstepping technique, the distributed optimal tracking control problem is transformed into a distributed optimal regulation problem of tracking error dynamics in affine form. Subsequently, the distributed optimal feedback controller is derived via adaptive dynamic programming technique, in which a critic network is constructed to approximate the associated cost function online with a designed weight update law. Therefore, the whole controller, consisting of distributed adaptive feedforward tracking controller and distributed optimal feedback controller, not only guarantees that all signals in the closed-loop system are uniformly ultimately bounded, but also guarantees that the cooperative cost function is minimized. The effectiveness of the proposed method is demonstrated by simulation on the cooperative guidance problem of multimissile systems.

...read moreread less

Journal Article•DOI•

Event-Triggered Adaptive Dynamic Programming for Non-Zero-Sum Games of Unknown Nonlinear Systems via Generalized Fuzzy Hyperbolic Models

[...]

Huaguang Zhang¹, Hanguang Su¹, Kun Zhang¹, Yanhong Luo¹•Institutions (1)

Northeastern University (China)¹

30 Jan 2019-IEEE Transactions on Fuzzy Systems

TL;DR: A novel near-optimal control scheme for a class of unknown nonlinear continuous-time non-zero-sum (NZS) differential games is investigated and an identifier-critic architecture is developed to obtain the event-triggered controller.

...read moreread less

Abstract: In this paper, by incorporating the event-triggered mechanism and the adaptive dynamic programming algorithm, a novel near-optimal control scheme for a class of unknown nonlinear continuous-time non-zero-sum (NZS) differential games is investigated. First, a generalized fuzzy hyperbolic model based identifier is established, using only the input–output data, to relax the requirement for the complete system dynamics. Then, under the event-based framework, the coupled Hamilton–Jacobi equations are derived for the multiplayer NZS games. Then, the adaptive critic design method is employed to approximate the optimal control policies; thus, an identifier-critic architecture is developed to obtain the event-triggered controller. By the virtue of the Lyapunov theory, a state-dependent triggering condition, which is different from the existing works, is developed to achieve the stability of the closed-loop control system both for the continuous and jump dynamics. Finally, two numerical examples are simulated to substantiate the feasibility of the analytical design.

...read moreread less

Journal Article•DOI•

Output Tracking Control Based on Adaptive Dynamic Programming With Multistep Policy Evaluation

[...]

Biao Luo¹, Derong Liu², Tingwen Huang³, Jiangjiang Liu⁴•Institutions (4)

Chinese Academy of Sciences¹, Guangdong University of Technology², Texas A&M University at Qatar³, University of Science and Technology Beijing⁴

01 Oct 2019-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: The convergence of MsHDP algorithm is proved by demonstrating that it converges to the solution of the Bellman equation.

...read moreread less

Abstract: In this paper, the optimal output tracking control problem of discrete-time nonlinear systems is considered. First, the augmented system is derived and the tracking control problem is converted to the regulation problem with a discounted performance index, which relies on the solution of the Bellman equation. It is known that policy iteration and value iteration are two classical algorithms for solving the Bellman equation. Through analysis of the two algorithms, it is found that policy iteration converges fast while requires an initial admissible control policy, and value iteration avoids the requirement of an initial admissible control policy but converges slowly. To achieve the tradeoff between policy iteration and value iteration, the multistep heuristic dynamic programming (MsHDP) is proposed by using multistep policy evaluation scheme. The convergence of MsHDP algorithm is proved by demonstrating that it converges to the solution of the Bellman equation. Subsequently, neural network-based actor-critic structure is developed to implement the MsHDP algorithm. The effectiveness and advantages of the developed MsHDP method are validated through comparative simulation studies.

...read moreread less

Journal Article•DOI•

Connected Vehicles Based Traffic Signal Timing Optimization

[...]

Wan Li¹, Xuegang Ban¹•Institutions (1)

University of Washington¹

01 Dec 2019-IEEE Transactions on Intelligent Transportation Systems

TL;DR: This work studies the traffic signal control problem with connected vehicles by assuming a fixed cycle length and proposes a two-step method to make sure that the obtained optimal solution can lead to the fixed cyclelength.

...read moreread less

Abstract: We study the traffic signal control problem with connected vehicles by assuming a fixed cycle length so that the proposed model can be extended readily for the coordination of multiple signals. The problem can be first formulated as a mixed-integer nonlinear program, by considering the information of individual vehicle’s trajectories (i.e., second-by-second vehicle locations and speeds) and their realistic driving/car-following behaviors. The objective function is to minimize the weighted sum of total fuel consumption and travel time. Due to the large dimension of the problem and the complexity of the nonlinear car-following model, solving the nonlinear program directly is challenging. We then reformulate the problem as a dynamic programming model by dividing the timing decisions into stages (one stage for a signal phase) and approximating the fuel consumption and travel time of a stage as functions of the state and decision variables of the stage. We also propose a two-step method to make sure that the obtained optimal solution can lead to the fixed cycle length. Numerical experiments are provided to test the performance of the proposed model using data generated by traffic simulation.

...read moreread less

Journal Article•DOI•

A Comparative Study of Meta-Heuristic Optimization Algorithms for 0 – 1 Knapsack Problem: Some Initial Results

[...]

Absalom E. Ezugwu¹, Verosha Pillay¹, Divyan Hirasen¹, Kershen Sivanarain¹, Melvin Govender¹ - Show less +1 more•Institutions (1)

University of KwaZulu-Natal¹

01 Apr 2019-IEEE Access

TL;DR: The results revealed the superior performances of the branch and bound dynamic programming, and hybrid genetic algorithm with simulated annealing methods over all the compared algorithms, and indicated that the hybrid algorithm can be applied as an alternative to solve small- and large-sized 0–1 knapsack problems.

...read moreread less

Abstract: In this paper, we present some initial results of several meta-heuristic optimization algorithms, namely, genetic algorithms, simulated annealing, branch and bound, dynamic programming, greedy search algorithm, and a hybrid genetic algorithm-simulated annealing for solving the 0-1 knapsack problems Each algorithm is designed in such a way that it penalizes infeasible solutions and optimizes the feasible solution The experiments are carried out using both low-dimensional and high-dimensional knapsack problems The numerical results of the hybrid algorithm are compared with the results achieved by the individual algorithms The results revealed the superior performances of the branch and bound dynamic programming, and hybrid genetic algorithm with simulated annealing methods over all the compared algorithms This performance was established by taking into account both the algorithm computational time and the solution quality In addition, the obtained results also indicated that the hybrid algorithm can be applied as an alternative to solve small- and large-sized 0-1 knapsack problems

...read moreread less

Journal Article•DOI•

Robust Dual Dynamic Programming

[...]

Angelos Georghiou¹, Angelos Tsoukalas², Wolfram Wiesemann³•Institutions (3)

Desautels Faculty of Management¹, American University of Beirut², Imperial College London³

03 Apr 2019-Operations Research

TL;DR: In the paper “Robust Dual Dynamic Programming,” Angelos Georghiou, Angelos Tsoukalas, and Wolfram Wiesemann propose a novel solution scheme for addressing planning problems with long horizons.

...read moreread less

Abstract: In the paper “Robust Dual Dynamic Programming,” Angelos Georghiou, Angelos Tsoukalas, and Wolfram Wiesemann propose a novel solution scheme for addressing planning problems with long horizons. Such...

...read moreread less

Journal Article•DOI•

Neural-Network-Based Robust Control Schemes for Nonlinear Multiplayer Systems With Uncertainties via Adaptive Dynamic Programming

[...]

He Jiang¹, Huaguang Zhang¹, Yanhong Luo¹, Ji Han¹•Institutions (1)

Northeastern University (China)¹

01 Mar 2019-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: This paper investigates the robust control issues of nonlinear multiplayer systems by utilizing adaptive dynamic programming methods and fills a gap in the ADP field, where actuator uncertainties for multiplayer systems are still not addressed.

...read moreread less

Abstract: This paper investigates the robust control issues of nonlinear multiplayer systems by utilizing adaptive dynamic programming (ADP) methods and fills a gap in the ADP field, where actuator uncertainties for multiplayer systems are still not addressed. Two types of actuator uncertainties including bounded nonlinear perturbation and unknown constant actuator fault are taken into consideration. First, a data-driven reinforcement learning (RL) approach is derived to learn the optimal solutions of multiplayer nonzero-sum games. Then, based on the obtained optimal control policies, two robust control schemes are developed to handle these two different types of uncertainties, respectively, and the associated stability analysis is also provided. To implement the proposed iterative RL approach, a single neural network (NN) architecture with least-square-based updating law is given, which reduces the computation burden compared with the traditional dual NN architecture. Finally, two numerical examples are shown to test the feasibility of our proposed schemes.

...read moreread less

Journal Article•DOI•

Output Feedback Q-Learning Control for the Discrete-Time Linear Quadratic Regulator Problem

[...]

Syed Ali Asad Rizvi¹, Zongli Lin¹•Institutions (1)

University of Virginia¹

01 May 2019-IEEE Transactions on Neural Networks

TL;DR: A new output feedback-based Q-learning approach to solving the linear quadratic regulation (LQR) control problem for discrete-time systems and it is shown that the proposed algorithms converge to the solution of the LQR Riccati equation.

...read moreread less

Abstract: Approximate dynamic programming (ADP) and reinforcement learning (RL) have emerged as important tools in the design of optimal and adaptive control systems. Most of the existing RL and ADP methods make use of full-state feedback, a requirement that is often difficult to satisfy in practical applications. As a result, output feedback methods are more desirable as they relax this requirement. In this paper, we present a new output feedback-based Q-learning approach to solving the linear quadratic regulation (LQR) control problem for discrete-time systems. The proposed scheme is completely online in nature and works without requiring the system dynamics information. More specifically, a new representation of the LQR Q-function is developed in terms of the input–output data. Based on this new Q-function representation, output feedback LQR controllers are designed. We present two output feedback iterative Q-learning algorithms based on the policy iteration and the value iteration methods. This scheme has the advantage that it does not incur any excitation noise bias, and therefore, the need of using discounted cost functions is circumvented, which in turn ensures closed-loop stability. It is shown that the proposed algorithms converge to the solution of the LQR Riccati equation. A comprehensive simulation study is carried out, which illustrates the proposed scheme.

...read moreread less

Journal Article•DOI•

Dual-loop online intelligent programming for driver-oriented predict energy management of plug-in hybrid electric vehicles

[...]

Ji Li¹, Quan Zhou¹, Yinglong He¹, Bin Shuai¹, Ziyang Li¹, Huw Williams¹, Hongming Xu¹ - Show less +3 more•Institutions (1)

University of Birmingham¹

01 Nov 2019-Applied Energy

TL;DR: An online predictive control strategy for series-parallel plug-in hybrid electric vehicles (PHEVs) is investigated, resulting in a novel online optimization methodology named the dual-loop online intelligent programming (DOIP) that is proposed for velocity prediction and energy-flow control.

...read moreread less

Journal Article•DOI•

A hybrid optimization-based scheduling strategy for combined cooling, heating, and power system with thermal energy storage

[...]

Fan Li¹, Bo Sun¹, Chenghui Zhang¹, Liu Che¹•Institutions (1)

Shandong University¹

01 Dec 2019-Energy

TL;DR: A hybrid optimization method that combines the GA and dynamic programming (DP) is proposed that increases the overall performance and the computing time is acceptable for the scheduling of the energy system.

...read moreread less

Collapse