scispace - formally typeset
Search or ask a question

Showing papers on "Optimal control published in 2023"


Journal ArticleDOI
TL;DR: In this article , an event-triggered adaptive dynamic programming (ETADP) algorithm is proposed to study the optimal decentralized control issue of interconnected nonlinear systems subject to stochastic dynamics.

15 citations


Journal ArticleDOI
TL;DR: In this article , an interval uncertainty-oriented optimal control method based on the linear quadratic regulator (LQR) for spacecraft attitude control was proposed to balance both minimizations of the optimal control cost function and state vector fluctuation by considering these two interval indices as the constraints in optimal control.
Abstract: Research on uncertainty-oriented optimal attitude control of spacecraft with complex space environments and multi-source uncertainties is a research hotspot. Considering that the uncertain parameters in the control system are difficult to quantify, this study proposed an interval uncertainty-oriented optimal control method based on the linear quadratic regulator (LQR) for spacecraft attitude control. The interval state-space equation of the spacecraft attitude dynamic with uncertain controlled feedback gain was constituted by expanding the deterministic model into an order-extended interval matrix format. Based on the interval uncertainty propagation method, the interval-based Riccati equation in LQR was proposed using the modified interval estimation method. Therefore, the interval-controlled feedback gain and interval cost function could be obtained, and the overestimation attributed to the interval expansion could be avoided. The interval-based reliability was investigated using the state-threshold interference model, and the interval-based safety index was developed. The interval uncertainty-based multi-objective optimal control model with constraints was proposed to balance both minimizations of the optimal control cost function and state vector fluctuation by considering these two interval indices as the constraints in optimal control. A flowchart and a numerical example of satellite attitude control were applied to reflect the effectiveness.

11 citations


Journal ArticleDOI
31 Jan 2023-Symmetry
TL;DR: In this paper , a mathematical model of the deadly COVID-19 pandemic was formulated to understand the dynamic behavior of COVID19 and the positivity and boundedness of the solutions were proved using the fractional-order properties of the Laplace transformation.
Abstract: In this manuscript, we formulate a mathematical model of the deadly COVID-19 pandemic to understand the dynamic behavior of COVID-19. For the dynamic study, a new SEIAPHR fractional model was purposed in which infectious individuals were divided into three sub-compartments. The purpose is to construct a more reliable and realistic model for a complete mathematical and computational analysis and design of different control strategies for the proposed Caputo–Fabrizio fractional model. We prove the existence and uniqueness of solutions by employing well-known theorems of fractional calculus and functional analyses. The positivity and boundedness of the solutions are proved using the fractional-order properties of the Laplace transformation. The basic reproduction number for the model is computed using a next-generation technique to handle the future dynamics of the pandemic. The local–global stability of the model was also investigated at each equilibrium point. We propose basic fixed controls through manipulation of quarantine rates and formulate an optimal control problem to find the best controls (quarantine rates) employed on infected, asymptomatic, and “superspreader” humans, respectively, to restrict the spread of the disease. For the numerical solution of the fractional model, a computationally efficient Adams–Bashforth method is presented. A fractional-order optimal control problem and the associated optimality conditions of Pontryagin maximum principle are discussed in order to optimally reduce the number of infected, asymptomatic, and superspreader humans. The obtained numerical results are discussed and shown through graphs.

10 citations


Journal ArticleDOI
TL;DR: In this article , a distributed fuzzy optimal consensus control problem for state-constrained nonlinear strict-feedback systems under an identifier-actor-critic architecture is investigated, where a fuzzy identifier is designed to approximate each agent's unknown nonlinear dynamics.
Abstract: This article investigates the distributed fuzzy optimal consensus control problem for state-constrained nonlinear strict-feedback systems under an identifier-actor-critic architecture. First, a fuzzy identifier is designed to approximate each agent's unknown nonlinear dynamics. Then, by defining multiple barrier-type local optimal performance indexes for each agent, the optimal virtual and actual control laws are obtained, where two fuzzy-logic systems working as the actor network and critic network are used to execute control behavior and evaluate control performance, respectively. It is proved that the proposed control protocol can drive all agents to reach consensus without violating state constraints, and make the local performance indexes reach the Nash equilibrium simultaneously. Simulation studies are given to verify the effectiveness of the developed fuzzy optimal consensus control approach.

8 citations


Journal ArticleDOI
TL;DR: In this article , the authors used parallel control to investigate the problem of event-triggered near-optimal control (ETNOC) for unknown discrete-time (DT) nonlinear systems.
Abstract: This article uses parallel control to investigate the problem of event-triggered near-optimal control (ETNOC) for unknown discrete-time (DT) nonlinear systems. First, to achieve parallel control, an augmented nonlinear system (ANS) with an augmented performance index (API) is proposed to introduce the control input into the feedback system. The control stability relationship between the ANS and the original system is analyzed, and it is shown that, by choosing a proper API, optimal control of the ANS with the API can be seen as near-optimal control of the original system with the original performance index (OPI). Second, based on parallel control, a novel event-triggered scheme is proposed, and then a novel ETNOC method is developed using the time-triggered optimal value function of the ANS with the API. The control stability is proved, and an upper bound, which is related to the design parameter, is provided for the actual performance index in advance. Then, to implement the developed ETNOC method for unknown DT nonlinear systems, a novel online learning algorithm is developed without reconstructing unknown systems, and neural network (NN) and adaptive dynamic programming (ADP) techniques are employed in the developed algorithm. The convergence of the signals in the closed-loop system (CLS) is shown using the Lyapunov approach, and the assumption of boundedness of input dynamics is not required. Finally, two simulations justify the theoretical conjectures.

8 citations


Journal ArticleDOI
TL;DR: In this paper , the authors reviewed the recent progress in model identification-based learning and optimal control and its applications to multi-agent systems (MASs) and expounded the current applications of model identificationbased adaptive dynamic programming (ADP) methods in the fields of single-agent system (SAS) and MASs.
Abstract: This paper reviews recent progress in model identification-based learning and optimal control and its applications to multi-agent systems (MASs). First, a class of learning-based optimal control method, namely adaptive dynamic programming (ADP), is introduced, and the existing results using ADP methods to solve optimal control problems are reviewed. Then, this paper investigates various kinds of model identification methods and analyzes the feasibility of combining the model identification method with the ADP method to solve optimal control of unknown systems. In addition, this paper expounds the current applications of model identification-based ADP methods in the fields of single-agent systems (SASs) and MASs. Finally, some conclusions and some future directions are presented.

7 citations


Journal ArticleDOI
TL;DR: In this article , a new deterministic SEIHR fractional model is developed for the first time, and an optimal control problem is developed to find the best controls for the quarantine and hospitalization strategies employed on exposed and infected humans, respectively.
Abstract: In this manuscript, we have studied the dynamical behavior of a deadly COVID‐19 pandemic which has caused frustration in the human community. For this study, a new deterministic SEIHR fractional model is developed for the first time. The purpose is to perform a complete mathematical analysis and the design of an optimal control strategy for the proposed Caputo–Fabrizio fractional model. We have proved the existence and uniqueness of solutions by employing principle of mathematical induction. The positivity and the boundedness of solutions is proved using comprehensive mathematical techniques. Two main equilibrium points of the pandemic model are stated. The basic reproduction number for the model is computed using next generation technique to handle the future dynamics of the pandemic. We develop an optimal control problem to find the best controls for the quarantine and hospitalization strategies employed on exposed and infected humans, respectively. For numerical solution of the fractional model, we implemented the Adams–Bashforth method to prove the importance of order. A general fractional‐order optimal control problem and associated optimality conditions of Pontryagin type are discussed, with the goal to minimize the number of exposed and infected humans. The extremals are obtained numerically.

6 citations


Journal ArticleDOI
TL;DR: In this article , an adaptive rolling planning method for the state of charge trajectory of plug-in hybrid-electric bus (PHEB) power battery is proposed, which can reduce fuel consumption by 0.13 L/100 km.
Abstract: The state of charge (SOC) trajectory planning for the power battery is the basis of realizing the global online optimal control of the plug-in hybrid-electric bus (PHEB) energy management system. This article proposed an adaptive rolling planning method for SOC trajectory of PHEBs power battery. First, a mathematical model for simulation research is established. Second, the driving cycles of PHEBs are collected, and the optimal SOC trajectories of driving cycles are obtained using dynamic programming (DP). In order to provide training data for the planning model of the optimal SOC trajectory of a trip segment (TS), the driving cycle and the optimal SOC trajectory of the driving cycle are cut into segments with a time length of 30 s. The incremental learning is used to construct a planning model of the TS’s optimal SOC trajectory and planning the optimal SOC trajectory of the actual TS. The optimal SOC trajectory of the actual TS is applied to proportion–integration-differentiation (PID) control and compared with PID based on mileage-based SOC planning (BMP). The result shows that this study can reduce fuel consumption by 0.8 L/100 km for the same driving cycle. Through incremental learning, the fuel consumption can be further reduced by 0.13 L/100 km.

5 citations


Journal ArticleDOI
TL;DR: In this article , the authors investigated the optimal output tracking problem for linear discrete-time systems with unknown dynamics using reinforcement learning (RL) and robust output regulation theory, and proposed an off-policy RL algorithm using only the measured output data along the trajectory of the system and the reference output.
Abstract: In this article, we investigate the optimal output tracking problem for linear discrete-time systems with unknown dynamics using reinforcement learning (RL) and robust output regulation theory. This output tracking problem only allows to utilize the outputs of the reference system and the controlled system, rather than their states, and differs from most existing works that depend on the state of the system. The optimal tracking problem is formulated into a linear quadratic regulation problem by proposing a family of dynamic discrete-time controllers. Then, it is shown that solving the output tracking problem is equivalent to solving output regulation equations, whose solution, however, requires the knowledge of the complete and accurate system dynamics. To remove such a requirement, an off-policy RL algorithm is proposed using only the measured output data along the trajectory of the system and the reference output. By introducing reexpression error and analyzing the rank condition of the parameterization matrix, we ensure the uniqueness of the proposed RL-based optimal control via output feedback.

5 citations


Journal ArticleDOI
TL;DR: In this paper , the authors proposed two novel notions referred to as sample identifying complexity and sample deciphering time in an encrypted control framework, which explicitly capture the relation between the dynamical characteristics of control systems and the level of identifiability of the systems while the latter shows the relationship between the computation time for the identification and the key length of a cryptosystem.
Abstract: In the state-of-the-art literature on cryptography and control theory, there has been no systematic methodology of constructing cyber–physical systems that can achieve the desired control performance while being protected against eavesdropping attacks. In this article, we tackle this challenging problem. We first propose two novel notions referred to as sample identifying complexity and sample deciphering time in an encrypted control framework. The former explicitly captures the relation between the dynamical characteristics of control systems and the level of identifiability of the systems while the latter shows the relation between the computation time for the identification and the key length of a cryptosystem. Based on these two tractable new notions, we propose a systematic method for designing both of an optimal key length to prevent system identification with a given precision within a given life span of systems and of an optimal controller to maximize both of the control performance and the difficulty of the identification. The efficiency of the proposed method in terms of security level and realtime-ness is investigated through numerical simulations. To the best of our knowledge, this article first connects the relationship between the security of cryptography and dynamical systems from a control-theoretic perspective.

5 citations


Journal ArticleDOI
TL;DR: In this paper , a time-coarsening strategy for model predictive control (MPC) is proposed to overcome the computational challenges associated with optimal control problems that span multiple timescales.
Abstract: We analyze a time-coarsening strategy for model predictive control (MPC) that we call diffusing-horizon MPC. This strategy seeks to overcome the computational challenges associated with optimal control problems that span multiple timescales. The coarsening approach uses a time discretization grid that becomes exponentially more sparse as one moves forward in time. This design is motivated by a recently established property of optimal control problems that is known as exponential decay of sensitivity (EDS). This property states that the impact of a parametric perturbation at a future time decays exponentially as one moves backward in time. We establish conditions under which this property holds for a constrained MPC formulation with linear dynamics and costs. Moreover, we show that the proposed coarsening scheme can be cast as a parametric perturbation of the MPC problem and, thus, the exponential decay condition holds. We use a HVAC plant case study with real data to demonstrate the proposed approach. Specifically, we show that computational times can be reduced by two orders of magnitude while increasing the closed-loop cost by only 3%.

Journal ArticleDOI
TL;DR: In this paper , an adaptive dynamic programming (ADPDP) controller is designed for optimal feedback control, which is combined with the steady-state control to realize the trajectory-tracking guidance.

Journal ArticleDOI
TL;DR: In this article , a capacity maximization scheme under energy consumption constraints based on double deep Q-network (DDQN) is proposed to maximize the system capacity under the UAV energy consumption constraint.
Abstract: Reconfigurable Intelligent Surface (RIS) has grown rapidly due to its performance improvement for wireless networks, and the integration of unmanned aerial vehicle (UAV) and RIS has obtained widespread attention. In this paper, the downlink of non-orthogonal multiple access (NOMA) UAV networks equipped with RIS is considered. The objective is to optimize the UAV trajectory with RIS phase shift to maximize the system capacity under the UAV energy consumption constraint. By deep reinforcement learning, a capacity maximization scheme under energy consumption constraints based on double deep Q-Network (DDQN) is proposed. The joint optimization of UAV trajectory with RIS phase shift design is achieved by DDQN algorithm. From the numerical results, the proposed optimization scheme can increase the system capacity of the RIS-UAV-assisted NOMA networks.

Journal ArticleDOI
TL;DR: In this article , the adaptive optimal stationary control of continuous-time linear stochastic systems with both additive and multiplicative noises was studied using reinforcement learning techniques, and a novel off-policy reinforcement learning algorithm, named optimistic least-squares-based policy iteration, was proposed.
Abstract: This article studies the adaptive optimal stationary control of continuous-time linear stochastic systems with both additive and multiplicative noises, using reinforcement learning techniques. Based on policy iteration, a novel off-policy reinforcement learning algorithm, named optimistic least-squares-based policy iteration, is proposed, which is able to find iteratively near-optimal policies of the adaptive optimal stationary control problem directly from input/state data without explicitly identifying any system matrices, starting from an initial admissible control policy. The solutions given by the proposed optimistic least-squares-based policy iteration are proved to converge to a small neighborhood of the optimal solution with probability one, under mild conditions. The application of the proposed algorithm to a triple inverted pendulum example validates its feasibility and effectiveness.

Journal ArticleDOI
TL;DR: In this article , a novel conic input mapping (CIM) design methodology was proposed to incorporate the online process data into the optimal and robust optimal ILC design, respectively.
Abstract: In this article, we study the optimal iterative learning control (ILC) for constrained systems with bounded uncertainties via a novel conic input mapping (CIM) design methodology. Due to the limited understanding of the process of interest, modeling uncertainties are generally inevitable, significantly reducing the convergence rate of the control systems. However, huge amounts of measured process data interacting with model uncertainties can easily be collected. Incorporating these data into the optimal controller design could unlock new opportunities to reduce the error of the current trail optimization. Based on several existing optimal ILC methods, we incorporate the online process data into the optimal and robust optimal ILC design, respectively. Our methodology, called CIM, utilizes the process data for the first time by applying the convex cone theory and maps the data into the design of control inputs. CIM-based optimal ILC and robust optimal ILC methods are developed for uncertain systems to achieve better control performance and a faster convergence rate. Next, rigorous theoretical analyses for the two methods have been presented, respectively. Finally, two illustrative numerical examples are provided to validate our methods with improved performance.

Journal ArticleDOI
TL;DR: In this paper , the authors extend the PINN framework to PDE-constrained optimal control problems, for which the governing PDE is fully known and the goal is to find a control variable that minimizes a desired cost objective.

Journal ArticleDOI
TL;DR: In this paper , the authors developed a novel cost function (performance index function) to overcome the obstacles in solving the optimal tracking control problem for a class of nonlinear systems with known system dynamics via adaptive dynamic programming technique.
Abstract: This article develops a novel cost function (performance index function) to overcome the obstacles in solving the optimal tracking control problem for a class of nonlinear systems with known system dynamics via adaptive dynamic programming (ADP) technique. For the traditional optimal control problems, the assumption that the controlled system has zero equilibrium is generally required to guarantee the finiteness of an infinite horizon cost function and a unique solution. In order to solve the optimal tracking control problem of nonlinear systems with nonzero equilibrium, a specific cost function related to tracking errors and their derivatives is designed in this article, in which the aforementioned assumption and related obstacles are removed and the controller design process is simplified. Finally, comparative simulations are conducted on an inverted pendulum system to illustrate the effectiveness and advantages of the proposed optimal tracking control strategy.

Journal ArticleDOI
TL;DR: In this paper , an optimal control model for the transmission dynamics of COVID-19 is investigated and the authors established important model properties like nonnegativity and boundedness of solutions, and also the region of invariance.
Abstract: In this paper, an optimal control model for the transmission dynamics of COVID-19 is investigated. We established important model properties like nonnegativity and boundedness of solutions, and also the region of invariance. Further, an expression for the basic reproduction number is computed and its sensitivity w.r.t model parameters is carried out to identify the most sensitive parameter. Based on sensitivity analysis, optimal control strategies were presented to reduce the disease burden and related costs. It is demonstrated that optimal control does exist and is unique. The characterization of optimal trajectories is analytically studied via Pontryagin's Minimum Principle. Moreover, various simulations were performed to support analytical results. The simulation results showed that the proposed controls significantly influence the disease burden compared to the absence of control cases. Further, it reveals that the applied control strategies are effective throughout the intervention period in reducing COVID-19 diseases in the community. Besides, the simulation results of the optimal control suggested that concurrently applying all controlling strategies outperform in mitigating the spread of COVID-19 compared to any other preventive measures.

Journal ArticleDOI
TL;DR: In this paper , a soft actor-critic (SAC)-based method is proposed to optimize the train driving strategy to minimize the trip time of the journey with constant energy consumption.
Abstract: The energy-efficient train control (EETC) problem is investigated in this article. And a soft actor-critic (SAC)-based method is proposed to optimize the train driving strategy. First, EETC problem is converted to the inverse problem, i.e., minimizing the trip time of the journey with constant energy consumption. Based on the conversion, the EETC problem is reformulated as a finite Markov decision process, which can be solved by deep reinforcement learning algorithms. Second, an optimization method based on the SAC method is designed to calculate the optimal driving strategy of the train with introducing the reservoir sampling method. Finally, some case studies are conducted to verify the effectiveness and performance of the proposed method. Simulation results demonstrate that a good energy-saving performance can be achieved. In single interval, the SAC-based method can reduce about 1.65% of the energy consumption compared with numerical method. And the energy consumption reduction can be extended to be 6.49% when the proposed approach is applied in multiple intervals.

Journal ArticleDOI
TL;DR: In this paper , an actor-critic RL strategy was used to learn the optimal Proportional Integral (PI) controller dynamics from a Direct Current (DC) motor speed control simulation environment.
Abstract: To benefit from the advantages of Reinforcement Learning (RL) in industrial control applications, RL methods can be used for optimal tuning of the classical controllers based on the simulation scenarios of operating conditions. In this study, the Twin Delay Deep Deterministic (TD3) policy gradient method, which is an effective actor-critic RL strategy, is implemented to learn optimal Proportional Integral (PI) controller dynamics from a Direct Current (DC) motor speed control simulation environment. For this purpose, the PI controller dynamics are introduced to the actor-network by using the PI-based observer states from the control simulation environment. A suitable Simulink simulation environment is adapted to perform the training process of the TD3 algorithm. The actor-network learns the optimal PI controller dynamics by using the reward mechanism that implements the minimization of the optimal control objective function. A setpoint filter is used to describe the desired setpoint response, and step disturbance signals with random amplitude are incorporated in the simulation environment to improve disturbance rejection control skills with the help of experience based learning in the designed control simulation environment. When the training task is completed, the optimal PI controller coefficients are obtained from the weight coefficients of the actor-network. The performance of the optimal PI dynamics, which were learned by using the TD3 algorithm and Deep Deterministic Policy Gradient algorithm, are compared. Moreover, control performance improvement of this RL based PI controller tuning method (RL-PI) is demonstrated relative to performances of both integer and fractional order PI controllers that were tuned by using several popular metaheuristic optimization algorithms such as Genetic Algorithm, Particle Swarm Optimization, Grey Wolf Optimization and Differential Evolution.

Journal ArticleDOI
TL;DR: In this article , the authors considered the problem of finding the optimal feedback gain that minimizes the worst-case conditional value-at-risk (CVaR) of a quadratic objective function subject to additive disturbances whose first two moments of the distribution are known.
Abstract: Stochastic linear quadratic control problems are considered from the viewpoint of risks. In particular, a worst-case conditional value-at-risk (CVaR) of quadratic objective function is minimized subject to additive disturbances whose first two moments of the distribution are known. The study focuses on three problems of finding the optimal feedback gain that minimizes the quadratic cost of: stationary distribution, one-step, and infinite time horizon. For the stationary distribution problem, it is proved that the optimal control gain that minimizes the worst-case CVaR of the quadratic cost is equivalent to that of the standard (stochastic) linear quadratic regulator. For the one-step problem, an approach to an optimal solution as well as analytical suboptimal solutions are presented. For the infinite time horizon problem, two suboptimal solutions that bound the optimal solution and an approach to an optimal solution for a special case are discussed. The presented theorems are illustrated with numerical examples.

Journal ArticleDOI
TL;DR: In this article , a hierarchical energy optimization control architecture based on networked information is designed, and a traffic signal timing model is used for vehicle target speed range planning in the upper system.

Journal ArticleDOI
TL;DR: In this paper , the authors investigated the approximate optimal control problem for nonlinear affine systems under the periodic event triggered control (PETC) strategy, which is the first time to present PETC for optimal control target of nonlinear systems.
Abstract: This article investigates the approximate optimal control problem for nonlinear affine systems under the periodic event triggered control (PETC) strategy. In terms of optimal control, a theoretical comparison of continuous control, traditional event-based control (ETC), and PETC from the perspective of stability convergence, concluding that PETC does not significantly affect the convergence rate than ETC. It is the first time to present PETC for optimal control target of nonlinear systems. A critic network is introduced to approximate the optimal value function based on the idea of reinforcement learning (RL). It is proven that the discrete updating time series from PETC can also be utilized to determine the updating time of the learning network. In this way, the gradient-based weight estimation for continuous systems is developed in discrete form. Then, the uniformly ultimately bounded (UUB) condition of controlled systems is analyzed to ensure the stability of the designed method. Finally, two illustrative examples are given to show the effectiveness of the method.

Journal ArticleDOI
TL;DR: In this paper , an optimal control problem in which a dynamical system is controlled by a nonlinear Caputo fractional state equation is investigated in the case when the Pontryagin maximum principle degenerates, that is, it is satisfied trivially.
Abstract: <p style='text-indent:20px;'>In this paper, we consider an optimal control problem in which a dynamical system is controlled by a nonlinear Caputo fractional state equation. The problem is investigated in the case when the Pontryagin maximum principle degenerates, that is, it is satisfied trivially. Then the second order optimality conditions are derived for the considered problem.</p>

Journal ArticleDOI
TL;DR: In this article , a coupled two-level variational model in Sobolev-Orlicz spaces with non-standard growth conditions of the objective functional and discuss its consistence and solvability issues is presented.
Abstract: We study a coupled two-level variational model in Sobolev-Orlicz spaces with non-standard growth conditions of the objective functional and discuss its consistence and solvability issues. At the first level, we deal with the so-called temporal interpolation problem that can be cast as a state constrained optimal control problem for anisotropic convection-diffusion equation with two types of control functions — distributed $L^2$-control and $BV$-bounded control in coefficients. At the second level, we have a constrained minimization problem with the nonstandard growth energy functional that lives in a variable Sobolev-Orlicz space. The characteristic feature of the proposed model is the fact that the variable exponent, which is associated with non-standard growth in the objective functional, is unknown a priori and it depends on the solution of the first-level optimal control problem.

Journal ArticleDOI
TL;DR: In this paper , a data-driven robust optimal control (DROC) method is proposed for uncertain nonlinear systems, where a robust evaluation strategy is introduced to capture the relationship between the approximating errors and the control variables.

Journal ArticleDOI
TL;DR: In this paper , the authors combine the notions of optimal control and stochastic resetting to address the problem of quantifying the effectiveness of such a heuristic strategy remains an open challenge.
Abstract: ``When in a difficult situation, it is sometimes better to give up and start all over again.'' While this empirical truth has been regularly observed in a wide range of circumstances, quantifying the effectiveness of such a heuristic strategy remains an open challenge. In this paper, we combine the notions of optimal control and stochastic resetting to address this problem. The emerging analytical framework allows one not only to measure the performance of a given restarting policy, but also to obtain the optimal strategy for a wide class of dynamical systems. We apply our technique to a system with a final reward and show that the reward value must be larger than a critical threshold for resetting to be effective. Our approach, analogous to the celebrated Hamilton-Jacobi-Bellman paradigm, provides the basis for the investigation of realistic restarting strategies across disciplines. As an application, we show that the framework can be applied to an epidemic model to predict the optimal lockdown policy.

Journal ArticleDOI
TL;DR: In this paper , the optimal explicit decentralized controllers are obtained by solving the forward and backward stochastic difference equations (FBSDEs) under the assumption of linear control strategies, and the optimal estimators based on asymmetric information are derived.
Abstract: This paper considers the decentralized LQG control problem for discrete-time decentralized system controlled by two players. In this scenario, player 1 shares a unit delayed observations and control inputs with the controller of player 2, whereas due to the limiting capacity, the controller of player 1 cannot obtain the observations and control inputs of player 2 which leads to the asymmetric one-step delay information. It should be emphasized that this structure makes the classical separation principle fail. Under the assumption of linear control strategies, we derive the optimal estimators based on asymmetric information. In virtue of the Pontryagin's maximum principle, the optimal explicit decentralized controllers are obtained by solving the forward and backward stochastic difference equations (FBSDEs). It is noted that the control gains are coupled with the estimation gains. Moreover, the estimation gains satisfy the forward Riccati equations and the control gains follow the backward Riccati equations which cannot be solved simultaneously. To this end, we present iterative solutions to the coupled forward and backward Riccati equations. Finally, a sufficient condition for the stabilization problem is given in terms of the coupled algebraic Riccati equations. Numerical examples are illustrated to show the effectiveness of the proposed algorithm.

Journal ArticleDOI
TL;DR: In this paper , an event-based Hamilton-Jacobi-Isaacs (HJI) equation was proposed to solve the tracking control problem in a class of nonlinear networked systems subject to limited network bandwidth and unmatched disturbance.
Abstract: This article concentrates on optimal tracking control for a class of nonlinear networked systems subjecting to limited network bandwidth and unmatched disturbance. Given the models of the control and reference systems, the considered optimal tracking control issue is initially formulated as a minimax optimization problem. Then, with the introduction of an event-triggered mechanism used for saving bandwidth, the formulated problem is transformed into solving an event-based Hamilton-Jacobi-Isaacs (HJI) equation by recurring to the Bellman optimality theory. Based on the HJI equation, we demonstrate that the stability of the concerned system in the sense of uniformly ultimately bounded (UUB) can be guaranteed with the envisioned optimal control and worst disturbance policies. Here, the disturbance policy can be varied periodically while the control policy can only be updated at event-triggering instants, which differs from the existed researches. Furthermore, we propose a reinforcement learning (RL)-based algorithm to handle the constructed HJI equation and thus settle the studied tracking control problem. The effectiveness of the algorithm is finally validated by both theoretical analysis and simulations.

Journal ArticleDOI
TL;DR: In this article , the observer-based dynamic optimal recoil controller design of a deepwater drilling riser system subject to friction force of fluid discharge and platform heave motion was investigated.