scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Anti-lock braking systems data-driven control using Q-learning

TL;DR: A model-free tire slip control solution for a fast, highly nonlinear Anti-lock Braking System (ABS) via a reinforcement Q-learning optimal control approach tailored around a batch neural fitted scheme using two neural networks to approximate the value function and the controller, respectively.
Abstract: A model-free tire slip control solution for a fast, highly nonlinear Anti-lock Braking System (ABS) is proposed in this work via a reinforcement Q-learning optimal control approach. The solution is tailored around a batch neural fitted scheme using two neural networks to approximate the value function and the controller, respectively. The transition samples are collected from the process through interaction by online exploiting the current iteration controller (or policy) under an e-greedy exploration strategy. The ABS process fits this type of learning-by-interaction since it does not need an initial stabilizing controller. The validation case studies carried out on a real laboratory setup reveal that high control system performance can be achieved after several tens of interaction episodes with the controlled process. Insightful comments on the observed control behavior in a set of real-time experiments are offered along with performance comparisons with several other controllers.
Citations
More filters
Journal ArticleDOI
TL;DR: In this paper, the problem of constraint control for an antilock braking system (ABS) with asymmetric slip ratio constraints is addressed. And a nonlinear backstepping control method based on barrier barrier is proposed.
Abstract: This paper is concerned with the problem of constraint control for an Antilock Braking System (ABS) with asymmetric slip ratio constraints. A nonlinear backstepping control method based on Barrier ...

12 citations


Cites background from "Anti-lock braking systems data-driv..."

  • ...Hence, many theoretical studies have been conducted on slip ratio control algorithms, such as nonlinear control [1], Fuzzy control [2], optimisation control [3], extremum seeking control [4], Adaptive Sliding mode controller [5,6], combined control [7], robust predictive control [8], fuzzy neural network controls [9], reinforcement Q-learning [10]....

    [...]

  • ...In this method, a reinforcement Q-learning optimal control approach was inserted in a neural fitted scheme by using two neural networks to approximate the value function and the controller....

    [...]

  • ...[10] proposed the design and implementation of a modelfree tire slip control for a fast and highly nonlinear ABS....

    [...]

Journal ArticleDOI
TL;DR: In this paper , an active disturbance rejection control technique for the anti-skid braking system is proposed, which can ensure that the closed-loop machine works around the height factor of the secure area of the friction curve.
Abstract: A high-quality and secure touchdown run for an aircraft is essential for economic, operational, and strategic reasons. The shortest viable touchdown run without any skidding requires variable braking pressure to manage the friction between the road surface and braking tire at all times. Therefore, the manipulation and regulation of the anti-skid braking system (ABS) should be able to handle steady nonlinearity and undetectable disturbances and to regulate the wheel slip ratio to make sure that the braking system operates securely. This work proposes an active disturbance rejection control technique for the anti-skid braking system. The control law ensures action that is bounded and manageable, and the manipulating algorithm can ensure that the closed-loop machine works around the height factor of the secure area of the friction curve, thereby improving overall braking performance and safety. The stability of the proposed algorithm is proven primarily by means of Lyapunov-based strategies, and its effectiveness is assessed by means of simulations on a semi-physical aircraft brake simulation platform.

2 citations

01 Jan 2018
TL;DR: The aim of this thesis is to adapt a non-linear Anti-lock Braking System (ABS) controller of a passenger car obtained as a simplified symbolic approximation of the solution to the Bellman equation to model-plant mismatches and process variations.
Abstract: Closed-loop control systems, which utilize output signals for feedback to generate control inputs, can achieve high performance. However, robustness of feedback control loops can be lost if system changes and uncertainties are too large. Adaptive control combines the traditional feedback structure with providing adaptation mechanisms that adjust a controller for a system with parameter uncertainties by using performance error information on line. Reinforcement learning (RL) is one of the many methods that can be used for adaptive control. The aim of this thesis is to adapt a non-linear Anti-lock Braking System (ABS) controller of a passenger car obtained as a simplified symbolic approximation of the solution to the Bellman equation to model-plant mismatches and process variations. Results for adaptation to dry and wet asphalt have been obtained successfully and have been compared with hand tuned and adaptive proportional-integral (P-I) controllers.

Cites methods from "Anti-lock braking systems data-driv..."

  • ...In contrast to [7], which gives a data-driven method to apply model free Q-learning for ABS control, model based RL has been used here....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: This paper presents and proves in detail a convergence theorem forQ-learning based on that outlined in Watkins (1989), showing that Q-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action- values are represented discretely.
Abstract: \cal Q-learning (Watkins, 1989) is a simple way for agents to learn how to act optimally in controlled Markovian domains. It amounts to an incremental method for dynamic programming which imposes limited computational demands. It works by successively improving its evaluations of the quality of particular actions at particular states. This paper presents and proves in detail a convergence theorem for \cal Q-learning based on that outlined in Watkins (1989). We show that \cal Q-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action-values are represented discretely. We also sketch extensions to the cases of non-discounted, but absorbing, Markov environments, and where many \cal Q values can be changed each iteration, rather than just one.

8,450 citations

Journal ArticleDOI
TL;DR: In this paper, the authors discuss a variety of adaptive critic designs (ACDs) for neuro-control, which are suitable for learning in noisy, nonlinear, and nonstationary environments They have common roots as generalizations of dynamic programming for neural reinforcement learning approaches.
Abstract: We discuss a variety of adaptive critic designs (ACDs) for neurocontrol These are suitable for learning in noisy, nonlinear, and nonstationary environments They have common roots as generalizations of dynamic programming for neural reinforcement learning approaches Our discussion of these origins leads to an explanation of three design families: heuristic dynamic programming, dual heuristic programming, and globalized dual heuristic programming (GDHP) The main emphasis is on DHP and GDHP as advanced ACDs We suggest two new modifications of the original GDHP design that are currently the only working implementations of GDHP They promise to be useful for many engineering applications in the areas of optimization and optimal control Based on one of these modifications, we present a unified approach to all ACDs This leads to a generalized training procedure for ACDs

1,109 citations

Journal ArticleDOI
TL;DR: The new design method is direct and can be applied using a single set of data generated by the plant, with no need for specific experiments nor iterations, and it is shown that the method searches for the global optimum of the design criterion.
Abstract: This paper considers the problem of designing a controller for an unknown plant based on input/output measurements. The new design method we propose is direct (no model identification of the plant is needed) and can be applied using a single set of data generated by the plant, with no need for specific experiments nor iterations. It is shown that the method searches for the global optimum of the design criterion and that, in the case of restricted complexity controller design, the achieved controller is a good approximation of the restricted complexity global optimal controller. A simulation example shows the effectiveness of the method.

901 citations

Journal ArticleDOI
TL;DR: The main objective of this paper is to review and summarize the recent achievements in data-based techniques, especially for complicated industrial applications, thus providing a referee for further study on the related topics both from academic and practical points of view.
Abstract: This paper provides an overview of the recent developments in data-based techniques focused on modern industrial applications. As one of the hottest research topics for complicated processes, the data-based techniques have been rapidly developed over the past two decades and widely used in numerous industrial sectors nowadays. The core of data-based techniques is to take full advantage of the huge amounts of available process data, aiming to acquire the useful information within. Compared with the well-developed model-based approaches, data-based techniques provide efficient alternative solutions for different industrial issues under various operating conditions. The main objective of this paper is to review and summarize the recent achievements in data-based techniques, especially for complicated industrial applications, thus providing a referee for further study on the related topics both from academic and practical points of view. This paper begins with a brief evolutionary overview of data-based techniques in the last two decades. Then, the methodologies only based on process measurements and the model-data integrated techniques will be further introduced. The recent developments for modern industrial applications are, respectively, presented mainly from perspectives of monitoring and control. The new trends of data-based technique as well as potential application fields are finally discussed.

856 citations

Journal ArticleDOI
TL;DR: In this article, the authors describe the use of reinforcement learning to design feedback controllers for discrete and continuous-time dynamical systems that combine features of adaptive control and optimal control, which are not usually designed to be optimal in the sense of minimizing user-prescribed performance functions.
Abstract: This article describes the use of principles of reinforcement learning to design feedback controllers for discrete- and continuous-time dynamical systems that combine features of adaptive control and optimal control. Adaptive control [1], [2] and optimal control [3] represent different philosophies for designing feedback controllers. Optimal controllers are normally designed of ine by solving Hamilton JacobiBellman (HJB) equations, for example, the Riccati equation, using complete knowledge of the system dynamics. Determining optimal control policies for nonlinear systems requires the offline solution of nonlinear HJB equations, which are often difficult or impossible to solve. By contrast, adaptive controllers learn online to control unknown systems using data measured in real time along the system trajectories. Adaptive controllers are not usually designed to be optimal in the sense of minimizing user-prescribed performance functions. Indirect adaptive controllers use system identification techniques to first identify the system parameters and then use the obtained model to solve optimal design equations [1]. Adaptive controllers may satisfy certain inverse optimality conditions [4].

841 citations