Reinforcement learning control of robot manipulators in uncertain environments

doi:10.1109/ICIT.2009.4939504

Home
/
Papers
/
Reinforcement learning control of robot manipulators in uncertain environments

Proceedings Article•DOI•

Reinforcement learning control of robot manipulators in uncertain environments

Hitesh Shah¹, M. Gopal¹•Institutions (1)

Indian Institute of Technology Delhi¹

10 Feb 2009-pp 1-6

TL;DR: This work investigates here the robust tracking performance of reinforcement learning control of manipulators, subjected to parameter variations and extraneous disturbances, and shows the importance of fuzzy Q-learning control.

read less

Abstract: Considerable attention has been given to the design of stable controllers for robot manipulators, in the presence of uncertainties. We investigate here the robust tracking performance of reinforcement learning control of manipulators, subjected to parameter variations and extraneous disturbances. Robustness properties in terms of average error, absolute maximum errors and absolute maximum control efforts, have been compared for reinforcement learning systems using various parameterized function approximators, such as fuzzy, neural network, decision tree, and support vector machine. Simulation results show the importance of fuzzy Q-learning control. Further improvements in this control approach through dynamic fuzzy Q-learning have also been highlighted.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Reinforcement learning and optimal adaptive control: An overview and implementation examples

[...]

Said Ghani Khan¹, Guido Herrmann², Frank L. Lewis³, Tony Pipe¹, Chris Melhuish¹ - Show less +1 more•Institutions (3)

University of the West of England¹, University of Bristol², University of Texas at Arlington³

01 Apr 2012-Annual Reviews in Control

TL;DR: An example of an implementation of a novel model-free Q-learning based discrete optimal adaptive controller for a humanoid robot arm that uses a novel adaptive dynamic programming (ADP) reinforcement learning (RL) approach to develop an optimal policy on-line.

...read moreread less

172 citations

Cites background from "Reinforcement learning control of r..."

...Shah and Gopal (2009) have presented reinforcement learning control for robot manipulators in uncertain environments....
[...]

Journal Article•DOI•

Model-Free Predictive Control of Nonlinear Processes Based on Reinforcement Learning

[...]

Hitesh Shah¹, Madan Gopal²•Institutions (2)

G H Patel College Of Engineering & Technology¹, Shiv Nadar University²

01 Jan 2016-IFAC-PapersOnLine

TL;DR: In this paper, a model-free predictive control (MFPC) framework is proposed to take care of both the issues of conventional MPC and the excessive computational burden associated with the control optimization.

...read moreread less

28 citations

Journal Article•DOI•

On the Development of Learning Control for Robotic Manipulators

[...]

Dan Zhang, Bin Wei

27 Sep 2017-Robotics

TL;DR: The authors review and discuss the learning control in robotic manipulators and some issues in learning control for robotic manipulator and are able to give a general guideline for future research.

...read moreread less

11 citations

Cites background from "Reinforcement learning control of r..."

...In [74], the tracking performance of reinforcement learning control for a two link robotic mechanism that has parameter variations and external disturbances is studied....
[...]

Proceedings Article•DOI•

Applying Q-Learning Algorithm to Study Line-Grasping Control Policy for Transmission Line Deicing Robot

[...]

Shuning Wei¹, Yaonan Wang¹, Yiming Yang¹, Feng Yin¹, Wenming Cao¹, Yong Tang² - Show less +2 more•Institutions (2)

Hunan University¹, National University of Defense Technology²

13 Oct 2010

TL;DR: This paper introduces a preliminary design of deicing robot, which travels on transmission lines and automatically remove ices, and implements a graphical simulation environment and uses it to evaluate the Q-learning based line-grasping control policy study algorithm.

...read moreread less

Abstract: Ice coating in power networks could result in power-tower collapse and power interruption. This paper introduces a preliminary design of deicing robot, which travels on transmission lines and automatically remove ices. Inevitably, the deicing robot will encounter some obstacles. To cross an obstacle, the deicing robot needs to control its arms to grasp transmission line over the obstacle. In this paper, Q-learning, one of reinforcement learning algorithms is applied to study the line-grasping control strategies for deicing robot. We implement a graphical simulation environment and use it to evaluate the Q-learning based line-grasping control policy study algorithm. Simulation results show that the proposed algorithm is promising for the line-grasping control of deicing robot.

...read moreread less

4 citations

Cites background from "Reinforcement learning control of r..."

...And some papers gave the comparison among them [13, 14]....
[...]

Proceedings Article•DOI•

Hybridization of model-based approach with reinforcement fuzzy system design

[...]

Hitesh Shah¹, M. Gopal¹•Institutions (1)

Indian Institute of Technology Delhi¹

05 Jul 2009

TL;DR: This paper proposes a method for hybridization of model-based approach with RL, which is the right solution for such control problems and shows superiority in terms of robustness of the controller to parameter variations in the plant.

...read moreread less

Abstract: Reinforcement learning (RL) is a popular learning paradigm to adaptive learning control of nonlinear systems, and is able to work without an explicit model. However, learning from scratch, i.e., without any a priori knowledge, is a daunting undertaking, which results in long training time and instability of learning process with large continuous state space. For physical systems, one must consider that the design of controller is very rarely a tabula rasa: some approximate mathematical model of the system is always available. In this paper, our focus is on control applications wherein the system to be controlled is a physical system. We can always obtain at least an approximate mathematical model of the plant to be controlled. We propose a method for hybridization of model-based approach with RL, which is the right solution for such control problems. The superiority of proposed hybrid approach has been established through simulation experiments on a cart-pole balance bench mark problem, comparing it with model-free RL system. We have used fuzzy inference system for function approximation; it can deal with continuous action space in Q-learning. Comparison with other function approximators has shown its superiority in terms of robustness of the controller to parameter variations in the plant.

...read moreread less

2 citations

References

PDF

Open Access

More filters

Book•

The Nature of Statistical Learning Theory

[...]

Vladimir Vapnik¹•Institutions (1)

Bell Labs¹

01 Jan 1995

TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?

...read moreread less

Abstract: Setting of the learning problem consistency of learning processes bounds on the rate of convergence of learning processes controlling the generalization ability of learning processes constructing learning algorithms what is important in learning theory?.

...read moreread less

40,147 citations

"Reinforcement learning control of r..." refers methods in this paper

...SVM Q-learning SVM is a new universal learning machine in the framework of structural risk minimization (SRM) [13]....
[...]
...D. SVM Q-learning SVM is a new universal learning machine in the framework of structural risk minimization (SRM) [13]....
[...]
...SVM uses a kernel function that satisfies Mercer’s condition [13], to map the input data into a highdimensional feature space, and then construct a linear optimal separating hyper plane in that space....
[...]
...SRM has better generalization ability and is superior to the traditional empirical risk minimization (ERM) principle....
[...]
...The membership function parameters used in this paper are same as in [13]....
[...]

Book•

Reinforcement Learning: An Introduction

[...]

Richard S. Sutton¹, Andrew G. Barto•Institutions (1)

Massachusetts Institute of Technology¹

01 Jan 1988

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.

...read moreread less

Abstract: Reinforcement learning, one of the most active research areas in artificial intelligence, is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment. In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning. Their discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications. The only necessary mathematical background is familiarity with elementary concepts of probability. The book is divided into three parts. Part I defines the reinforcement learning problem in terms of Markov decision processes. Part II provides basic solution methods: dynamic programming, Monte Carlo methods, and temporal-difference learning. Part III presents a unified view of the solution methods and incorporates artificial neural networks, eligibility traces, and planning; the two final chapters present case studies and consider the future of reinforcement learning.

...read moreread less

37,989 citations

"Reinforcement learning control of r..." refers methods in this paper

...In order to explore the set of possible actions and acquire experience through the RL signals, actions are selected using an exploration/exploitation policy (EEP) [11]....
[...]
...If †iu , the action selected in rule iR is ε -greedy iu (where ε -greedy is a function implementing the EEP strategy), while *iu is the maximizing action, i.e., *( , ) max ( , )i b mq i u q i b≤= , then Q-value for the inferred action ku is † 1 1 ( , ) ( ) ( , ) ( ) N Nk k k k i i i i i Q x u x q i u xα α = = = ∑ ∑ , and value of state kx is: 1 1 ( ) ( ) ( , ) ( ) N Nk k k i i i i i V x x q i u xα α∗ = = = ∑ ∑ ....
[...]
...This information is used to calculate temporal difference (TD) [11] approximation error as: 1 ( ) ( , ) k k k n k Q c V x Q x u γ + ∆ = + − and q parameter values are updated as...
[...]

Journal Article•DOI•

Least Squares Support Vector Machine Classifiers

[...]

Johan A. K. Suykens¹, Joos Vandewalle¹•Institutions (1)

Katholieke Universiteit Leuven¹

01 Jun 1999-Neural Processing Letters

TL;DR: A least squares version for support vector machine (SVM) classifiers that follows from solving a set of linear equations, instead of quadratic programming for classical SVM's.

...read moreread less

Abstract: In this letter we discuss a least squares version for support vector machine (SVM) classifiers. Due to equality type constraints in the formulation, the solution follows from solving a set of linear equations, instead of quadratic programming for classical SVM‘s. The approach is illustrated on a two-spiral benchmark classification problem.

...read moreread less

8,811 citations

Learning from delayed rewards

[...]

Chris Watkins

01 Jan 1989

4,916 citations

"Reinforcement learning control of r..." refers background or methods in this paper

...One of the most popular RL approaches is the Q-learning [3]....
[...]
...It is an adaptation of Watkin’s Q-learning [3] for FIS, where both the actions and Q-functions are inferred from fuzzy rules....
[...]

Journal Article•DOI•

Reinforcement learning is direct adaptive optimal control

[...]

Richard S. Sutton, Andrew G. Barto¹, Ronald J. Williams²•Institutions (2)

University of Massachusetts Amherst¹, Northeastern University²

01 Apr 1992-IEEE Control Systems Magazine

TL;DR: Reinforcement learning methods are presented as a computationally simple, direct approach to the adaptive optimal control of nonlinear systems.

...read moreread less

Abstract: Neural network reinforcement learning methods are described and considered as a direct approach to adaptive optimal control of nonlinear systems. These methods have their roots in studies of animal learning and in early learning control work. An emerging deeper understanding of these methods is summarized that is obtained by viewing them as a synthesis of dynamic programming and stochastic approximation methods. The focus is on Q-learning systems, which maintain estimates of utilities for all state-action pairs and make use of these estimates to select actions. The use of hybrid direct/indirect methods is briefly discussed. >

...read moreread less

437 citations