scispace - formally typeset
Search or ask a question

Showing papers by "Kumpati S. Narendra published in 2016"


Proceedings ArticleDOI
01 Dec 2016
TL;DR: The principal objective of this paper is to present a new approach, based on the use of multiple models (or estimates), that may alleviate this problem and increase the speed of response in reinforcement learning.
Abstract: Reinforcement Learning aims to find the optimal decision in uncertain environments on the basis of qualitative and noisy on-line performance feedback provided by the environments. During the past four decades, learning theory has grown into a vast field in which a very large number of problems have been studied. One of the primary limitations of reinforcement schemes, acknowledged by workers in the field, is their slow speed of convergence. The principal objective of this paper is to present a new approach, based on the use of multiple models (or estimates), that may alleviate this problem and increase the speed of response. In adaptive control theory, multiple model based methods have been proposed over the past two decades, which improve substantially the performance of the system. The authors undertook to apply similar concepts in reinforcement learning as well, and this paper represents the first effort in this direction. Simple situations of learning in feed-forward networks are considered in the paper, and compared to two different schemes. It is shown that convergence speeds that are more than an order of magnitude faster than those of the first scheme, can be achieved in some cases. While the second scheme is comparable to the new approach in many situations, it is seen to exhibit undesirable behaviour in others, where the new approach is more robust. The latter is currently being extended incrementally and systematically to more complex problems that have been discussed in the literature. The ultimate aim of the authors is to apply this approach to learning in discrete and continuous state dynamic environments.

25 citations


Book ChapterDOI
01 Jan 2016
TL;DR: In this paper, the authors present four distinct methods based on switching, switching and tuning, interactive/evolutionary adaptation, and second-level adaptation to control rapidly time-varying plants.
Abstract: Adaptive systems that continuously monitor their own performance and adjust their control strategies to improve it have been studied for more than 50 years. The theory of such systems when the plants (or processes) to be controlled are linear and time-invariant is currently well understood. Numerous methods currently exist to achieve a satisfactory and robust response when the uncertainty in the system is small. During the past 3 decades numerous attempts have been made by workers in the field to extend the methods to systems with larger uncertainties. During this period the author and his colleagues have attempted to use a general method based on multiple models to control rapidly time-varying plants. More specifically, they have proposed four distinct methods based on (1) switching, (2) switching and tuning, (3) interactive/evolutionary adaptation, and (4) second-level adaptation. In this chapter, which is tutorial in nature, the four methods are critically examined. Work currently in progress which attempts to combine them using a hierarchical approach is described. Since many of the problems considered were formulated only recently, they are open-ended and hopefully will be of interest to a wide audience, including both beginners and experts.

13 citations