Author

# M. Waltz

Bio: M. Waltz is an academic researcher. The author has contributed to research in topics: Adaptive control & Automatic control. The author has an hindex of 1, co-authored 1 publications receiving 136 citations.

##### Papers

More filters

••

TL;DR: A learning control system using a reinforcement technique that is capable of controlling a plant that may be nonlinear and nonstationary and which learns the best control choice for each control situation.

Abstract: This paper describes a learning control system using a reinforcement technique. The controller is capable of controlling a plant that may be nonlinear and nonstationary. The only a priori information required by the controller is the order of the plant. The approach is to design a controller which partitions the control measurement space into sets called control situations and then learns the best control choice for each control situation. The control measurements are those indicating the state of the plant and environment. The learning is accomplished by reinforcement of the probability of choosing a particular control choice for a given control situation. The system was stimulated on an IBM 1710-GEDA hybrid computer facility. Experimental results obtained from the simulation are presented.

149 citations

##### Cited by

More filters

•

01 Jan 1988TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.

Abstract: Reinforcement learning, one of the most active research areas in artificial intelligence, is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment. In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning. Their discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications. The only necessary mathematical background is familiarity with elementary concepts of probability. The book is divided into three parts. Part I defines the reinforcement learning problem in terms of Markov decision processes. Part II provides basic solution methods: dynamic programming, Monte Carlo methods, and temporal-difference learning. Part III presents a unified view of the solution methods and incorporates artificial neural networks, eligibility traces, and planning; the two final chapters present case studies and consider the future of reinforcement learning.

37,989 citations

••

TL;DR: The proposed RADP methodology can be viewed as an extension of ADP to uncertain nonlinear systems and has been applied to the controller design problems for a jet engine and a one-machine power system.

Abstract: This paper studies the robust optimal control design for a class of uncertain nonlinear systems from a perspective of robust adaptive dynamic programming (RADP). The objective is to fill up a gap in the past literature of adaptive dynamic programming (ADP) where dynamic uncertainties or unmodeled dynamics are not addressed. A key strategy is to integrate tools from modern nonlinear control theory, such as the robust redesign and the backstepping techniques as well as the nonlinear small-gain theorem, with the theory of ADP. The proposed RADP methodology can be viewed as an extension of ADP to uncertain nonlinear systems. Practical learning algorithms are developed in this paper, and have been applied to the controller design problems for a jet engine and a one-machine power system.

328 citations

••

TL;DR: A stochastic reinforcement learning algorithm for learning functions with continuous outputs using a connectionist network that learns to perform an underconstrained positioning task using a simulated 3 degree-of-freedom robot arm.

Abstract: Most of the research in reinforcement learning has been on problems with discrete action spaces. However, many control problems require the application of continuous control signals. In this paper, we present a stochastic reinforcement learning algorithm for learning functions with continuous outputs using a connectionist network. We define stochastic units that compute their real-valued outputs as a function of random activations generated using the normal distribution. Learning takes place by using our algorithm to adjust the two parameters of the normal distribution so as to increase the probability of producing the optimal real value for each input pattern. The performance of the algorithm is studied by using it to learn tasks of varying levels of difficulty. Further, as an example of a potential application, we present a network incorporating these stochastic real-valued units that learns to perform an underconstrained positioning task using a simulated 3 degree-of-freedom robot arm.

306 citations

••

TL;DR: A stochastic real-valued (SRV) reinforcement learning algorithm is described and used for learning control and the authors show how it can be used with nonlinear multilayer ANNs.

Abstract: Skill acquisition is a difficult , yet important problem in robot performance. The authors focus on two skills, namely robotic assembly and balancing and on two classic tasks to develop these skills via learning: the peg-in hole insertion task, and the ball balancing task. A stochastic real-valued (SRV) reinforcement learning algorithm is described and used for learning control and the authors show how it can be used with nonlinear multilayer ANNs. In the peg-in-hole insertion task the SRV network successfully learns to insert to insert a peg into a hole with extremely low clearance, in spite of high sensor noise. In the ball balancing task the SRV network successfully learns to balance the ball with minimal feedback. >

210 citations

••

01 Jan 2013

TL;DR: This chapter argues that the answer to both questions is assuredly “yes” and that the machine learning framework of reinforcement learning is particularly appropriate for bringing learning together with what in animals one would call motivation.

Abstract: Psychologists distinguish between extrinsically motivated behavior, which is behavior undertaken to achieve some externally supplied reward, such as a prize, a high grade, or a high-paying job, and intrinsically motivated behavior, which is behavior done for its own sake. Is an analogous distinction meaningful for machine learning systems? Can we say of a machine learning system that it is motivated to learn, and if so, is it possible to provide it with an analog of intrinsic motivation? Despite the fact that a formal distinction between extrinsic and intrinsic motivation is elusive, this chapter argues that the answer to both questions is assuredly “yes” and that the machine learning framework of reinforcement learning is particularly appropriate for bringing learning together with what in animals one would call motivation. Despite the common perception that a reinforcement learning agent’s reward has to be extrinsic because the agent has a distinct input channel for reward signals, reinforcement learning provides a natural framework for incorporating principles of intrinsic motivation.

207 citations