scispace - formally typeset
Search or ask a question

Showing papers on "Reinforcement learning published in 1970"


Journal ArticleDOI
King-Sun Fu1
TL;DR: The basic concept of learning control is introduced, and the following five learning schemes are briefly reviewed: 1) trainable controllers using pattern classifiers, 2) reinforcement learning control systems, 3) Bayesian estimation, 4) stochastic approximation, and 5) Stochastic automata models.
Abstract: The basic concept of learning control is introduced. The following five learning schemes are briefly reviewed: 1) trainable controllers using pattern classifiers, 2) reinforcement learning control systems, 3) Bayesian estimation, 4) stochastic approximation, and 5) stochastic automata models. Potential applications and problems for further research in learning control are outlined.

204 citations


Journal ArticleDOI
Yu-Chi Ho1
TL;DR: In this article, the authors examine the problem of control in a generalized framework in which there are more than one criterion, more than a single intelligent controller, each of which has access to different information.
Abstract: In this paper, we examine the problem of control in a generalized framework in which there are more than one criterion, more than one intelligent controller, each of which has access to different information. It is argued that optimal control, differential games, dynamic team theory, etc., can all be viewed as special cases of this generalized control theory. Useful concepts and potential difficulties are discussed.

131 citations


Journal ArticleDOI
TL;DR: A linear reinforcement learning technique is proposed to provide a memory and thus accelerate the convergence of successive approximation algorithms to establish a consistent direction or search insensitive to perturbations introduced by the random variables involved.
Abstract: A linear reinforcement learning technique is proposed to provide a memory and thus accelerate the convergence of successive approximation algorithms. The learning scheme is used to update weighting coefficients applied to the components of the correction terms of the algorithm. A direction of the search approaching the direction of a "ridge" will result in a gradient peak-seeking method which accelerates considerably the convergence to a neighborhood of the extremum. In a stochastic approximation algorithm the learning scheme provides the required memory to establish a consistent direction or search insensitive to perturbations introduced by the random variables involved. The accelerated algorithms and the respective proofs of convergence are presented. Illustrative examples demonstrate the validity of the proposed algorithms.

42 citations


Journal ArticleDOI
TL;DR: A computer method of two-stage learning is employed in which the first stage is coarse and attempts to satisfy the terminal boundary conditions on the basis of subgoal learning, which yields an approximation to the optimum control law.
Abstract: Learning heuristics for an on-line controller are presented, and various aspects of the problem are discussed. The controller is required to achieve optimal regulator control for an unknown process in the face of random disturbances. A computer method of two-stage learning is employed in which the first stage is coarse and attempts to satisfy the terminal boundary conditions on the basis of subgoal learning. This yields an approximation to the optimum control law. Rote learning is also carried out during this time. The second, or tuning stage, improves on this result by a technique of reinforcement learning applied to the integral performance criterion. The effect of varying the parameters associated with the learning algorithm is studied. A discussion of a hybrid computer simulation of a second-order plant subject to one input with two possible levels is presented.

10 citations


01 Jan 1970
TL;DR: In this article, the authors explain the science behind reinforcement learning and optimal control in a way that is accessible to students with a background in calculus and matrix algebra, along with insight into why reinforcement learning sometimes fails.
Abstract: A high school student can create deep Q-learning code to control her robot, without any understanding of the meaning of 'deep' or 'Q', or why the code sometimes fails. This book is designed to explain the science behind reinforcement learning and optimal control in a way that is accessible to students with a background in calculus and matrix algebra. A unique focus is algorithm design to obtain the fastest possible speed of convergence for learning algorithms, along with insight into why reinforcement learning sometimes fails. Advanced stochastic process theory is avoided at the start by substituting random exploration with more intuitive deterministic probing for learning. Once these ideas are understood, it is not difficult to master techniques rooted in stochastic control. These topics are covered in the second part of the book, starting with Markov chain theory and ending with a fresh look at actor-critic methods for reinforcement learning.

7 citations


Journal ArticleDOI
TL;DR: A planning system which integrates the reinforcement learning method and a neural network approach with the aim to ensure autonomous robot behavior in unpredictable working conditions to evaluate robot behavior and to induce new or to improve the existing knowledge.

3 citations


01 Jun 1970
TL;DR: A theory of the human learning processes is first described and a computer game which simulates the human capabilities of reasoning and learning is presented which is required to make intelligent decisions based on past experiences and critical analysis of the present situation.
Abstract: : The purpose of this thesis is a discussion of developing human-like behavior in the computer. A theory of the human learning processes is first described. This leads to the presentation of a computer game which simulates the human capabilities of reasoning and learning. The program is required to make intelligent decisions based on past experiences and critical analysis of the present situation.

2 citations


Journal ArticleDOI
TL;DR: Algorithm for implementing learning controller based on subgoal concept applicable to linear stationary system and its implications for learning controller design and simulation are revealed.
Abstract: Algorithm for implementing learning controller based on subgoal concept applicable to linear stationary system

1 citations