scispace - formally typeset
Search or ask a question

Showing papers on "Reinforcement learning published in 1966"


Journal ArticleDOI
TL;DR: Mathematical models for optimizing learning process are presented, showing instructional problems reformulation as theory of optimization problems and how these problems can be transformed into optimization problems.
Abstract: Mathematical models for optimizing learning process, showing instructional problems reformulation as theory of optimization problems

70 citations


Journal ArticleDOI
J. Sklansky1
TL;DR: Current developments in learning systems for automatic control are discussed from the point of view of pattern recognition, and Markov chain theory provides an approach to modelling the dynamics of learning controllers.
Abstract: Recent developments in learning systems for automatic control are discussed from the point of view of pattern recognition. The following mathematical areas are given special attention: 1) decision theory, which produces control policies from gradually adjusted estimates of pattern probabilities, 2) trainable threshold logic, which produces control policies from networks of adjustable threshold devices, 3) stochastic approximation, which produces asymptotically optimum controllers, and 4) Markov chain theory, which provides an approach to modelling the dynamics of learning controllers. Projected applications in the following areas are discussed: process control, automated design of controllers, reliability control, numerical computation, and communication systems. A selected bibliography is included.

63 citations


Journal ArticleDOI
TL;DR: The formation of the stimulus?response inter-connection in a learning system is analysed and a utility criterion and a decision system are necessary in order that the learning system selects the optimum response to a given stimulus.
Abstract: In this note, the formation of the stimulus?response inter-connection in a learning system is analysed. The system, a technical or a biological one, establishes this correspondence by processing the information fed back from the medium during the learning process. A utility criterion and a decision system are necessary, in order that the learning system selects the optimum response to a given stimulus.

3 citations