scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Bettering Operation of Robots by Learning

01 Jun 1984-Journal of Robotic Systems (Wiley)-Vol. 1, Iss: 2, pp 123-140
TL;DR: A betterment process for the operation of a mechanical robot in a sense that it betters the nextoperation of a robot by using the previous operation's data is proposed.
Abstract: This article proposes a betterment process for the operation of a mechanical robot in a sense that it betters the next operation of a robot by using the previous operation's data. The process has an iterative learning structure such that the (k + 1)th input to joint actuators consists of the kth input plus an error increment composed of the derivative difference between the kth motion trajectory and the given desired motion trajectory. The convergence of the process to the desired motion trajectory is assured under some reasonable conditions. Numerical results by computer simulation are presented to show the effectiveness of the proposed learning scheme.
Citations
More filters
Journal ArticleDOI
TL;DR: Though beginning its third decade of active research, the field of ILC shows no sign of slowing down and includes many results and learning algorithms beyond the scope of this survey.
Abstract: This article surveyed the major results in iterative learning control (ILC) analysis and design over the past two decades. Problems in stability, performance, learning transient behavior, and robustness were discussed along with four design techniques that have emerged as among the most popular. The content of this survey was selected to provide the reader with a broad perspective of the important ideas, potential, and limitations of ILC. Indeed, the maturing field of ILC includes many results and learning algorithms beyond the scope of this survey. Though beginning its third decade of active research, the field of ILC shows no sign of slowing down.

2,645 citations


Cites background or methods from "Bettering Operation of Robots by Le..."

  • ...systems can be found in [5], [36], [115]....

    [...]

  • ...The P-, D-, and PD-type learning functions are arguably the most widely used types of learning functions, particularly for nonlinear systems [5], [12], [39], [81], [83], [86]–[92]....

    [...]

  • ...Just as with PD feedback controllers, the most commonly employed method for selecting the gains of the PD-type learning function is by tuning [5], [6], [10], [12], [17], [27], [43]....

    [...]

  • ...ILC has been successfully applied to industrial robots [5]–[9], computer numerical control (CNC) machine tools [10], wafer stage motion systems [11], injection-molding machines [12], [13], aluminum extruders [14], cold rolling mills [15], induction motors [16], chain conveyor systems [17], camless engine valves [18], autonomous vehicles [19], antilock braking [20], rapid thermal processing [21], [22], and semibatch chemical reactors [23]....

    [...]

  • ...However, these ideas lay dormant until a series of articles in 1984 [5], [30]–[32] sparked widespread interest....

    [...]

Journal ArticleDOI
TL;DR: This article attempts to strengthen the links between the two research communities by providing a survey of work in reinforcement learning for behavior generation in robots by highlighting both key challenges in robot reinforcement learning as well as notable successes.
Abstract: Reinforcement learning offers to robotics a framework and set of tools for the design of sophisticated and hard-to-engineer behaviors. Conversely, the challenges of robotic problems provide both inspiration, impact, and validation for developments in reinforcement learning. The relationship between disciplines has sufficient promise to be likened to that between physics and mathematics. In this article, we attempt to strengthen the links between the two research communities by providing a survey of work in reinforcement learning for behavior generation in robots. We highlight both key challenges in robot reinforcement learning as well as notable successes. We discuss how contributions tamed the complexity of the domain and study the role of algorithms, representations, and prior knowledge in achieving these successes. As a result, a particular focus of our paper lies on the choice between model-based and model-free as well as between value-function-based and policy-search methods. By analyzing a simple problem in some detail we demonstrate how reinforcement learning approaches may be profitably applied, and we note throughout open questions and the tremendous potential for future research.

2,391 citations


Additional excerpts

  • ...This approach is known as iterative learning control (Arimoto et al., 1984)....

    [...]

Journal ArticleDOI
TL;DR: A direct adaptive tracking control architecture is proposed and evaluated for a class of continuous-time nonlinear dynamic systems for which an explicit linear parameterization of the uncertainty in the dynamics is either unknown or impossible.
Abstract: A direct adaptive tracking control architecture is proposed and evaluated for a class of continuous-time nonlinear dynamic systems for which an explicit linear parameterization of the uncertainty in the dynamics is either unknown or impossible. The architecture uses a network of Gaussian radial basis functions to adaptively compensate for the plant nonlinearities. Under mild assumptions about the degree of smoothness exhibit by the nonlinear functions, the algorithm is proven to be globally stable, with tracking errors converging to a neighborhood of zero. A constructive procedure is detailed, which directly translates the assumed smoothness properties of the nonlinearities involved into a specification of the network required to represent the plant to a chosen degree of accuracy. A stable weight adjustment mechanism is determined using Lyapunov theory. The network construction and performance of the resulting controller are illustrated through simulations with example systems. >

2,254 citations


Cites methods from "Bettering Operation of Robots by Le..."

  • ...1), these methods can be used to estimate directly the values required for uad(x) [3, 4, 15]....

    [...]

01 Jan 2012
TL;DR: A survey of work in reinforcement learning for behavior generation in robots can be found in this article, where the authors highlight key challenges in robot reinforcement learning as well as notable successes and discuss the role of algorithms, representations and prior knowledge in achieving these successes.
Abstract: Reinforcement learning offers to robotics a framework and set of tools for the design of sophisticated and hard-to-engineer behaviors. Conversely, the challenges of robotic problems provide both inspiration, impact, and validation for developments in reinforcement learning. The relationship between disciplines has sufficient promise to be likened to that between physics and mathematics. In this article, we attempt to strengthen the links between the two research communities by providing a survey of work in reinforcement learning for behavior generation in robots. We highlight both key challenges in robot reinforcement learning as well as notable successes. We discuss how contributions tamed the complexity of the domain and study the role of algorithms, representations, and prior knowledge in achieving these successes. As a result, a particular focus of our paper lies on the choice between model-based and model-free as well as between value-function-based and policy-search methods. By analyzing a simple problem in some detail we demonstrate how reinforcement learning approaches may be profitably applied, and we note throughout open questions and the tremendous potential for future research.

1,513 citations

Journal ArticleDOI
TL;DR: A hierarchical neural network model which accounts for the learning and control capability of the CNS and provides a promising parallel-distributed control scheme for a large-scale complex object whose dynamics are only partially known is proposed.
Abstract: In order to control voluntary movements, the central nervous system (CNS) must solve the following three computational problems at different levels: the determination of a desired trajectory in the visual coordinates, the transformation of its coordinates to the body coordinates and the generation of motor command. Based on physiological knowledge and previous models, we propose a hierarchical neural network model which accounts for the generation of motor command. In our model the association cortex provides the motor cortex with the desired trajectory in the body coordinates, where the motor command is then calculated by means of long-loop sensory feedback. Within the spinocerebellum — magnocellular red nucleus system, an internal neural model of the dynamics of the musculoskeletal system is acquired with practice, because of the heterosynaptic plasticity, while monitoring the motor command and the results of movement. Internal feedback control with this dynamical model updates the motor command by predicting a possible error of movement. Within the cerebrocerebellum — parvocellular red nucleus system, an internal neural model of the inverse-dynamics of the musculo-skeletal system is acquired while monitoring the desired trajectory and the motor command. The inverse-dynamics model substitutes for other brain regions in the complex computation of the motor command. The dynamics and the inverse-dynamics models are realized by a parallel distributed neural network, which comprises many sub-systems computing various nonlinear transformations of input signals and a neuron with heterosynaptic plasticity (that is, changes of synaptic weights are assumed proportional to a product of two kinds of synaptic inputs). Control and learning performance of the model was investigated by computer simulation, in which a robotic manipulator was used as a controlled system, with the following results: (1) Both the dynamics and the inverse-dynamics models were acquired during control of movements. (2) As motor learning proceeded, the inverse-dynamics model gradually took the place of external feedback as the main controller. Concomitantly, overall control performance became much better. (3) Once the neural network model learned to control some movement, it could control quite different and faster movements. (4) The neural netowrk model worked well even when only very limited information about the fundamental dynamical structure of the controlled system was available. Consequently, the model not only accounts for the learning and control capability of the CNS, but also provides a promising parallel-distributed control scheme for a large-scale complex object whose dynamics are only partially known.

1,508 citations

References
More filters
Journal ArticleDOI
TL;DR: A new approach to the dynamic control of manipulators is developed from the viewpoint of mechanics and it is shown that a linear feedback of generalized coordinates and their derivatives are effective for motion control in the large.
Abstract: Department of Mechanical Engineering, Faculty of Engineering SCience, Osaka University, Toyonaka, Osaka, 560 Japan A new approach to the dynamic control of manipulators is developed from the viewpoint of mechanics. It is /irst shown that a linear feedback of generalized coordinates and their derivatives are effective for motion control in the large. Next, we propose a method for task-oriented coordinate control which can be easily implemented by a micro-computer and is suited to sensor feedback control. The proposed method is applicable even when holonomic constraints are added to the system. Effectiveness of the proposed method is verified by computer simulation.

1,155 citations

Proceedings ArticleDOI
01 Dec 1984
TL;DR: In this paper, a new concept called "betterment process" is proposed for the purpose of giving a learning ability of autonomous construction of a better control input to a class of multi-input multi-output servomechanism or mechatronics systems such as mechanical robots.
Abstract: A new concept called "betterment process" is proposed for the purpose of giving a learning ability of autonomous construction of a better control input to a class of multi-input multi-output servomechanism or mechatronics systems such as mechanical robots. It is assumed that those dynamic systems can be operated repeatedly at low cost and in a relatively short time under invariant initial physical conditions, but the knowledge on precise description of their dynamics is not required. The betterment process is composed of a simple iteration rule that generates autonomously a present actuator input better than the previous one, provided a desired output response is given. The convergence of iteration is proved for a simple betterment process where the k+1th input is composed of the kth input plus an increment of the derivative error between the kth output response and given desired response. Discussions on potential applications of the proposed theory to controlling robots or other mechanical systems are presented together with future subjects to be investigated.

382 citations

Journal ArticleDOI
TL;DR: These schemes not only assure the asymptotic stability of positioning, but also make it possible to move robot manipulators at high speed and stop smoothly with high positioning accuracy.

32 citations