scispace - formally typeset
Open AccessProceedings Article

Obstacle Avoidance through Reinforcement Learning

Tony J. Prescott, +1 more
- Vol. 4, pp 523-530
Reads0
Chats0
TLDR
A method is described for generating plan-like, reflexive, obstacle avoidance behaviour in a mobile robot that adapts its responses to sensory stimuli so as to minimise the negative reinforcement arising from collisions.
Abstract
A method is described for generating plan-like, reflexive, obstacle avoidance behaviour in a mobile robot. The experiments reported here use a simulated vehicle with a primitive range sensor. Avoidance behaviour is encoded as a set of continuous functions of the perceptual input space. These functions are stored using CMACs and trained by a variant of Barto and Sutton's adaptive critic algorithm. As the vehicle explores its surroundings it adapts its responses to sensory stimuli so as to minimise the negative reinforcement arising from collisions. Strategies for local navigation are therefore acquired in an explicitly goal-driven fashion. The resulting trajectories form elegant collision-free paths through the environment.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Rapid, safe, and incremental learning of navigation strategies

TL;DR: A reinforcement connectionist learning architecture that allows an autonomous robot to acquire efficient navigation strategies in a few trials and has high tolerance to noisy sensory data and good generalization abilities is proposed.
Journal ArticleDOI

Trajectory Planning and Obstacle Avoidance for Hyper-Redundant Serial Robots

TL;DR: In this paper, the authors presented an optimization algorithm for the motion planning of a hyper-redundant robot where the motion of one end (head) is an arbitrary desired path.

The sensorimotor foundations of phonology: a computational model of early childhood articulatory and phonetic development

TL;DR: HABLAR as discussed by the authors is a computational model of the sensorimotor foundations of early childhood phonological development, which is intended to explain key characteristics of normal phonology development including the phonetic characteristics of babble, systematic and context sensitive patterns of sound substitutions and deletions, and overgeneralization of pronunciation patterns.
Journal ArticleDOI

Learning Signaling Behaviors and Specialization in Cooperative Agents

TL;DR: A learning mechanism that allows a multiagent system to cooperate to achieve a gathering task efficiently in unknown and changing environments is presented and simulation results show that the multi agent system always achieves near-optimal performances.
Proceedings ArticleDOI

Path Planning of Humanoid Arm Based on Deep Deterministic Policy Gradient

TL;DR: A new obstacle avoidance algorithm, based on an existing deep reinforcement learning framework called deep deterministic policy gradient (DDPG), is proposed to use DDPG to plan the trajectory of a robot arm to realize obstacle avoidance.
References
More filters
Book

The Sciences of the Artificial

TL;DR: A new edition of Simon's classic work on artificial intelligence as mentioned in this paper adds a chapter that sorts out the current themes and tools for analyzing complexity and complex systems, taking into account important advances in cognitive psychology and the science of design while confirming and extending Simon's basic thesis that a physical symbol system has the necessary and sufficient means for intelligent action.
Journal ArticleDOI

The Sciences of the Artificial

Journal ArticleDOI

Neuronlike adaptive elements that can solve difficult learning control problems

TL;DR: In this article, a system consisting of two neuron-like adaptive elements can solve a difficult learning control problem, where the task is to balance a pole that is hinged to a movable cart by applying forces to the cart base.
Journal ArticleDOI

A Theory of Cerebellar Function

TL;DR: It is demonstrated that, in order for the learning process to be stable, pattern storage must be accomplished principally by weakening synaptic weights rather than by strengthening them.
Related Papers (5)