J
Jan Peters
Researcher at Technische Universität Darmstadt
Publications - 725
Citations - 35857
Jan Peters is an academic researcher from Technische Universität Darmstadt. The author has contributed to research in topics: Reinforcement learning & Robot. The author has an hindex of 81, co-authored 669 publications receiving 29940 citations. Previous affiliations of Jan Peters include National Technical University of Athens & Helen Wills Neuroscience Institute.
Papers
More filters
Journal ArticleDOI
Reinforcement learning in robotics: A survey
TL;DR: This article attempts to strengthen the links between the two research communities by providing a survey of work in reinforcement learning for behavior generation in robots by highlighting both key challenges in robot reinforcement learning as well as notable successes.
Reinforcement Learning in Robotics: A Survey.
Jens Kober,Jan Peters +1 more
TL;DR: A survey of work in reinforcement learning for behavior generation in robots can be found in this article, where the authors highlight key challenges in robot reinforcement learning as well as notable successes and discuss the role of algorithms, representations and prior knowledge in achieving these successes.
Journal ArticleDOI
2008 Special Issue: Reinforcement learning of motor skills with policy gradients
Jan Peters,Stefan Schaal +1 more
TL;DR: This paper examines learning of complex motor skills with human-like limbs, and combines the idea of modular motor control by means of motor primitives as a suitable way to generate parameterized control policies for reinforcement learning with the theory of stochastic policy gradient learning.
Book
A Survey on Policy Search for Robotics
TL;DR: This work classifies model-free methods based on their policy evaluation strategy, policy update strategy, and exploration strategy and presents a unified view on existing algorithms.
Proceedings Article
Natural Actor-Critic
TL;DR: This paper investigates a novel model-free reinforcement learning architecture, the Natural Actor-Critic, where the actor updates are based on stochastic policy gradients employing Amari's natural gradient approach, while the critic obtains both the natural policy gradient and additional parameters of a value function simultaneously by linear regression.