scispace - formally typeset
J

Jan Peters

Researcher at Technische Universität Darmstadt

Publications -  725
Citations -  35857

Jan Peters is an academic researcher from Technische Universität Darmstadt. The author has contributed to research in topics: Reinforcement learning & Robot. The author has an hindex of 81, co-authored 669 publications receiving 29940 citations. Previous affiliations of Jan Peters include National Technical University of Athens & Helen Wills Neuroscience Institute.

Papers
More filters
Journal ArticleDOI

Reinforcement learning in robotics: A survey

TL;DR: This article attempts to strengthen the links between the two research communities by providing a survey of work in reinforcement learning for behavior generation in robots by highlighting both key challenges in robot reinforcement learning as well as notable successes.

Reinforcement Learning in Robotics: A Survey.

Jens Kober, +1 more
TL;DR: A survey of work in reinforcement learning for behavior generation in robots can be found in this article, where the authors highlight key challenges in robot reinforcement learning as well as notable successes and discuss the role of algorithms, representations and prior knowledge in achieving these successes.
Journal ArticleDOI

2008 Special Issue: Reinforcement learning of motor skills with policy gradients

TL;DR: This paper examines learning of complex motor skills with human-like limbs, and combines the idea of modular motor control by means of motor primitives as a suitable way to generate parameterized control policies for reinforcement learning with the theory of stochastic policy gradient learning.
Book

A Survey on Policy Search for Robotics

TL;DR: This work classifies model-free methods based on their policy evaluation strategy, policy update strategy, and exploration strategy and presents a unified view on existing algorithms.
Proceedings Article

Natural Actor-Critic

TL;DR: This paper investigates a novel model-free reinforcement learning architecture, the Natural Actor-Critic, where the actor updates are based on stochastic policy gradients employing Amari's natural gradient approach, while the critic obtains both the natural policy gradient and additional parameters of a value function simultaneously by linear regression.