scispace - formally typeset
Y

Yuval Tassa

Researcher at Google

Publications -  58
Citations -  16371

Yuval Tassa is an academic researcher from Google. The author has contributed to research in topics: Reinforcement learning & Humanoid robot. The author has an hindex of 31, co-authored 53 publications receiving 13493 citations. Previous affiliations of Yuval Tassa include University of Washington & Hebrew University of Jerusalem.

Papers
More filters
Posted Content

Continuous control with deep reinforcement learning

TL;DR: This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.
Proceedings ArticleDOI

MuJoCo: A physics engine for model-based control

TL;DR: A new physics engine tailored to model-based control, based on the modern velocity-stepping approach which avoids the difficulties with spring-dampers, which can compute both forward and inverse dynamics.
Proceedings Article

Continuous control with deep reinforcement learning

TL;DR: In this paper, an actor-critic, model-free algorithm based on the deterministic policy gradient is proposed to operate over continuous action spaces, which is able to find policies whose performance is competitive with those found by a planning algorithm with full access to the dynamics of the domain.
Proceedings ArticleDOI

Synthesis and stabilization of complex behaviors through online trajectory optimization

TL;DR: An online trajectory optimization method and software platform applicable to complex humanoid robots performing challenging tasks such as getting up from an arbitrary pose on the ground and recovering from large disturbances using dexterous acrobatic maneuvers is presented.
Posted Content

Emergence of Locomotion Behaviours in Rich Environments

TL;DR: This paper explores how a rich environment can help to promote the learning of complex behavior, and finds that this encourages the emergence of robust behaviours that perform well across a suite of tasks.