scispace - formally typeset
D

Danijar Hafner

Researcher at Google

Publications -  41
Citations -  4545

Danijar Hafner is an academic researcher from Google. The author has contributed to research in topics: Reinforcement learning & Intelligent agent. The author has an hindex of 17, co-authored 41 publications receiving 2605 citations. Previous affiliations of Danijar Hafner include Hasso Plattner Institute & University of Toronto.

Papers
More filters
Proceedings Article

Learning Latent Dynamics for Planning from Pixels

TL;DR: The Deep Planning Network (PlaNet) is proposed, a purely model-based agent that learns the environment dynamics from images and chooses actions through fast online planning in latent space using a latent dynamics model with both deterministic and stochastic transition components.
Journal ArticleDOI

A deep learning framework for neuroscience

TL;DR: It is argued that a deep network is best understood in terms of components used to design it—objective functions, architecture and learning rules—rather than unit-by-unit computation.
Proceedings Article

Dream to Control: Learning Behaviors by Latent Imagination

TL;DR: Dreamer is presented, a reinforcement learning agent that solves long-horizon tasks purely by latent imagination and efficiently learn behaviors by backpropagating analytic gradients of learned state values through trajectories imagined in the compact state space of a learned world model.
Proceedings ArticleDOI

Sim-to-Real: Learning Agile Locomotion For Quadruped Robots

TL;DR: This system can learn quadruped locomotion from scratch using simple reward signals and users can provide an open loop reference to guide the learning process when more control over the learned gait is needed.
Posted Content

Learning Latent Dynamics for Planning from Pixels

TL;DR: In this article, the Deep Planning Network (PlaNet) learns the environment dynamics from images and chooses actions through fast online planning in latent space, which achieves state-of-the-art performance on continuous control tasks with contact dynamics, partial observability and sparse rewards.