D
Daan Wierstra
Researcher at Google
Publications - 71
Citations - 66536
Daan Wierstra is an academic researcher from Google. The author has contributed to research in topics: Reinforcement learning & Recurrent neural network. The author has an hindex of 51, co-authored 71 publications receiving 51290 citations. Previous affiliations of Daan Wierstra include Utrecht University & Dalle Molle Institute for Artificial Intelligence Research.
Papers
More filters
Journal ArticleDOI
Human-level control through deep reinforcement learning
Volodymyr Mnih,Koray Kavukcuoglu,David Silver,Andrei Rusu,Joel Veness,Marc G. Bellemare,Alex Graves,Martin Riedmiller,Andreas K. Fidjeland,Georg Ostrovski,Stig Petersen,Charles Beattie,Amir Sadik,Ioannis Antonoglou,Helen King,Dharshan Kumaran,Daan Wierstra,Shane Legg,Demis Hassabis +18 more
TL;DR: This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.
Posted Content
Playing Atari with Deep Reinforcement Learning
Volodymyr Mnih,Koray Kavukcuoglu,David Silver,Alex Graves,Ioannis Antonoglou,Daan Wierstra,Martin Riedmiller +6 more
TL;DR: This work presents the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning, which outperforms all previous approaches on six of the games and surpasses a human expert on three of them.
Posted Content
Continuous control with deep reinforcement learning
Timothy P. Lillicrap,Jonathan J. Hunt,Alexander Pritzel,Nicolas Heess,Tom Erez,Yuval Tassa,David Silver,Daan Wierstra +7 more
TL;DR: This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.
Proceedings Article
Matching networks for one shot learning
TL;DR: In this paper, a network that maps a small labeled support set and an unlabeled example to its label obviates the need for fine-tuning to adapt to new class types.
Posted Content
Stochastic Backpropagation and Approximate Inference in Deep Generative Models
TL;DR: In this article, a generative and recognition model is proposed to represent approximate posterior distributions and act as a stochastic encoder of the data, which allows for joint optimisation of the parameters of both the generative model and the recognition model.