scispace - formally typeset
D

Daan Wierstra

Researcher at Google

Publications -  71
Citations -  66536

Daan Wierstra is an academic researcher from Google. The author has contributed to research in topics: Reinforcement learning & Recurrent neural network. The author has an hindex of 51, co-authored 71 publications receiving 51290 citations. Previous affiliations of Daan Wierstra include Utrecht University & Dalle Molle Institute for Artificial Intelligence Research.

Papers
More filters
Journal ArticleDOI

Human-level control through deep reinforcement learning

TL;DR: This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.
Posted Content

Playing Atari with Deep Reinforcement Learning

TL;DR: This work presents the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning, which outperforms all previous approaches on six of the games and surpasses a human expert on three of them.
Posted Content

Continuous control with deep reinforcement learning

TL;DR: This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.
Proceedings Article

Matching networks for one shot learning

TL;DR: In this paper, a network that maps a small labeled support set and an unlabeled example to its label obviates the need for fine-tuning to adapt to new class types.
Posted Content

Stochastic Backpropagation and Approximate Inference in Deep Generative Models

TL;DR: In this article, a generative and recognition model is proposed to represent approximate posterior distributions and act as a stochastic encoder of the data, which allows for joint optimisation of the parameters of both the generative model and the recognition model.