scispace - formally typeset
Y

Yan Duan

Researcher at University of California, Berkeley

Publications -  49
Citations -  14077

Yan Duan is an academic researcher from University of California, Berkeley. The author has contributed to research in topics: Reinforcement learning & Feature learning. The author has an hindex of 33, co-authored 48 publications receiving 11634 citations.

Papers
More filters
Posted Content

InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets

TL;DR: InfoGAN as mentioned in this paper is a generative adversarial network that maximizes the mutual information between a small subset of the latent variables and the observation, which can be interpreted as a variation of the Wake-Sleep algorithm.
Proceedings Article

InfoGAN: interpretable representation learning by information maximizing generative adversarial nets

TL;DR: InfoGAN as mentioned in this paper is an information-theoretic extension to the GAN that is able to learn disentangled representations in a completely unsupervised manner, and it also discovers visual concepts that include hair styles, presence of eyeglasses, and emotions on the CelebA face dataset.
Proceedings Article

Benchmarking deep reinforcement learning for continuous control

TL;DR: In this paper, the authors present a benchmark suite of continuous control tasks, including classic tasks like cart-pole swing-up, tasks with high state and action dimensionality such as 3D humanoid locomotion, and tasks with partial observations.
Posted Content

RL^2: Fast Reinforcement Learning via Slow Reinforcement Learning

TL;DR: This paper proposes to represent a "fast" reinforcement learning algorithm as a recurrent neural network (RNN) and learn it from data, encoded in the weights of the RNN, which are learned slowly through a general-purpose ("slow") RL algorithm.
Journal ArticleDOI

Motion planning with sequential convex optimization and convex collision checking

TL;DR: A sequential convex optimization procedure, which penalizes collisions with a hinge loss and increases the penalty coefficients in an outer loop as necessary, and an efficient formulation of the no-collisions constraint that directly considers continuous-time safety are presented.