Y
Yan Duan
Researcher at University of California, Berkeley
Publications - 49
Citations - 14077
Yan Duan is an academic researcher from University of California, Berkeley. The author has contributed to research in topics: Reinforcement learning & Feature learning. The author has an hindex of 33, co-authored 48 publications receiving 11634 citations.
Papers
More filters
Posted Content
InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets
TL;DR: InfoGAN as mentioned in this paper is a generative adversarial network that maximizes the mutual information between a small subset of the latent variables and the observation, which can be interpreted as a variation of the Wake-Sleep algorithm.
Proceedings Article
InfoGAN: interpretable representation learning by information maximizing generative adversarial nets
TL;DR: InfoGAN as mentioned in this paper is an information-theoretic extension to the GAN that is able to learn disentangled representations in a completely unsupervised manner, and it also discovers visual concepts that include hair styles, presence of eyeglasses, and emotions on the CelebA face dataset.
Proceedings Article
Benchmarking deep reinforcement learning for continuous control
TL;DR: In this paper, the authors present a benchmark suite of continuous control tasks, including classic tasks like cart-pole swing-up, tasks with high state and action dimensionality such as 3D humanoid locomotion, and tasks with partial observations.
Posted Content
RL^2: Fast Reinforcement Learning via Slow Reinforcement Learning
TL;DR: This paper proposes to represent a "fast" reinforcement learning algorithm as a recurrent neural network (RNN) and learn it from data, encoded in the weights of the RNN, which are learned slowly through a general-purpose ("slow") RL algorithm.
Journal ArticleDOI
Motion planning with sequential convex optimization and convex collision checking
John Schulman,Yan Duan,Jonathan Ho,Alex X. Lee,Ibrahim Awwal,Henry Bradlow,Jia Pan,Sachin Patil,Ken Goldberg,Pieter Abbeel +9 more
TL;DR: A sequential convex optimization procedure, which penalizes collisions with a hinge loss and increases the penalty coefficients in an outer loop as necessary, and an efficient formulation of the no-collisions constraint that directly considers continuous-time safety are presented.