S
Shixiang Gu
Researcher at Google
Publications - 84
Citations - 13594
Shixiang Gu is an academic researcher from Google. The author has contributed to research in topics: Reinforcement learning & Computer science. The author has an hindex of 35, co-authored 62 publications receiving 9130 citations. Previous affiliations of Shixiang Gu include Max Planck Society & University of Cambridge.
Papers
More filters
Proceedings Article
Categorical Reparameterization with Gumbel-Softmax
Eric Jang,Shixiang Gu,Ben Poole +2 more
TL;DR: Gumbel-Softmax as mentioned in this paper replaces the non-differentiable samples from a categorical distribution with a differentiable sample from a novel Gumbel softmax distribution, which has the essential property that it can be smoothly annealed into the categorical distributions.
Posted Content
Categorical Reparameterization with Gumbel-Softmax
Eric Jang,Shixiang Gu,Ben Poole +2 more
TL;DR: It is shown that the Gumbel-Softmax estimator outperforms state-of-the-art gradient estimators on structured output prediction and unsupervised generative modeling tasks with categorical latent variables, and enables large speedups on semi-supervised classification.
Proceedings ArticleDOI
Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates
TL;DR: In this article, a deep reinforcement learning algorithm based on off-policy training of deep Q-functions can scale to complex 3D manipulation tasks and can learn deep neural network policies efficiently enough to train on real physical robots.
Posted Content
Towards Deep Neural Network Architectures Robust to Adversarial Examples
Shixiang Gu,Luca Rigazio +1 more
TL;DR: Deep Contractive Network as mentioned in this paper proposes a new end-to-end training procedure that includes a smoothness penalty inspired by the contractive autoencoder (CAE), which increases the network robustness to adversarial examples, without a significant performance penalty.
Posted Content
Continuous Deep Q-Learning with Model-based Acceleration
TL;DR: This paper proposed normalized advantage functions (NAF) as an alternative to the more commonly used policy gradient and actor-critic methods to accelerate model-free reinforcement learning for continuous control tasks.