Top 7 papers published by Scott Reed from Google in 2017

Proceedings Article•

[...]

Ziyu Wang, Josh Merel, Scott Reed, Greg Wayne, Nando de Freitas, Nicolas Heess - Show less +2 more

04 Dec 2017

TL;DR: A new version of GAIL is developed that is much more robust than the purely-supervised controller, especially with few demonstrations, and avoids mode collapse, capturing many diverse behaviors when GAIL on its own does not.

...read moreread less

Abstract: Deep generative models have recently shown great promise in imitation learning for motor control. Given enough data, even supervised approaches can do one-shot imitation learning; however, they are vulnerable to cascading failures when the agent trajectory diverges from the demonstrations. Compared to purely supervised methods, Generative Adversarial Imitation Learning (GAIL) can learn more robust controllers from fewer demonstrations, but is inherently mode-seeking and more difficult to train. In this paper, we show how to combine the favourable aspects of these two approaches. The base of our model is a new type of variational autoencoder on demonstration trajectories that learns semantic policy embeddings. We show that these embeddings can be learned on a 9 DoF Jaco robot arm in reaching tasks, and then smoothly interpolated with a resulting smooth interpolation of reaching behavior. Leveraging these policy representations, we develop a new version of GAIL that (1) is much more robust than the purely-supervised controller, especially with few demonstrations, and (2) avoids mode collapse, capturing many diverse behaviors when GAIL on its own does not. We demonstrate our approach on learning diverse gaits from demonstration on a 2D biped and a 62 DoF 3D humanoid in the MuJoCo physics environment.

...read moreread less

115 citations

Posted Content•

Parallel Multiscale Autoregressive Density Estimation

[...]

Scott Reed, Aaron van den Oord, Nal Kalchbrenner, Sergio Gomez Colmenarejo, Ziyu Wang, Dan Belov, Nando de Freitas - Show less +3 more

10 Mar 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: PixelCNN as discussed by the authors proposes a parallelized PixelCNN that allows more efficient inference by modeling certain pixel groups as conditionally independent and achieves competitive density estimation and orders of magnitude speedup - O(log N) sampling instead of O(N) - enabling the practical generation of 512x512 images.

...read moreread less

Abstract: PixelCNN achieves state-of-the-art results in density estimation for natural images. Although training is fast, inference is costly, requiring one network evaluation per pixel; O(N) for N pixels. This can be sped up by caching activations, but still involves generating each pixel sequentially. In this work, we propose a parallelized PixelCNN that allows more efficient inference by modeling certain pixel groups as conditionally independent. Our new PixelCNN model achieves competitive density estimation and orders of magnitude speedup - O(log N) sampling instead of O(N) - enabling the practical generation of 512x512 images. We evaluate the model on class-conditional image generation, text-to-image synthesis, and action-conditional video generation, showing that our model achieves the best results among non-pixel-autoregressive density models that allow efficient sampling.

...read moreread less

114 citations

Proceedings Article•

Parallel Multiscale Autoregressive Density Estimation.

[...]

Scott Reed¹, Aaron van den Oord¹, Nal Kalchbrenner¹, Sergio Gomez Colmenarejo¹, Ziyu Wang², Yutian Chen¹, Dan Belov¹, Nando de Freitas¹ - Show less +4 more•Institutions (2)

Google¹, University College London²

17 Jul 2017

100 citations

Generating Interpretable Images with Controllable Structure

[...]

Scott Reed¹, Aaron van den Oord¹, Nal Kalchbrenner¹, Victor Bapst¹, Matthew Botvinick¹, Nando de Freitas¹ - Show less +2 more•Institutions (1)

Google¹

24 Apr 2017

TL;DR: Improved text-to-image synthesis with controllable object locations using an extension of Pixel Convolutional Neural Networks (PixelCNN) and it is shown how the model can generate images conditioned on part keypoints and segmentation masks.

...read moreread less

Abstract: We demonstrate improved text-to-image synthesis with controllable object locations using an extension of Pixel Convolutional Neural Networks (PixelCNN). In addition to conditioning on text, we show how the model can generate images conditioned on part keypoints and segmentation masks. The character-level text encoder and image generation network are jointly trained end-to-end via maximum likelihood. We establish quantitative baselines in terms of text and structure-conditional pixel log-likelihood for three data sets: Caltech-UCSD Birds (CUB), MPII Human Pose (MHP), and Common Objects in Context (MS-COCO).

...read moreread less

61 citations

Posted Content•

Few-shot Autoregressive Density Estimation: Towards Learning to Learn Distributions

[...]

Scott Reed¹, Yutian Chen¹, Tom Le Paine², Aaron van den Oord³, S. M. Ali Eslami¹, Danilo Jimenez Rezende¹, Oriol Vinyals¹, Nando de Freitas⁴ - Show less +4 more•Institutions (4)

Google¹, University of Illinois at Urbana–Champaign², Ghent University³, University of Oxford⁴

27 Oct 2017-arXiv: Neural and Evolutionary Computing

TL;DR: In this article, a few-shot image density estimation model was proposed to learn visual concepts from only a handful of examples. But the model requires many thousands of gradient-based weight updates and unique image examples for training.

...read moreread less

Abstract: Deep autoregressive models have shown state-of-the-art performance in density estimation for natural images on large-scale datasets such as ImageNet. However, such models require many thousands of gradient-based weight updates and unique image examples for training. Ideally, the models would rapidly learn visual concepts from only a handful of examples, similar to the manner in which humans learns across many vision tasks. In this paper, we show how 1) neural attention and 2) meta learning techniques can be used in combination with autoregressive models to enable effective few-shot density estimation. Our proposed modifications to PixelCNN result in state-of-the art few-shot density estimation on the Omniglot dataset. Furthermore, we visualize the learned attention policy and find that it learns intuitive algorithms for simple tasks such as image mirroring on ImageNet and handwriting on Omniglot without supervision. Finally, we extend the model to natural images and demonstrate few-shot image generation on the Stanford Online Products dataset.

...read moreread less

38 citations

Posted Content•

ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans

[...]

Angela Dai, Daniel Ritchie, Martin Bokeloh, Scott Reed, Jürgen Sturm, Matthias Nießner - Show less +2 more

29 Dec 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this paper, a fully-convolutional generative 3D CNN model whose filter kernels are invariant to the overall scene size is proposed to handle large scenes with varying spatial extent, managing the cubic growth in data size as scene size increases.

...read moreread less

Abstract: We introduce ScanComplete, a novel data-driven approach for taking an incomplete 3D scan of a scene as input and predicting a complete 3D model along with per-voxel semantic labels. The key contribution of our method is its ability to handle large scenes with varying spatial extent, managing the cubic growth in data size as scene size increases. To this end, we devise a fully-convolutional generative 3D CNN model whose filter kernels are invariant to the overall scene size. The model can be trained on scene subvolumes but deployed on arbitrarily large scenes at test time. In addition, we propose a coarse-to-fine inference strategy in order to produce high-resolution output while also leveraging large input context sizes. In an extensive series of experiments, we carefully evaluate different model design choices, considering both deterministic and probabilistic models for completion and semantic inference. Our results show that we outperform other methods not only in the size of the environments handled and processing efficiency, but also with regard to completion quality and semantic segmentation performance by a significant margin.

...read moreread less

35 citations

Posted Content•

Robust Imitation of Diverse Behaviors

[...]

Ziyu Wang, Josh Merel, Scott Reed, Greg Wayne, Nando de Freitas, Nicolas Heess - Show less +2 more

10 Jul 2017-arXiv: Learning

TL;DR: In this paper, a variational autoencoder is used to learn semantic policy embeddings from demonstration trajectories and then interpolated with a resulting smooth interpolation of reaching behavior.

...read moreread less

Abstract: Deep generative models have recently shown great promise in imitation learning for motor control. Given enough data, even supervised approaches can do one-shot imitation learning; however, they are vulnerable to cascading failures when the agent trajectory diverges from the demonstrations. Compared to purely supervised methods, Generative Adversarial Imitation Learning (GAIL) can learn more robust controllers from fewer demonstrations, but is inherently mode-seeking and more difficult to train. In this paper, we show how to combine the favourable aspects of these two approaches. The base of our model is a new type of variational autoencoder on demonstration trajectories that learns semantic policy embeddings. We show that these embeddings can be learned on a 9 DoF Jaco robot arm in reaching tasks, and then smoothly interpolated with a resulting smooth interpolation of reaching behavior. Leveraging these policy representations, we develop a new version of GAIL that (1) is much more robust than the purely-supervised controller, especially with few demonstrations, and (2) avoids mode collapse, capturing many diverse behaviors when GAIL on its own does not. We demonstrate our approach on learning diverse gaits from demonstration on a 2D biped and a 62 DoF 3D humanoid in the MuJoCo physics environment.

...read moreread less

27 citations

Showing papers by "Scott Reed published in 2017"