Top 18 papers published by Xi Chen from University of California, Berkeley in 2016

Posted Content•

[...]

Tim Salimans¹, Ian Goodfellow², Wojciech Zaremba³, Vicki Cheung, Alec Radford¹, Xi Chen⁴ - Show less +2 more•Institutions (4)

OpenAI¹, Google², Facebook³, University of California, Berkeley⁴

10 Jun 2016-arXiv: Learning

TL;DR: In this article, the authors present a variety of new architectural features and training procedures that apply to the generative adversarial networks (GANs) framework and achieve state-of-the-art results in semi-supervised classification on MNIST, CIFAR-10 and SVHN.

...read moreread less

Abstract: We present a variety of new architectural features and training procedures that we apply to the generative adversarial networks (GANs) framework. We focus on two applications of GANs: semi-supervised learning, and the generation of images that humans find visually realistic. Unlike most work on generative models, our primary goal is not to train a model that assigns high likelihood to test data, nor do we require the model to be able to learn well without using any labels. Using our new techniques, we achieve state-of-the-art results in semi-supervised classification on MNIST, CIFAR-10 and SVHN. The generated images are of high quality as confirmed by a visual Turing test: our model generates MNIST samples that humans cannot distinguish from real data, and CIFAR-10 samples that yield a human error rate of 21.3%. We also present ImageNet samples with unprecedented resolution and show that our methods enable the model to learn recognizable features of ImageNet classes.

...read moreread less

5,711 citations

Proceedings Article•

Improved techniques for training GANs

[...]

Tim Salimans¹, Ian Goodfellow², Wojciech Zaremba³, Vicki Cheung, Alec Radford¹, Xi Chen⁴ - Show less +2 more•Institutions (4)

OpenAI¹, Google², Facebook³, University of California, Berkeley⁴

05 Dec 2016

TL;DR: In this article, a variety of new architectural features and training procedures are applied to the generative adversarial networks (GANs) framework and achieved state-of-the-art results in semi-supervised classification on MNIST, CIFAR-10 and SVHN.

...read moreread less

Abstract: We present a variety of new architectural features and training procedures that we apply to the generative adversarial networks (GANs) framework. Using our new techniques, we achieve state-of-the-art results in semi-supervised classification on MNIST, CIFAR-10 and SVHN. The generated images are of high quality as confirmed by a visual Turing test: our model generates MNIST samples that humans cannot distinguish from real data, and CIFAR-10 samples that yield a human error rate of 21.3%. We also present ImageNet samples with unprecedented resolution and show that our methods enable the model to learn recognizable features of ImageNet classes.

...read moreread less

3,332 citations

Posted Content•

InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets

[...]

Xi Chen¹, Yan Duan¹, Rein Houthooft¹, John Schulman¹, Ilya Sutskever², Pieter Abbeel¹ - Show less +2 more•Institutions (2)

University of California, Berkeley¹, OpenAI²

12 Jun 2016-arXiv: Learning

TL;DR: InfoGAN as mentioned in this paper is a generative adversarial network that maximizes the mutual information between a small subset of the latent variables and the observation, which can be interpreted as a variation of the Wake-Sleep algorithm.

...read moreread less

Abstract: This paper describes InfoGAN, an information-theoretic extension to the Generative Adversarial Network that is able to learn disentangled representations in a completely unsupervised manner. InfoGAN is a generative adversarial network that also maximizes the mutual information between a small subset of the latent variables and the observation. We derive a lower bound to the mutual information objective that can be optimized efficiently, and show that our training procedure can be interpreted as a variation of the Wake-Sleep algorithm. Specifically, InfoGAN successfully disentangles writing styles from digit shapes on the MNIST dataset, pose from lighting of 3D rendered images, and background digits from the central digit on the SVHN dataset. It also discovers visual concepts that include hair styles, presence/absence of eyeglasses, and emotions on the CelebA face dataset. Experiments show that InfoGAN learns interpretable representations that are competitive with representations learned by existing fully supervised methods.

...read moreread less

2,409 citations

Proceedings Article•

InfoGAN: interpretable representation learning by information maximizing generative adversarial nets

[...]

Xi Chen¹, Yan Duan¹, Rein Houthooft¹, John Schulman¹, Ilya Sutskever², Pieter Abbeel¹ - Show less +2 more•Institutions (2)

University of California, Berkeley¹, OpenAI²

05 Dec 2016

TL;DR: InfoGAN as mentioned in this paper is an information-theoretic extension to the GAN that is able to learn disentangled representations in a completely unsupervised manner, and it also discovers visual concepts that include hair styles, presence of eyeglasses, and emotions on the CelebA face dataset.

...read moreread less

Abstract: This paper describes InfoGAN, an information-theoretic extension to the Generative Adversarial Network that is able to learn disentangled representations in a completely unsupervised manner. InfoGAN is a generative adversarial network that also maximizes the mutual information between a small subset of the latent variables and the observation. We derive a lower bound of the mutual information objective that can be optimized efficiently. Specifically, InfoGAN successfully disentangles writing styles from digit shapes on the MNIST dataset, pose from lighting of 3D rendered images, and background digits from the central digit on the SVHN dataset. It also discovers visual concepts that include hair styles, presence/absence of eyeglasses, and emotions on the CelebA face dataset. Experiments show that InfoGAN learns interpretable representations that are competitive with representations learned by existing supervised methods. For an up-to-date version of this paper, please see https://arxiv.org/abs/1606.03657.

...read moreread less

2,290 citations

Proceedings Article•

Benchmarking deep reinforcement learning for continuous control

[...]

Yan Duan¹, Xi Chen¹, Rein Houthooft¹, John Schulman¹, Pieter Abbeel¹ - Show less +1 more•Institutions (1)

University of California, Berkeley¹

19 Jun 2016

TL;DR: In this paper, the authors present a benchmark suite of continuous control tasks, including classic tasks like cart-pole swing-up, tasks with high state and action dimensionality such as 3D humanoid locomotion, and tasks with partial observations.

...read moreread less

Abstract: Recently, researchers have made significant progress combining the advances in deep learning for learning feature representations with reinforcement learning. Some notable examples include training agents to play Atari games based on raw pixel data and to acquire advanced manipulation skills using raw sensory inputs. However, it has been difficult to quantify progress in the domain of continuous control due to the lack of a commonly adopted benchmark. In this work, we present a benchmark suite of continuous control tasks, including classic tasks like cart-pole swing-up, tasks with very high state and action dimensionality such as 3D humanoid locomotion, tasks with partial observations, and tasks with hierarchical structure. We report novel findings based on the systematic evaluation of a range of implemented reinforcement learning algorithms. Both the benchmark and reference implementations are released at https://github.com/rllab/rllab in order to facilitate experimental reproducibility and to encourage adoption by other researchers.

...read moreread less

1,038 citations

Proceedings Article•

Improved Variational Inference with Inverse Autoregressive Flow

[...]

Durk P. Kingma, Tim Salimans¹, Rafal Jozefowicz², Xi Chen³, Ilya Sutskever², Max Welling⁴ - Show less +2 more•Institutions (4)

OpenAI¹, Google², University of California, Berkeley³, University of Amsterdam⁴

01 Jan 2016

TL;DR: A new type of normalizing flow, inverse autoregressive flow (IAF), is proposed that, in contrast to earlier published flows, scales well to high-dimensional latent spaces and significantly improves upon diagonal Gaussian approximate posteriors.

...read moreread less

Abstract: The framework of normalizing flows provides a general strategy for flexible variational inference of posteriors over latent variables. We propose a new type of normalizing flow, inverse autoregressive flow (IAF), that, in contrast to earlier published flows, scales well to high-dimensional latent spaces. The proposed flow consists of a chain of invertible transformations, where each transformation is based on an autoregressive neural network. In experiments, we show that IAF significantly improves upon diagonal Gaussian approximate posteriors. In addition, we demonstrate that a novel type of variational autoencoder, coupled with IAF, is competitive with neural autoregressive models in terms of attained log-likelihood on natural images, while allowing significantly faster synthesis.

...read moreread less

901 citations

Proceedings Article•

Improving Variational Inference with Inverse Autoregressive Flow

[...]

Diederik P. Kingma¹, Tim Salimans², Rafal Jozefowicz³, Xi Chen⁴, Ilya Sutskever³, Max Welling⁵ - Show less +2 more•Institutions (5)

University of Amsterdam¹, OpenAI², Google³, University of California, Berkeley⁴, Canadian Institute for Advanced Research⁵

15 Jun 2016

TL;DR: This article proposed a data transformation called inverse autoregressive flows (IAF) to transform a simple distribution over the latent variables into a much more flexible distribution, while still allowing us to compute the resulting variables' probability density function.

...read moreread less

Abstract: We propose a simple and scalable method for improving the flexibility of variational inference through a transformation with autoregressive neural networks. Autoregressive neural networks, such as RNNs or the PixelCNN, are very powerful models and potentially interesting for use as variational posterior approximation. However, ancestral sampling in such networks is a long sequential operation, and therefore typically very slow on modern parallel hardware, such as GPUs. We show that by inverting autoregressive neural networks we can obtain equally powerful posterior models from which we can sample efficiently on modern hardware. We show that such data transformations, inverse autoregressive flows (IAF), can be used to transform a simple distribution over the latent variables into a much more flexible distribution, while still allowing us to compute the resulting variables' probability density function. The method is simple to implement, can be made arbitrarily flexible and, in contrast with previous work, is well applicable to models with high-dimensional latent spaces, such as convolutional generative models. The method is applied to a novel deep architecture of variational auto-encoders. In experiments with natural images, we demonstrate that autoregressive flow leads to significant performance gains.

...read moreread less

767 citations

Posted Content•

RL^2: Fast Reinforcement Learning via Slow Reinforcement Learning

[...]

Yan Duan, John Schulman, Xi Chen, Peter L. Bartlett, Ilya Sutskever, Pieter Abbeel - Show less +2 more

04 Nov 2016-arXiv: Artificial Intelligence

TL;DR: This paper proposes to represent a "fast" reinforcement learning algorithm as a recurrent neural network (RNN) and learn it from data, encoded in the weights of the RNN, which are learned slowly through a general-purpose ("slow") RL algorithm.

...read moreread less

Abstract: Deep reinforcement learning (deep RL) has been successful in learning sophisticated behaviors automatically; however, the learning process requires a huge number of trials. In contrast, animals can learn new tasks in just a few trials, benefiting from their prior knowledge about the world. This paper seeks to bridge this gap. Rather than designing a "fast" reinforcement learning algorithm, we propose to represent it as a recurrent neural network (RNN) and learn it from data. In our proposed method, RL$^2$, the algorithm is encoded in the weights of the RNN, which are learned slowly through a general-purpose ("slow") RL algorithm. The RNN receives all information a typical RL algorithm would receive, including observations, actions, rewards, and termination flags; and it retains its state across episodes in a given Markov Decision Process (MDP). The activations of the RNN store the state of the "fast" RL algorithm on the current (previously unseen) MDP. We evaluate RL$^2$ experimentally on both small-scale and large-scale problems. On the small-scale side, we train it to solve randomly generated multi-arm bandit problems and finite MDPs. After RL$^2$ is trained, its performance on new MDPs is close to human-designed algorithms with optimality guarantees. On the large-scale side, we test RL$^2$ on a vision-based navigation task and show that it scales up to high-dimensional problems.

...read moreread less

668 citations

Posted Content•

Benchmarking Deep Reinforcement Learning for Continuous Control

[...]

Yan Duan¹, Xi Chen¹, Rein Houthooft¹, John Schulman¹, Pieter Abbeel¹ - Show less +1 more•Institutions (1)

University of California, Berkeley¹

22 Apr 2016-arXiv: Learning

TL;DR: In this article, the authors present a benchmark suite of continuous control tasks, including classic tasks like cart-pole swing-up, tasks with high state and action dimensionality such as 3D humanoid locomotion, and tasks with partial observations.

...read moreread less

Abstract: Recently, researchers have made significant progress combining the advances in deep learning for learning feature representations with reinforcement learning. Some notable examples include training agents to play Atari games based on raw pixel data and to acquire advanced manipulation skills using raw sensory inputs. However, it has been difficult to quantify progress in the domain of continuous control due to the lack of a commonly adopted benchmark. In this work, we present a benchmark suite of continuous control tasks, including classic tasks like cart-pole swing-up, tasks with very high state and action dimensionality such as 3D humanoid locomotion, tasks with partial observations, and tasks with hierarchical structure. We report novel findings based on the systematic evaluation of a range of implemented reinforcement learning algorithms. Both the benchmark and reference implementations are released at this https URL in order to facilitate experimental reproducibility and to encourage adoption by other researchers.

...read moreread less

521 citations

Posted Content•

VIME: Variational Information Maximizing Exploration

[...]

Rein Houthooft¹, Xi Chen², Yan Duan², John Schulman², Filip De Turck¹, Pieter Abbeel² - Show less +2 more•Institutions (2)

Ghent University¹, University of California, Berkeley²

31 May 2016-arXiv: Learning

TL;DR: Variational Information Maximizing Exploration (VIME) as mentioned in this paper is an exploration strategy based on maximization of information gain about the agent's belief of environment dynamics, which can be applied with several different underlying RL algorithms.

...read moreread less

Abstract: Scalable and effective exploration remains a key challenge in reinforcement learning (RL). While there are methods with optimality guarantees in the setting of discrete state and action spaces, these methods cannot be applied in high-dimensional deep RL scenarios. As such, most contemporary RL relies on simple heuristics such as epsilon-greedy exploration or adding Gaussian noise to the controls. This paper introduces Variational Information Maximizing Exploration (VIME), an exploration strategy based on maximization of information gain about the agent's belief of environment dynamics. We propose a practical implementation, using variational inference in Bayesian neural networks which efficiently handles continuous state and action spaces. VIME modifies the MDP reward function, and can be applied with several different underlying RL algorithms. We demonstrate that VIME achieves significantly better performance compared to heuristic exploration methods across a variety of continuous control tasks and algorithms, including tasks with very sparse rewards.

...read moreread less

426 citations

Posted Content•

#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning

[...]

Haoran Tang¹, Rein Houthooft², Davis Foote¹, Adam Stooke¹, Xi Chen¹, Yan Duan¹, John Schulman³, Filip De Turck², Pieter Abbeel¹ - Show less +5 more•Institutions (3)

University of California, Berkeley¹, Ghent University², OpenAI³

15 Nov 2016-arXiv: Artificial Intelligence

TL;DR: A simple generalization of the classic count-based approach can reach near state-of-the-art performance on various high-dimensional and/or continuous deep RL benchmarks, and is found that simple hash functions can achieve surprisingly good results on many challenging tasks.

...read moreread less

Abstract: Count-based exploration algorithms are known to perform near-optimally when used in conjunction with tabular reinforcement learning (RL) methods for solving small discrete Markov decision processes (MDPs). It is generally thought that count-based methods cannot be applied in high-dimensional state spaces, since most states will only occur once. Recent deep RL exploration strategies are able to deal with high-dimensional continuous state spaces through complex heuristics, often relying on optimism in the face of uncertainty or intrinsic motivation. In this work, we describe a surprising finding: a simple generalization of the classic count-based approach can reach near state-of-the-art performance on various high-dimensional and/or continuous deep RL benchmarks. States are mapped to hash codes, which allows to count their occurrences with a hash table. These counts are then used to compute a reward bonus according to the classic count-based exploration theory. We find that simple hash functions can achieve surprisingly good results on many challenging tasks. Furthermore, we show that a domain-dependent learned hash code may further improve these results. Detailed analysis reveals important aspects of a good hash function: 1) having appropriate granularity and 2) encoding information relevant to solving the MDP. This exploration strategy achieves near state-of-the-art performance on both continuous control tasks and Atari 2600 games, hence providing a simple yet powerful baseline for solving MDPs that require considerable exploration.

...read moreread less

Posted Content•

Variational Lossy Autoencoder

[...]

Xi Chen¹, Diederik P. Kingma², Tim Salimans², Yan Duan¹, Prafulla Dhariwal², John Schulman¹, Ilya Sutskever³, Pieter Abbeel¹ - Show less +4 more•Institutions (3)

University of California, Berkeley¹, OpenAI², Google³

08 Nov 2016-arXiv: Learning

TL;DR: Li et al. as mentioned in this paper combine VAE with neural autoregressive models such as RNN, MADE and PixelRNN/CNN to learn a global representation for 2D images that describes only global structure and discards information about detailed texture.

...read moreread less

Abstract: Representation learning seeks to expose certain aspects of observed data in a learned representation that's amenable to downstream tasks like classification. For instance, a good representation for 2D images might be one that describes only global structure and discards information about detailed texture. In this paper, we present a simple but principled method to learn such global representations by combining Variational Autoencoder (VAE) with neural autoregressive models such as RNN, MADE and PixelRNN/CNN. Our proposed VAE model allows us to have control over what the global latent code can learn and , by designing the architecture accordingly, we can force the global latent code to discard irrelevant information such as texture in 2D images, and hence the VAE only "autoencodes" data in a lossy fashion. In addition, by leveraging autoregressive models as both prior distribution $p(z)$ and decoding distribution $p(x|z)$, we can greatly improve generative modeling performance of VAEs, achieving new state-of-the-art results on MNIST, OMNIGLOT and Caltech-101 Silhouettes density estimation tasks.

...read moreread less

Proceedings Article•

Variational Lossy Autoencoder

[...]

Xi Chen¹, Diederik P. Kingma², Tim Salimans², Yan Duan¹, Prafulla Dhariwal², John Schulman¹, Ilya Sutskever³, Pieter Abbeel¹ - Show less +4 more•Institutions (3)

University of California, Berkeley¹, OpenAI², Google³

04 Nov 2016

TL;DR: This paper presents a simple but principled method to learn global representations by combining Variational Autoencoder (VAE) with neural autoregressive models such as RNN, MADE and PixelRNN/CNN with greatly improve generative modeling performance of VAEs.

...read moreread less

Abstract: Representation learning seeks to expose certain aspects of observed data in a learned representation that's amenable to downstream tasks like classification. For instance, a good representation for 2D images might be one that describes only global structure and discards information about detailed texture. In this paper, we present a simple but principled method to learn such global representations by combining Variational Autoencoder (VAE) with neural autoregressive models such as RNN, MADE and PixelRNN/CNN. Our proposed VAE model allows us to have control over what the global latent code can learn and , by designing the architecture accordingly, we can force the global latent code to discard irrelevant information such as texture in 2D images, and hence the VAE only "autoencodes" data in a lossy fashion. In addition, by leveraging autoregressive models as both prior distribution $p(z)$ and decoding distribution $p(x|z)$, we can greatly improve generative modeling performance of VAEs, achieving new state-of-the-art results on MNIST, OMNIGLOT and Caltech-101 Silhouettes density estimation tasks.

...read moreread less

Proceedings Article•

VIME: Variational Information Maximizing Exploration

[...]

Rein Houthooft¹, Xi Chen², Yan Duan², John Schulman², Filip De Turck¹, Pieter Abbeel² - Show less +2 more•Institutions (2)

Ghent University¹, University of California, Berkeley²

26 Aug 2016

TL;DR: Variational Information Maximizing Exploration (VIME) as mentioned in this paper is an exploration strategy based on maximization of information gain about the agent's belief of environment dynamics, which can be applied with several different underlying RL algorithms.

...read moreread less

Abstract: Scalable and effective exploration remains a key challenge in reinforcement learning (RL). While there are methods with optimality guarantees in the setting of discrete state and action spaces, these methods cannot be applied in high-dimensional deep RL scenarios. As such, most contemporary RL relies on simple heuristics such as epsilon-greedy exploration or adding Gaussian noise to the controls. This paper introduces Variational Information Maximizing Exploration (VIME), an exploration strategy based on maximization of information gain about the agent's belief of environment dynamics. We propose a practical implementation, using variational inference in Bayesian neural networks which efficiently handles continuous state and action spaces. VIME modifies the MDP reward function, and can be applied with several different underlying RL algorithms. We demonstrate that VIME achieves significantly better performance compared to heuristic exploration methods across a variety of continuous control tasks and algorithms, including tasks with very sparse rewards.

...read moreread less

Posted Content•

Improving Variational Inference with Inverse Autoregressive Flow

[...]

Diederik P. Kingma¹, Tim Salimans², Rafal Jozefowicz³, Xi Chen⁴, Ilya Sutskever³, Max Welling⁵ - Show less +2 more•Institutions (5)

University of Amsterdam¹, OpenAI², Google³, University of California, Berkeley⁴, Canadian Institute for Advanced Research⁵

15 Jun 2016-arXiv: Learning

TL;DR: This paper proposed a new type of normalizing flow, inverse autoregressive flow (IAF), that, in contrast to earlier published flows, scales well to high-dimensional latent spaces, and demonstrated that a novel type of variational autoencoder, coupled with IAF, is competitive with neural autoregression models in terms of attained log-likelihood on natural images, while allowing significantly faster synthesis.

...read moreread less

Abstract: The framework of normalizing flows provides a general strategy for flexible variational inference of posteriors over latent variables. We propose a new type of normalizing flow, inverse autoregressive flow (IAF), that, in contrast to earlier published flows, scales well to high-dimensional latent spaces. The proposed flow consists of a chain of invertible transformations, where each transformation is based on an autoregressive neural network. In experiments, we show that IAF significantly improves upon diagonal Gaussian approximate posteriors. In addition, we demonstrate that a novel type of variational autoencoder, coupled with IAF, is competitive with neural autoregressive models in terms of attained log-likelihood on natural images, while allowing significantly faster synthesis.

...read moreread less

InfoGAN: interpretable representation learning by information maximizing Generative Adversarial Nets

[...]

Xi Chen¹, Yan Duan¹, Rein Houthooft¹, John Schulman¹, Ilya Sutskever², Pieter Abbeel¹ - Show less +2 more•Institutions (2)

University of California, Berkeley¹, OpenAI²

01 Jan 2016

TL;DR: Experiments show that InfoGAN learns interpretable representations that are competitive with representations learned by existing fully supervised methods.

...read moreread less

Abstract: This paper describes InfoGAN, an information-theoretic extension to the Gener-ative Adversarial Network that is able to learn disentangled representations in a completely unsupervised manner. InfoGAN is a generative adversarial network that also maximizes the mutual information between a small subset of the latent variables and the observation. We derive a lower bound of the mutual information objective that can be optimized efficiently. Specifically, InfoGAN successfully disentangles writing styles from digit shapes on the MNIST dataset, pose from lighting of 3D rendered images, and background digits from the central digit on the SVHN dataset. It also discovers visual concepts that include hair styles, pres-ence/absence of eyeglasses, and emotions on the CelebA face dataset. Experiments show that InfoGAN learns interpretable representations that are competitive with representations learned by existing supervised methods.

...read moreread less

Posted Content•

Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks.

[...]

Rein Houthooft, Xi Chen, Yan Duan, John Schulman, Filip De Turck, Pieter Abbeel - Show less +2 more

31 May 2016-arXiv: Learning

TL;DR: VIME is introduced, an exploration strategy based on maximization of information gain about the agent's belief of environment dynamics which efficiently handles continuous state and action spaces and can be applied with several different underlying RL algorithms.

...read moreread less

Abstract: Scalable and effective exploration remains a key challenge in reinforcement learning (RL). While there are methods with optimality guarantees in the setting of discrete state and action spaces, these methods cannot be applied in high-dimensional deep RL scenarios. As such, most contemporary RL relies on simple heuristics such as epsilon-greedy exploration or adding Gaussian noise to the controls. This paper introduces Variational Information Maximizing Exploration (VIME), an exploration strategy based on maximization of information gain about the agent's belief of environment dynamics. We propose a practical implementation, using variational inference in Bayesian neural networks which efficiently handles continuous state and action spaces. VIME modifies the MDP reward function, and can be applied with several different underlying RL algorithms. We demonstrate that VIME achieves significantly better performance compared to heuristic exploration methods across a variety of continuous control tasks and algorithms, including tasks with very sparse rewards.

...read moreread less

Proceedings Article•

Improving Variational Autoencoders with Inverse Autoregressive Flow

[...]

Diederik P. Kingma¹, Tim Salimans², Rafal Jozefowicz³, Xi Chen⁴, Ilya Sutskever³, Max Welling¹ - Show less +2 more•Institutions (4)

University of Amsterdam¹, OpenAI², Google³, University of California, Berkeley⁴

01 Jan 2016

TL;DR: In experiments with natural images, it is demonstrated that autoregressive flow leads to significant performance gains and is well applicable to models with high-dimensional latent spaces, such as convolutional generative models.

...read moreread less

Abstract: We propose a simple and scalable method for improving the flexibility of variational inference through a transformation with autoregressive neural networks. Autoregressive neural networks, such as RNNs or the PixelCNN, are very powerful models and potentially interesting for use as variational posterior approximation. However, ancestral sampling in such networks is a long sequential operation, and therefore typically very slow on modern parallel hardware, such as GPUs. We show that by inverting autoregressive neural networks we can obtain equally powerful posterior models from which we can sample efficiently on modern hardware. We show that such data transformations, inverse autoregressive flows (IAF), can be used to transform a simple distribution over the latent variables into a much more flexible distribution, while still allowing us to compute the resulting variables' probability density function. The method is simple to implement, can be made arbitrarily flexible and, in contrast with previous work, is well applicable to models with high-dimensional latent spaces, such as convolutional generative models. The method is applied to a novel deep architecture of variational auto-encoders. In experiments with natural images, we demonstrate that autoregressive flow leads to significant performance gains.

...read moreread less

Showing papers by "Xi Chen published in 2016"