Open AccessProceedings Article
PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications
TLDR
This work discusses the implementation of PixelCNNs, a recently proposed class of powerful generative models with tractable likelihood that contains a number of modifications to the original model that both simplify its structure and improve its performance.Abstract:
PixelCNNs are a recently proposed class of powerful generative models with tractable likelihood. Here we discuss our implementation of PixelCNNs which we make available at this https URL. Our implementation contains a number of modifications to the original model that both simplify its structure and improve its performance. 1) We use a discretized logistic mixture likelihood on the pixels, rather than a 256-way softmax, which we find to speed up training. 2) We condition on whole pixels, rather than R/G/B sub-pixels, simplifying the model structure. 3) We use downsampling to efficiently capture structure at multiple resolutions. 4) We introduce additional short-cut connections to further speed up optimization. 5) We regularize the model using dropout. Finally, we present state-of-the-art log likelihood results on CIFAR-10 to demonstrate the usefulness of these modifications.read more
Citations
More filters
Posted Content
Denoising Diffusion Probabilistic Models
TL;DR: High quality image synthesis results are presented using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics, which naturally admit a progressive lossy decompression scheme that can be interpreted as a generalization of autoregressive decoding.
Proceedings Article
Zero-Shot Text-to-Image Generation
Aditya Ramesh,Mikhail Pavlov,Gabriel Goh,Scott Gray,Chelsea Voss,Alec Radford,Mark Chen,Ilya Sutskever +7 more
TL;DR: This work describes a simple approach based on a transformer that autoregressively models the text and image tokens as a single stream of data that is competitive with previous domain-specific models when evaluated in a zero-shot fashion.
Journal ArticleDOI
Adversarial Examples: Attacks and Defenses for Deep Learning
TL;DR: In this paper, the authors review recent findings on adversarial examples for DNNs, summarize the methods for generating adversarial samples, and propose a taxonomy of these methods.
Posted Content
Generating Long Sequences with Sparse Transformers.
TL;DR: This paper introduces sparse factorizations of the attention matrix which reduce this to $O(n)$, and generates unconditional samples that demonstrate global coherence and great diversity, and shows it is possible in principle to use self-attention to model sequences of length one million or more.
Proceedings Article
Masked Autoregressive Flow for Density Estimation
TL;DR: Masked autoregressive flow as mentioned in this paper is a generalization of Inverse Autoregressive Flow that uses the random numbers that the model uses internally when generating data for density estimation.
References
More filters
Journal Article
Dropout: a simple way to prevent neural networks from overfitting
TL;DR: It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.
Proceedings Article
Auto-Encoding Variational Bayes
Diederik P. Kingma,Max Welling +1 more
TL;DR: A stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case is introduced.
Posted Content
U-Net: Convolutional Networks for Biomedical Image Segmentation
TL;DR: It is shown that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.
Posted Content
Stochastic Backpropagation and Approximate Inference in Deep Generative Models
TL;DR: In this article, a generative and recognition model is proposed to represent approximate posterior distributions and act as a stochastic encoder of the data, which allows for joint optimisation of the parameters of both the generative model and the recognition model.
WaveNet: A Generative Model for Raw Audio
Aaron van den Oord,Sander Dieleman,Heiga Zen,Karen Simonyan,Oriol Vinyals,Alex Graves,Nal Kalchbrenner,Andrew W. Senior,Koray Kavukcuoglu +8 more
TL;DR: WaveNet, a deep neural network for generating raw audio waveforms, is introduced; it is shown that it can be efficiently trained on data with tens of thousands of samples per second of audio, and can be employed as a discriminative model, returning promising results for phoneme recognition.