scispace - formally typeset
Open AccessPosted Content

Hidden Convexity of Wasserstein GANs: Interpretable Generative Models with Closed-Form Solutions

TLDR
In this paper, the authors analyze the training of Wasserstein GANs with two-layer neural network discriminators through the lens of convex duality, and for a variety of generators expose the conditions under which GAN can be solved exactly with convex optimization approaches, or can be represented as convexconcave games.
Abstract
Generative Adversarial Networks (GANs) are commonly used for modeling complex distributions of data. Both the generators and discriminators of GANs are often modeled by neural networks, posing a non-transparent optimization problem which is non-convex and non-concave over the generator and discriminator, respectively. Such networks are often heuristically optimized with gradient descent-ascent (GDA), but it is unclear whether the optimization problem contains any saddle points, or whether heuristic methods can find them in practice. In this work, we analyze the training of Wasserstein GANs with two-layer neural network discriminators through the lens of convex duality, and for a variety of generators expose the conditions under which Wasserstein GANs can be solved exactly with convex optimization approaches, or can be represented as convex-concave games. Using this convex duality interpretation, we further demonstrate the impact of different activation functions of the discriminator. Our observations are verified with numerical results demonstrating the power of the convex interpretation, with applications in progressive training of convex architectures corresponding to linear generators and quadratic-activation discriminators for CelebA image generation. The code for our experiments is available at this https URL.

read more

Citations
More filters
Posted Content

Convex Geometry and Duality of Over-parameterized Neural Networks

TL;DR: A convex analytic framework for ReLU neural networks is developed which elucidates the inner workings of hidden neurons and their function space characteristics and establishes a connection to $\ell_0$-$\ell_1$ equivalence for neural networks analogous to the minimal cardinality solutions in compressed sensing.
Posted Content

Path Regularization: A Convexity and Sparsity Inducing Regularization for Parallel ReLU Networks

TL;DR: In this article, a path regularized parallel architecture with multiple ReLU sub-networks is considered, and it is shown that the computational complexity required to globally optimize the equivalent convex problem is polynomial-time with respect to the number of data samples and feature dimension.
Posted Content

The Convex Geometry of Backpropagation: Neural Network Gradient Flows Converge to Extreme Points of the Dual Convex Program

TL;DR: In this article, the authors study non-convex subgradient flows for training two-layer ReLU neural networks from a convex geometry and duality perspective, and derive a sufficient condition on the dual variables which ensures that the stationary points of the non-Convex objective are the KKT points of convex objective.
References
More filters
Posted Content

Adam: A Method for Stochastic Optimization

TL;DR: In this article, the adaptive estimates of lower-order moments are used for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimate of lowerorder moments.
Posted Content

PyTorch: An Imperative Style, High-Performance Deep Learning Library

TL;DR: PyTorch as discussed by the authors is a machine learning library that provides an imperative and Pythonic programming style that makes debugging easy and is consistent with other popular scientific computing libraries, while remaining efficient and supporting hardware accelerators such as GPUs.
Proceedings ArticleDOI

Image-to-Image Translation with Conditional Adversarial Networks

TL;DR: Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.
Posted Content

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

TL;DR: This work introduces a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrates that they are a strong candidate for unsupervised learning.
Proceedings ArticleDOI

A Style-Based Generator Architecture for Generative Adversarial Networks

TL;DR: This paper proposed an alternative generator architecture for GANs, borrowing from style transfer literature, which leads to an automatically learned, unsupervised separation of high-level attributes (e.g., pose and identity when trained on human faces) and stochastic variation in the generated images.
Related Papers (5)