Sliced Wasserstein Distance for Learning Gaussian Mixture Models
Soheil Kolouri,Gustavo K. Rohde,Heiko Hoffmann +2 more
- pp 3427-3436
Reads0
Chats0
TLDR
This work proposes an alternative formulation for estimating the GMM parameters using the sliced Wasserstein distance, which gives rise to a new algorithm that can estimate high-dimensional data distributions more faithfully than the EM algorithm.Abstract:
Gaussian mixture models (GMM) are powerful parametric tools with many applications in machine learning and computer vision. Expectation maximization (EM) is the most popular algorithm for estimating the GMM parameters. However, EM guarantees only convergence to a stationary point of the log-likelihood function, which could be arbitrarily worse than the optimal solution. Inspired by the relationship between the negative log-likelihood function and the Kullback-Leibler (KL) divergence, we propose an alternative formulation for estimating the GMM parameters using the sliced Wasserstein distance, which gives rise to a new algorithm. Specifically, we propose minimizing the sliced-Wasserstein distance between the mixture model and the data distribution with respect to the GMM parameters. In contrast to the KL-divergence, the energy landscape for the sliced-Wasserstein distance is more well-behaved and therefore more suitable for a stochastic gradient descent scheme to obtain the optimal GMM parameters. We show that our formulation results in parameter estimates that are more robust to random initializations and demonstrate that it can estimate high-dimensional data distributions more faithfully than the EM algorithm.read more
Citations
More filters
Proceedings ArticleDOI
Sliced Wasserstein Discrepancy for Unsupervised Domain Adaptation
TL;DR: The proposed sliced Wasserstein discrepancy (SWD) is designed to capture the natural notion of dissimilarity between the outputs of task-specific classifiers and enables efficient distribution alignment in an end-to-end trainable fashion.
Posted Content
Wasserstein Distributionally Robust Optimization: Theory and Applications in Machine Learning
TL;DR: This tutorial argues that Wasserstein distributionally robust optimization has interesting ramifications for statistical learning and motivates new approaches for fundamental learning tasks such as classification, regression, maximum likelihood estimation or minimum mean square error estimation, among others.
Posted Content
Max-Sliced Wasserstein Distance and its use for GANs
Ishan Deshpande,Yuan-Ting Hu,Ruoyu Sun,Ayis Pyrros,Nasir Siddiqui,Sanmi Koyejo,Zhizhen Zhao,David Forsyth,Alexander G. Schwing +8 more
TL;DR: This work demonstrates that the recently proposed sliced Wasserstein distance trains GANs on high-dimensional images up to a resolution of 256x256 easily and develops the max-sliced Wasserenstein distance, which enjoys compelling sample complexity while reducing projection complexity, albeit necessitating a max estimation.
Proceedings ArticleDOI
Generative Multiplane Images: Making a 2D GAN 3D-Aware
TL;DR: This work modifies a classical GAN, i.e .
Posted Content
Sliced Wasserstein Discrepancy for Unsupervised Domain Adaptation
TL;DR: In this article, a sliced Wasserstein discrepancy (SWD) is proposed to capture the natural notion of dissimilarity between the outputs of task-specific classifiers, which provides a geometrically meaningful guidance to detect target samples that are far from the support of the source.
References
More filters
Proceedings Article
Adam: A Method for Stochastic Optimization
Diederik P. Kingma,Jimmy Ba +1 more
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Journal ArticleDOI
Maximum likelihood from incomplete data via the EM algorithm
Journal ArticleDOI
Generative Adversarial Nets
Ian Goodfellow,Jean Pouget-Abadie,Mehdi Mirza,Bing Xu,David Warde-Farley,Sherjil Ozair,Aaron Courville,Yoshua Bengio +7 more
TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.
Book
Machine Learning : A Probabilistic Perspective
TL;DR: This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach, and is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.
Proceedings ArticleDOI
Deep Learning Face Attributes in the Wild
TL;DR: A novel deep learning framework for attribute prediction in the wild that cascades two CNNs, LNet and ANet, which are fine-tuned jointly with attribute tags, but pre-trained differently.