scispace - formally typeset
Open AccessProceedings ArticleDOI

Sliced Wasserstein Distance for Learning Gaussian Mixture Models

Reads0
Chats0
TLDR
This work proposes an alternative formulation for estimating the GMM parameters using the sliced Wasserstein distance, which gives rise to a new algorithm that can estimate high-dimensional data distributions more faithfully than the EM algorithm.
Abstract
Gaussian mixture models (GMM) are powerful parametric tools with many applications in machine learning and computer vision. Expectation maximization (EM) is the most popular algorithm for estimating the GMM parameters. However, EM guarantees only convergence to a stationary point of the log-likelihood function, which could be arbitrarily worse than the optimal solution. Inspired by the relationship between the negative log-likelihood function and the Kullback-Leibler (KL) divergence, we propose an alternative formulation for estimating the GMM parameters using the sliced Wasserstein distance, which gives rise to a new algorithm. Specifically, we propose minimizing the sliced-Wasserstein distance between the mixture model and the data distribution with respect to the GMM parameters. In contrast to the KL-divergence, the energy landscape for the sliced-Wasserstein distance is more well-behaved and therefore more suitable for a stochastic gradient descent scheme to obtain the optimal GMM parameters. We show that our formulation results in parameter estimates that are more robust to random initializations and demonstrate that it can estimate high-dimensional data distributions more faithfully than the EM algorithm.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Sliced Wasserstein Discrepancy for Unsupervised Domain Adaptation

TL;DR: The proposed sliced Wasserstein discrepancy (SWD) is designed to capture the natural notion of dissimilarity between the outputs of task-specific classifiers and enables efficient distribution alignment in an end-to-end trainable fashion.
Posted Content

Wasserstein Distributionally Robust Optimization: Theory and Applications in Machine Learning

TL;DR: This tutorial argues that Wasserstein distributionally robust optimization has interesting ramifications for statistical learning and motivates new approaches for fundamental learning tasks such as classification, regression, maximum likelihood estimation or minimum mean square error estimation, among others.
Posted Content

Max-Sliced Wasserstein Distance and its use for GANs

TL;DR: This work demonstrates that the recently proposed sliced Wasserstein distance trains GANs on high-dimensional images up to a resolution of 256x256 easily and develops the max-sliced Wasserenstein distance, which enjoys compelling sample complexity while reducing projection complexity, albeit necessitating a max estimation.
Proceedings ArticleDOI

Generative Multiplane Images: Making a 2D GAN 3D-Aware

TL;DR: This work modifies a classical GAN, i.e .
Posted Content

Sliced Wasserstein Discrepancy for Unsupervised Domain Adaptation

TL;DR: In this article, a sliced Wasserstein discrepancy (SWD) is proposed to capture the natural notion of dissimilarity between the outputs of task-specific classifiers, which provides a geometrically meaningful guidance to detect target samples that are far from the support of the source.
References
More filters
Proceedings Article

Adam: A Method for Stochastic Optimization

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Journal ArticleDOI

Generative Adversarial Nets

TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.
Book

Machine Learning : A Probabilistic Perspective

TL;DR: This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach, and is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.
Proceedings ArticleDOI

Deep Learning Face Attributes in the Wild

TL;DR: A novel deep learning framework for attribute prediction in the wild that cascades two CNNs, LNet and ANet, which are fine-tuned jointly with attribute tags, but pre-trained differently.
Related Papers (5)