scispace - formally typeset
Search or ask a question

Showing papers by "Marco Cuturi published in 2013"


Proceedings Article
Marco Cuturi1
05 Dec 2013
TL;DR: This work smooths the classic optimal transport problem with an entropic regularization term, and shows that the resulting optimum is also a distance which can be computed through Sinkhorn's matrix scaling algorithm at a speed that is several orders of magnitude faster than that of transport solvers.
Abstract: Optimal transport distances are a fundamental family of distances for probability measures and histograms of features. Despite their appealing theoretical properties, excellent performance in retrieval tasks and intuitive formulation, their computation involves the resolution of a linear program whose cost can quickly become prohibitive whenever the size of the support of these measures or the histograms' dimension exceeds a few hundred. We propose in this work a new family of optimal transport distances that look at transport problems from a maximum-entropy perspective. We smooth the classic optimal transport problem with an entropic regularization term, and show that the resulting optimum is also a distance which can be computed through Sinkhorn's matrix scaling algorithm at a speed that is several orders of magnitude faster than that of transport solvers. We also show that this regularized distance improves upon classic optimal transport distances on the MNIST classification problem.

2,681 citations


Posted Content
TL;DR: The Wasserstein distance is proposed to be smoothed with an entropic regularizer and recover in doing so a strictly convex objective whose gradients can be computed for a considerably cheaper computational cost using matrix scaling algorithms.
Abstract: We present new algorithms to compute the mean of a set of empirical probability measures under the optimal transport metric. This mean, known as the Wasserstein barycenter, is the measure that minimizes the sum of its Wasserstein distances to each element in that set. We propose two original algorithms to compute Wasserstein barycenters that build upon the subgradient method. A direct implementation of these algorithms is, however, too costly because it would require the repeated resolution of large primal and dual optimal transport problems to compute subgradients. Extending the work of Cuturi (2013), we propose to smooth the Wasserstein distance used in the definition of Wasserstein barycenters with an entropic regularizer and recover in doing so a strictly convex objective whose gradients can be computed for a considerably cheaper computational cost using matrix scaling algorithms. We use these algorithms to visualize a large family of images and to solve a constrained clustering problem.

155 citations


Posted Content
TL;DR: This work smooths the classical optimal transportation problem with an entropic regularization term, and shows that the resulting optimum is also a distance which can be computed through Sinkhorn-Knopp's matrix scaling algorithm at a speed that is several orders of magnitude faster than that of transportation solvers.
Abstract: Optimal transportation distances are a fundamental family of parameterized distances for histograms Despite their appealing theoretical properties, excellent performance in retrieval tasks and intuitive formulation, their computation involves the resolution of a linear program whose cost is prohibitive whenever the histograms' dimension exceeds a few hundreds We propose in this work a new family of optimal transportation distances that look at transportation problems from a maximum-entropy perspective We smooth the classical optimal transportation problem with an entropic regularization term, and show that the resulting optimum is also a distance which can be computed through Sinkhorn-Knopp's matrix scaling algorithm at a speed that is several orders of magnitude faster than that of transportation solvers We also report improved performance over classical optimal transportation distances on the MNIST benchmark problem

122 citations


Proceedings Article
Marco Cuturi1
01 Jan 2013

46 citations


Proceedings Article
16 Jun 2013
TL;DR: It is shown that many of the optimization problems arising in a multivariate stochastic process can be solved exactly using semidefinite programming and a variant of the S-lemma.
Abstract: Starting from a sample path of a multivariate stochastic process, we study several techniques to isolate linear combinations of the variables with a maximal amount of mean reversion, while constraining the variance of the combination to be larger than a given threshold. We show that many of the optimization problems arising in this context can be solved exactly using semidefinite programming and a variant of the S-lemma. In finance, these methods can be used to isolate statistical arbitrage opportunities, i.e. mean reverting baskets with enough variance to overcome market friction. In a more general setting, mean reversion and its generalizations can also be used as a proxy for stationarity, while variance simply measures signal strength.

17 citations


Journal Article
TL;DR: This paper provides algorithms to estimate the parameters of a family of embeddings proposed by Aitchison (1982) to map the probability simplex onto a suitable Euclidean space, and shows that these algorithms lead to representations that outperform alternative approaches to compare histograms.
Abstract: Learning distances that are specically designed to compare histograms in the probability simplex has recently attracted the attention of the community. Learning such distances is important because most machine learning problems involve bags of features rather than simple vectors. Ample empirical evidence suggests that the Euclidean distance in general and Mahalanobis metric learning in particular may not be suitable to quantify distances between points in the simplex. We propose in this paper a new contribution to address this problem by generalizing a family of embeddings proposed by Aitchison (1982) to map the probability simplex onto a suitable Euclidean space. We provide algorithms to estimate the parameters of such maps, and show that these algorithms lead to representations that outperform alternative approaches to compare histograms.

3 citations


Proceedings Article
21 Oct 2013
TL;DR: In this article, the authors propose a new contribution to address this problem by generalizing a family of embeddings proposed by Aitchison (1982) to map the probability simplex onto a suitable Euclidean space.
Abstract: Learning distances that are specically designed to compare histograms in the probability simplex has recently attracted the attention of the community. Learning such distances is important because most machine learning problems involve bags of features rather than simple vectors. Ample empirical evidence suggests that the Euclidean distance in general and Mahalanobis metric learning in particular may not be suitable to quantify distances between points in the simplex. We propose in this paper a new contribution to address this problem by generalizing a family of embeddings proposed by Aitchison (1982) to map the probability simplex onto a suitable Euclidean space. We provide algorithms to estimate the parameters of such maps, and show that these algorithms lead to representations that outperform alternative approaches to compare histograms.

1 citations