scispace - formally typeset
Search or ask a question

Showing papers by "Marco Cuturi published in 2022"


Proceedings ArticleDOI
28 Jun 2022
TL;DR: C OND OT is introduced, an approach to estimate OT maps conditioned on a context variable, using several pairs of measures tagged with a context label c i to infer the effect of an arbitrary combination of genetic or therapeutic perturbation on single cells, using only observations of the effects of said perturbations separately.
Abstract: Optimal transport (OT) theory describes general principles to define and select, among many possible choices, the most efficient way to map a probability measure onto another. That theory has been mostly used to estimate, given a pair of source and target probability measures $(\mu, u)$, a parameterized map $T_\theta$ that can efficiently map $\mu$ onto $ u$. In many applications, such as predicting cell responses to treatments, pairs of input/output data measures $(\mu, u)$ that define optimal transport problems do not arise in isolation but are associated with a context $c$, as for instance a treatment when comparing populations of untreated and treated cells. To account for that context in OT estimation, we introduce CondOT, a multi-task approach to estimate a family of OT maps conditioned on a context variable, using several pairs of measures $\left(\mu_i, u_i\right)$ tagged with a context label $c_i$. CondOT learns a global map $\mathcal{T}_\theta$ conditioned on context that is not only expected to fit all labeled pairs in the dataset $\left\{\left(c_i,\left(\mu_i, u_i\right)\right)\right\}$, i.e., $\mathcal{T}_\theta\left(c_i\right) \sharp \mu_i \approx u_i$, but should also generalize to produce meaningful maps $\mathcal{T}_\theta\left(c_{\text {new }}\right)$ when conditioned on unseen contexts $c_{\text {new }}$. Our approach harnesses and provides a novel usage for partially input convex neural networks, for which we introduce a robust and efficient initialization strategy inspired by Gaussian approximations. We demonstrate the ability of CondOT to infer the effect of an arbitrary combination of genetic or therapeutic perturbations on single cells, using only observations of the effects of said perturbations separately.

22 citations


Journal Article
TL;DR: In this paper , the authors propose a new framework to reconstruct a stochastic process using only samples from its marginal distributions, observed at start and end times 0 and T , which is useful to infer population dynamics, e.g., when modeling the time-evolution of cell populations from single-cell sequencing data.
Abstract: We propose a new framework to reconstruct a stochastic process {Pt : t ∈ [0, T ]} using only samples from its marginal distributions, observed at start and end times 0 and T . This reconstruction is useful to infer population dynamics, a crucial challenge, e.g., when modeling the time-evolution of cell populations from single-cell sequencing data. Our general framework encompasses the more specific Schrödinger bridge (SB) problem, where Pt represents the evolution of a thermodynamic system at almost equilibrium. Estimating such bridges is notoriously difficult, motivating our proposal for a novel adaptive scheme called the GSBFLOW. Our goal is to rely on Gaussian approximations of the data to provide the reference stochastic process needed to estimate SB. To that end, we solve the SB problem with Gaussian marginals, for which we provide, as a central contribution, a closed-form solution and SDErepresentation. We use these formulas to define the reference process used to estimate more complex SBs, and show that this does indeed help with its numerical solution. We obtain notable improvements when reconstructing both synthetic processes and single-cell genomics experiments.

10 citations


Proceedings ArticleDOI
15 Jun 2022
TL;DR: It is shown empirically that carefully chosen initializations can be used off-the-shelf, with little to no tuning, and result in consistent speed-ups for a variety of OT problems, and will not bias gradients which are computed with implicit differentiation.
Abstract: While the optimal transport (OT) problem was originally formulated as a linear program, the addition of entropic regularization has proven beneficial both computationally and statistically, for many applications. The Sinkhorn fixed-point algorithm is the most popular approach to solve this regularized problem, and, as a result, multiple attempts have been made to reduce its runtime using, e.g., annealing in the regularization parameter, momentum or acceleration. The premise of this work is that initialization of the Sinkhorn algorithm has received comparatively little attention, possibly due to two preconceptions: since the regularized OT problem is convex, it may not be worth crafting a good initialization, since any is guaranteed to work; secondly, because the outputs of the Sinkhorn algorithm are often unrolled in end-to-end pipelines, a data-dependent initialization would bias Jacobian computations. We challenge this conventional wisdom, and show that data-dependent initializers result in dramatic speed-ups, with no effect on differentiability as long as implicit differentiation is used. Our initializations rely on closed-forms for exact or approximate OT solutions that are known in the 1D, Gaussian or GMM settings. They can be used with minimal tuning, and result in consistent speed-ups for a wide variety of OT problems.

7 citations


Proceedings Article
17 Feb 2022
TL;DR: In this article , the authors show that debiasing can yield better approximations to the Monge map under favorable conditions on P and Q, when the regularization strength is large or the number of samples is small.
Abstract: Estimating optimal transport (OT) maps ( a.k.a. Monge maps) between two measures P and Q is a problem fraught with computational and statistical challenges. A promising approach lies in using the dual potential functions obtained when solving an entropy-regularized OT problem between samples P n and Q n , which can be used to recover an approximately optimal map. The ne-gentropy penalization in that scheme introduces, however, an estimation bias that grows with the regularization strength. A well-known remedy to debias such estimates, which has gained wide pop-ularity among practitioners of regularized OT, is to center them, by subtracting auxiliary problems involving P n and itself, as well as Q n and itself. We do prove that, under favorable conditions on P and Q , debiasing can yield better approximations to the Monge map. However, and perhaps surprisingly, we present a few cases in which debiasing is provably detrimental in a statistical sense, notably when the regularization strength is large or the number of samples is small. These claims are validated experimentally on synthetic and real datasets, and should reopen the debate on whether debiasing is needed when using entropic OT.

7 citations


Proceedings ArticleDOI
24 May 2022
TL;DR: The strong theoretical foundations provided in this paper motivate further studies of the empirical behaviour of LOT estimator, notably on suitable local minima and on improvements on the convergence of the MD scheme using other adaptive choices for step sizes.
Abstract: The matching principles behind optimal transport (OT) play an increasingly important role in machine learning, a trend which can be observed when OT is used to disambiguate datasets in applications (e.g. single-cell genomics) or used to improve more complex methods (e.g. balanced attention in transformers or self-supervised learning). To scale to more challenging problems, there is a growing consensus that OT requires solvers that can operate on millions, not thousands, of points. The low-rank optimal transport (LOT) approach advocated in \cite{scetbon2021lowrank} holds several promises in that regard, and was shown to complement more established entropic regularization approaches, being able to insert itself in more complex pipelines, such as quadratic OT. LOT restricts the search for low-cost couplings to those that have a low-nonnegative rank, yielding linear time algorithms in cases of interest. However, these promises can only be fulfilled if the LOT approach is seen as a legitimate contender to entropic regularization when compared on properties of interest, where the scorecard typically includes theoretical properties (statistical complexity and relation to other methods) or practical aspects (debiasing, hyperparameter tuning, initialization). We target each of these areas in this paper in order to cement the impact of low-rank approaches in computational OT.

6 citations


Proceedings Article
11 Feb 2022
TL;DR: In this article , the Schr\"odinger bridge (SB) problem between Gaussians has been studied in the dynamic setting and closed-form expressions for SBs between Gaussian measures are provided.
Abstract: The static optimal transport $(\mathrm{OT})$ problem between Gaussians seeks to recover an optimal map, or more generally a coupling, to morph a Gaussian into another. It has been well studied and applied to a wide variety of tasks. Here we focus on the dynamic formulation of OT, also known as the Schr\"odinger bridge (SB) problem, which has recently seen a surge of interest in machine learning due to its connections with diffusion-based generative models. In contrast to the static setting, much less is known about the dynamic setting, even for Gaussian distributions. In this paper, we provide closed-form expressions for SBs between Gaussian measures. In contrast to the static Gaussian OT problem, which can be simply reduced to studying convex programs, our framework for solving SBs requires significantly more involved tools such as Riemannian geometry and generator theory. Notably, we establish that the solutions of SBs between Gaussian measures are themselves Gaussian processes with explicit mean and covariance kernels, and thus are readily amenable for many downstream applications such as generative modeling or interpolation. To demonstrate the utility, we devise a new method for modeling the evolution of single-cell genomics data and report significantly improved numerical stability compared to existing SB-based approaches.

6 citations


Proceedings ArticleDOI
18 Apr 2022
TL;DR: This work proposes using matching techniques found in the optimal transport (OT) literature, resulting in images that are able to reflect faithfully a wide diversity of prompts, and provides numerous illustrations showing that OT avoids some of the pit-falls arising from estimating vectors with mean distances.
Abstract: Recent advances in deep learning, such as powerful generative models and joint text-image embeddings, have provided the computational creativity community with new tools, opening new perspectives for artistic pursuits. Text-to-image synthesis approaches that operate by generating images from text cues provide a case in point. These images are generated with a latent vector that is progressively refined to agree with text cues. To do so, patches are sampled within the generated image, and compared with the text prompts in the common text-image embedding space; The latent vector is then updated, using gradient descent, to reduce the mean (average) distance between these patches and text cues. While this approach provides artists with ample freedom to customize the overall appear- ance of images, through their choice in generative models, the reliance on a simple criterion (mean of distances) often causes mode collapse: The entire image is drawn to the average of all text cues, thereby losing their diversity. To address this issue, we propose using matching techniques found in the optimal transport (OT) literature, resulting in images that are able to reflect faithfully a wide diversity of prompts. We provide numerous illustrations showing that OT avoids some of the pit-falls arising from estimating vectors with mean distances, and demonstrate the capacity of our proposed method to perform better in experiments, qualitatively and quantitatively.

1 citations


Posted ContentDOI
25 Mar 2022-bioRxiv
TL;DR: This work reformulates the MMD-MA optimization problem using linear algebra and solves it with KeOps, a CUDA framework for symbolic matrix computation in Python, and shows that LSMMD-MA scales to a million cells in each modality, two orders of magnitude greater than existing implementations.
Abstract: Motivation Modality matching in single-cell omics data analysis—i.e., matching cells across data sets collected using different types of genomic assays—has become an important problem, because unifying perspectives across different technologies holds the promise of yielding biological and clinical discoveries. However, single-cell dataset sizes can now reach hundreds of thousands to millions of cells, which remains out of reach for most multi-modal computational methods. Results We propose LSMMD-MA, a large-scale Python implementation of the MMD-MA method for multimodal data integration. In LSMMD-MA we reformulate the MMD-MA optimization problem using linear algebra and solve it with KeOps, a CUDA framework for symbolic matrix computation in Python. We show that LSMMD-MA scales to a million cells in each modality, two orders of magnitude greater than existing implementations. Availability LSMMD-MA is freely available at https://github.com/google-research/large_scale_mmdma Contact lpapaxanthos@google.com

Journal ArticleDOI
TL;DR: This work proposes a new framework to carry out averaging of these datasets, with the goal of synthesizing a representative template trajectory from multiple trajectories, and leverages a smooth formulation of DTW that is shown to capture temporal shifts, and UOT to handle both variations in space and size.
Abstract: Several fields in science, from genomics to neuroimaging, require monitoring populations (measures) that evolve with time. These complex datasets, describing dynamics with both time and spatial components, pose new challenges for data analysis. We propose in this work a new framework to carry out averaging of these datasets, with the goal of synthesizing a representative template trajectory from multiple trajectories. We show that this requires addressing three sources of invariance: shifts in time, space, and total population size (or mass/amplitude). Here we draw inspiration from dynamic time warping (DTW), optimal transport (OT) theory and its unbalanced extension (UOT) to propose a criterion that can address all three issues. This proposal leverages a smooth formulation of DTW (Soft-DTW) that is shown to capture temporal shifts, and UOT to handle both variations in space and size. Our proposed loss can be used to define spatio-temporal barycenters as Fréchet means. Using Fenchel duality, we show how these barycenters can be computed efficiently, in parallel, via a novel variant of entropy-regularized debiased UOT. Experiments on handwritten letters and brain imaging data confirm our theoretical findings and illustrate the effectiveness of the proposed loss for spatio-temporal data.