# Inverse Optimal Transport

25 Feb 2020-Siam Journal on Applied Mathematics (Society for Industrial and Applied Mathematics)-Vol. 80, Iss: 1, pp 599-619

TL;DR: This paper proposes a systematic approach to infer unknown costs from noisy observations of optimal transportation plans, and uses a graph-based formulation of the problem, with countries at the nodes of graphs and non-zero weighted adjacencies only on edges between countries which share a border.

Abstract: Discrete optimal transportation problems arise in various contexts in engineering, the sciences, and the social sciences. Often the underlying cost criterion is unknown, or only partly known, and t...

Topics: Estimation theory (50%)

## Summary (2 min read)

Jump to: [2.2.] – [2.3. Inverse problem and identifiability.] – [3. Algorithms for inversion.] – [4. Numerical results.] – [4.2. Graph-based cost.] – [4.3. Toeplitz cost.] and [5.]

### 2.2.

- Problems (2.1) and (2.4) are formulated for general cost matrices C---the specific structure of C depends on the application considered.
- The authors will investigate the behavior of the proposed methodologies for C being (i) Toeplitz (ii) nonsymmetric (iii) determined by an underlying graph structure.
- Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php.
- This resulting discrete underlying structure, which relates the cost matrix to a directed graph representing the migration network between countries, is detailed in the following.
- The minimal cost of moving between vertices of a graph can be computed using Dijkstra's algorithm, recalled in section 3.1 below.

### 2.3. Inverse problem and identifiability.

- The authors discuss how to sample from the posterior, using a random walk Metropolis (RwM) method, in section 3.2.

### 3. Algorithms for inversion.

- In 1970 Hastings introduced a wide class of MCMC methods, now known as Metropolis--Hastings algorithms [14] , and in principle this provides a wide range of variants on RwM that may be used for their Bayesian formulation of inverse OT.
- In high dimensional spaces it can be hard to design proposals which are accepted with a reasonable acceptance probability, and the idea of fixing subsets of the variables, and proposing moves in the remainder, is natural.
- At each iteration one (or several) components of the unknown parameter is updated by sampling from its full conditional probability distribution and cycling through all the variables.
- The method may be relaxed to allow a RwM step from the conditional probability distribution, rather than a full sample.
- Note that in general, for all the methods described here, any proposal which decreases the value of \Phi and remains in \sansU is accepted with probability one.

### 4. Numerical results.

- The authors start by presenting estimates for the European network shown in Figure 2 .1.
- Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php that edges connect countries sharing a border.
- The authors use the estimated transportation map reported in [21] and assume that the noise level is 4\%.
- The authors perform two runs of the RwM-within-Gibbs algorithm, using the exact solver in the first and Sinkhorn's algorithm with \epsilon = 0.04 in the second.
- The acceptance rate of the exact solver is 50.8\% ((53.8\%, 53.7\%, 44.9\%) for the different components u, v, and f ); for Sinkhorn the authors have 82.9\% (84.7\%, 85.5\%, 78.6\%).

### 4.2. Graph-based cost.

- Next the authors investigate the sensitivity of the results with respect to the forward solver used in Algorithm 3.1.
- The authors run two RwM simulations---the first one using the exact solver and the second one using the Sinkhorn algorithm.
- The authors observe that both runs give similar posterior distributions if they choose the regularization parameter \epsilon in a sensible way; see Figure 4 .5.
- Generally speaking it seems advisable to choose it similar to the noise level (as in the shown results).
- The authors will investigate the impact of the regularization parameter in the next subsection in more detail.

### 4.3. Toeplitz cost.

- Next the authors generate the noisy transportation map using the Sinkhorn algorithm.
- In each case the authors perform two different RwM runs, first using the Sinkhorn algorithm and then the exact solver.
- The respective posterior distributions are shown in Figures 4.12 and 4.13.
- The authors recall that the Sinkhorn algorithm solves the respective regularized optimization problem, which has a unique minimum.
- The authors investigate the identification from generated data in case of 4\% noise.

### 5.

- This paper introduces a systematic approach to infer unknown costs from noisy observations of OT plans.
- It is based on the Bayesian framework for inverse problems and allows us to quantify uncertainty in the obtained estimates; however, the methodology may also be viewed as a stochastic optimization procedure in its own right, tuning the unknowns so that the OT plan better fits the data.
- In this context reported annual migration flow statistics can be interpreted as noisy observations of OT plans with cost related to the geographical position of countries.
- The numerical investigations show that the proposed methodologies are robust and consistent for different cost functions and parametrizations.

Did you find this useful? Give us your feedback

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

SIAM J. APPL. MATH.

c

2020 Society for Industrial and Applied Mathematics

Vol. 80, No. 1, pp. 599–619

INVERSE OPTIMAL TRANSPORT

∗

ANDREW M. STUART

†

AND MARIE-THERESE WOLFRAM

‡

Abstract. Discrete optimal transportation problems arise in various contexts in engineering,

the sciences, and the social sciences. Often the underlying cost criterion is unknown, or only partly

known, and the observed optimal solutions are corrupted by noise. In this paper we propose a

systematic approach to infer unknown costs from noisy observations of optimal transportation plans.

The algorithm requires only the ability to solve the forward optimal transport problem, which is a

linear program, and to generate random numbers. It has a Bayesian interpretation and may also be

viewed as a form of stochastic optimization. We illustrate the developed methodologies using the

example of international migration ﬂows. Reported migration ﬂow data captures (noisily) the number

of individuals moving from one country to another in a given period of time. It can be interpreted as

a noisy observation of an optimal transportation map, with costs related to the geographical position

of countries. We use a graph-based formulation of the problem, with countries at the nodes of graphs

and nonzero weighted adjacencies only on edges between countries which share a border. We use

the proposed algorithm to estimate the weights, which represent cost of transition, and to quantify

uncertainty in these weights.

Key words. optimal transport, international migration ﬂows, linear program, parameter

estimation, Bayesian inversion

AMS subject classiﬁcations. 90C08, 62F15, 65K10

DOI. 10.1137/19M1261122

1. Introduction.

1.1. Background. There are many problems in engineering, the sciences, and

the social sciences, in which an input is transformed into output in an optimal way

according to a cost criterion. We are interested in problems where the transforma-

tion from input to output is known, and the objective is to infer the cost criterion

which drives this transformation. Our primary motivation is optimal transport (OT)

problems in which the transport plan is known but the cost is not. More generally

linear programs in which the solution is known, but the cost function and constraints

are to be determined, fall into the category of problems to which the methodology

introduced in this paper applies. We illustrate the type of problem of interest by

means of an example.

Example: International migration. Quantifying migration ﬂows between

countries is essential to understand contemporary migration ﬂow patterns. Typically

two types of migration statistics are collected—ﬂow and stock data. Migration stock

data states the number of foreign born individuals present in a country at a given time

and is usually based on population censuses. Stock data is available for almost all

countries in the world. Migration ﬂow data captures the number of migrants entering

and leaving (inﬂow and outﬂow, respectively) a country over the course of a speciﬁc

period, such as one year; see [1]. It is collected by most developed countries, but no

∗

Received by the editors May 10, 2019; accepted for publication (in revised form) December 4,

2019; published electronically February 25, 2020.

https://doi.org/10.1137/19M1261122

Funding: The work of the ﬁrst author was supported by U.S. National Science Foundation

(NSF) under grant DMS 1818977 and by AFOSR grant FA9550-17-1-0185. The work of the second

author was partially supported by the Royal Society International Exchanges grant IE 161662.

†

California Institute of Technology, Pasadena, CA 91125 (astuart@caltech.edu).

‡

University of Warwick, Coventry CV4 7AL, UK, and RICAM, Austrian Academy of Sciences,

4040 Linz, Australia (m.wolfram@warwick.ac.uk).

599

Downloaded 07/23/20 to 131.215.249.116. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

600 ANDREW M. STUART AND MARIE-THERESE WOLFRAM

Table 1.1

Harmonized migration ﬂow statistics for the period 2002–2007; see [9].

From To

CZ DE DK LU NL PL

CZ R 0 9,218 262 4 511 45

S 0 560 24 3 81 583

DE R 1,362 0 4,001 454 9,182 2,876

S 8,104 0 3,095 1,686 9,293 100,827

DK R 46 2,687 0 11 475 34

S 179 2,612 0 1,387 602 833

LU R 2 2,282 162 0 161 5

S 13 911 99 0 97 23

NL R 255 13,681 864 27 0 163

S 298 10,493 533 191 0 1,020

PL R 1,608 136,927 2,436 19 5,744 0

S 63 14,417 111 23 577 0

Tot: S 3,273 164,795 7,725 515 16,073 3,123

R 8,657 28,993 3,862 2,041 10,650 103,286

international standards are deﬁned. For example, the time of residence after which a

person counts as an international migrant varies from country to country. Because of

the diﬀerent deﬁnitions and data collection methods, these statistics can be hard to

compare. International agencies, such as the United Nations Statistics Division or the

Statistical Oﬃce of the European Union (Eurostat), publish annual migration ﬂow

estimates. These estimates are often based on Poisson or Bayesian linear regression.

For more information about the estimation of migration ﬂows using ﬂow or stock

statistics we refer to [2, 4, 20, 21]. For the purposes of this paper the main issue to

appreciate is that migration data is available but should be viewed as noisy.

Flow data is typically presented in an origin-destination matrix, in which the

(i, j)th oﬀ-diagonal entry contains the number of people moving from country i to

country j in a given period of time. This origin-destination data can be reported by

both the sending (S) and the receiving (R) countries. Hence two migration ﬂow tables

are available, often desegregated by sex and age groups. Table 1.1 shows harmonized

data, which was preprocessed to improve comparability, reported by 6 European coun-

tries for the period 2002–2007. The numbers of the sending and receiving countries

vary signiﬁcantly. For example, Germany reported that 136, 927 people immigrated

from Poland, while Poland reported 14, 417 individuals who left for Germany. These

very diﬀerent numbers naturally raise the question of the true migration ﬂows. In

many settings it is natural to place greater weight on receiving data rather than de-

parture data. But even this data is not subject to uniform standards, and therefore

providing reliable estimates and quantifying uncertainty is of great interest.

We interpret the reported origin-destination data maps (when appropriately nor-

malized) as a noisy estimate of a transport plan arising from an OT problem with

unknown cost. It is then natural to try and infer the transportation cost, as it carries

information about the migration process.

The preceding example serves as motivation, and we will come back to it through-

out this paper. However, we reemphasize that the proposed identiﬁcation method-

ologies that we introduce in this paper can be used for general inverse OT and linear

programming problems; further examples will serve to illustrate this fact.

1.2. Literature review. OT originates with the French mathematician Gas-

pard Monge who, in 1781, investigated the problem of ﬁnding the most cost-eﬀective

Downloaded 07/23/20 to 131.215.249.116. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

INVERSE OPTIMAL TRANSPORT 601

way to move a pile of sand to ﬁll a hole of the same volume. Kantorovich intro-

duced the modern (relaxed) formulation of the problem, in which mass can be split,

in 1942. In more mathematical terms Kantorovich considered the following setup:

given two positive measures (of equal mass) and a cost function, ﬁnd the transporta-

tion map that moves one measure to the other minimizing the transport cost. The

corresponding inﬁmum induces a distance between these two measures—the so-called

Wasserstein distance. The Wasserstein distance plays an important role in probability

theory, partial diﬀerential equations, and many other ﬁelds in applied mathematics

[27, 30]. Furthermore the techniques and methodologies developed in OT have found

application in a variety of scientiﬁc disciplines including data science, economics,

imaging, and meteorology [13].

With the spread and application of OT into diﬀerent scientiﬁc disciplines the

interest in computational methodologies has increased. Commonly used numerical

methods broadly speaking fall into two categories: linear programming [8] and meth-

ods speciﬁc to the structure of OT. Linear programs are classic problems which have

been extensively studied in the ﬁeld of optimization and operations research. Many

computational methodologies have been developed, such as the famous simplex algo-

rithm (and its many variants), the Hungarian algorithm, and the auction algorithm.

All these methods work well for small to medium sized problems but are too slow

in modern applications such as imaging or supply chain management. Recently a

signiﬁcant speed up, of linear programming, was achieved by considering a regular-

ized OT problem, leading to the Sinkhorn algorithm (or variants thereof) in which an

additional entropic regularization term is added to the objective function; this allows

eﬃcient computation of the corresponding minimizer and induces a trade-oﬀ between

ﬁdelity to the original problem and computational speed. This family of eﬃcient

algorithms resulted in the rapid advancement of computational OT in recent years,

especially in the context of imaging and data science; see [7, 19, 22].

Inverse problems for linear programming received considerable interest in the

engineering literature. The paper [3], building on earlier work in [32], studies the

problem by seeking a cost function nearest to a given one in

p

for which the given

solution is an optimal linear program; this problem is itself a linear program in the

case p = 1. The formulation of an inverse problem for linear programming in [10]

took a slightly more general perspective, as it does not assume that the given data

necessarily arises as the solution of a linear program, and rather seeks to minimize

the distance to the solution set of a linear program. Recent application of the inverse

problem for linear programming may be found in [26], for example. The most closely

related work to this paper is the recent publication by Li et al. (see [16]) in which the

authors minimize the log likelihood function to estimate the underlying cost. These

papers on inverse linear programming are foundational and have opened up a great

deal of subsequent research. However, the methods used in them do not account

in a systematic way for noise in the data provided and for the incorporation of prior

information. We address these issues by adopting a Bayesian formulation of the inverse

problem for linear programming, concentrating on OT in particular; the ideas are

readily generalized to inverse linear programming in general. The Bayesian approach

not only allows for the quantiﬁcation of uncertainty but also leads to algorithms which

may be viewed as stochastic methods for exploring the space of solutions, constrained

by the observed data. An overview of the computational state of the art for Bayesian

inversion may be found in [15]. The speciﬁc methods that we introduce have the

desirable feature that they require only solution of the forward OT problem and the

ability to generate random numbers.

Downloaded 07/23/20 to 131.215.249.116. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

602 ANDREW M. STUART AND MARIE-THERESE WOLFRAM

1.3. Our contribution. Our contributions to the subject of inverse problems

within linear programming are as follows.

• We formulate inverse OT problems in a Bayesian framework.

• We provide a computational framework for solving inverse OT problems in

an eﬃcient fashion.

• We give a systematic discussion of identiﬁability issues arising for ﬁnite di-

mensional inverse OT.

• We introduce graph-based cost functions for OT, using graph-shortest paths

in an integral way.

• Graph-based OT has considerable potential for application, and we introduce

a new way of studying migration ﬂow data using inverse OT in the graph-

based setting.

We emphasize that, while the graph-based formulation of cost corresponds to

a rather speciﬁc way of designing cost functions for discrete linear programs, the

framework and algorithms developed in this paper apply quite generally to inverse

linear programming and hence to OT in general. We develop the methodology in

general, using graph-based migration ﬂow as a primary illustrative example. In section

2 we deﬁne OT as a linear program, describe the cost criteria considered, and formulate

inverse OT in a Bayesian setting; in this section we discuss the identiﬁability issue

for ﬁnite dimensional inverse OT. Section 3 presents algorithms for the forward and

inverse OT problem, and section 4 contains numerical results.

We will use the following notation throughout this manuscript. Let | · | and ·, ·

denote the Euclidean norm and inner-product on R

n

and the Frobenius norm and

inner-product on R

n×n

. The spaces of probability matrices, probability vectors, and

probability matrices with speciﬁed marginals are deﬁned as

P

n×n

=

B ∈ R

n×n

: B

ij

≥ 0,

n

i,j=1

B

ij

= 1

, P

n

=

u ∈ R

n

: u

j

≥ 0,

n

j=1

u

j

= 1

,

S

p,q

=

B ∈ P

n×n

: B1

1

1 = p, B

T

1

1

1 = q for p, q ∈ P

n

, where 1

1

1 = (1, . . . , 1)

T

∈ R

n

.

2. Inverse OT. In this section we introduce the forward OT problem and dis-

cuss speciﬁc cost criteria, before formulating the respective inverse OT problem in the

Bayesian framework.

2.1. Forward problem. We consider two discrete probability vectors q ∈ P

n

and p ∈ P

n

and a given cost C ∈ P

n×n

. Then the OT problem corresponds to ﬁnding

a map transporting p to q at minimal cost. Note that in OT the cost matrix has

nonnegative entries, which can be normalized to be an element of P

n×n

without loss

of generality. The respective forward OT problem is to ﬁnd

T

∗

∈ argmin

T ∈S

p,q

C, T .(2.1)

Problem (2.1) falls into the more general class of linear programs. Linear programs

(and their many variants) arise in various speciﬁc settings—such as the earth mover's

distance [25] or cost network ﬂows [5]—in diﬀerent scientiﬁc communities. The prob-

lem (2.1) has, by virtue of being a speciﬁc class of linear programs, at least one

solution; this solution lies on the boundary of the feasible set of solutions (deﬁned

by the equality constraints). If the solution is unique, then we deﬁne mapping

F : P

n

× P

n

× P

n×n

→ P

n×n

by

T

∗

= F(p, q, C).(2.2)

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

INVERSE OPTIMAL TRANSPORT 603

In the nonunique setting we deﬁne F(p, q, C) to be a unique element determined by

running a speciﬁc nonrandom algorithm for the linear program to termination, started

at a speciﬁc initial guess.

We now consider (2.1) regularized by the addition of the discrete entropy, an

approach popularized in [7, 19] and which has led to considerable analytical and com-

putational developments. Besides the advantageous analytical and computational as-

pects, the proposed regularization term can be interpreted as an inherent uncertainty

in the cost due to the heterogeneity of agents. Galichon and Salanie [13] propose the

same regularization in the context of marriage market and matching problems. The

resulting problem is

H(T ) = −T, log(T ) + Tr(T ) = −

n

i,j=1

T

i,j

(log T

i,j

− 1),(2.3)

where the matrix logarithm operation is applied elementwise. Then

T

∗

= argmin

T ∈S

p,q

C, T + H(T )

.(2.4)

This problem has a unique minimizer T

∗

, since H(T ) is strongly convex. Following our

previous notation we deﬁne the corresponding mapping by F

: P

n

×P

n

×P

n

→ P

n×n

T

∗

= F

(p, q, C).(2.5)

It is, in contrast to the optimal solution of (2.1), not sparse. It is known that solutions

to (2.4) converge to minimizers of (2.1) as → 0. Determining the rate of convergence

is still an open problem. The special structure of this regularized problem can be

used to construct eﬃcient splitting algorithms. These methods are based on the

equivalent formulation of ﬁnding the projection of the joint coupling with respect to

the Kullback–Leibler divergence

D

KL

(T K) := T, log(T/K) − Tr(T ) + Tr(K) =

n

i,j=1

T

i,j

log

T

i,j

K

i,j

− T

i,j

+ K

i,j

,

where the matrix logarithm and division operations are applied elementwise and K

is the Gibbs kernel

K

i,j

= exp

−

C

i,j

.(2.6)

In particular

T

∗

= argmin

T ∈S

p,q

D

KL

(T K).(2.7)

The Kullback–Leibler divergence can be computed extremely eﬃciently using prox-

imal methods, yielding, for example, the celebrated Sinkhorn algorithm. We will

brieﬂy outline the underlying ideas in section 3.1.

2.2. Cost criteria. Problems (2.1) and (2.4) are formulated for general cost

matrices C—the speciﬁc structure of C depends on the application considered. We

will investigate the behavior of the proposed methodologies for C being

(i) Toeplitz

(ii) nonsymmetric

(iii) determined by an underlying graph structure.

##### Citations

More filters

•

Abstract: Optimal Transport (OT) defines geometrically meaningful "Wasserstein" distances, used in machine learning applications to compare probability distributions. However, a key bottleneck is the design of a "ground" cost which should be adapted to the task under study. In most cases, supervised metric learning is not accessible, and one usually resorts to some ad-hoc approach. Unsupervised metric learning is thus a fundamental problem to enable data-driven applications of Optimal Transport. In this paper, we propose for the first time a canonical answer by computing the ground cost as a positive eigenvector of the function mapping a cost to the pairwise OT distances between the inputs. This map is homogeneous and monotone, thus framing unsupervised metric learning as a non-linear Perron-Frobenius problem. We provide criteria to ensure the existence and uniqueness of this eigenvector. In addition, we introduce a scalable computational method using entropic regularization, which - in the large regularization limit - operates a principal component analysis dimensionality reduction. We showcase this method on synthetic examples and datasets. Finally, we apply it in the context of biology to the analysis of a high-throughput single-cell RNA sequencing (scRNAseq) dataset, to improve cell clustering and infer the relationships between genes in an unsupervised way.

2 citations

•

TL;DR: This paper proposes mean-field game inverse-problem models to reconstruct the ground metrics and interaction kernels in the running costs, and numerically demonstrates that the model is both efficient and robust to noise.

Abstract: Mean-field games arise in various fields including economics, engineering, and machine learning. They study strategic decision making in large populations where the individuals interact via certain mean-field quantities. The ground metrics and running costs of the games are of essential importance but are often unknown or only partially known. In this paper, we propose mean-field game inverse-problem models to reconstruct the ground metrics and interaction kernels in the running costs. The observations are the macro motions, to be specific, the density distribution, and the velocity field of the agents. They can be corrupted by noise to some extent. Our models are PDE constrained optimization problems, which are solvable by first-order primal-dual methods. Besides, we apply Bregman iterations to find the optimal model parameters. We numerically demonstrate that our model is both efficient and robust to noise.

### Cites background from "Inverse Optimal Transport"

...[37] proposes a framework to learn the unknown ground costs from noisy observations during optimal transport....

[...]

...In particular, [30, 37] focus on the linear programming formulation of inverse optimal transport problems....

[...]

...the static joint distribution in [30, 37]....

[...]

••

14 Nov 2020-

TL;DR: This paper investigates the solution for an inverse of a parametric nonlinear transportation problem, in which, for a certain values of the parameters, the cost of the unit transportation in the basic problem are adapted as little as possible so that the specific feasible alternative become an optimal solution.

Abstract: This paper investigates the solution for an inverse of a parametric nonlinear transportation problem, in which, for a certain values of the parameters, the cost of the unit transportation in the basic problem are adapted as little as possible so that the specific feasible alternative become an optimal solution. In addition, a solution stability set of these parameters was investigated to keep the new optimal solution (feasible one) is unchanged. The idea of this study based on using a tuning parameters λ∈Rm in the function of the objective and input parameters υ∈Rl in the set of constraint. The inverse parametric nonlinear cost transportation problem P(λ,υ), where the tuning parameters λ∈Rm in the objective function are tuned (adapted) as less as possible so that the specific feasible solution x∘ has been became the optimal ones for a certain values of υ∈Rl, then, a solution stability set of the parameters was investigated to keep the new optimal solution x∘ unchanged. The proposed method consists of three phases. Firstly, based on the optimality conditions, the parameter λ∈Rm are tuned as less as possible so that the initial feasible solution x∘ has been became new optimal solution. Secondly, using input parameters υ∈Rl resulting problem is reformulated in parametric form P(υ). Finally, based on the stability notions, the availability domain of the input parameters was detected to keep its optimal solution unchanged. Finally, to clarify the effectiveness of the proposed algorithm not only for the inverse transportation problems but also, for the nonlinear programming problems; numerical examples treating the inverse nonlinear programming problem and the inverse transportation problem of minimizing the nonlinear cost functions are presented.

••

Matthieu Heitz

^{1}, Nicolas Bonneel^{2}, David Coeurjolly^{2}, Marco Cuturi^{3}+1 more•Institutions (4)TL;DR: This paper considers the GML problem when the learned metric is constrained to be a geodesic distance on a graph that supports the measures of interest, and seeks a graph ground metric such that the OT interpolation between the starting and ending densities that result from that ground metric agrees with the observed evolution.

Abstract: Optimal transport (OT) distances between probability distributions are parameterized by the ground metric they use between observations. Their relevance for real-life applications strongly hinges on whether that ground metric parameter is suitably chosen. Selecting it adaptively and algorithmically from prior knowledge, the so-called ground metric learning GML) problem, has therefore appeared in various settings. We consider it in this paper when the learned metric is constrained to be a geodesic distance on a graph that supports the measures of interest. This imposes a rich structure for candidate metrics, but also enables far more efficient learning procedures when compared to a direct optimization over the space of all metric matrices. We use this setting to tackle an inverse problem stemming from the observation of a density evolving with time: we seek a graph ground metric such that the OT interpolation between the starting and ending densities that result from that ground metric agrees with the observed evolution. This OT dynamic framework is relevant to model natural phenomena exhibiting displacements of mass, such as for instance the evolution of the color palette induced by the modification of lighting and materials.

### Cites background from "Inverse Optimal Transport"

...[20], Stuart and Wolfram [46], Li et al....

[...]

...Stuart and Wolfram [46] infer graph-based cost functions similar to ours, but learn from noisy observations of transport plans, in a Bayesian framework....

[...]