Journal Article•DOI•

A Riemannian Framework for Tensor Computing

Xavier Pennec¹, Pierre Fillard¹, Nicholas Ayache¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

01 Jan 2006-International Journal of Computer Vision (Kluwer Academic Publishers)-Vol. 66, Iss: 1, pp 41-66

TL;DR: This paper proposes to endow the tensor space with an affine-invariant Riemannian metric and demonstrates that it leads to strong theoretical properties: the cone of positive definite symmetric matrices is replaced by a regular and complete manifold without boundaries, the geodesic between two tensors and the mean of a set of tensors are uniquely defined.

read less

Abstract: Tensors are nowadays a common source of geometric information. In this paper, we propose to endow the tensor space with an affine-invariant Riemannian metric. We demonstrate that it leads to strong theoretical properties: the cone of positive definite symmetric matrices is replaced by a regular and complete manifold without boundaries (null eigenvalues are at the infinity), the geodesic between two tensors and the mean of a set of tensors are uniquely defined, etc. We have previously shown that the Riemannian metric provides a powerful framework for generalizing statistics to manifolds. In this paper, we show that it is also possible to generalize to tensor fields many important geometric data processing algorithms such as interpolation, filtering, diffusion and restoration of missing data. For instance, most interpolation and Gaussian filtering schemes can be tackled efficiently through a weighted mean computation. Linear and anisotropic diffusion schemes can be adapted to our Riemannian framework, through partial differential evolution equations, provided that the metric of the tensor space is taken into account. For that purpose, we provide intrinsic numerical schemes to compute the gradient and Laplace-Beltrami operators. Finally, to enforce the fidelity to the data (either sparsely distributed tensors or complete tensors fields) we propose least-squares criteria based on our invariant Riemannian distance which are particularly simple and efficient to solve.

...read moreread less

Summary (9 min read)

Jump to: [1 Introduction] – [1.1 Related work] – [2 Statistics on geometric features] – [2.1 Exponential chart] – [2.2 Practical implementation] – [2.3 Basic statistical tools] – [3 Working on the Tensor space] – [3.1 Exponential, logarithm and square root of tensors] – [3.3 An invariant Riemannian metric] – [3.4 Exponential and logarithm maps] – [3.5 Induced and orthonormal coordinate systems] – [3.6 Gradient descent and PDEs: an intrinsic geodesic marching] – [3.7 Example with the mean value] – [3.8 Simple statistical operations on tensors] – [4 Tensor Interpolation] – [4.1 Interpolation through weighted mean] – [4.2 Example of the linear interpolation] – [4.3 Tri-linear interpolation] – [4.4 Interpolation of non regular measurements] – [5 Filtering tensor fields] – [5.1 Gaussian Filtering] – [5.2 Spatial gradient of Tensor fields] – [5.3 Filtering using PDEs] – [5.4 Anisotropic filtering] – [6 Regularization and restoration of tensor fields] – [6.1 The regularization term] – [6.2 A least-squares attachment term] – [6.3 A least-squares attachment term for sparsely distributed ten-] – [6.4 Interpolation through diffusion] and [7 Conclusion]

1 Introduction

Positive definite symmetric matrices (so-called tensors in this article) are often encountered in image processing, for instance as covariance matrices for characterizing statistics on deformations, or as an encoding of the principal diffusion directions in Diffusion Tensor Imaging (DTI).
Even when a SVD is performed to smooth independently the rotation (eigenvectors basis trihedron) and eigenvalues, there is a continuity problem around equal eigenvalues.
This statistical framework was then reorganized and extended in [Pennec, 1999, Pennec, 2004] for general Riemannian manifolds, invariance properties leading in some case to a natural choice for the metric.
The authors show how this theory can be applied to tensors, leading to a new intrinsic computing framework for these geometric features with many important theoretical properties as well as practical computing properties.
The authors show that this problem can be tackled efficiently through a weighted mean optimization.

2 Statistics on geometric features

The authors summarize in this Section the theory of statistics on Riemannian manifolds developed in [Pennec, 1999, Pennec, 2004].
The aim is to exemplify the fact that choosing a Riemannian metric “automatically” determines a powerful framework to work on the manifold through the use of a few tools from differential geometry.
To compute the length of the curve, the authors can proceed as usual by integrating this value along the curve.
The distance between two points of a connected Riemannian manifold is the minimum length among the curves joining these points.
This means that the manifold has no boundary nor any singular point that the authors can reach in a finite time.

2.1 Exponential chart

From the theory of second order differential equations, the authors know that there exists one and only one geodesic starting from that point with this tangent vector.
The function that maps to each vector −→xy ∈.
TxM the point y of the manifold that is reached after a unit time by the geodesic starting at x with this tangent vector is called the exponential map.
On the sphere S2(1) for instance, the cut locus of a point x is its antipodal point and the tangential cut locus is the circle of radius π.
Geodesics starting from x are straight lines, and the distance from the reference point are conserved.

2.2 Practical implementation

In fact, most of the usual operations using additions and subtractions may be reinterpreted in a Riemannian framework using the notion of bipoint, an antecedent of vector introduced during the 19th Century.
Indeed, one defines vectors as equivalent classes of bipoints (oriented couples of points) in a Euclidean space.
This is possible because the authors have a canonical way (the translation) to compare what happens at two different points.
In a Riemannian manifold, the authors can still compare things locally (by parallel transportation), but not any more globally.
Conversely, the logarithmic map may be used to map almost any bipoint (x, y) into a vector −→xy = logx(y) of TxM.

2.3 Basic statistical tools

The Riemannian metric induces an infinitesimal volume element on each tangent space, and thus a measure dM on the manifold that can be used to measure random events on the manifold and to define the probability density function (if it exists) of these random elements.
With the probability measure of a random element, the authors can integrate functions from the manifold to any vector space, thus defining the expected value of this function.
As for the mean, the authors chose in [Pennec, 1996, Pennec, 1999, Pennec, 2004] a variational approach to generalize the Normal Law: they define it as the distribution that minimizes the information knowing the mean and the covariance.
The relation between the concentration matrix (the “metric” used in the exponential of the probability density function) and the covariance matrix is slightly more complex than the simple inversion of the vectorial case, as it has to be corrected for the curvature of the manifold.
This opens the way to the generalization of many other statistical tests, as the authors may expect similarly simple approximations for sufficiently centered distributions.

3 Working on the Tensor space

Let us now focus on the space Sym+n of positive definite symmetric matrices .
The goal is to find a Riemannian metric with interesting enough properties.
This leads to a very regular manifold structure where tensors with null and infinite eigenvalues are both at an infinite distance of any positive definite symmetric matrix: the cone of positive definite symmetric matrices is replaced by a space which has an infinite development in each of its n(n + 1)/2 directions.
Moreover, there is one and only one geodesic joining any two tensors, and the authors can even define globally consistent orthonormal coordinate systems of tangent spaces.
Thus, the structure the authors obtain is very close to a vector space, except that the space is curved.

3.1 Exponential, logarithm and square root of tensors

In the case of symmetric matrices, the authors have some important simplifications.
The series defining the exponential function converges for any matrix argument, but this is generally not the case for the series defining its inverse function: the logarithm.
Thus, the matrix exponential realizes a one-to-one mapping between the space of symmetric matrices to the space of tensors.
The square root is always defined and moreover unique: let Σ = U D2UT be a diagonalization (with positives values for the di’s).

3.3 An invariant Riemannian metric

Another way to determine the invariant distance is through the Riemannian metric.
Let us take the most simple scalar product on the tangent space at the identity matrix: if W1 and W2 are tangent vectors (i.e. symmetric matrices, not necessarily definite nor positive), the authors define the scalar product to be the standard matrix scalar product 〈W1 |W2 〉 = Tr(W T1 W2).
To find the geodesic without going though the computation of Christoffel symbols, the authors may rely on a result from differential geometry [Gamkrelidze, 1991, Helgason, 1978, Kobayashi and Nomizu, 1969] which says that the geodesics for the invariant metrics on affine symmetric spaces are generated by the action of the one-parameter subgroups of the acting Lie group1.
Introducing this solution into the equation of geodesics (Eq. 3) does not lead to a very tractable expression.
Thanks to the invariance by the linear group, this property holds for the distance to any (positive definite) tensor of the manifold.

3.4 Exponential and logarithm maps

This mapping is called the exponential map, because it corresponds to the usual exponential in some matrix groups.
As explained in Section 2.1, these charts can be viewed as the development of the manifold onto the tangent space along the geodesics.
Moreover, as the manifold has a non-positive curvature [Skovgaard, 1984], there is no cut-locus and the statistical properties detailed in [Pennec, 2004] hold in their most general form.
The authors have the existence and uniqueness of the mean of any distribution with a compact support [Kendall, 1990].

3.5 Induced and orthonormal coordinate systems

One has to be careful because the coordinate system of all these charts is not orthonormal.
Even if this basis is orthonormal at some points of the manifold (such as at the identity for their tensors), it has to be corrected for the Riemannian metric at other places due to the manifold curvature.
In their case, the transformation Σ1/2 ∈ GLn is moreover uniquely defined (as a positive square root) and is a smooth function of Σ over the complete tensor manifold.
This group action approach was chosen in earlier works [Pennec, 1996, Pennec and Thirion, 1997, Pennec and Ayache, 1998] with what the authors called the placement function.
On the sphere, there is a singularity at the antipodal point of the chosen origin for any otherwise smooth placement function.

3.6 Gradient descent and PDEs: an intrinsic geodesic marching

The principle of a first order gradient descent is to go toward the steepest descent, in the direction opposite to the gradient for a short time-step ε, and iterate the process.
The standard operator Σt+1 = Σt−εWt is only valid for very short time-steps in the flat Euclidean matrix space, and the authors could easily go out of the cone of positive definite tensors.
A much more interesting numerical operator is given by following the geodesic backward starting at Σ with tangent vector.
This intrinsic gradient descent ensures that the authors cannot leave the manifold.
This intrinsic scheme is trivially generalized to partial differential evolution equations (PDEs) on tensor fields such as ∂tΣ(x, t) = −W (x, t): the authors obtain Σ(x, t+dt) = expΣ(x,t)(−dtW (x, t)).

3.7 Example with the mean value

The Karcher or Fréchet mean is the set of tensors minimizing the sum of squared distances: C(Σ) = ∑N i=1 dist 2(Σ, Σi).
In the case of tensors, the manifold has a non-positive curvature[Skovgaard, 1984], so that there is one and only one mean value Σ̄ [Kendall, 1990].
Thus, the intrinsic Newton gradient descent algorithm gives the following mean value at estimation step t +.
This gradient descent algorithm usually converges very fast (about 10 iterations, see Fig. 2).

3.8 Simple statistical operations on tensors

As described in [Pennec, 2004], the authors may generalize most of the usual statistical methods by using the exponential chart at the mean point.
One can generate a random Gaussian tensor using the following procedure: the authors sample n(n+1)/2 independent and normalized real Gaussian samples, multiply the corresponding vector by the square root of the desired covariance matrix (expressed in their Vec coordinate system), and come back to the tensor manifold using the inverse Vec mapping.
This theorem states that the empirical mean of N independently and identically distributed (IID) random variables with a variance γ2 asymptotically follows a Gaussian law of variance γ2/N , centered at the exact mean value.
Moreover, a Kolmogorov-Smirnov test confirms that the distance between the empirical and theoretical cumulative pdf is not significant (p-value of 0.19).

4 Tensor Interpolation

One of the important operations in geometric data processing is to interpolate values between known measurements.
In 3D image processing, (tri-) linear interpolation is often used thanks to its very low computational load and comparatively much better results than nearest neighbor interpolation.
A typical example where the convolution kernel has an infinite support is the sinus cardinal interpolation.
To ensure that this is an interpolating function, one has to require that wi(xj) = δij (where δij is the Kronecker symbol).
Typical examples in triangulations or tetrahedrizations are barycentric and natural neighbor coordinates [Sibson, 1981].

4.1 Interpolation through weighted mean

To generalize interpolation methods defined using weighted means to their tensor manifold, let us assume that the sample weights wk(x) are defined as above in Rd.
Thanks to their normalization, the value f(x) interpolated from vectors fk verifies ∑N i=1 wi(x)(fi−f(x)) =.
Of course, the authors loose in general the existence and uniqueness properties.
For positive weights, the existence and uniqueness theorems for the Karcher mean can be adapted.

4.2 Example of the linear interpolation

The linear interpolation is simple as this is a walk along the geodesic joining the two tensors.
The authors displayed in Fig. 3 the flat and the Riemannian interpolations between 2D tensors of eigenvalues (5,1) horizontally and (1,50) at 45 degrees, along with the evolution of the eigenvalues, their mean (i.e. trace of the matrix) and product (i.e. determinant of the matrix or volume of the ellipsoid).
With the standard matrix coefficient interpolation, the evolution of the trace is perfectly linear (which was expected since this is a linear function of the coefficients), and the principal eigenvalue regularly grows almost linearly, while the smallest eigenvalue slightly grows toward a local maxima before lowering.
What is much more annoying is that the determinant (i.e. the volume) does not grow regularly in between the two tensors, but goes through a maximum.
On the contrary, one can clearly see a regular evolution of the eigenvalues and of their product with the interpolation in their Riemannian space.

4.3 Tri-linear interpolation

The bi- and tri-linear interpolation of tensors on a regular grid in 2D or 3D are almost as simple, except that the authors do not have any longer an explicit solution using geodesics since there are more than two reference points.
After computing the (bi-) tri-linear weights with respect to the neighboring sites of the point the authors want to evaluate, they now have to go through the iterative optimization of the weighted mean (Eq. 5) to compute the interpolated tensor.
One can see that the volume of the tensors is much more important with the classical than with the Riemannian interpolation.
The authors also get a much smoother interpolation of the principal directions with their method.

4.4 Interpolation of non regular measurements

When tensors are not measured on a regular grid but “randomly” localized in space, defining neighbors becomes an issue.
One solution, proposed by [Sibson, 1981] and later used for surfaces by [Cazals and Boissonnat, 2001], is the natural neighbor interpolation.
For any point x, its natural neighbors are the points of {xi} whose Voronoi cells are chopped off upon insertion of x into the Voronoi diagram.
Moreover, the authors have little control on the quality of this approximation.

5 Filtering tensor fields

Let us now consider that the authors have a tensor field, for instance like in Diffusion Tensor Imaging (DTI) [Le Bihan et al., 2001], where the tensor is a first order approximation of the anisotropic diffusion of the water molecules at each point of the imaged tissues.
In the brain, the diffusion is much favored in the direction of oriented structures (fibers of axons).
The tensor field obtained from the images is noisy and needs to be regularized before being further analyzed.
The generalization to tensor fields is quite straightforward using once again weighted means (Section 5.1 below).

5.1 Gaussian Filtering

Like previously, this weighted mean can be solved on their manifold using their intrinsic gradient descent scheme.
One can see a more important blurring of the corpus callosum fiber tracts using the flat metric.
The integration of this filtering scheme into a complete fiber tracking system would be necessary to fully evaluate the pros and cons of each metric.

5.2 Spatial gradient of Tensor fields

The linearity of the derivatives implies that the authors could use directional derivatives in more than the d orthogonal directions.
This is especially well adapted to stabilize the discrete computations: the finite difference estimation of the directional derivative is ∂uF (x) = F (x+ u)−F (x).
The authors experimentally found in other applications (e.g. to compute the Jacobian of a deformation field in non-rigid registration [Rey et al., 2002, p. 169]) that this gradient approximation scheme was more stable and much faster than computing all derivatives using convolutions, for instance by the derivative of the Gaussian.
To quantify the local amount of variability independently of the space direction, one usually takes the norm of the gradient: ‖∇F (x)‖2 = ∑d i=1 ‖∂iF (x)‖ (7) Notice that this approximation is consistent with the previous one only if the directions u are normalized to unity.
For a manifold valued field Σ(x) define on Rd, the authors can proceed similarly, except that the directional derivatives ∂iΣ(x) are now tangent vectors of TΣ(x)M.

5.3 Filtering using PDEs

Regularizing a scalar, vector or tensor field F aims at reducing the amount of its spatial variations.
The vector case Let us decompose their vector field F (x) into its n scalar components fi(x).
Thus, choosing an orthonormal coordinate system on the space Rn, their regularization criterion is decoupled into n independent scalar regularization problems: Reg(F )(x) = n∑ i=1 ∫ Ω ‖∇fi(x)‖2 dx = n∑ i=1 Reg(fi).
The authors may avoid the introduction of additional complex mathematical tools by coming back to the basic definitions.
For the numerical computation of the Laplacian, the authors may approximate the first and second order tensor derivative by their Euclidean derivatives.

5.4 Anisotropic filtering

In practice, the authors would like to filter within the homogeneous regions, but not across their boundaries.
(14) Figures 7 and 8 present example results of this very simple anisotropic filtering scheme on synthetic and real DTI images.
The anisotropic smoothing perfectly preserves the discontinuity while completely smoothing each region.
Thus, similarly to the Euclidean mean of identically and independently distributed measurements, the authors expect the standard deviation of the regularized tensors to be roughly 7 ' √ 48 times smaller than the one of the noisy input tensors.
One can see that the tensors are regularized in “homogeneous” regions (ventricles, temporal areas), while the main tracts are left unchanged.

6 Regularization and restoration of tensor fields

The pure diffusion is efficient to reduce the noise in the data, but it also reduces the amount of information.
At an infinite diffusion time, the tensor field will be completely homogeneous (or homogeneous by part for some anisotropic diffusion schemes), with a value corresponding to the mean of the measurements over the region (with Neumann boundary conditions).
Thus, the absolute minimum of their regularization criterion alone is of little interest.
To keep close to the measured tensor field Σ0(x) while still regularizing, a more theoretically grounded approach is to consider an optimization problem with a competition between a data attachment term and a possibly non-linear anisotropic regularization term: C(Σ) = Sim(Σ, Σ0) + λ Reg(Σ).

6.1 The regularization term

To preserve the discontinuities, the gradient of this criterion (the Laplacian) may be tailored to prevent the smoothing across them, as the authors have done in Section 5.4.
There is no more convergence guarantee, since this anisotropic regularization“force”may not derive from a well-posed criterion .
This scheme was used in [Chefd’hotel et al., 2004] with the flat Euclidean metric on tensors, in conjunction with a geometric numerical integration scheme that preserves the rank of the matrix.
This produces the eigenvalue swelling effect they observed.
The authors may compute the directional derivatives using finite differences in the flat matrix space and use the intrinsic evolution scheme, but they believe that there are more efficient ways to do it using the exponential map.

6.2 A least-squares attachment term

Usually, one considers that the data (e.g. a scalar image or a displacement vector field F0(x)) are corrupted by a uniform Gaussian noise independent at each spatial position.
On the tensor manifold, assuming a uniform Gaussian noise independent at each position also leads to a least-squares criterion through a maximum likelihood approach.

6.3 A least-squares attachment term for sparsely distributed ten-

Now, let us consider the case where the authors do not have a dense measure of their tensor field, but only N measures Σi at irregularly distributed sample points xi.
The tensor field Σ(x) is related to the data only at the measurement points xi through the Dirac distributions δ(x−xi).
If the introduction of distributions may be dealt with for the theoretical differentiation of the criterion with respect to the continuous tensor field Σ, it is a real problem for the numerical implementation.

6.4 Interpolation through diffusion

With the sparse data attachment term (16) and the isotropic first order regularization term (10), the authors are looking for a tensor field that minimizes its spatial variations while interpolating (or more precisely approximating at the desired precision) the measurement values: C(Σ) = N∑ i=1 ∫ Ω Gσ(x− xi) dist2 (Σ(xi) , Σi) + λ ∫ Ω ‖∇Σ(x)‖2Σ(x) dx.
Last but not least, the authors need an initialization of the tensor field Σ0(x) to obtain a fully operational algorithm.
Thus, the authors have a potentially destructive competition between the two terms of the criterion in very localized area.
In that case, the Laplacian regularization will spread from these boundaries with no constraints until it reaches the counterbalancing forces of the data attachment term in the immediate vicinity of the sparse measurements.

7 Conclusion

The authors propose in this paper an affine invariant metric that endows the space of positive define symmetric matrices with a very regular manifold structure.
The authors exemplify some the the good metric properties for some simple statistical operations.
A second contribution of the paper is the application of this framework to important geometric data processing problem such as interpolation, filtering, diffusion and restoration of tensor fields.
This framework is based on the choice of a Riemannian metric on one side, which leads to powerful differential geometry tools such as the the exponential maps and geodesic marching techniques, and on the transformation of linear combinations or integrals into minimization problems on the other side.
The authors are currently investigating another very interesting application field in collaboration with P. Thompson and A. Toga at UCLA: the modeling and analysis of the variability of the brain anatomy.

Did you find this useful? Give us your feedback

Figures (11)

Figure 7: Left: 3D synthetic tensor field with a clear discontinuity. Middle: The field has been corrupted by a Gaussian noise (in the Riemannian sense). Right: result of the regularization after 30 iterations (time step ε = 0.01).

Figure 8: Anisotropic filtering of a DTI slice (time step 0.01, κ = 0.046). From left to right: at the beginning, after 10 and after 50 iterations. Top: A 3D view of the tensors as ellipsoids. The color codes for the direction of the principal eigenvector. The results could be compared with the isotropic Gaussian filtering displayed in Figure 5. Middle: Riemannian norm of the gradient. Bottom: fractional anisotropy.

Figure 2: Mean of random Gaussian tensors. Left: Typical evolution of the distance between successive iterations of the mean computation. The convergence is clearly very fast. Right: Histogram of the renormalized Mahalanobis distance µ2 = N dist(Σ̄, Σ̂)2/γ2 between the computed and the exact mean tensors. The curve is the pdf of the χ26 distribution.

Table 1: Re-interpretation of basic standard operations in a Riemannian manifold.

Figure 6: Norm of the gradient of the tensor field. Left: computed on the coefficients with Eq. 7 (with the flat metric). Middle: we computed the directional derivatives with the exponential map (Eq. 8), but the norm is taken without correcting for the metric. As this should be very close to the flat gradient norm, we only display the difference image. The main differences are located on very sharp boundaries, where the curvature of our metric has the most important impact. However, the relative differences remains small (less than 10%), which shows the stability of both the gradient and the log / exp computation schemes. Right: Riemannian norm of the Riemannian gradient (Eq. 9). One can see much more detailed structures within the brain, which will now be preserved during an anisotropic regularization step.

Figure 4: Top left: Bi-linear interpolation between the four 2D tensors at the corners in the standard matrix space (interpolation of the coefficients). Top right: Equivalent bi-linear interpolation in our Riemannian space. Bottom left: A slice of the tri-linear interpolation between 3D tensors in the standard matrix space (interpolation of the coefficients). Bottom right: Equivalent tri-linear interpolation in our Riemannian space.

Figure 3: Top: Linear interpolation between 2D tensors of eigenvalues (5,1) horizontally and (1,50) at 45 degrees. Left: interpolation in the standard matrix space (interpolation of the coefficients). Right: geodesic interpolation in our Riemannian space. Bottom: evolution of the eigenvalues, their mean (i.e. trace of the matrix) and product (i.e. determinant of the matrix or volume of the ellipsoid).

Figure 10: Interpolation and extrapolation of tensor values from four measurements using diffusion. Top left: The four initial tensor measurements. Top right: Initialization of the tensor field using a soft closest point interpolation (mean of the four tensors with a renormalized spatial Gaussian influence). Bottom left: result of the diffusion without the data attachment term (1000 iterations, time-step ε = 1, λ = +∞). Bottom right: result of the diffusion with an attachment term after (1000 iterations, time-step ε = 1, λ = 0.01, σ = 1 pixel of the reconstruction grid). The algorithm did in fact converge in about 100 iterations.

Figure 9: Closeup on the results of the different filtering methods around the splenium of the corpus callosum. The color codes for the direction of the principal eigenvector (red: left-right, green: posterior-anterior, blue: inferior-superior). Top left: Original image. Top right: Gaussian filtering using the flat metric (5x5 window, σ = 2.0). This metric gives too much weight to tensors with large eigenvalues, thus leading to clear outliers in the ventricles or in the middle of the splenium tract. Bottom left: Gaussian filtering using the Riemannian metric (5x5 window, σ = 2.0). Outliers disappeared, but the discontinuities are not well preserved, for instance in the ventricles at the level of the cortico-spinal tracts (upper-middle part of the images). Bottom right: Anisotropic filtering in the Riemannian framework (time step 0.01, 50 iterations). The ventricles boundary is very well conserved with an anisotropic filter and both isotropic (ventricles) and anisotropic (splenium) regions are regularized. Note that the U-shaped tracts at the boundary of the grey/white matter (lower left and right corners of each image) are preserved with an anisotropic filter and not with a Gaussian filter. 26

Figure 5: Regularization of a DTI slice around the corpus callosum by isotropic Gaussian filtering. Left: raw estimation of the tensors. The color codes for the direction of the principal eigenvector (red: left/right, green: anterior/posterior, blue: top/bottom). Middle: Gaussian filtering of the coefficients (5x5 window, σ = 2.0). Right: equivalent filtering (same parameters) using the Riemannian metric.

Figure 1: Left: The tangent planes at points x and y of the sphere S2 are different: the vectors v and w of TxM cannot be compared to the vectors t and u of TyM. Thus, it is natural to define the scalar product on each tangent plane. Right: The geodesics starting at x are straight lines in the exponential map and the distance along them is conserved.

Content maybe subject to copyright Report

HAL Id: inria-00614990

https://hal.inria.fr/inria-00614990

Submitted on 17 Aug 2011

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-

entic research documents, whether they are pub-

lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diusion de documents

scientiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

A Riemannian Framework for Tensor Computing

Xavier Pennec, Pierre Fillard, Nicholas Ayache

To cite this version:

Xavier Pennec, Pierre Fillard, Nicholas Ayache. A Riemannian Framework for Tensor Computing.

International Journal of Computer Vision, Springer Verlag, 2006, 66 (1), pp.41–66. �10.1007/s11263-

005-3222-z�. �inria-00614990�

A Riemannian Framework for Tensor Computing

Xavier Pennec, Pierre Fillard, Nicholas Ayache

EPIDAURE / ASCLEPIOS Project-team, INRIA Sophia-Antipolis

2004 Route des Lucioles BP 93, F-06902 Sophia Antipolis Cedex, FRANCE

February 8 2005

This paper appeared in the International Journal of Computer Vision 66(1):41–66,

January 2006 (Springer). The original publication is available at www.springerlink.com.

Abstract

Tensors are nowadays a common source of geometric information. In this paper,

we propose to endow the tensor space with an aﬃne-invariant Riemannian metric.

We demonstrate that it leads to strong theoretical properties: the cone of positive

deﬁnite symmetric matrices is replaced by a regular and complete manifold without

boundaries (null eigenvalues are at the inﬁnity), the geodesic between two tensors and

the mean of a set of tensors are uniquely deﬁned, etc. We have previously shown that

the Riemannian metric provides a powerful framework for generalizing statistics to

manifolds. In this paper, we show that it is also possible to generalize to tensor ﬁelds

many important geometric data processing algorithms such as interpolation, ﬁltering,

diﬀusion and restoration of missing data. For instance, most interpolation and Gaussian

ﬁltering schemes can be tackled eﬃciently through a weighted mean computation.

Linear and anisotropic diﬀusion schemes can be adapted to our Riemannian framework,

through partial diﬀerential evolution equations, provided that the metric of the tensor

space is taken into account. For that purpose, we provide intrinsic numerical schemes

to compute the gradient and Laplace-Beltrami operators. Finally, to enforce the ﬁdelity

to the data (either sparsely distributed tensors or complete tensors ﬁelds) we propose

least-squares criteria based on our invariant Riemannian distance which are particularly

simple and eﬃcient to solve.

Keywords: Tensors, Diﬀusion Tensor MRI, Regularization, Interpolation, Extrapolation,

PDE, Riemannian Manifold, Aﬃne-invariant Metric.

1 Introduction

Positive deﬁnite symmetric matrices (so-called tensors in this article) are often encountered

in image processing, for instance as covariance matrices for characterizing statistics on defor-

mations, or as an encoding of the principal diﬀusion directions in Diﬀusion Tensor Imaging

(DTI). The measurements of these tensors is often noisy in real applications and we would

like to perform estimation, smoothing and interpolation of ﬁelds of this type of features. The

main problem is that the tensor space is a manifold that is not a vector space with the usual

additive structure. As symmetric positive deﬁnite matrices constitute a convex half-cone

in the vector space of matrices, many usual operations (like the mean) are stable in this

space. However, problems arise when estimating tensors from data (in standard DTI, the

estimated symmetric matrix could have negative eigenvalues), or when smoothing ﬁelds of

tensors: the numerical schemes used to solve the Partial Diﬀerential Equation (PDE) may

sometimes lead to negative eigenvalues if the time step is not small enough. Even when a

SVD is performed to smooth independently the rotation (eigenvectors basis trihedron) and

eigenvalues, there is a continuity problem around equal eigenvalues.

In previous works [Pennec, 1996, Pennec and Ayache, 1998], we used invariance require-

ments to develop some basic probability tools on transformation groups and homogeneous

manifolds. This statistical framework was then reorganized and extended in [Pennec, 1999,

Pennec, 2004] for general Riemannian manifolds, invariance properties leading in some case

to a natural choice for the metric. In this paper, we show how this theory can be applied to

tensors, leading to a new intrinsic computing framework for these geometric features with

many important theoretical properties as well as practical computing properties.

In the remaining of this section, we quickly investigate some connected works on tensors.

Then, we summarize in Section 2 the main ideas of the statistical framework we developed

on Riemannian manifolds. The aim is to exemplify the fact that choosing a Riemannian

metric “automatically” determines a powerful framework to work on the manifold through

the introduction of a few tools from diﬀerential geometry. In order to use this Riemannian

framework on our tensor manifold, we propose in Section 3 an aﬃne-invariant Riemannian

metric on tensors. We demonstrate that it leads to very strong theoretical properties, as well

as some important practical algorithms such as an intrinsic geodesic gradient descent. Section

4 focuses on the application of this framework to an important geometric data processing

problem: interpolation of tensor values. We show that this problem can be tackled eﬃciently

through a weighted mean optimization. However, if weights are easy to deﬁne for regularly

sampled tensors (e.g. for linear or tri-linear interpolation), the problem proved to be more

diﬃcult for irregularly sampled values.

With Section 5, we turn to tensors ﬁeld computing, and more particularly ﬁltering. If

the Gaussian ﬁltering may still be deﬁned through weighted means, the partial diﬀerential

equation (PDE) approach is slightly more complex. In particular, the metric of the tensor

space has to be taken into account when computing the magnitude of the spatial gradient

of the tensor ﬁeld. Thanks to our Riemannian framework, we propose eﬃcient numerical

schemes to compute the gradient, its amplitude, and the Laplace-Beltrami operator used in

linear diﬀusion. We also propose an adjustment of this manifold Laplacian that realizes an

anisotropic ﬁltering. Finally, Section 6 focuses on simple statistical approaches to regularize

and restore missing values in tensor ﬁelds. Here, the use of the Riemannian distance inherited

from the chosen metric is fundamental to deﬁne least-squares data attachment criteria for

dense and sparsely distributed tensor ﬁelds that lead to simple implementation schemes in

our intrinsic computing framework.

1.1 Related work

Quite an impressive literature has now been issued on the estimation and regularization of

tensor ﬁelds, especially in the context of Diﬀusion Tensor Imaging (DTI) [Basser et al., 1994,

Le Bihan et al., 2001, Westin et al., 2002]. Most of the works dealing with the geometric na-

ture of the tensors has been performed for the discontinuity-preserving regularization of the

tensor ﬁelds using Partial Diﬀerential Equations (PDEs). For instance, [Coulon et al., 2004]

anisotropically restores the principal direction of the tensor, and uses this regularized di-

rections map as an input for the anisotropic regularization of the eigenvalues. A quite

similar idea is adopted in [Tschump erl´e, 2002], where a spectral decomposition W (x) =

U(x) D(x) U(x)

of the tensor ﬁeld is performed at each points to independently regularize

the eigenvalues and eigenvectors (orientations). This approach requires an additional reori-

entation step of the rotation matrices due to the non-uniqueness of the decomposition (each

eigenvector is deﬁned up its sign and there may be joint permutations of the eigenvectors

and eigenvalues) in order to avoid the creation of artiﬁcial discontinuities. Another problem

arises when two or more eigenvalues become equal: a whole subspace of unit eigenvectors

is possible, and even a re-orientation becomes diﬃcult. An intrinsic integration scheme for

PDEs that uses the exponential map has been added in [Chefd’hotel et al., 2002], and allows

to perform PDEs evolution on the considered manifold without re-projections. In essence,

this is an inﬁnitesimal version of the intrinsic gradient descent technique on manifolds we

introduced in [Pennec, 1996, Pennec, 1999] for the computation of the mean.

The aﬃne-invariant Riemannian metric we detail in Section 3.3 may be traced back

to the work of [Nomizu, 1954] on aﬃne invariant connections on homogeneous spaces. It

is implicitly hidden under very general theorems on symmetric spaces in many diﬀerential

geometry textbooks [Kobayashi and Nomizu, 1969, Helgason, 1978, Gamkrelidze, 1991] and

sometimes considered as a well known result as in [Bhatia, 2003]. In statistics, it has been

introduced as the Fisher information metric [Skovgaard, 1984] to model the geometry of the

multivariate normal family. The idea of the invariant metric came to the mind of the ﬁrst

author during the IPMI conference in 2001 [Coulon et al., 2001, Batchelor et al., 2001], as an

application to diﬀusion tensor imaging (DTI) of the statistical methodology on Riemannian

manifolds previously developed (and summarized in the next Section). However, this idea

was not exploited until the end of 2003, when the visit of P. Thompson (UCLA, USA) raised

the need to interpolate tensors that represent the variability from speciﬁc locations on sulci

to the whole volume. The expertise of the second author on DTI [Fillard et al., 2003] pro-

vided an ideal alternative application ﬁeld. During the writing of this paper, we discovered

that the invariant metric has been independently proposed by [F

orstner and Moonen, 1999]

to deal with covariance matrices, and very recently by [Fletcher and Joshi, 2004] for the

analysis of principal modes of sets of diﬀusion tensors. By looking for a suitable met-

ric on the space of Gaussian distributions for the segmentation of diﬀusion tensor images,

[Lenglet et al., 2004a, Lenglet et al., 2004b] also end-up with the same metric. It is interest-

ing to see that completely diﬀerent approaches, relying on an aﬃne-invariant requirement

on the one hand, and relying on an information measure to evaluate the distance between

distributions on the other hand, lead to the same metric on the tensor space. However, to

our knowledge, this Riemannian metric has not been promoted as a complete computing

framework, as we propose in this paper.

2 Statistics on geometric features

We summarize in this Section the theory of statistics on Riemannian manifolds developed in

[Pennec, 1999, Pennec, 2004]. The aim is to exemplify the fact that choosing a Riemannian

metric “automatically” determines a powerful framework to work on the manifold through

the use of a few tools from diﬀerential geometry.

In the geometric framework, one can specify the structure of a manifold M by a Rieman-

nian metric. This is a continuous collection of scalar products on the tangent space at each

point of the manifold. Thus, if we consider a curve on the manifold, we can compute at each

point its instantaneous speed vector and its norm, the instantaneous speed. To compute

the length of the curve, we can proceed as usual by integrating this value along the curve.

The distance between two points of a connected Riemannian manifold is the minimum length

among the curves joining these points. The curves realizing this minimum for any two points

of the manifold are called geodesics. The calculus of variations shows that geodesics are the

solutions of a system of second order diﬀerential equations depending on the Riemannian

metric. In the following, we assume that the manifold is geodesically complete, i.e. that the

deﬁnition domain of all geodesics can be extended to R. This means that the manifold has

no boundary nor any singular point that we can reach in a ﬁnite time. As an important

consequence, the Hopf-Rinow-De Rham theorem states that there always exists at least one

minimizing geodesic between any two points of the manifold (i.e. whose length is the distance

between the two points).

Figure 1: Left: The tangent planes at points x and y of the sphere S

are diﬀerent: the vectors v

and w of T

M cannot be compared to the vectors t and u of T

M. Thus, it is natural to deﬁne

the scalar product on each tangent plane. Right: The geodesics starting at x are straight lines in

the exp onential map and the distance along them is conserved.

2.1 Exponential chart

Let x be a point of the manifold that we consider as a local reference and

−→

xy a vector of the

tangent space T

M at that point. From the theory of second order diﬀerential equations, we

HTML Viewer

Frequently Asked Questions (16)

Q1. What are the contributions mentioned in the paper "A riemannian framework for tensor computing" ?

In this paper, the authors propose to endow the tensor space with an affine-invariant Riemannian metric. The authors demonstrate that it leads to strong theoretical properties: the cone of positive definite symmetric matrices is replaced by a regular and complete manifold without boundaries ( null eigenvalues are at the infinity ), the geodesic between two tensors and the mean of a set of tensors are uniquely defined, etc. The authors have previously shown that the Riemannian metric provides a powerful framework for generalizing statistics to manifolds. In this paper, the authors show that it is also possible to generalize to tensor fields many important geometric data processing algorithms such as interpolation, filtering, diffusion and restoration of missing data. For that purpose, the authors provide intrinsic numerical schemes to compute the gradient and Laplace-Beltrami operators. Finally, to enforce the fidelity to the data ( either sparsely distributed tensors or complete tensors fields ) the authors propose least-squares criteria based on their invariant Riemannian distance which are particularly simple and efficient to solve.

Q2. What are the common types of symmetric matrices?

Positive definite symmetric matrices (so-called tensors in this article) are often encountered in image processing, for instance as covariance matrices for characterizing statistics on deformations, or as an encoding of the principal diffusion directions in Diffusion Tensor Imaging (DTI).

Q3. What is the main problem with symmetric positive definite matrices?

As symmetric positive definite matrices constitute a convex half-cone in the vector space of matrices, many usual operations (like the mean) are stable in this space.

Q4. How can the authors generate a random Gaussian tensor?

For instance, the authors can generate a random (generalized) Gaussian tensor using the following procedure: the authors sample n(n+1)/2 independent and normalized real Gaussian samples, multiply the corresponding vector by the square root of the desired covariance matrix (expressed in their Vec coordinate system), and come back to the tensor manifold using the inverse Vec mapping.

Q5. What is the way to compute the directional derivatives?

The authors may compute the directional derivatives using finite differences in the flat matrix space and use the intrinsic evolution scheme, but the authors believe that there are more efficient ways to do it using the exponential map.

Q6. What is the Riemannian interpolation of the eigenvalues?

With the standard matrix coefficient interpolation, the evolution of the trace is perfectly linear (which was expected since this is a linear function of the coefficients), and the principal eigenvalue regularly grows almost linearly, while the smallest eigenvalue slightly grows toward a local maxima before lowering.

Q7. How do the authors get the geodesic starting from any other point of the manifold?

Using the invariance of their metric, the authors easily obtain the geodesic starting from any other point of the manifold using their group action:

Q8. What are the main problems that arise when estimating tensors from data?

problems arise when estimating tensors from data (in standard DTI, the estimated symmetric matrix could have negative eigenvalues), or when smoothing fields of tensors: the numerical schemes used to solve the Partial Differential Equation (PDE) may sometimes lead to negative eigenvalues if the time step is not small enough.

Q9. What is the intrinsic integration scheme for PDEs?

An intrinsic integration scheme for PDEs that uses the exponential map has been added in [Chefd’hotel et al., 2002], and allows to perform PDEs evolution on the considered manifold without re-projections.

Q10. What is the way to keep close to the measured tensor field?

To keep close to the measured tensor field Σ0(x) while still regularizing, a more theoretically grounded approach is to consider an optimization problem with a competition between a data attachment term and a possibly non-linear anisotropic regularization term:

Q11. What is the tangential cut locus of a point x?

On the sphere S2(1) for instance, the cut locus of a point x is its antipodal point and the tangential cut locus is the circle of radius π.

Q12. What is the weighting function of the Laplace Beltrami operator?

If c(.) is a weighting function decreasing from c(0) = 1 to c(+∞) = 0, this can be realized directly in the discrete implementation of the Laplacian (Eq. 13): the contribution ∆uΣ of the spatial direction u to the Laplace-Beltrami operator is weighted by their decreasing function according to the norm ‖∂uΣ‖Σ of the gradient in that direction.

Q13. What is the procedure for a tensor field (x) ?

For a tensor field Σ(x) ∈ Sym+n over Rd, the procedure is more complex as the authors should use the covariant derivative (the connection) to differentiate vectors fields on their manifold.

Q14. What is the problem around equal eigenvalues?

Even when a SVD is performed to smooth independently the rotation (eigenvectors basis trihedron) and eigenvalues, there is a continuity problem around equal eigenvalues.

Q15. What is the directional derivative of HSim(F)?

This time, the directional derivative ∂HSim(F ) is directly expressed using a scalar product with H in the proper functional space, so that the steepest ascent direction is ∇Sim(F ) = 2 (F (x)− F0(x)).

Q16. What is the reinterpretation of addition and subtraction using logarithmic and?

This reinterpretation of addition and subtraction using logarithmic and exponential maps is very powerful to generalize algorithms working on vector spaces to algorithms on Riemannian manifolds, as illustrated by Table 1.

A Riemannian Framework for Tensor Computing

Summary (9 min read)

1 Introduction

1.1 Related work

2 Statistics on geometric features

2.1 Exponential chart

2.2 Practical implementation

2.3 Basic statistical tools

3 Working on the Tensor space

3.1 Exponential, logarithm and square root of tensors

3.3 An invariant Riemannian metric

3.4 Exponential and logarithm maps

3.5 Induced and orthonormal coordinate systems

3.6 Gradient descent and PDEs: an intrinsic geodesic marching

3.7 Example with the mean value

3.8 Simple statistical operations on tensors

4 Tensor Interpolation

4.1 Interpolation through weighted mean

4.2 Example of the linear interpolation

4.3 Tri-linear interpolation

4.4 Interpolation of non regular measurements

5 Filtering tensor fields

5.1 Gaussian Filtering

5.2 Spatial gradient of Tensor fields

5.3 Filtering using PDEs

5.4 Anisotropic filtering

6 Regularization and restoration of tensor fields

6.1 The regularization term

6.2 A least-squares attachment term

6.3 A least-squares attachment term for sparsely distributed ten-

6.4 Interpolation through diffusion

7 Conclusion

Figures (11)

Citations

Cites background or methods from "A Riemannian Framework for Tensor C..."

Cites background from "A Riemannian Framework for Tensor C..."

Cites background from "A Riemannian Framework for Tensor C..."

References

"A Riemannian Framework for Tensor C..." refers background in this paper

"A Riemannian Framework for Tensor C..." refers background or methods in this paper

Related Papers (5)

Frequently Asked Questions (16)

Q1. What are the contributions mentioned in the paper "A riemannian framework for tensor computing" ?

Q2. What are the common types of symmetric matrices?

Q3. What is the main problem with symmetric positive definite matrices?

Q4. How can the authors generate a random Gaussian tensor?

Q5. What is the way to compute the directional derivatives?

Q6. What is the Riemannian interpolation of the eigenvalues?

Q7. How do the authors get the geodesic starting from any other point of the manifold?

Q8. What are the main problems that arise when estimating tensors from data?

Q9. What is the intrinsic integration scheme for PDEs?

Q10. What is the way to keep close to the measured tensor field?

Q11. What is the tangential cut locus of a point x?

Q12. What is the weighting function of the Laplace Beltrami operator?

Q13. What is the procedure for a tensor field (x) ?

Q14. What is the problem around equal eigenvalues?

Q15. What is the directional derivative of HSim(F)?

Q16. What is the reinterpretation of addition and subtraction using logarithmic and?