Equivalence of distance-based and RKHS-based statistics in hypothesis testing

doi:10.1214/13-AOS1140

Open AccessJournal ArticleDOI

Equivalence of distance-based and RKHS-based statistics in hypothesis testing

Dino Sejdinovic, +3 more

- 01 Oct 2013 -

Annals of Statistics

- Vol. 41, Iss: 5, pp 2263-2291

Chats0

TLDR

In this paper, a unifying framework linking two classes of statistics used in two-sample and independence testing is presented, namely, the energy distance and distance covariances from the statistics literature; and the maximum mean discrepancy (MMD), that is, distances between embeddings of distributions to reproducing kernel Hilbert spaces.

Abstract:

We provide a unifying framework linking two classes of statistics used in two-sample and independence testing: on the one hand, the energy distances and distance covariances from the statistics literature; on the other, maximum mean discrepancies (MMD), that is, distances between embeddings of distributions to reproducing kernel Hilbert spaces (RKHS), as established in machine learning. In the case where the energy distance is computed with a semimetric of negative type, a positive definite kernel, termed distance kernel, may be defined such that the MMD corresponds exactly to the energy distance. Conversely, for any positive definite kernel, we can interpret the MMD as energy distance with respect to some negative-type semimetric. This equivalence readily extends to distance covariance using kernels on the product space. We determine the class of probability distributions for which the test statistics are consistent against all alternatives. Finally, we investigate the performance of the family of distance kernels in two-sample and independence tests: we show in particular that the energy distance most commonly employed in statistics is just one member of a parametric family of kernels, and that other choices from this family can yield more powerful tests.

Equivalence of distance-based and RKHS-based statistics in hypothesis testing

Citations

Learning Transferable Features with Deep Adaptation Networks

Learning Transferable Features with Deep Adaptation Networks

Deep Transfer Learning with Joint Adaptation Networks

Contrastive Adaptation Network for Unsupervised Domain Adaptation

VisDA: The Visual Domain Adaptation Challenge

References

Methods of Modern Mathematical Physics

Support Vector Machines

A kernel two-sample test

Kernel Principal Component Analysis

Measuring and testing dependence by correlation of distances

Related Papers (5)

Measuring and testing dependence by correlation of distances

A kernel two-sample test

Brownian distance covariance

Measuring statistical dependence with hilbert-schmidt norms

Energy statistics: A class of statistics based on distances