On the multivariate runs test
Norbert Henze,Mathew D. Penrose +1 more
TLDR
In this paper, it was shown that the multivariate two-sample test based on the number of edges in the minimal spanning tree is asymptotically distribution-free.Abstract:
For independent $d$-variate random variables $X_1,\dots,X_m$ with common density $f$ and $Y_1,\dots,Y_n$ with common density $g$, let $R_{m,n}$ be the number of edges in the minimal spanning tree with vertices $X_1,\dots,X_m$, $Y_1,\dots,Y_n$ that connect points from different samples. Friedman and Rafsky conjectured that a test of $H_0: f = g$ that rejects $H_0$ for small values of $R_{m,n}$ should have power against general alternatives. We prove that $R_{m,n}$ is asymptotically distribution-free under $H_0$ , and that the multivariate two-sample test based on $R_{m,n}$ is universally consistent.read more
Citations
More filters
Journal ArticleDOI
A kernel two-sample test
TL;DR: This work proposes a framework for analyzing and comparing distributions, which is used to construct statistical tests to determine if two samples are drawn from different distributions, and presents two distribution free tests based on large deviation bounds for the maximum mean discrepancy (MMD).
Proceedings Article
A Kernel Method for the Two-Sample-Problem
TL;DR: This work proposes two statistical tests to determine if two samples are from different distributions, and applies this approach to a variety of problems, including attribute matching for databases using the Hungarian marriage method, where the test performs strongly.
Posted Content
A Kernel Method for the Two-Sample Problem
TL;DR: In this paper, the largest difference in expectations over functions in the unit ball of a reproducing kernel Hilbert space (RKHS) is defined, and the test statistic can be computed in quadratic time, although efficient linear time approximations are available.
Graph Kernels
TL;DR: A unified framework to study graph kernels is presented and a kernel that is close to the optimal assignment kernel of kernel of Frohlich et al. (2006) yet provably positive semi-definite is provided.
Journal ArticleDOI
On a new multivariate two-sample test
L. Baringhaus,C. Franz +1 more
TL;DR: In this paper, the authors proposed a new test for the multivariate two-sample problem, where the test statistic is the difference of the sum of all the Euclidean interpoint distances between the random variables from the two different samples and one-half of the two corresponding sums of distances of the variables within the same sample.
References
More filters
Book
Extreme Values, Regular Variation, and Point Processes
TL;DR: In this paper, the authors present a survey of the main domains of attraction and norming constants in point processes and point processes, and their relationship with multivariate extremity processes.
Journal ArticleDOI
Multivariate Generalizations of the Wald-Wolfowitz and Smirnov Two-Sample Tests
TL;DR: In this paper, generalizations of the Wald-Wolfowitz runs statistic and the Smirnov maximum deviation statistic for the two-sample problem are presented based on the minimal spanning tree of the pooled sample points.
Journal ArticleDOI
Multivariate Two-Sample Tests Based on Nearest Neighbors
TL;DR: In this paper, a new class of simple tests is proposed for the general multivariate two-sample problem based on the (possibly weighted) proportion of all k nearest neighbor comparisons in which observations and their neighbors belong to the same sample.
Journal ArticleDOI
A Multivariate Two-Sample Test Based on the Number of Nearest Neighbor Type Coincidences
TL;DR: In this article, it was shown that the limiting (normal) distribution of the number of comparisons in which observations and their neighbors belong to the same sample does not depend on the density of the observations.