scispace - formally typeset
Open AccessJournal ArticleDOI

On the multivariate runs test

Norbert Henze, +1 more
- 01 Mar 1999 - 
- Vol. 27, Iss: 1, pp 290-298
TLDR
In this paper, it was shown that the multivariate two-sample test based on the number of edges in the minimal spanning tree is asymptotically distribution-free.
Abstract
For independent $d$-variate random variables $X_1,\dots,X_m$ with common density $f$ and $Y_1,\dots,Y_n$ with common density $g$, let $R_{m,n}$ be the number of edges in the minimal spanning tree with vertices $X_1,\dots,X_m$, $Y_1,\dots,Y_n$ that connect points from different samples. Friedman and Rafsky conjectured that a test of $H_0: f = g$ that rejects $H_0$ for small values of $R_{m,n}$ should have power against general alternatives. We prove that $R_{m,n}$ is asymptotically distribution-free under $H_0$ , and that the multivariate two-sample test based on $R_{m,n}$ is universally consistent.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

A kernel two-sample test

TL;DR: This work proposes a framework for analyzing and comparing distributions, which is used to construct statistical tests to determine if two samples are drawn from different distributions, and presents two distribution free tests based on large deviation bounds for the maximum mean discrepancy (MMD).
Proceedings Article

A Kernel Method for the Two-Sample-Problem

TL;DR: This work proposes two statistical tests to determine if two samples are from different distributions, and applies this approach to a variety of problems, including attribute matching for databases using the Hungarian marriage method, where the test performs strongly.
Posted Content

A Kernel Method for the Two-Sample Problem

TL;DR: In this paper, the largest difference in expectations over functions in the unit ball of a reproducing kernel Hilbert space (RKHS) is defined, and the test statistic can be computed in quadratic time, although efficient linear time approximations are available.

Graph Kernels

TL;DR: A unified framework to study graph kernels is presented and a kernel that is close to the optimal assignment kernel of kernel of Frohlich et al. (2006) yet provably positive semi-definite is provided.
Journal ArticleDOI

On a new multivariate two-sample test

TL;DR: In this paper, the authors proposed a new test for the multivariate two-sample problem, where the test statistic is the difference of the sum of all the Euclidean interpoint distances between the random variables from the two different samples and one-half of the two corresponding sums of distances of the variables within the same sample.
References
More filters
Book

Extreme Values, Regular Variation, and Point Processes

TL;DR: In this paper, the authors present a survey of the main domains of attraction and norming constants in point processes and point processes, and their relationship with multivariate extremity processes.
Journal ArticleDOI

Multivariate Generalizations of the Wald-Wolfowitz and Smirnov Two-Sample Tests

TL;DR: In this paper, generalizations of the Wald-Wolfowitz runs statistic and the Smirnov maximum deviation statistic for the two-sample problem are presented based on the minimal spanning tree of the pooled sample points.
Journal ArticleDOI

Multivariate Two-Sample Tests Based on Nearest Neighbors

TL;DR: In this paper, a new class of simple tests is proposed for the general multivariate two-sample problem based on the (possibly weighted) proportion of all k nearest neighbor comparisons in which observations and their neighbors belong to the same sample.
Journal ArticleDOI

A Multivariate Two-Sample Test Based on the Number of Nearest Neighbor Type Coincidences

Norbert Henze
- 01 Jun 1988 - 
TL;DR: In this article, it was shown that the limiting (normal) distribution of the number of comparisons in which observations and their neighbors belong to the same sample does not depend on the density of the observations.
Related Papers (5)