scispace - formally typeset
Search or ask a question
Journal ArticleDOI

The Calculation of Distributions of Two-Sided Kolmogorov-Smirnov Type Statistics

01 Feb 1972-Annals of Mathematical Statistics (Institute of Mathematical Statistics)-Vol. 43, Iss: 1, pp 58-64
TL;DR: In this article, the precision of a truncated development of a two-sided Kolmogorov-Smirnov type has been determined in powers of λ √ 2 for any λ = 1,2, 10, 100.
Abstract: Let $X_1^n \leqq X_2^n \leqq \cdots \leqq X_n^n$ be the order statistics of a size $n$ sample from any distribution function $F$ not necessarily continuous. Let $\alpha_j, \beta_j, (j = 1,2, \cdots, n)$ be any numbers. Let $P_n = P(\alpha_j < X_j^n \leqq \beta_j, j = 1,2, \cdots, n)$. A recursion is given which calculates $P_n$ for any $F$ and any $\alpha_j, \beta_j$. Suppose now that $F$ is continuous. A two-sided statistic of Kolmogorov-Smirnov type has the distribution function $P_{\mathrm{KS}} = P\lbrack\sup n^{\frac{1}{2}}\psi(F) \cdot |F^n - F| \leqq \lambda\rbrack$, where $F^n$ is the empirical distribution function of the sample and $\psi(x)$ is any nonnegative weight function. As $P_{\mathrm{KS}}$ has the form $P_n$, its calculation as a function of $\lambda$ can be carried out by means of the recursion. This has been done for the case $\psi(x) = \lbrack x(1 - x)\rbrack^{-\frac{1}{2}}$. Curves are given which represent $\lambda$ versus $1 - P_{\mathrm{KS}}$ for $n = 1,2, 10, 100$. From additional computations, the precision of a truncated development of $1 - P_{\mathrm{KS}}$ in powers of $\lambda^{-2}$ has been determined.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: In this article, maximum likelihood methods are used to test for a change in a sequence of independent exponential family random variables, with particular emphasis on the exponential distribution, and the confidence regions for the change point cover historical events that may have caused the changes.
Abstract: SUMMARY Maximum likelihood methods are used to test for a change in a sequence of independent exponential family random variables, with particular emphasis on the exponential distribution. The exact null and alternative distributions of the test statistics are found, and the power is compared with a test based on a linear trend statistic. Exact and approximate confidence regions for the change-point are based on the values accepted by a level x likelihood ratio test and a modification of the method proposed by Cox & Spj0tvoll (1982). The methods are applied to a classical data set on the time intervals between coal mine explosions, and the change in variation of stock market returns. In both cases the confidence regions for the change-point cover historical events that may have caused the changes.

241 citations

Journal ArticleDOI
Vijayan N. Nair1
TL;DR: In this paper, the problem of obtaining simultaneous confidence bands for the survival function S(x) when the data are arbitrarily right censored is considered, and the usual pointwise confidence intervals based on Greenwood's variance formula can be adapted to yield a large-sample confidence band.
Abstract: Consider the problem of obtaining simultaneous confidence bands for the survival function S(x) when the data are arbitrarily right censored. The usual pointwise confidence intervals based on Greenwood's variance formula can be adapted to yield a large-sample confidence band. This band has, in a certain sense, equal precision at each point of S(x). It is compared with the censored versions of the Kolmogorov band and the Renyi band. The comparisons are made in terms of the widths and the adequacy of large-sample approximations and are carried out under various censoring models and degrees of censoring. The bands are illustrated by applying them to data from a mechanical-switch life test.

188 citations

Journal ArticleDOI
TL;DR: In this paper, the authors introduced the stabilized probability plot, which is a new and powerful goodness-of-fit statistic, analogous to the standard Kolmogorov-Smirnov statistic D, defined to be the maximum deviation of the plotted points from their theoretical values.
Abstract: SUMMARY The stabilized probability plot is introduced. An attractive feature of the plot that enhances its interpretability is that the variances of the plotted points are approximately equal. This prompts the definition of a new and powerful goodness-of-fit statistic Dsp which, analogous to the standard Kolmogorov-Smirnov statistic D, is defined to be the maximum deviation of the plotted points from their theoretical values. Using either D or Dsp it is shown how to construct acceptance regions for QQ,, Pp and the new plots. Acceptance regions can help remove much of the subjectivity from the interpretation of these probability plots.

156 citations


Cites methods from "The Calculation of Distributions of..."

  • ...Exact critical points for D for testing a simple hypothesis were s P computed using a recursive a lgorithm described by Noe (1972) ....

    [...]

  • ...power for D and D was computed using the recursive algorithm of Noe (1972) ....

    [...]

Journal ArticleDOI
TL;DR: A unified family of goodness-of-fit tests based on φ$-divergences is introduced and studied in this article, which includes both the supremum version of the Anderson-Darling statistic and the test statistic of Berk and Jones [Z. Verw. Wahrsch.
Abstract: A unified family of goodness-of-fit tests based on $\phi$-divergences is introduced and studied. The new family of test statistics $S_n(s)$ includes both the supremum version of the Anderson--Darling statistic and the test statistic of Berk and Jones [Z. Wahrsch. Verw. Gebiete 47 (1979) 47--59] as special cases ($s=2$ and $s=1$, resp.). We also introduce integral versions of the new statistics. We show that the asymptotic null distribution theory of Berk and Jones [Z. Wahrsch. Verw. Gebiete 47 (1979) 47--59] and Wellner and Koltchinskii [High Dimensional Probability III (2003) 321--332. Birkhauser, Basel] for the Berk--Jones statistic applies to the whole family of statistics $S_n(s)$ with $s\in[-1,2]$. On the side of power behavior, we study the test statistics under fixed alternatives and give extensions of the ``Poisson boundary'' phenomena noted by Berk and Jones for their statistic. We also extend the results of Donoho and Jin [Ann. Statist. 32 (2004) 962--994] by showing that all our new tests for $s\in[-1,2]$ have the same ``optimal detection boundary'' for normal shift mixture alternatives as Tukey's ``higher-criticism'' statistic and the Berk--Jones statistic.

120 citations


Cites methods from "The Calculation of Distributions of..."

  • ...Hence, with qn(s,α) denoting the upper 1 − α quantile of the distribution of Sn(s) under F0 (which is computable via Noé’s recursion as discussed in Section 3.1 or can be approximated for large n via Theorem 3.1), it follows that PF (Sn(s,F )≤ qn(s,α)) = PF0(Sn(s)≤ qn(s,α)) = 1−α for each fixed α ∈ (0,1) and n....

    [...]

  • ...(See Shorack and Wellner [38], pages 362–366 for an exposition of Noé’s methods.)...

    [...]

  • ...Jager [23] gives exact finite sample computations for the whole family of statistics via Noé’s recursions for values of n up to 3000....

    [...]

  • ...Owen [36] showed how to use the recursions of Noé [35] to obtain finite sample critical points of the Berk–Jones statistic Rn = Sn(1) for values of n up to 1000....

    [...]

  • ...Finite sample critical points via Noé’s recursion....

    [...]

Journal ArticleDOI
TL;DR: In this article, the authors present an exact formula for sample sizes up to 31, six recursion formulae, and one matrix formula that can be used to calculate a two-sided p value.
Abstract: One of the most widely used goodness-of-fit tests is the two-sided one sample Kolmogorov-Smirnov (K-S) test which has been implemented by many computer statistical software packages. To calculate a two-sided p value (evaluate the cumulative sampling distribution), these packages use various methods including recursion formulae, limiting distributions, and approximations of unknown accuracy developed over thirty years ago. Based on an extensive literature search for the two-sided one sample K-S test, this paper identifies an exact formula for sample sizes up to 31, six recursion formulae, and one matrix formula that can be used to calculate a p value. To ensure accurate calculation by avoiding catastrophic cancelation and eliminating rounding error, each of these formulae is implemented in rational arithmetic. For the six recursion formulae and the matrix formula, computational experience for sample sizes up to 500 shows that computational times are increasing functions of both the sample size and the number of digits in the numerator and denominator integers of the rational number test statistic. The computational times of the seven formulae vary immensely but the Durbin recursion formula is almost always the fastest. Linear search is used to calculate the inverse of the cumulative sampling distribution (find the confidence interval half-width) and tables of calculated half-widths are presented for sample sizes up to 500. Using calculated half-widths as input, computational times for the fastest formula, the Durbin recursion formula, are given for sample sizes up to two thousand.

118 citations


Cites background or methods from "The Calculation of Distributions of..."

  • ...Note that Noe (1972) actually sorts the combined list of αj ’s and the βj ’s but since the distribution function F (z) is a non-decreasing function, this is equivalent to sorting the distribution function where ties are broken by choosing the α-boundary....

    [...]

  • ...The general recursion formula by Noe (1972) contains both the two-sided and the one-sided K-S statistics as special cases....

    [...]