scispace - formally typeset
Search or ask a question

Showing papers in "Annals of Statistics in 1973"


Journal ArticleDOI
TL;DR: In this article, a class of prior distributions, called Dirichlet process priors, is proposed for nonparametric problems, for which treatment of many non-parametric statistical problems may be carried out, yielding results that are comparable to the classical theory.
Abstract: The Bayesian approach to statistical problems, though fruitful in many ways, has been rather unsuccessful in treating nonparametric problems. This is due primarily to the difficulty in finding workable prior distributions on the parameter space, which in nonparametric ploblems is taken to be a set of probability distributions on a given sample space. There are two desirable properties of a prior distribution for nonparametric problems. (I) The support of the prior distribution should be large--with respect to some suitable topology on the space of probability distributions on the sample space. (II) Posterior distributions given a sample of observations from the true probability distribution should be manageable analytically. These properties are antagonistic in the sense that one may be obtained at the expense of the other. This paper presents a class of prior distributions, called Dirichlet process priors, broad in the sense of (I), for which (II) is realized, and for which treatment of many nonparametric statistical problems may be carried out, yielding results that are comparable to the classical theory. In Section 2, we review the properties of the Dirichlet distribution needed for the description of the Dirichlet process given in Section 3. Briefly, this process may be described as follows. Let $\mathscr{X}$ be a space and $\mathscr{A}$ a $\sigma$-field of subsets, and let $\alpha$ be a finite non-null measure on $(\mathscr{X}, \mathscr{A})$. Then a stochastic process $P$ indexed by elements $A$ of $\mathscr{A}$, is said to be a Dirichlet process on $(\mathscr{X}, \mathscr{A})$ with parameter $\alpha$ if for any measurable partition $(A_1, \cdots, A_k)$ of $\mathscr{X}$, the random vector $(P(A_1), \cdots, P(A_k))$ has a Dirichlet distribution with parameter $(\alpha(A_1), \cdots, \alpha(A_k)). P$ may be considered a random probability measure on $(\mathscr{X}, \mathscr{A})$, The main theorem states that if $P$ is a Dirichlet process on $(\mathscr{X}, \mathscr{A})$ with parameter $\alpha$, and if $X_1, \cdots, X_n$ is a sample from $P$, then the posterior distribution of $P$ given $X_1, \cdots, X_n$ is also a Dirichlet process on $(\mathscr{X}, \mathscr{A})$ with a parameter $\alpha + \sum^n_1 \delta_{x_i}$, where $\delta_x$ denotes the measure giving mass one to the point $x$. In Section 4, an alternative definition of the Dirichlet process is given. This definition exhibits a version of the Dirichlet process that gives probability one to the set of discrete probability measures on $(\mathscr{X}, \mathscr{A})$. This is in contrast to Dubins and Freedman [2], whose methods for choosing a distribution function on the interval [0, 1] lead with probability one to singular continuous distributions. Methods of choosing a distribution function on [0, 1] that with probability one is absolutely continuous have been described by Kraft [7]. The general method of choosing a distribution function on [0, 1], described in Section 2 of Kraft and van Eeden [10], can of course be used to define the Dirichlet process on [0, 1]. Special mention must be made of the papers of Freedman and Fabius. Freedman [5] defines a notion of tailfree for a distribution on the set of all probability measures on a countable space $\mathscr{X}$. For a tailfree prior, posterior distribution given a sample from the true probability measure may be fairly easily computed. Fabius [3] extends the notion of tailfree to the case where $\mathscr{X}$ is the unit interval [0, 1], but it is clear his extension may be made to cover quite general $\mathscr{X}$. With such an extension, the Dirichlet process would be a special case of a tailfree distribution for which the posterior distribution has a particularly simple form. There are disadvantages to the fact that $P$ chosen by a Dirichlet process is discrete with probability one. These appear mainly because in sampling from a $P$ chosen by a Dirichlet process, we expect eventually to see one observation exactly equal to another. For example, consider the goodness-of-fit problem of testing the hypothesis $H_0$ that a distribution on the interval [0, 1] is uniform. If on the alternative hypothesis we place a Dirichlet process prior with parameter $\alpha$ itself a uniform measure on [0, 1], and if we are given a sample of size $n \geqq 2$, the only nontrivial nonrandomized Bayes rule is to reject $H_0$ if and only if two or more of the observations are exactly equal. This is really a test of the hypothesis that a distribution is continuous against the hypothesis that it is discrete. Thus, there is still a need for a prior that chooses a continuous distribution with probability one and yet satisfies properties (I) and (II). Some applications in which the possible doubling up of the values of the observations plays no essential role are presented in Section 5. These include the estimation of a distribution function, of a mean, of quantiles, of a variance and of a covariance. A two-sample problem is considered in which the Mann-Whitney statistic, equivalent to the rank-sum statistic, appears naturally. A decision theoretic upper tolerance limit for a quantile is also treated. Finally, a hypothesis testing problem concerning a quantile is shown to yield the sign test. In each of these problems, useful ways of combining prior information with the statistical observations appear. Other applications exist. In his Ph. D. dissertation [1], Charles Antoniak finds a need to consider mixtures of Dirichlet processes. He treats several problems, including the estimation of a mixing distribution, bio-assay, empirical Bayes problems, and discrimination problems.

5,033 citations


Journal ArticleDOI
TL;DR: In this paper, a formal power series expansion of the initial terms of a power-series expansion with respect to the number of observations has been proposed, in most cases down to 4 observations per parameter.
Abstract: Maximum likelihood type robust estimates of regression are defined and their asymptotic properties are investigated both theoretically and empirically. Perhaps the most important new feature is that the number $p$ of parameters is allowed to increase with the number $n$ of observations. The initial terms of a formal power series expansion (essentially in powers of $p/n$) show an excellent agreement with Monte Carlo results, in most cases down to 4 observations per parameter.

2,221 citations


Journal ArticleDOI
TL;DR: In this article, it was shown that a random probability measure P* on X has a Ferguson distribution with parameter p if for every finite partition (B1, *. *, B) of X, the vector p*(B,), * * *, p *(B) has a Dirichlet distribution with parameters (Bj), *--, cp(B,) (when p(B), = 0, this means p*) = 0 with probability 1).
Abstract: Let p be any finite positive measure on (the Borel sets of) a complete separable metric space X. We shall say that a random probability measure P* on X has a Ferguson distribution with parameter p if for every finite partition (B1, * . *, B) of X the vector p*(B,), * * *, p*(B,) has a Dirichlet distribution with parameter (Bj), *--, cp(B,) (when p(B,) = 0, this means p*(B,) = 0 with probability 1). Ferguson (3) has shown that, for any p, Ferguson p* exist and when used as prior distributions yield Bayesian counterparts to well-known classical nonpa- rametric tests. He also shows that p* is a.s. discrete. His approach involves a rather deep study of the gamma process. One of us (1) has given a different and perhaps simpler proof that Ferguson priors concentrate on discrete distributions. In this note we give still a third approach to Ferguson distributions, exploiting their connection with generalized Polya urn schemes. We shall say that a sequence (X,, n > 1} of random variables with values in X is a Poilya sequence with parameter 1a if for every B c X (1) P(X1 e B) = p(B)/p(X) and (2) P{X,+1 e B I1 **,, X = pn(B)/1p(X) where p. = p + 3 l(Xi) and 3(x) denotes the unit measure concentrating at x. Note that, for finite X, the sequence {XJ} represents the results of successive draws from an urn where initially the urn has p(x) balls of color x and, after each draw, the ball drawn is replaced and another ball of its same color is added to the urn. Note also that, without the restriction to finite X, for any (Borel measurable) function zS on X, the sequence {0(X")} is a P6lya sequence with parameter qSp, where q4(A) = p{l e Al. We now describe the connections between Polya sequences and Ferguson distributions.

1,469 citations


Journal ArticleDOI
TL;DR: In this paper, the authors consider density estimates of the usual type generated by a weight function and obtain limit theorems for the maximum of the normalized deviation of the estimate from its expected value, and for quadratic norms of the same quantity.
Abstract: We consider density estimates of the usual type generated by a weight function Limt theorems are obtained for the maximum of the normalized deviation of the estimate from its expected value, and for quadratic norms of the same quantity Using these results we study the behavior of tests of goodness-of-fit and confidence regions based on these statistics In particular, we obtain a procedure which uniformly improves the chi-square goodness-of-fit test when the number of observations and cells is large and yet remains insensitive to the estimation of nuisance parameters A new limit theorem for the maximum absolute value of a type of nonstationary Gaussian process is also proved

703 citations


Journal ArticleDOI
TL;DR: In this paper, the authors formalized robust test problems between two approximately known simple hypotheses as minimax test problems, where the composite hypotheses can be described in terms of alternating capacities of order 2 (in the sense of Choquet), and the minimax tests are ordinary Neyman-Pearson tests between a fixed representative pair of simple hypotheses.
Abstract: Robust test problems between two approximately known simple hypotheses can be formalized as minimax test problems between two composite hypotheses. We show that if the composite hypotheses can be described in terms of alternating capacities of order 2 (in the sense of Choquet), then the minimax tests are ordinary Neyman-Pearson tests between a fixed representative pair of simple hypotheses; moreover, the condition is in a certain sense also necessary. All the neighborhoods customarily used to formalized approximate knowledge happen to have this particular structure.

486 citations


Journal ArticleDOI
TL;DR: In this paper, it was shown that these assumptions imply convergence at the required rate of the Bayes estimates or maximum probability estimates for independent identically distributed observations whose distribution depends on a parameter.
Abstract: Consider independent identically distributed observations whose distribution depends on a parameter $\theta$. Measure the distance between two parameter points $\theta_1, \theta_2$ by the Hellinger distance $h(\theta_1, \theta_2)$. Suppose that for $n$ observations there is a good but not perfect test of $\theta_0$ against $\theta_n$. Then $n^{\frac{1}{2}}h(\theta_0, \theta_n)$ stays away from zero and infinity. The usual parametric examples, regular or irregular, also have the property that there are estimates $\hat{\theta}_n$ such that $n^{\frac{1}{2}}h(\hat{\theta}_n, \theta_0)$ stays bounded in probability, so that rates of separation for tests and estimates are essentially the same. The present paper shows that need not be true in general but is correct under certain metric dimensionality assumptions on the parameter set. It is then shown that these assumptions imply convergence at the required rate of the Bayes estimates or maximum probability estimates.

462 citations


Journal ArticleDOI
TL;DR: In this paper, the weak convergence of the sample df under a given sequence of alternative hypotheses when parameters are estimated from the data is studied under a general class of estimators and it is shown that the sampledf, when normalised, converges weakly to a specified normal process.
Abstract: The weak convergence of the sample df is studied under a given sequence of alternative hypotheses when parameters are estimated from the data. For a general class of estimators it is shown that the sample df, when normalised, converges weakly to a specified normal process. The results are specialised to the case of efficient estimation.

378 citations


Journal ArticleDOI
TL;DR: In this article, it was shown that the solution of the generalized least squares equations is asymptotically efficient if consistent estimates of the covariance matrix are used to obtain the coefficients of the linear equations.
Abstract: One or more observations are made on a random vector, whose covariance matrix may be a linear combination of known symmetric matrices and whose mean vector may be a linear combination of known vectors; the coefficients of the linear combinations are unknown parameters to be estimated. Under the assumption of normality, equations are developed for the maximum likelihood estimates. These equations may be solved by an iterative method; in each step a set of linear equations is solved. If consistent estimates of $\sigma_0, \sigma_1, \cdots, \sigma_m$ are used to obtain the coefficients of the linear equations, the solution of these equations is asymptotically efficient as the number of observations on the random vector tends to infinity. This result is a consequence of a theorem that the solution of the generalized least squares equations is asymptotically efficient if a consistent estimate of the covariance matrix is used. Applications are made to the components of variance model in the analysis of variance and the finite moving average model in time series analysis.

275 citations


Journal ArticleDOI
TL;DR: In this article, it is shown that in order for the trimmed mean to be asymptotically normal, it is necessary and sufficient that the sample be trimmed at sample percentiles such that the corresponding population percentiles are uniquely defined.
Abstract: In this paper it is shown that in order for the trimmed mean to be asymptotically normal, it is necessary and sufficient that the sample be trimmed at sample percentiles such that the corresponding population percentiles are uniquely defined (The sufficiency of this condition is well known) In addition, the (non-normal) limiting distribution of the trimmed mean when this condition is not satisfied is derived, and it is shown that in some situations the use of the trimmed mean may lead to severely biased inferences Some possible remedies are briefly discussed, including the use of "smoothly" trimmed means

246 citations


Journal ArticleDOI
TL;DR: In this article, the large-sample distributions of the maximum-likelihood estimates for the index, skewness, scale, and location parameters (respectively β, β, c, and δ) of a stable distribution are studied.
Abstract: The large-sample distributions of the maximum-likelihood estimates for the index, skewness, scale, and location parameters (respectively $\alpha, \beta, c$, and $\delta$) of a stable distribution are studied. It is shown that if both $\alpha$ and $\delta$ are unknown, then the likelihood function $L$ will have no maximum within $0 0$, then the maximum-likelihood estimates are consistent and $n^{\frac{1}{2}}(\hat{\alpha} - \alpha, \hat{\beta} - \beta, \hat{c} - c, \hat{\delta} - \delta)$ has a limiting normal distribution with mean (0,0,0,0) and covariance matrix $\mathbf{I}^{-1}$, where $\mathbf{I}$ is the Fisher information matrix. There are some exceptional values of $\alpha$ and $\beta$ for which the argument presented does not hold. The argument consists in showing that the family of stable distributions satisfies conditions given in the literature and in doing so it is proven that certain asymptotic expansions for stable densities can be differentiated arbitrarily with respect to the parameters.

239 citations


Journal ArticleDOI
TL;DR: In this article, the existence of a Borel measurable Bayes procedure was shown to be a non-trivial problem, and a counterexample was given in which such a measure fails to exist.
Abstract: Let $f: X \times Y \rightarrow R$. We prove two theorems concerning the existence of a measurable function $\varphi$ such that $f(x, \varphi(x)) = \inf_y f(x,y)$. The first concerns Borel measurability and the second concerns absolute (or universal) measurability. These results are related to the existence of measurable projections of sets $S \subset X \times Y$. Among other applications these theorems can be applied to the problem of finding measurable Bayes procedures according to the usual procedure of minimizing the a posteriori risk. This application is described here and a counterexample is given in which a Borel measurable Bayes procedure fails to exist.

Journal ArticleDOI
TL;DR: In this paper, new test criteria are proposed for testing various hypotheses concerning covariance matrices and asymptotic expansions of their null distributions are derived in terms of the $\chi^2$-distribution.
Abstract: Some new test criteria are proposed for testing various hypotheses concerning covariance matrices. Asymptotic expansions of their null distributions are derived in terms of the $\chi^2$-distribution.

Journal ArticleDOI
TL;DR: In this paper, it was shown that a quadratic form in a multivariate sample has a certain rank and its nonzero eigenvalues are distinct with probability one under the assumption that the matrix defining the quadratically form satisfies a specific rank condition and that the underlying distribution of the sample is continuous with respect to Lebesgue measure.
Abstract: This paper shows that a quadratic form in a multivariate sample has a certain rank and its nonzero eigenvalues are distinct with probability one under the assumption that the matrix defining the quadratic form satisfies a certain rank condition and that the underlying distribution of the sample is absolutely continuous with respect to Lebesgue measure.

Journal ArticleDOI
TL;DR: In this paper, the problem of estimating the unknown values of unknown values corresponding to a very large number of measurements of a given quantity is solved by a procedure of interval estimation, whose operating characteristic is expressed in terms of a reformulation of the law of large numbers.
Abstract: The kind of calibration problem considered may be roughly described as follows: There are two related quantities $\mathscr{U}$ and $\mathscr{V}$ such that $\mathscr{U}$ is relatively easy to measure and $\mathscr{V}$ relatively difficult, requiring more effort or expense; furthermore the error in a measurement of $\mathscr{V}$ is negligible compared with that for $\mathscr{U}$. A distinguishing feature of the problem is, that from a single calibration experiment, where measurements are made on a number of pairs $(\mathscr{U}, \mathscr{V})$, we wish subsequently to estimate the unknown values of $\mathscr{V}$ corresponding to a very large number of measurements of $\mathscr{U}$. The problem is solved by a procedure of interval estimation, whose operating characteristic is expressed in terms of a reformulation of the law of large numbers. Some idea of the contents of the article may be obtained from the table of contents.


Journal ArticleDOI
TL;DR: In this article, a model for Bernoulli trials with Markov dependence is developed which possesses the usual frequency parameter p and an additional dependence parameter i. Small and large sample distribution theory for the sufficient statistics of the model is presented.
Abstract: which possesses the usual frequency parameter p = P[X= 11 and an additional dependence parameter i = P[Xi = 1 I Xi-, = 1]. Sufficient statistics for the model with p and i unknown are found and an exact closed form expression for their small sample joint distribution is given. Large sample distribution theory is also given and small sample variances compared with large sample approximations. Easily computed estimators of p and i are recommended and shown to be asymptotically efficient. With p unknown the u.m.p. unbiased test of independence is noted to be the run test. An application to a rainfall example is given. 1. Summary. A model for Bernoulli trials with Markov dependence is developed which possesses the usual frequency parameter p and an additional dependence parameter 2. Small and large sample distribution theory for the sufficient statistics of the model is presented. Easily computed estimators of p and 2 are recommended and shown to be asymptotically efficient. Lastly, with p unknown, the u.m.p. unbiased test of independence is noted to be the run test.

Journal ArticleDOI
TL;DR: In this article, the authors established necessary and sufficient conditions that the random matrix XAX'$ be positive definite w.p.i.d. and showed that the sample covariance matrix X_i - \bar{X}(X_i- \bar {X})'$ is positive definite W.P.1.
Abstract: Let $X = (X_1, \cdots, X_n)$ where the $X_i: p \times 1$ are independent random vectors, and let $A: n \times n$ be positive semi-definite symmetric. This paper establishes necessary and sufficient conditions that the random matrix $XAX'$ be positive definite w.p.1. The results are applied to cases where $A$ has a particular form or $X_1, \cdots, X_n$ are i.i.d. In particular, it is shown that in the i.i.d. case, the sample covariance matrix $\sigma(X_i - \bar{X})(X_i - \bar{X})'$ is positive definite w.p. 1 $\operatorname{iff} P\lbrack X_1 \in F\rbrack = 0$ for every proper flat $F \subset R^p$.

Journal ArticleDOI
TL;DR: In this paper, the authors give a sequence of designs converging to a $D$-optimal design, whether or not all the parameters are estimable under the limiting design.
Abstract: Fedorov (Theory of Optimal Experiments (1972)) gives a sequence of designs converging to a $D$-optimal design. Several modifications of that sequence are given to improve the speed of convergence. The analogous sequence for estimating some of the parameters is shown to converge to a $D$-optimal design, whether or not all the parameters are estimable under the limiting design. We prove the result $d(x, \xi)\xi(x) \leqq 1$, and several related results.

Journal ArticleDOI
TL;DR: In this article, the authors examined the large sample behavior of some estimates of the regression parameters and showed that the asymptotic efficiency of these procedures is independent of the design matrix.
Abstract: We consider the general linear model with independent symmetric errors. In this context we propose and examine the large sample behavior of some estimates of the regression parameters. For the location model these statistics are linear combinations of order statistics. In general they depend on a preliminary estimate and the ordered residuals based on it. The asymptotic efficiency of these procedures is independent of the design matrix. Specifically analogues of the median and trimmed and Winsorized means are proposed.

Journal ArticleDOI
TL;DR: In this paper, moment optimality is used to define a policy that maximizes the sequence of signed moments of total discounted return with a positive (negative) sign if the moment is odd (even).
Abstract: Standard finite state and action discrete time Markov decision processes with discounting are studied using a new optimality criterion called moment optimality. A policy is moment optimal if it lexicographically maximizes the sequence of signed moments of total discounted return with a positive (negative) sign if the moment is odd (even). This criterion is equivalent to being a little risk adverse. It is shown that a stationary policy is moment optimal by examining the negative of the Laplace transform of the total return random variable. An algorithm to construct all stationary moment optimal policies is developed. The algorithm is shown to be finite.

Journal ArticleDOI
TL;DR: In this paper, the authors investigated the problem of estimating the mean vector of a multivariate normal distribution with covariance matrix equal to the covariance of the loss matrix, and showed that for sufficiently large sample sizes (which never need exceed 4) proper Bayes minimax estimators exist for this problem.
Abstract: We investigate the problem of estimating the mean vector $\mathbf{\theta}$ of a multivariate normal distribution with covariance matrix equal to $\sigma^2\mathbf{I}_p, \sigma^2$ unknown, and loss $\|\delta - \mathbf{\theta}\|^2/\sigma^2$. We first find a class of minimax estimators for this problem which enlarges a class given by Baranchik. This result is then used to show that for sufficiently large sample sizes (which never need exceed 4) proper Bayes minimax estimators exist for $\mathbf{\theta}$ if $p \geqq 5$.

Journal ArticleDOI
TL;DR: In this paper, the authors established the inadmissibility of the traditional maximum likelihood estimators by exhibiting explicit procedures having lower risk than the corresponding maximum likelihood procedure, and proved that these results are proved in Theorems 1 and 2 of Section 3.
Abstract: Consider a multiple regression problem in which the dependent variable and (3 or more) independent variables have a joint normal distribution. This problem was investigated some time ago by Charles Stein, who proposed reasonable loss functions for various problems involving estimation of the regression coefficients and who obtained various minimax and admissibility results. In this paper we continue this investigation and establish the inadmissibility of the traditional maximum likelihood estimators. Inadmissibility is proved by exhibiting explicit procedures having lower risk than the corresponding maximum likelihood procedure. These results are given in Theorems 1 and 2 of Section 3.

Journal ArticleDOI
TL;DR: In this paper, an Edgeworth-type expansion for the distribution of a minimum contrast estimator, and expansions suitable for the computation of critical regions of prescribed error (type one) as well as confidence intervals of prescribed confidence coefficient are presented.
Abstract: This paper contains an Edgeworth-type expansion for the distribution of a minimum contrast estimator, and expansions suitable for the computation of critical regions of prescribed error (type one) as well as confidence intervals of prescribed confidence coefficient. Furthermore, it is shown that, for one-sided alternatives, the test based on the maximum likelihood estimator as well as the test based on the derivative of the log-likelihood function is uniformly most powerful up to a term of order $O(n^{-1})$. Finally, an estimator is proposed which is median unbiased up to an error of order $O(n^{-1})$ and which is--within the class of all estimators with this property--maximally concentrated about the true parameter up to a term of order $O(n^{-1})$. The results of this paper refer to real parameters and to families of probability measures which are "continuous" in some appropriate sense (which excludes the common discrete distributions).


Journal ArticleDOI
TL;DR: In this paper, the authors formulate the problem as an optimal stopping problem and find a stopping rule to minimize the maximum expected sample size for all sequential tests with prescribed error probabilities of the null hypothesis (H_0: \theta = -\theta_1$ versus the simple alternative H_1: Ã
Abstract: Among all sequential tests with prescribed error probabilities of the null hypothesis $H_0: \theta = -\theta_1$ versus the simple alternative $H_1: \theta = \theta_1$, where $\theta$ is the unknown mean of a normal population, we want to find the test which minimizes the maximum expected sample size. In this paper, we formulate the problem as an optimal stopping problem and find an optimal stopping rule. The analogous problem in continuous time is also studied, where we want to test whether the drift coefficient of a Wiener process is $-\theta_1$ or $\theta_1$. By reducing the corresponding optimal stopping problem to a free boundary problem, we obtain upper and lower bounds as well as the asymptotic behavior of the stopping boundaries.

Journal ArticleDOI
TL;DR: In this article, bounds on the moments of the difference between a $U$-statistic and its projection were established for convergence in the central limit theorem and the strong law of large numbers.
Abstract: Bounds are provided for the rates of convergence in the central limit theorem and the strong law of large numbers for $U$-statistics. The results are obtained by establishing suitable bounds upon the moments of the difference between a $U$-statistic and its projection. Analogous conclusions for the associated von Mises statistical functions are indicated. Statistics considered for exemplification are the sample variance and the Wilcoxon two-sample statistic.

Journal ArticleDOI
TL;DR: In this article, the law of large numbers with respect to monotone distributions has been studied in the context of Euclidean spaces, and Brunk has investigated their consistency when $N = 1 and some results are obtained in the case of 2.
Abstract: For each $t$ in some subset $T$ of $N$-dimensional Euclidean space let $F_t$ be a distribution function with mean $m(t)$. Suppose $m(t)$ is non-decreasing in each of the coordinates of $t$. Let $t_1, t_2,\cdots$ be a sequence of points in $T$ and let $Y_1, Y_2,\cdots$ be an independent sequence of random variables such that the distribution function of $Y_k$ is $F_{t_k}$. Estimators $\hat{m}_n(t; Y_1,\cdots, Y_n)$ of $m(t)$ which are monotone in each coordinate of $t$ and which minimize $\sum^n_{i=1} \lbrack\hat{m}_n(t_i; Y_1,\cdots, Y_n) - Y_i\rbrack^2$ are already known. Brunk has investigated their consistency when $N = 1$. In this paper additional consistency results are obtained when $N = 1$ and some results are obtained in the case $N = 2$. In addition, we prove several lemmas about the law of large numbers which we believe to be of independent interest.

Journal ArticleDOI
TL;DR: In this paper, the authors give conditions under which the distribution of the test criteria converges to the asymptotic distribution of test criteria of the form $Q_n^W = \sum \lbrack F_0(X_{kn}) - k/n + 1\rbrack^2W(k/n+1 + 1)$ for testing the hypothesis that X_{1n}, X_{2n},\cdots, X_{nn} are the order statistics of an independent sample from the distribution function $F_0$
Abstract: Let $Z_1, Z_2,\cdots$, be independent and identically distributed random variables and $\{c_{ijn}\}$ real numbers; put $T_n = \sum^n_{i,j = 1} c_{ijn}Z_iZ_j$. This paper gives conditions under which the distribution of $T_n - ET_n$ converges to the distribution of $\sum \Upsilon_m(Y_m^2 - 1)$ with $\{\Upsilon_m\}$ a real sequence and $Y_1, Y_2,\cdots$ independent $N(0, 1)$ random variables. The results are applied to the calculation of the asymptotic distributions of test criteria of the form $Q_n^W = \sum \lbrack F_0(X_{kn}) - k/n + 1\rbrack^2W(k/n + 1)$ for testing the hypothesis that $X_{1n}, X_{2n},\cdots, X_{nn}$ are the order statistics of an independent sample from the distribution function $F_0$; here $W$ is a weight function.

Journal ArticleDOI
TL;DR: The theory of rank tests has been extended to include purely discrete random variables under the null hypothesis of randomness (including the two-sample and $k$-sample problems) and under contiguous alternatives for the two methods of assigning scores known as the average scores method and the randomized ranks method as mentioned in this paper.
Abstract: The theory of rank tests has been developed primarily for continuous random variables. Recently the asymptotic theory of linear rank tests has been extended to include purely discrete random variables under the null hypothesis of randomness (including the two-sample and $k$-sample problems) and under contiguous alternatives, for the two methods of assigning scores known as the average scores method and the randomized ranks method. In this paper the theory of rank tests is developed with no assumptions concerning the continuous or discrete nature of the underlying distribution function. Conditional rank tests, given the vector of ties, are shown to be similar, and the locally most powerful conditional rank test is given. The asymptotic distribution of linear rank statistics is given under the null hypotheses of randomness and symmetry (which includes the one-sample problem), and under contiguous alternatives. Three methods of assigning scores, the average scores, midranks, and randomized ranks methods, are discussed and briefly compared.

Journal ArticleDOI
TL;DR: In this article, it was shown that certain conditional distributions, obtained by conditioning on a sufficient statistic, can be used to transform a set of random variables into a smaller set of randomly distributed variables that are identically and independently distributed with uniform distributions on the interval from zero to one.
Abstract: It is shown that certain conditional distributions, obtained by conditioning on a sufficient statistic, can be used to transform a set of random variables into a smaller set of random variables that are identically and independently distributed with uniform distributions on the interval from zero to one. This result is then used to construct distribution-free tests of fit for composite goodness-of-fit problems. In particular, distribution-free chi-square goodness-of-fit tests are obtained for univariate normal, exponential, and normal linear regression model families of distributions.