scispace - formally typeset
Search or ask a question

Showing papers in "Annals of Mathematical Statistics in 1972"


Journal ArticleDOI
TL;DR: In this article, the authors generalized the iterative scaling method to allow real numbers and showed that it is possible to estimate a large class of probability distributions in product form subject to (1) and (2) or from maximizing entropy or maximizing likelihood.
Abstract: Say that a probability distribution $\{p_i; i \in I\}$ over a finite set $I$ is in "product form" if (1) $p_i = \pi_i\mu \prod^d_{s=1} \mu_s^{b_si}$ where $\pi_i$ and $\{b_{si}\}$ are given constants and where $\mu$ and $\{\mu_s\}$ are determined from the equations (2) $\sum_{i \in I} b_{si} p_i = k_s, s = 1, 2, \cdots, d$; (3) $\sum_{i \in I} p_i = 1$. Probability distributions in product form arise from minimizing the discriminatory information $\sum_{i \in I} p_i \log p_i/\pi_i$ subject to (2) and (3) or from maximizing entropy or maximizing likelihood. The theory of the iterative scaling method of determining (1) subject to (2) and (3) has, until now, been limited to the case when $b_{si} = 0, 1$. In this paper the method is generalized to allow the $b_{si}$ to be any real numbers. This expands considerably the list of probability distributions in product form which it is possible to estimate by maximum likelihood.

1,292 citations



Journal ArticleDOI
TL;DR: A selective review on robust statistics, centering on estimates of location, but extending into other estimation and testing problems, can be found in this paper, where three important classes of estimates are singled out and some basic heuristic tools for assessing properties of robust estimates (or test statistics) are discussed.
Abstract: This is a selective review on robust statistics, centering on estimates of location, but extending into other estimation and testing problems. After some historical remarks, several possible concepts of robustness are critically reviewed. Three important classes of estimates are singled out and some basic heuristic tools for assessing properties of robust estimates (or test statistics) are discussed: influence curve, jackknifing. Then we give some asymptotic and finite sample minimax results for estimation and testing. The material is complemented by miscellaneous remarks on: computational aspects; other estimates; scale, regression, time series and other estimation problems; some tentative practical recommendations.

557 citations


Journal ArticleDOI
TL;DR: In this article, the authors give some measures of the dispersion of a set of numbers, and define their estimates as those values of the parameters that minimize the residual dispersion, which is asymptotically equivalent to estimates recently proposed by Jureckova.
Abstract: An appealing approach to the problem of estimating the regression coefficients in a linear model is to find those values of the coefficients which make the residuals as small as possible. We give some measures of the dispersion of a set of numbers, and define our estimates as those values of the parameters which minimize the dispersion of the residuals. We consider dispersion measures which are certain linear combinations of the ordered residuals. We show that the estimates derived from them are asymptotically equivalent to estimates recently proposed by Jureckova. In the case of a single parameter, we show that our estimate is a "weighted median" of the pairwise slopes $(Y_j - Y_i)/(c^j - c^i)$.

515 citations



Journal ArticleDOI
TL;DR: In this article, a necessary and sufficient condition for joint asymptotic normality in a new (strong) sense, in the case of independence, is given, in which the dimensionality of the vector random variable under consideration is allowed to increase indefinitely.
Abstract: The concept of asymptotic normality takes on some new aspects when the dimensionality of the vector random variable under consideration is allowed to increase indefinitely. A necessary and sufficient condition for joint asymptotic normality in a new (strong) sense, in the case of independence, is given.

266 citations


Journal ArticleDOI
TL;DR: In this article, the authors consider a queue in which is characterized by the asymptotic behavior of a random walk with negative drift and show that the tail behavior of such a walk is determined by the traffic intensity and the ratio of arrival rate to service rate.
Abstract: Consider a $GI/G/1$ queue in which $W_n$ is the waiting time of the $n$th customer, $W(t)$ is the virtual waiting time at time $t$, and $Q(t)$ is the number of customers in the system at time $t$. We let the extreme values of these processes be $W_n^\ast = \max \{W_j: 0 \leqq j \leqq n\}, W^\ast(t) = \sup \{W(s): 0 \leqq s \leqq t\}$, and $Q^\ast(t) = \sup \{Q(s): 0 \leqq s \leqq t\}$. The asymptotic behavior of the queue is determined by the traffic intensity $\rho$, the ratio of arrival rate to service rate. When $\rho 1$. For the case $\rho < 1$, it is necessary to obtain the tail behavior of the maximum of a random walk with negative drift before it first enters the set $(-\infty, 0\rbrack$.

243 citations


Journal ArticleDOI
TL;DR: In this paper, the authors considered the problem of converting input sequences of symbols generated by a stationary random process into sequences of independent, equiprobable output symbols, measuring the efficiency of such a procedure when the input sequence is finite by the expected value of the ratio of output symbols to input symbols.
Abstract: We consider procedures for converting input sequences of symbols generated by a stationary random process into sequences of independent, equiprobable output symbols, measuring the efficiency of such a procedure when the input sequence is finite by the expected value of the ratio of output symbols to input symbols. For a large class of processes and a large class of procedures we give an obvious information-theoretic upper bound to efficiency. We also construct procedures which attain this bound in the limit of long input sequences without making use of the process parameters, for two classes of processes. In the independent case we generalize a 1951 result of von Neumann and 1970 results of Hoeffding and Simons for independent but biased binary input, gaining a factor of 3 or 4 in efficiency. In the finite-state case we generalize a 1968 result of Samuelson for two-state binary Markov input, gaining a larger factor in efficiency.

200 citations


Journal ArticleDOI
TL;DR: In this paper, the problem of estimating the number of trials of a multinomial distribution, from an incomplete observation of the cell totals, under constraints on the cell probabilities, is investigated.
Abstract: This paper deals with the problem of estimating the number of trials of a multinomial distribution, from an incomplete observation of the cell totals, under constraints on the cell probabilities. More specifically let $(n_1, \cdots, n_k)$ be distributed according to the multinomial law $M(N; p_1, \cdots, p_k)$ where $N$ is the number of trials and the $p_i$'s are the cell probabilities, $\sum^k_{i=1}p_i$ being equal to 1. Suppose that only a proper subset of $(n_1, \cdots, n_k)$ is observable, that $N, p_1, \cdots, p_k$ are unknown and that $N$ is to be estimated. Without loss of generality, $(n_1, \cdots, n_{l-1}), l \leqq k$ may be taken to be the observable random vector. For fixed $N, (n_1, \cdots, n_{l-1}, N - n)$ has the multinomial distribution $M(N; p_1, \cdots, p_l)$ where $n$ denotes $\sum^{l-1}_{i=1}n_i$ and $p_l$ denotes $1 - \sum^{l-1}_{i=1}p_i$. If the parameter space is such that $N$ can take any nonnegative integral value and each $p_i$ can take any value between 0 and 1, such that $\sum^{l-1}_{i=1}p_i n$. In specific situations, it might, however, be possible to postulate constraints of the type \begin{equation*}\tag{1.1} p_i = f_i(\theta),\quad i = 1, \cdots, l\end{equation*} where $\theta = (\theta_1, \cdots, \theta_r)$ is a vector of $r$ independent parameters and $f_i$ are known functions. This may lead to estimability of $N$. The problem of estimating $N$ in such a situation is studied here. The present investigation is motivated by the following problem. Experiments in particle physics often involve visual scanning of film containing photographs of particles (occurring, for instance, inside a bubble chamber). The scanning is done with a view to counting the number $N$ of particles of a predetermined type (these particles will be referred to as events). But owing to poor visibility caused by such characteristics as low momentum, the distribution and configuration of nearby track patterns, etc., some events are likely to be missed during the scanning process. The question, then, is: How does one get an estimate of $N$? The usual procedure of estimating $N$ is as follows. Film containing the $N$ (unknown) events is scanned separately by $w$ scanners (ordered in some specific way) using the same instructions. For each event $E$ let a $w$-vector $Z(E)$ be defined, such that the $j$th component $Z_j$ of $Z(E)$ is 1 if $E$ is detected by the $j$th scanner and is 0 otherwise. Let $\mathscr{J}$ be the set of $2^w w$-vectors of 1's and 0's and let $I_0$ by the vector of 0's. Let $x_I$ be the number of events $E$ whose $Z(E) = I$. For $I \in \mathscr{J} - \{I_0\}$, the $x_I$'s are observed. A probability model is assumed for the results of the scanning process. That is, it is assumed that there is a probability $p_I$ that $Z(E)$ assumes the value $I$ and that these $p_I$'s are constrained by equations of the type (1.1) (These constraints vary according to the assumptions made about the scanners and events, thus giving rise to different models. An example of $p_I(\theta)$ would be $E( u^{\Sigma^w_{j=1}I_j}(1 - u)^{w-\Sigma^w_{j=1}I_j})$ where $I_j$ is the $j$th component of $I$ and expectation is taken with respect to the two-parameter beta density for $v$. This is the result of assuming that all scanners are equally efficient in detecting events, that the probability $v$ that an event is seen by any scanner is a random variable and that the results of the different scans are locally independent. For a discussion of various models, see Sanathanan (1969), Chapter III. $N$ is then estimated using the observed $x_I$'s and the constraints on the $P_I$'s, provided certain conditions (e.g., the minimum number of scans required) are met. The following formulation of the problem of estimating $N$, however, leads to some systematic study including a development of the relevant asymptotic distribution theory for the estimators. The $Z(E)$'s may be regarded as realizations of $N$ independent identically distributed random variables whose common distribution is discrete with probabilities $p_I$ at $I$ (In particle counting problems, it is usually true that the particles of interest are sparsely distributed throughout the film on account of their Poisson distribution with low intensity. Thus in spite of the factors affecting their visibility outlined earlier, the events can be assumed to be independent.). The joint distribution of the $x_I$'s is, then, multinomial $M(N; p_I, I \in \mathscr{J})$. The problem of estimating $N$ is now in the form stated at the beginning of this section. Since the estimate depends on the constraints provided for the $p_I$'s, it is important to test the "fit" on the model selected. The conditional distribution of the $x_I$'s $(I eq I_0)$ given $x$ is multinomial $M(x; p_I/p(I eq I_0))$ where $x$ is defined as $\sum_{I eq I_0} x_I$ and $p$ as $\sum_{I eq I_0}P_I$. The corresponding $\chi^2$ goodness of fit test may therefore be used to test the adequacy of a model in question. Various estimators of $N$ are considered in this paper and among them is, of course, the maximum likelihood estimator of $N$. Asymptotic theory for maximum likelihood estimation of the parameters of a multinomial distribution has been developed before for the case where $N$ is known but not for the case where $N$ is unknown. Asymptotic theory related to the latter case is developed is Section 4. The result on the asymptotic joint distribution of the relevant maximum likelihood estimators is stated in Theorem 2. A second method of estimation considered is that of maximizing the likelihood based on the conditional probability of observing $(n_1,\cdots, n_{l-1})$, given $n$. This method is called the conditional maximum likelihood (C.M.L.) method. The C.M.L. estimator of $N$ is shown (Theorem 2) to be asymptotically equivalent to the maximum likelihood estimator. Section 5 contains an extension of these results to the situation involving several multinomial distributions. This situation arises in the particle scanning context when the detected events are classified into groups based on some factor like momentum which is related to visibility of an event, and a separate scanning record is available for each group. A third method of estimation considered is that of equating certain linear combinations of the cell totals (presumably chosen on the basis of some criterion) to their respective expected values. Asymptotic theory for this method is given in Section 6. This discussion is motivated by a particular case which is applicable to some models in the particle scanning problem, using a criterion based on the method of moments for the unobservable random variable, given by the number of scanners detecting an event (Discussion of the particular case can be found in Sanathanan (1969) Chapter III.). In the next section we give some definitions and a preliminary lemma.

195 citations


Journal ArticleDOI
TL;DR: In this article, the authors studied the performance of estimation-preceded-by-testing in the context of estimating the mean vector of a multivariate normal distribution with quadratic loss.
Abstract: Estimation-preceded-by-testing is studied in the context of estimating the mean vector of a multivariate normal distribution with quadratic loss. It is shown that although there are parameter values for which the risk of a preliminary-test estimator is less than that of the usual estimator, there are also values for which its risk exceeds that of the usual estimator, and that it is dominated by the positive-part version of the Stein-James estimator. The results apply to preliminary-test estimators corresponding to any linear hypothesis concerning the mean vector, e.g., an hypothesis in a regression model. The case in which the covariance matrix of the multi-normal distribution is known up to a multiplicative constant and the case in which it is completely unknown are treated.

175 citations


Journal ArticleDOI
TL;DR: A U-statistic J sub N is proposed for testing the hypothesis H sub O that a new item has stochastically the same life length as a used item of any age against the alternative hypothesis HSub 1 that anew item has Stochastically greater life length.
Abstract: : A U-statistic J sub N is proposed for testing the hypothesis H sub O that a new item has stochastically the same life length as a used item of any age (i.e., the life distribution F is exponential), against the alternative hypothesis H sub 1 that a new item has stochastically greater life length (F(x) F(y) > or = F(x+y), for all x > or = 0, y > or = 0, where F = 1-F). J sub n is unbiased; in fact, under a partial ordering of H sub 1 distributions, J sub n is ordered stochastically in the same way. Consistency against H sub 1 alternatives is shown, and asymptotic relative efficiencies are computed. Small sample null tail probabilities are derived, and critical values are tabulated to permit application of the test. (Author)

Book ChapterDOI
TL;DR: In this paper, it was shown how martingale theorems can be used to widen the scope of classical inferential results concerning autocorrelations in time series analysis.
Abstract: In this paper it is shown how martingale theorems can be used to appreciably widen the scope of classical inferential results concerning autocorrelations in time series analysis. The object of study is a process which is basically the second-order stationary purely non-deterministic process and contains, in particular, the mixed autoregressive and moving average process. We obtain a strong law and a central limit theorem for the autocorrelations of this process under very general conditions. These results show in particular that, subject to mild regularity conditions, the classical theory of inference for the process in question goes through if the best linear predictor is the best predictor (both in the least squares sense).

Journal ArticleDOI
TL;DR: Two theorems on the asymptotic normality of linear combinations of functions of order statistics are given in this article, one requires a smooth scoring function but the underlying df need not be continuous even and can also depend on the sample size.
Abstract: Two theorems on the asymptotic normality of linear combinations of functions of order statistics are given. Theorem 1 requires a "smooth" scoring function but the underlying df need not be continuous even and can also depend on the sample size. Theorem 2 allows general scoring functions but places additional restrictions on the df. Applications included.

Journal ArticleDOI
TL;DR: In this paper, the authors studied the convergence of a density function evaluated at 0 to a given set of density functions and a sequence of numbers, and derived lower bounds for the possible rate of convergence.
Abstract: Estimation of the value $f(0)$ of a density function evaluated at 0 is studied, $f: \mathbb{R}_m \rightarrow \mathbb{R}, 0 \in \mathbb{R}_m$. Sequences of estimators $\{\gamma_n, n \geqq 1\}$, one estimator for each sample size, are studied. We are interested in the problem, given a set $C$ of density functions and a sequence of numbers $\{a_n, n \leqq 1\}$, how rapidly can $a_n$ tend to zero and yet have $\lim\inf_{n\rightarrow\infty} \inf_{f\in C}P_f(|\gamma_n(X_1,\cdots, X_n) - f(0)|\leqq a_n) > 0?$ In brief, by "rate of convergence" we will mean the rate which $a_n$ tends to zero. For a continuum of different choices of the set $C$ specified by various Lipschitz conditions on the $k$th partial derivatives of $f, k \geqq 0$, lower bounds for the possible rate of convergence are obtained. Combination of these lower bounds with known methods of estimation give best possible rates of convergence in a number of cases.

Journal ArticleDOI
TL;DR: In this article, the authors considered the problem of estimating the regression function of a regression function on a set of points (x, y, k) of the form (m_n(x) = \sum Y_ik((x - X_i)/a_n)/\sum k((x − X - i)/a-n)n) ).
Abstract: As an approximation to the regression function $m$ of $Y$ on $X$ based upon empirical data, E. A. Nadaraya and G. S. Watson have studied estimates of $m$ of the form $m_n(x) = \sum Y_ik((x - X_i)/a_n)/\sum k((x - X_i)/a_n)$. For distinct points $x_1, \cdots, x_k$, we establish conditions under which $(na_n)^{\frac{1}{2}}(m_n(x_1) - m(x_1), \cdots, m_n(x_k) - m(x_k))$ is asymptotically multivariate normal.

Journal ArticleDOI
TL;DR: In this article, the authors derived asymptotic formulas for variate values in a finite population under general conditions when the population is sampled by successive drawings without replacement in the following way: at each draw the probability of drawing item $s$ is proportional to a number $p_s > 0$ if item$s$ remains in the population and is 0 otherwise.
Abstract: To each of the items $1,2,\cdots, N$ in a finite population there is associated a variate value. The population is sampled by successive drawings without replacement in the following way. At each draw the probability of drawing item $s$ is proportional to a number $p_s > 0$ if item $s$ remains in the population and is 0 otherwise. Let $\Delta(s; n)$ be the probability that item $s$ is obtained in the first $n$ draws and let $Z_n$ be the sum of the variate values obtained in the first $n$ draws. Asymptotic formulas, valid under general conditions when $n$ and $N$ both are "large", are derived for $\Delta(s; n), EZ_n$ and $\operatorname{Cov}(Z_{n_1}, Z_{n_2})$. Furthermore it is shown that, still under general conditions, the joint distribution of $Z_{n_1}, Z_{n_2},\cdots, Z_{n_d}$ is asymptotically normal. The general results are then applied to obtain asymptotic results for a "quasi"-Horvitz-Thompson estimator of the population total.

Journal ArticleDOI
TL;DR: In this paper, it was shown that for any right continuous martingale, there is a right continuous family of minimal stopping times for the Wiener process with a stable distribution of index α > 1.
Abstract: A stopping time $T$ for the Wiener process $W(t)$ is called minimal if there is no stopping time $S \leqq T$ such that $W(S)$ and $W(T)$ have the same distribution. In the first section, it is shown that if $E\{W(T)\} = 0$, then $T$ is minimal if and only if the process $W(t \wedge T)$ is uniformly integrable. Also, if $T$ is minimal and $E\{W(T)\} = 0$ then $E\{T\} = E\{W(T)^2\}$. In the second section, these ideas are used to show that for any right continuous martingale $M(t)$, there is a right continuous family of minimal stopping times $T(t)$ such that $W(T(t))$ has the same finite joint distributions as $M(t)$. In the last section it is shown that if $T$ is defined in the manner proposed by Skorokhod (and therefore minimal) such that $W(T)$ has a stable distribution of index $\alpha > 1$ then $T$ is in the domain of attraction of a stable distribution of index $\alpha/2$.

Journal ArticleDOI
TL;DR: In this article, a Bernoulli process with unknown expectations is selected and observed at each of n$ stages, and the objective is to maximize the expected number of successes from the n$ selections.
Abstract: One of two independent Bernoulli processes (arms) with unknown expectations $\rho$ and $\lambda$ is selected and observed at each of $n$ stages. The selection problem is sequential in that the process which is selected at a particular stage is a function of the results of previous selections as well as of prior information about $\rho$ and $\lambda$. The variables $\rho$ and $\lambda$ are assumed to be independent under the (prior) probability distribution. The objective is to maximize the expected number of successes from the $n$ selections. Sufficient conditions for the optimality of selecting one or the other of the arms are given and illustrated for example distributions. The stay-on-a-winner rule is proved.

Journal ArticleDOI
TL;DR: In this paper, the authors consider some unresolved relationships among various notions of bivariate dependence and show that for any non-decreasing $f$ and $g$ the associated notions of dependence are associated.
Abstract: We consider some unresolved relationships among various notions of bivariate dependence. In particular we show that $P\lbrack T > t \mid S > s\rbrack \uparrow$ in $s$ (or alternately, $P\lbrack T \leqq t \mid S \leqq s\rbrack \downarrow$ in $s$) implies $S, T$ are associated, i.e. $\operatorname{Cov} \lbrack f(S, T), g(S, T)\rbrack \geqq 0$ for all non-decreasing $f$ and $g$.

Journal ArticleDOI
TL;DR: In this article, a Cramer von-Mises type statistic is proposed for testing the symmetry of a continuous distribution function and its asymptotic null distribution is found explicitly.
Abstract: A Cramer von-Mises type statistic is proposed for testing the symmetry of a continuous distribution function Its asymptotic null distribution is found explicitly, and its asymptotic distribution under a sequence of local alternatives is described A Monte Carlo study indicates that the asymptotic formulae are accurate for sample sizes as small as twenty

Journal ArticleDOI
TL;DR: In this paper, conditions for strong consistency and asymptotic normality of the MLE estimator for multiparameter exponential models are given, and the conditions are less restrictive than required by general theorems in this area.
Abstract: Conditions are given for the strong consistency and asymptotic normality of the MLE (maximum likelihood estimator) for multiparameter exponential models. Because of the special structure assumed, the conditions are less restrictive than required by general theorems in this area. The technique involves certain convex functions on Euclidean spaces that arise naturally in the present context. Some examples are considered; among them, the multinomial distribution. Some convexity and continuity properties of multivariate cumulant generating functions are also discussed.

Journal ArticleDOI
TL;DR: In this paper, the optimality criteria formulated in terms of the power functions of individual tests are given for problems where several hypotheses are tested simultaneously, subject to the constraint that the expected number of false rejections is less than a given constant gamma when all null hypotheses are true.
Abstract: : Optimality criteria formulated in terms of the power functions of the individual tests are given for problems where several hypotheses are tested simultaneously. Subject to the constraint that the expected number of false rejections is less than a given constant gamma when all null hypotheses are true, tests are found which maximize the minimum average power and the minimum power of the individual tests over certain alternatives. In the common situations in the analysis of variance this leads to application of multiple t-tests. Recommendations for choosing the value of gamma are given by relating gamma to the probability of no false rejections if all hypotheses are true. Based upon the optimality of the tests, a similar optimality property of joint confidence sets is also derived. (Author)

Journal ArticleDOI
TL;DR: In this article, the authors considered separable mean zero Gaussian processes X(t) with correlation functions rho(t,s) for which 1-rho( t, s) is asymptotic to a regularly varying (at zero) function of /t-s/ with exponent 0=or < alpha =or < 2.
Abstract: : The authors consider two problems for separable mean zero Gaussian processes X(t) with correlation functions rho(t,s) for which 1-rho(t,s) is asymptotic to a regularly varying (at zero) function of /t-s/ with exponent 0=or < alpha =or <2. In showing the existence of such (stationary) processes for 0 = or < alpha < 2, the authors relate the magnitude of the tails of the spectral distributionsto the behavior of the covariance function at the origin. For 0 < alpha = or < 2, the authors obtain the asymptotic distribution of the maximum of X(t). This second result is used to obtain a result for X(t) as t approaches infinity similar to the 'so called' law of the iterated logarithm. (Author)

Journal ArticleDOI
TL;DR: In this article, it was shown that the optimal expected return is finite, and conditions under which the optimal stopping times exist, and the supremum is taken over all stop rules.
Abstract: We determine $\sup E\lbrack r(S_T)\rbrack$, where $S_n$ is a sequence of partial sums of independent identically distributed random variables, for two reward functions: $r(x) = x^+$ and $r(x) = (e^x - 1)^+$. The supremum is taken over all stop rules $T$. We give conditions under which the optimal expected return is finite. Under these conditions, optimal stopping times exist, and we determine them. The problem has an interpretation in an action timing problem in finance.

Journal ArticleDOI
TL;DR: In this article, the precision of a truncated development of a two-sided Kolmogorov-Smirnov type has been determined in powers of λ √ 2 for any λ = 1,2, 10, 100.
Abstract: Let $X_1^n \leqq X_2^n \leqq \cdots \leqq X_n^n$ be the order statistics of a size $n$ sample from any distribution function $F$ not necessarily continuous. Let $\alpha_j, \beta_j, (j = 1,2, \cdots, n)$ be any numbers. Let $P_n = P(\alpha_j < X_j^n \leqq \beta_j, j = 1,2, \cdots, n)$. A recursion is given which calculates $P_n$ for any $F$ and any $\alpha_j, \beta_j$. Suppose now that $F$ is continuous. A two-sided statistic of Kolmogorov-Smirnov type has the distribution function $P_{\mathrm{KS}} = P\lbrack\sup n^{\frac{1}{2}}\psi(F) \cdot |F^n - F| \leqq \lambda\rbrack$, where $F^n$ is the empirical distribution function of the sample and $\psi(x)$ is any nonnegative weight function. As $P_{\mathrm{KS}}$ has the form $P_n$, its calculation as a function of $\lambda$ can be carried out by means of the recursion. This has been done for the case $\psi(x) = \lbrack x(1 - x)\rbrack^{-\frac{1}{2}}$. Curves are given which represent $\lambda$ versus $1 - P_{\mathrm{KS}}$ for $n = 1,2, 10, 100$. From additional computations, the precision of a truncated development of $1 - P_{\mathrm{KS}}$ in powers of $\lambda^{-2}$ has been determined.

Journal ArticleDOI
TL;DR: Asymptotic normality of linear rank statistics for testing the hypothesis of independence is established under fixed alternatives in this article, where a generalization of a result of Bhuchongkul [1] is obtained both with respect to the conditions concerning the orders of magnitude of the score functions and to the smoothness conditions on these functions.
Abstract: Asymptotic normality of linear rank statistics for testing the hypothesis of independence is established under fixed alternatives. A generalization of a result of Bhuchongkul [1] is obtained both with respect to the conditions concerning the orders of magnitude of the score functions and with respect to the smoothness conditions on these functions.

Journal ArticleDOI
TL;DR: In this paper, a sequential procedure for estimating the mean of an exponential distribution is proposed, which is shown to perform well for large values of the mean, and the results of a Monte Carlo study indicate that it also performs well for moderate values of a mean.
Abstract: A sequential procedure for estimating the mean of an exponential distribution is proposed. It is shown to perform well for large values of the mean, and the results of a Monte Carlo study indicate that it also performs well for moderate values of the mean.

Journal ArticleDOI
TL;DR: In this article, a sequence of decision problems is considered where for each problem the observation has a probability density function of exponential type with parameter lambda where lambda is selected independently for each problems according to an unknown prior distribution G(lambda).
Abstract: : A sequence of decision problems is considered where for each problem the observation has a probability density function of exponential type with parameter lambda where lambda is selected independently for each problem according to an unknown prior distribution G(lambda). It is supposed that in each of the problems, one of two possible actions (e.g., 'accept' or 'reject') must be taken. Under various assumptions, reasonably sharp upper bounds are found for the rate at which the risk of the nth problem approaches the smallest possible risk for certain refinements of the standard empirical Bayes procedures. For suitably chosen procedures, under situations likely to occur in practice, rates faster than n to the power (-1 + epsilon) may be obtained for arbitrarily small epsilon > 0. Arbitrarily slow rates can occur in pathological situations. (Author)

Journal ArticleDOI
TL;DR: For partial cumulative sums of independent and identically distributed random variables with a finite (positive) variance, weak convergence to Brownian motion processes has been established by Donsker (1951, 1952).
Abstract: For partial cumulative sums of independent and identically distributed random variables (i.i.d.r.v.) with a finite (positive) variance, weak convergence to Brownian motion processes has been established by Donsker (1951, 1952). The result is extended here to differentiable statistical functions of von Mises (1947) and $U$-statistics of Hoeffding (1948). Along with the extension to generalized $U$-statistics, a few applications are briefly sketched.

Journal ArticleDOI
TL;DR: In this article, a convex version of the Edgeworth asymptotic expansion for the quadratic form of the distribution function of X_n is presented, where the assumption of continuous distribution is removed.
Abstract: Ranga Rao [10] developed a version of the Edgeworth asymptotic expansion for $\mathrm{Pr}(X_n \in B)$, where $X_n = n^{-\frac{1}{2}} \sum^n_{i=1} Z_i, \lbrack Z_n\rbrack$ is a sequence of independent random vectors in $R_k$ having a common lattice distribution with mean vector zero and nonsingular covariance matrix $ ot\sum$, and $B$ is a Borel set. Use of this expansion is very difficult, except for the distribution function of $X_n$. In this paper, Ranga Rao's expansion is used to obtain a different expansion, when $B$ is convex. This new expansion is much simpler to evaluate. In the special case when $B = \lbrack x \mid x^T ot\sum^{-1} x < c\rbrack$, the new expansion assumes its simplest form. The first partial sum is the usual multivariate normal approximation, and Esseen ([6] pages 110-111) determined the order of magnitude of its error, i.e., $\mathrm{Pr}(X_n \in B) = K_k(c) + O(n^{-k/(k+1)})$ where $K_k(c)$ is the chi-square distribution function with $k$ degrees of freedom. Note that the order of magnitude of the error is $n^{-\frac{1}{2}}$ for $k = 1$ and approaches $n^{-1}$ as $k$ increases. The second partial sum is $\mathrm{Pr}(X_n \in B) = K_k(c) + (N(nc) - V(nc)) \frac{\exp(-c/2)}{(2\pi n)^{k/2}| ot\sum|^{\frac{1}{2}}} + O(n^{-1})$ where $N(nc)$ is the number of integer vectors $m$ in the ellipsoid $(m + na)^T ot\sum^{-1}(m + na) < nc$ having center at $-na$, and $V(nc)$ is the volume of this ellipsoid. This provides a new expansion for the distribution function of the quadratic form $X_n^T ot\sum^{-1}X_n$. When $Z_i$ has a multinomial distribution with parameters $N = 1, p_1, \cdots, p_m, \sum^m_{i=1} p_i = 1, X_n^T ot\sum^{-1} X_n$ is the chi-square goodness-of-fit statistic, and the new expansion (with $k = m - 1$) provides very accurate approximations for its distribution function. The accuracy of the first several partial sums, and of the Edgeworth approximation under the (inappropriate) assumption that $Z_i$ has a continuous distribution, is examined numerically for a number of multinomial distributions. It is concluded that the Edgeworth approximation assuming a continuous distribution should never be used when $Z_i$ has a lattice distribution, and that the second partial sum of the new expansion is much more accurate than the normal approximation for all multinomial distributions examined.