scispace - formally typeset
Search or ask a question

Showing papers in "Annals of Mathematical Statistics in 1958"



Journal ArticleDOI
TL;DR: In this article, the authors show that the treatment mean square and the treatment group interaction can be tested in the same approximate fashion by using the Box procedure, and that the conservative test would be $F(1, n - 1).
Abstract: The mixed model in a 2-way analysis of variance is characterized by a fixed classification, e.g., treatments, and a random classification, e.g., plots or individuals. If we consider $k$ different treatments each applied to everyone of $n$ individuals, and assume the usual analysis of variance assumptions of uncorrelated errors, equal variances and normality, an appropriate analysis for the set of $nk$ observations $x_{ij}, i = 1, 2, \cdots n, j = 1, 2, \cdots k$, is ???? where the $F$ ratio under the null hypothesis has the $F$ distribution with $(k - 1)$ and $(k - 1)(n - 1)$ degrees of freedom. As is well known, if we extend the situation so that the errors have equal correlations instead of being uncorrelated, the $F$ ratio has the same distribution. Under the null hypothesis, the numerator estimates the same quantity as the denominator, namely, $(1 - \rho)\sigma^2$, where $\rho$ is the constant correlation coefficient among the treatments. This case can also be considered as a sampling of $n$ vectors (individuals) from a $k$-variate normal population with variance-covariance matrix $$V = \sigma^2 \begin{pmatrix} 1 & \rho & \cdots & \rho \\ \rho & & & \vdots \\ \vdots & & & \rho \\ \rho & \cdots & \rho & 1\end{pmatrix}.$$ If we consider this type of formulation and suppose the $k$ treatment errors to have a multivariate normal distribution with unknown variance-covariance matrix (the same for each individual), then the usual test described above is valid for $k = 2$. For $k > 2$, and $n \geqq k$, Hotelling's $T^2$ is the appropriate test for the homogeneity of the treatment means. However, the working statistician is sometimes confronted with the case where $k > n$, or he does not have the adequate means for computing large order inverse matrices and would therefore like to use the original test ratio which in general does not have the requisite $F$ distribution. Box [1] and [2] has given an approximate distribution of the test ratio to be $F\lbrack(k - 1)\epsilon, (k - 1)(n - 1)\epsilon\rbrack$ where $\epsilon$ is a function of the population variances and covariances and may further be approximated by the sample variances and covariances. We show in Section 3 that $\epsilon \geqq (k - 1)^{-1}$, and therefore a conservative test would be $F(1, n - 1)$. Box referred only to one group of $n$ individuals. We shall extend his results to a frequently occurring case, namely, the analysis of $g$ groups where the $\alpha$th group has $n_\alpha$ individuals, $\alpha = 1, 2, \cdots g$, and $\Sigma^g_{\alpha = 1} n_\alpha = N$. We will show that the treatment mean square and the treatment $\times$ group interaction can be tested in the same approximate fashion by using the Box procedure.

1,102 citations


Journal ArticleDOI
TL;DR: In this paper, the estimation of a parameter lying in a subset of a set of possible parameters is considered, and the estimator considered lies in the subset and is a solution of likelihood equations containing a Lagrangian multiplier.
Abstract: The estimation of a parameter lying in a subset of a set of possible parameters is considered This subset is the null space of a well-behaved function and the estimator considered lies in the subset and is a solution of likelihood equations containing a Lagrangian multiplier It is proved that, under certain conditions analogous to those of Cramer, these equations have a solution which gives a local maximum of the likelihood function The asymptotic distribution of this `restricted maximum likelihood estimator' and an iterative method of solving the equations are discussed Finally a test is introduced of the hypothesis that the true parameter does lie in the subset; this test, which is of wide applicability, makes use of the distribution of the random Lagrangian multiplier appearing in the likelihood equations

598 citations


Journal ArticleDOI
TL;DR: In this paper, it was shown for all alternative hypotheses that the FisherYates-Terry-Hoeffding cl-statistic is asymptotically normal and the test for translation based on it is at least as efficient as the t-test.
Abstract: Theorems of Wald and Wolfowitz, Noether, Hoeffding, Lehmann, Madow, and Dwass have given sufficient conditions for the asymptotic normality of TN. In this paper we extend some of these results to cover more situations with F s G. In particular it is shown for all alternative hypotheses that the FisherYates-Terry-Hoeffding cl-statistic is asymptotically normal and the test for translation based on it is at least as efficient as the t-test. 2. Introduction. Finding the distributions of nonparametric test statistics and establishing optimum properties of these tests for small samples has progressed slower than the corresponding large sample theory. Even so, it is not possible to state that the basic framework of the large sample theory has been completed. Dwass [3] has recently presented a general theorem on the asymptotic normality of certain nonparametric test statistics under alternative hypotheses. His results, however, do not apply to such important and interesting procedures as the cl-test [11]. Many papers have appeared giving the asymptotic efficiency of particular tests. Hodges and Lehmann [7] have discussed the asymptotic efficiency of the Wilcoxon test with respect to all translation alternatives. In the same paper they have conjectured that the cl-test is as efficient as the ttest for normal alternatives and at least as efficient as the t-test for all other alternatives. The beginning of our work came from a desire to verify the Hodges and Lehmann conjecture. Related to the conjecture is the hypothesis that the cl-statistic is asymptotically normally distributed. Thus our work has two parts: developing a new theorem for asymptotic normality of nonparametric test statistics and the establishing of the variational argument required for determining the minimum efficiency of test procedures.

468 citations


Journal ArticleDOI
TL;DR: In this paper, it is shown that the distribution of the successive $x$'s is not uniquely determined by that of the $u$' s alone, but is determined by a moment generating function, the inversion of which will yield the asymptotic distribution.
Abstract: Several authors have studied the discrete stochastic process $(x_t)$ in which the $x$'s are related by the stochastic difference equation \begin{equation*}\tag{1.1}x_t = \alpha x_{t - 1} + u_t, \quad t = 1,2, \cdots, T,\end{equation*} where the $u$'s are unobservable disturbances, independent and identically distributed with mean zero and variance $\sigma^2$, and $\alpha$ is an unknown parameter. The statistical problem is to find some appropriate function of the $x$'s as an estimator for $\alpha$ and examine its properties. We may rewrite (1.1) as \begin{equation*}\tag{1.2}x_t = u_t + \alpha u_{t - 1} + \cdots + \alpha^{t - 1}u_1 + \alpha^tx_0.\end{equation*} From (1.2) we see that the distribution of the successive $x$'s is not uniquely determined by that of the $u$'s alone. The distribution of $x_0$ must also be specified. Three distributions which have been proposed for $x_0$ are the following: (A) $x_0$ = a constant (with probability one), (B) $x_0$ is normally distributed with mean zero and variance $\sigma^2/(1 - \alpha^2)$, (C) $x_0 = x_T$. Distribution (B) is perhaps the most appealing from a physical point of view, since if $x_0$ has this distribution and if the $u$'s are normally distributed, then the process is stationary (e.g., see Koopmans [4]). However, there are several analytic difficulties which arise in the statistical treatment of this process. Distribution (C), the so-called circular distribution, has been proposed as an approximation to (B) and is much easier to analyze (e.g., see Dixon [2]). Distribution (A) has been studied extensively by Mann and Wald [5]. An interesting feature of distribution (A) is that $\alpha$ may assume any finite value, while for distributions (B) and (C) $\alpha$ must be between $-1$ and 1. From (1.2) we see that a process satisfying (1.1) and (A) has \begin{equation*}\tag{1.3}\operatorname{var}(x_t) = \sigma^2(1 + \alpha^2 + \cdots + \alpha^{2(t - 1)})\end{equation*} If $|\alpha| \geqq 1, \lim_{t = \infty} \operatorname{var}(x_t) = \infty$ and the process is said to be "explosive." Mann and Wald [5] considered only the case $|\alpha| 1$, it is shown that the asymptotic distribution of $\alpha$ is the Cauchy distribution. For $|\alpha| = 1$, a moment generating function is found, the inversion of which will yield the asymptotic distribution.

415 citations



Journal ArticleDOI
TL;DR: In this article, Kruskal and Blough give an unbiased estimator of the squared multiple correlation, which is a strictly increasing function of the usual estimator differing from it only by terms of order l/n and consequently having the same asymptotic distribution.
Abstract: 1. Summary and introduction. This paper deals with the unbiased estimation of the correlation of two variates having a bivariate normal distribution (Sec. 2), and of the intraclass correlation, i.e., the common correlation coefficient of a p-variate normal distribution with equal variances and equal covariances (Sec. 3). In both cases, the estimator has the following properties. It is a function of acomplete sufficient statistic and is therefore the unique (except for sets of probability zero) minimum variance unbiased estimator. Its range is the region of possible values of the estimated quantity. It is a strictly increasing function of the usual estimator differing from it only by terms of order l/n and consequently having the same asymptotic distribution. Since the unbiased estimators are cumbersome in form in that they are expressed as series or integrals, tables are included giving the unbiased estimators as functions of the usual estimators. In Sec. 4 we give an unbiased estimator of the squared multiple correlation. It has the properties mentioned in the second paragraph except that it may be negative, which the squared multiple correlation cannot. In each case the estimator is obtained by inverting a Laplace transform. We are grateful to W. H. Kruskal and L. J. Savage for very helpful comments and suggestions, and to R. R. Blough for his able computations.

407 citations


Journal ArticleDOI
TL;DR: In this article, the Robbins-Monro procedure and the Kiefer-Wolfowitz procedure are considered, for which the magnitude of the $n$th step depends on the number of changes in sign in $(X_i - X_{i - 1})$ for n = 2, \cdots, n.
Abstract: Using a stochastic approximation procedure $\{X_n\}, n = 1, 2, \cdots$, for a value $\theta$, it seems likely that frequent fluctuations in the sign of $(X_n - \theta) - (X_{n - 1} - \theta) = X_n - X_{n - 1}$ indicate that $|X_n - \theta|$ is small, whereas few fluctuations in the sign of $X_n - X_{n - 1}$ indicate that $X_n$ is still far away from $\theta$. In view of this, certain approximation procedures are considered, for which the magnitude of the $n$th step (i.e., $X_{n + 1} - X_n$) depends on the number of changes in sign in $(X_i - X_{i - 1})$ for $i = 2, \cdots, n$. In theorems 2 and 3, $$X_{n + 1} - X_n$$ is of the form $b_nZ_n$, where $Z_n$ is a random variable whose conditional expectation, given $X_1, \cdots, X_n$, has the opposite sign of $X_n - \theta$ and $b_n$ is a positive real number. $b_n$ depends in our processes on the changes in sign of $$X_i - X_{i - 1}(i \leqq n)$$ in such a way that more changes in sign give a smaller $b_n$. Thus the smaller the number of changes in sign before the $n$th step, the larger we make the correction on $X_n$ at the $n$th step. These procedures may accelerate the convergence of $X_n$ to $\theta$, when compared to the usual procedures ([3] and [5]). The result that the considered procedures converge with probability one may be useful for finding optimal procedures. Application to the Robbins-Monro procedure (Theorem 2) seems more interesting than application to the Kiefer-Wolfowitz procedure (Theorem 3).

403 citations






Journal ArticleDOI
TL;DR: In this paper, the authors show that many commonly employed symmetrical designs such as Balanced Incomplete Block Designs (BIBDs), Latin Squares (LS's), Youden Squares, etc., have optimum properties among the class of non-randomized designs.
Abstract: Many commonly employed symmetrical designs such as Balanced Incomplete Block Designs (BIBD's), Latin Squares (LS's), Youden Squares (YS's), etc., are shown to have optimum properties among the class of non-randomized designs (Section 3). This represents an extension of a property first proved by Wald for LS's in [1]; a similar property demonstrated by Ehrenfeld for LS's in [2] (as well as a third optimum property considered here) is shown to be an immediate consequence of the Wald property, and the Wald property is shown to be the more relevant when one considers optimality rigorously (Section 2). Surprisingly, all of these optimum properties fail to hold if randomized designs are considered (Section 4); the results of Sections 2 and 3, as well as those appearing previously in the literature (as in [1], [2], [3]) must be interpreted in this sense. Generalizations of the BIBD's and YS's, for which analogous results hold, are introduced.

Journal ArticleDOI
TL;DR: Theorem 2.1 provides the basis for a stepwise procedure leading to this information when both the function to be minimized and the restricting sets are convex as discussed by the authors, but it makes no contribution to the problem of finding the minimizing point on a given boundary or intersection of boundaries.
Abstract: There are collected in this paper several observations and results more or less loosely related by their connections with the subject mentioned in the title. The discussion moves from the general to the specific, beginning with some remarks on minimization of convex functions subject to side conditions, and ending with a discussion of uniform consistency of estimators of linearly ordered parameters. Section 2 deals with one aspect of the problem of minimizing a function of several variables, subject to side conditions which specify that the variables must satisfy certain inequalities. It is frequently true in such problems that information as to which of the restricting sets contain the minimizing point on their boundaries is of great assistance in finding this point. Theorem 2.1 provides the basis for a stepwise procedure leading to this information when both the function to be minimized and the restricting sets are convex. It makes no contribution, however, to the problem of finding the minimizing point on a given boundary or intersection of boundaries. Brief mention is made in Section 3 of some examples of estimation problems for which the remark to which Section 2 is devoted is appropriate. Section 4 is concerned with a situation in which samples are taken from $k$ populations, each known to belong to a given one-parameter "exponential family". The problem is the maximum likelihood estimation of the $k$ parameters determining the populations, subject to certain restrictions. Methods are discussed of finding the minimizing point on a given intersection of boundaries of restricting sets. In the particular case when all populations belong to the same exponential family and when the restrictions on the parameters are order restrictions, it is observed that the maximum likelihood estimators (MLE's) of the means are independent of the particular exponential family. In Section 5 is discussed a property, related to sufficiency, of the MLE's discussed in Section 4. Let $y$ denote a vector representing a set of possible values of the MLE's, $E$ a Borel subset of the sample space, $\tau$ a parameter point, $S_0$ the intersection of the restricting sets. If $S_0$ is bounded by hyperplanes, there is a determination of the conditional probability $p\tau(E \mid y)$ which is independent of $\tau$ when $y$ is interior to $S_0$, and, when $y$ lies on a face, edge, or vertex of $S_0$, is independent of $\tau$ on the closure of that face, edge, or vertex. This result may be regarded as a generalization of a remark ([16], p. 77) to the effect that if $X$ and $Y$ are normally distributed random variables with unit standard deviation and means $\xi$ and $\eta$ respectively, and if $\xi$ and $\eta$ are known to satisfy a linear equation, then the foot of the perpendicular from the observation point $(x, y)$ to the line is a sufficient estimator. Section 6 is devoted to the same problem as are Sections 4 and 5, except that the parameters are linearly ordered, and that the populations need not belong to exponential families. Conditions are obtained for the strong uniform consistency of an estimator which is the MLE when the populations do belong to the same exponential family. An asymptotic lower bound is given for the probability of achieving a given precision uniformly.

Journal ArticleDOI
TL;DR: In this article, a set of bounded confidence level confidence intervals for the mean of variables which follow a multivariate normal distribution is given, where the confidence is known exactly, rather than merely being bounded below.
Abstract: Methods are given for constructing sets of simultaneous confidence intervals for the means of variables which follow a multivariate normal distribution. In section (3), a set of confidence intervals is obtained for each of two special cases; first when the variances are assumed to be known, and second when the variances are assumed to be equal. These two sets have the property that the confidence is known exactly, rather than merely being bounded below. In the case of known variances, the intervals are of fixed lengths (i.e., the lengths are the same from sample to sample); when the variances are unknown, the intervals are of variable lengths. It may be surprising to note that nothing need be known about the covariances in order to obtain confidence intervals of fixed lengths whose confidence coefficient is exact. These intervals are long, and do not make use of all the information provided by the sample. Each of sections (4) to (7) considers a different method for obtaining confidence intervals of bounded confidence level. In each section a set of fixed lengths is obtained when the variances are assumed to be known, while a set of variable lengths is obtained when the variances are unknown but equal. In section (5) the set of variable lengths applies to the general multivariate normal distribution, all the other confidence intervals in this paper require some assumption concerning the variances. In section (8) the sets of intervals are compared on the basis of length. One of the bounded confidence level methods, which has been established only for two or three variables or for an arbitrary number of variables with a special type of correlation matrix, is shown to yield the best possible set. Another of the bounded confidence level methods, whose use is established in general, is shown to be almost as good as the best set for confidence coefficients of practical interest. It is interesting to notice that intervals with bounded confidence level, are found which are much shorter than the ones whose confidence level is exact. This need not surprise us, however. In the case of just one variable, we might easily find hat the 95% confidence intervals for the mean using the $t$-statistic were shorter on the average than 94% confidence intervals using order statistics. Moreover, since in admitting sets of confidence intervals with bounded confidence level we consider a much broader class of methods, we might almost expect that some of them would give better intervals.

Journal ArticleDOI
TL;DR: In this paper, a new method which generates B.A.N. estimates as roots of certain linear forms is introduced and investigated, and a particular application of the method, the estimation of the bacterial density in an experiment using dilution series is considered.
Abstract: Various minimum $\chi^2$ methods used for generating B.A.N. estimates are summarized, and a new method which generates B.A.N. estimates as roots of certain linear forms is introduced and investigated. As a particular application of the method, the estimation of the bacterial density in an experiment using dilution series is considered.

Journal ArticleDOI
TL;DR: In this paper, it is shown that arbitrarily small differences in each individual variable can result in a detectable overall difference provided the number of variables (or more precisely, r) can be made sufficiently large.
Abstract: 0. Summary. The classical multivariate 2 sample significance test based on Hotelling's T2 is undefined when the number k of variables exceeds the number of within sample degrees of freedom available for estimation of variances and covariances. Addition of an a priori Euclidean metric to the affine k-space assumed by the classical method leads to an alternative approach to the same problem. A test statistic F which is the ratio of 2 mean square distances is proposed and 3 methods of attaching a significance level to F are described. The third method is considered in detail and leads to a "non-exact" significance test where the null hypothesis distribution of F depends, in approximation, on a single unknown parameter r for which an estimate must be substituted. Approximate distribution theory leads to 2 independent estimates of r based on nearly sufficient statistics and these may be combined to yield a single estimate. A test of F nominally at the 5 % level but based on an estimate of r rather than r itself has a true significance level which is a function of r. This function is investigated and shown to be quite near 5 %. The sensitivity of the test to a parameter measuring statistical distance between population means is discussed and it is shown that arbitrarily small differences in each individual variable can result in a detectable overall difference provided the number of variables (or, more precisely, r) can be made sufficiently large. This sensitivity discussion has stated implications for the a priori choice of metric in k-space. Finally a geometrical description of the case of large r is presented. 1. Introduction. The statistical problem here treated is that of significance testing for the difference of the means of 2 k-variate populations which may be assumed to have the same structure of variances and covariances, the test being based on a sample from each population with sample sizes denoted by ni and n2 . It is intended to provide a method applicable to data where the number k of characteristics measured on each individual is large but where the number of individuals measured may be quite small. The usual method of classical multivariate statistics encounters a mathematical barrier and becomes inapplicable when k > ni + n2 - 2, but certainly the need has arisen in applied statistical work for techniques handling small samples of highly described individuals. The classical method has 2 equivalent formulations in terms of the T2 statistic

Journal ArticleDOI
TL;DR: In this paper, a solution for the problem of obtaining a distribution-free one-sided confidence interval for $p = \Pr \{Y < X\}$ has been proposed.
Abstract: A solution for the problem of obtaining a distribution-free one-sided confidence interval for $p = \Pr \{Y < X\}$ has been proposed in [1]. At present a numerical procedure is given for computing the sample sizes needed for such a confidence interval with given width and confidence level.

Journal ArticleDOI
TL;DR: In this paper, a step-down procedure for multivariate analysis of variance is proposed, where the variates are arranged in descending order of importance and the compound hypothesis is accepted if and only if each of the univariate hypotheses are accepted.
Abstract: Test criteria for (i) multivariate analysis of variance, (ii) comparison of variance-covariance matrices, and (iii) multiple independence of groups of variates when the parent population is multivariate normal are usually derived either from the likelihood-ratio principle [6] or from the "union-intersection" principle [2]. An alternative procedure, called the "step-down" procedure, has been recently used by Roy and Bargmann [5] in devising a test for problem (iii). In this paper the step-down procedure is applied to problems (i) and (ii) in deriving new tests of significance and simultaneous confidence-bounds on a number of "deviation-parameters." The essential point of the step-down procedure in multivariate analysis is that the variates are supposed to be arranged in descending order of importance. The hypothesis concerning the multivariate distribution is then decomposed into a number of hypotheses--the first hypothesis concerning the marginal univariate distribution of the first variate, the second hypothesis concerning the conditional univariate distribution of the second variate given the first variate, the third hypothesis concerning the conditional univariate distribution of the third variate given the first two variates, and so on. For each of these component hypotheses concerning univariate distributions, well known test procedures with good properties are usually available, and these are made use of in testing the compound hypothesis on the multivariate distribution. The compound hypothesis is accepted if and only if each of the univariate hypotheses are accepted. It so turns out that the component univariate tests are independent, if the compound hypothesis is true. It is therefore possible to determine the level of significance of the compound test in terms of the levels of significance of the component univariate tests and to derive simultaneous confidence-bounds on certain meaningful parametric functions on the lines of [3] and [4]. The step-down procedure obviously is not invariant under a permutation of the variates and should be used only when the variates can be arranged on a priori grounds. Some advantages of the step-down procedure are (i) the procedure uses widely known statistics like the variance-ratio, (ii) the test is carried out in successive stages and if significance is established at a certain stage, one can stop at that stage and no further computations are needed, and (iii) it leads to simultaneous confidence-bounds on certain meaningful parametric functions. 1.1 Notations. The operator $\varepsilon$ applied to a matrix of random variables is used to generate the matrix of expected values of the corresponding random variables. The form of a matrix is denoted by a subscript; thus $A_{n \times m}$ indicates that the matrix $A$ has $n$ rows and $m$ columns. The maximum latent root of a square matrix $B$ is denoted by $\lambda_{\max}(B)$. Given a vector $a = (a_1, a_2, \cdots, a_t)'$ and a subset $T$ of the natural numbers $1, 2, \cdots, t$, say $T = (j_1, j_2, \cdots, j_u)$ where $j_1 < j_2 < \cdots j_u$, the notation $T\lbrack a\rbrack$ will be used to denote the positive quantity: $$T\lbrack a\rbrack = + \{a^2_{j_1} + a^2_{j_2} + \cdots + a^2_{j_t}\}^{1/2}.$$ $T\lbrack a\rbrack$ will be called the $T$-norm of $a$. Similarly, given a matrix $B_{t \times t}$, we shall write $B_{(T)}$ for the $u \times u$ submatrix formed by taking the $j_1$th, $j_2$th, $\cdots, j_u$th rows and columns of $B$. We shall call $B_{(T)}$ the $T$-submatrix of $B$.

Journal ArticleDOI
TL;DR: In this article, the generalized variance was used as a criterion for the efficiency of estimating the coefficients of a polynomial regression curve of given degree for the classical regression model and for models based on a particular stationary stochastic process.
Abstract: Using the generalized variance as a criterion for the efficiency of estimation, the best choice of fixed variable values within an interval for estimating the coefficients of a polynomial regression curve of given degree is determined for the classical regression model. Using this same criterion, some results are obtained on the increased efficiency arising from doubling the number of equally spaced observation points (i) when the total interval is fixed and (ii) when the total interval is doubled. Measures of the increased efficiency are found for the classical regression model and for models based on a particular stationary stochastic process and a pure birth stochastic process.

Journal ArticleDOI
TL;DR: In this article, a test based on the union-intersection principle is proposed for overall independence between variates distributed according to the multivariate normal law, and this is extended to the hypothesis of independence between several groups of variates which have a joint multiivariate normal distribution.
Abstract: In this paper a test based on the union-intersection principle is proposed for overall independence between $p$ variates distributed according to the multivariate normal law, and this is extended to the hypothesis of independence between several groups of variates which have a joint multivariate normal distribution. Methods used in earlier papers [3, 4] have been applied in order to invert these tests for each situation, and to obtain, with a joint confidence coefficient greater than or equal to a preassigned level, simultaneous confidence bounds on certain parametric functions. These parametric functions are, in case I, the moduli of the regression vectors: (a) of the variate $p$ on the variates $(p - 1), (p - 2), \cdots, 2, 1,$ or on any subset of the latter; (b) of the variate $(p - 1)$ on the variates $(p - 2), (p - 3), \cdots, 2, 1,$ or any subset of the latter, etc.; and finally, (c) of the variate 2 on the variate 1. For case II, parallel to each case considered above, there is an analogous statement in which the regression vector is replaced by a regression matrix, $\beta$, say, and the "modulus" of the regression vector is replaced by the (positive) square-root of the largest characteristic root of $(\beta\beta')$. Simultaneous confidence bounds on these sets of parameters are given. As far as the proposed tests of hypotheses of multiple independence are concerned they are offered as an alternative to another class of tests based on the likelihood-ratio criterion [5, 6] which has been known for a long time. So far as the confidence bounds are concerned it is believed, however, that no other easily obtainable confidence bounds are available in this area. One of the objects of these confidence bounds is the detection of the "culprit variates" in the case of rejection of the hypothesis of multiple independence, for the "complex" hypothesis is, in this case, the intersection of several more "elementary" hypotheses of two-by-two independence.


Journal ArticleDOI
TL;DR: For finite-state indecomposable channels, Shannon's basic theorem, that transmission is possible at any rate less than channel capacity but not at any greater rate, is proved as discussed by the authors.
Abstract: For finite-state indecomposable channels, Shannon's basic theorem, that transmission is possible at any rate less than channel capacity but not at any greater rate, is proved. A necessary and sufficient condition for indecomposability, from which it follows that every channel with finite memory is indecomposable, is given. An important tool is a modification, for some processes which are not quite stationary, of theorems of McMillan and Breiman on probabilities of long sequences in ergodic processes.

Journal ArticleDOI
TL;DR: In this article, a procedure is given for selecting a subset such that the probability that all the populations better than the standard are included in the subset is equal to or greater than a predetermined number.
Abstract: A procedure is given for selecting a subset such that the probability that all the populations better than the standard are included in the subset is equal to or greater than a predetermined number $P^{\ast}$. Section 3 deals with the problem of the location parameter for the normal distribution with known and unknown variance. Section 4 deals with the scale parameter problem for the normal distribution with known and unknown mean as well as the chi-square distribution. Section 5 deals with binomial distributions where the parameter of interest is the probability of failure on a single trial. In each of the above cases the case of known standard and unknown standard are treated separately. Tables are available for some problems; in other problems transformations are used such that the given tables are again appropriate.


Book ChapterDOI
TL;DR: In this paper, the authors propose a test (decision rule) which makes the probability of an erroneous decision small when F belongs to G or H, and at the same time exercises some control over the number of observations required to reach a decision when F is in F.
Abstract: Suppose it is desired to make one of two decisions, d 1 and d 2, on the basis of independent observations on a chance variable whose distribution F is known to belong to a set F. There are given two subsets G and H of F such that decision d 1(d 2) is strongly preferred if F is in G (H). Then it is reasonable to look for a test (decision rule) which makes the probability of an erroneous decision small when F belongs to G or H, and at the same time exercises some control over the number of observations required to reach a decision when F is in F (not only in G or H).

Journal ArticleDOI
TL;DR: In this paper, the authors derived the Pitman limiting power of the frequency x2-test when the unknown parameters occurring in the specification of class probabilities are estimated from the sample by an asymptotically efficient method like the method of maximum likelihood, minimum x2 etc.
Abstract: in the usual manner, Cochran, in an expository article [10] has suggested the derivation of its Pitman limiting power [11], and he illustrated it in the case of the simple goodness of fit test. The colncept of asymptotic power suggested by Pitman has also been extensively used in various other areas like nonparametric inference (see e.g. Hoeffding and Rosenblatt [12]) and seems to be a useful tool for comparing alternative consistent tests or alternative designs for experimentation, with regard to their performance in the immediate neighbourhood of the null hypothesis. The consistency of the frequency X2-test has already been established by Neyman [13]. The object of the present paper is to obtain the Pitman limiting power of this test when the unknown parameters occurring in the specification of class probabilities are estimated from the sample by an asymptotically efficient method like the method of maximum likelihood, minimum x2 etc. In section 5, we discuss a few applications of the Pitman limiting power for frequency x2-tests.

Journal ArticleDOI
TL;DR: De la Garza as mentioned in this paper considered the estimation of a polynomial of degree n from observations in a given range of the independent variable n. This range may conveniently be taken to be from $+1$ to $-1$ and showed that for any arbitrary distribution of the points of observation there was a distribution of n observations at only $p + 1$ points for which the variances were the same.
Abstract: De la Garza ([1], [2]) has considered the estimation of a polynomial of degree $p$ from $n$ observations in a given range of the independent variable $x$. This range may conveniently be taken to be from $+1$ to $-1$. He showed that for any arbitrary distribution of the points of observation there was a distribution of the $n$ observations at only $p + 1$ points for which the variances (determined by the matrix $\mathbf{X}^T\mathbf{W X}$) were the same. He then considered how these $p + 1$ points should be distributed so that the maximum variance of the fitted value in the range of interpolation should be as small as possible. In the present note general formulae will be obtained for the distribution of the points of observation and for the variances of the fitted values in the minimax variance case, and the variances will be compared with those for the uniform spacing case.

Journal ArticleDOI
TL;DR: In this paper, a multivariate Tchebycheff inequality is given, in terms of the covariances of the random variables in question, and it is shown that the inequality is sharp, i.e., the bound given can be achieved.
Abstract: A multivariate Tchebycheff inequality is given, in terms of the covariances of the random variables in question, and it is shown that the inequality is sharp, i.e., the bound given can be achieved. This bound is obtained from the solution of a certain matrix equation and cannot be computed easily in general. Some properties of the solution are given, and the bound is given explicitly for some special cases. A Less sharp but easily computed and useful bound is also given.

Journal ArticleDOI
TL;DR: In this paper, a table for computing trivariate normal probabilities is presented, and a derivation of the relationship between the trivariates normal integral and the tabulated function, a description of the table, a numerical example, and an extension of the method to higher dimensions is given.
Abstract: A table for computing trivariate normal probabilities is given. A summary of the formulas used, a derivation of the relationship between the trivariate normal integral and the tabulated function, a description of the table, a numerical example, and an extension of the method to higher dimensions are also given.