scispace - formally typeset
Search or ask a question

Showing papers in "Biometrika in 1963"


Journal ArticleDOI

441 citations


Journal ArticleDOI
TL;DR: In this article, the recursive method proposed by Durbin (1960) for the fitting of autoregressive schemes of successively increasing order is generalized to fitting of multivariate autoregressions, and of schemes with rational spectral density function.
Abstract: SUMMARY The recursive method proposed by Durbin (1960) for the fitting of autoregressive schemes of successively increasing order is generalized to the fitting of multivariate autoregressions, and of schemes with rational spectral density function. It is also shown that an autoregression fitted from the Yule-Walker relations, even if of insufficient order, has the necessary stability properties. This property holds in the multivariate case, too, and is important in connexion with a problem arising in multivariate prediction: the approximate factorization of a spectral density matrix.

380 citations


Journal ArticleDOI
Akio Kudo1
TL;DR: In this article, a multivariate analogue of the one-sided test of significance was developed for multivariate normal population with known variance matrix, which can be solved by either the normal or the t-distribution functions.
Abstract: In this paper we consider the following problem. Given a multivariate normal population with known variance matrix, what test is appropriate to determine whether the means are slipped to the right? In the case when the population is univariate normal, this problem can be solved by the ordinary one-sided test using either the normal or the t-distribution functions. It is the purpose of this paper to develop what may be termed a multivariate analogue of the one-sided test of significance.

351 citations


Journal ArticleDOI
TL;DR: The Elements of Statistics Appendix as mentioned in this paper Theoretically, the classical limit theorem and the theory of infinite-divisible distributions have been studied in the context of random variables and distribution functions.
Abstract: 1. Random Events and Their Probabilities 2. Sequences in Independent Trials 3. Markov Chains 4. Random Valuables and Distribution Functions 5. Numberical Characteristics of Random Variables 6. The Law of Large Numbers 7. Characteristic Functions 8. The Classical Limit Theorem 9. The Theory of Infinitely Divisible Distributions 10. The Theory of Stochastic Processes 11. The Elements of Statistics Appendix

282 citations



Journal ArticleDOI
TL;DR: In this paper, the authors considered the linear relationship between certain pairs of physical vector quantities which may be described by 3 x 3 symmetric matrices and derived confidence intervals for the lengths of the principal axes.
Abstract: 1. SUMMARY The tensors considered in this paper are the linear relationships between certain pairs of physical vector quantities which may be described by 3 x 3 symmetric matrices. Such a linear relationship may well vary from point to point in a non-homogeneous medium. From measurements at a given point in the medium, the least squares estimates of the components of the tensor at that point are obtained; from these the tensor's principal axes may be estimated. On the assumption that the errors of measurement are small and normally distributed, confidence intervals for the lengths of the principal axes are derived, together with confidence regions on the unit sphere for the directions of these axes. Tests for equality of pairs of principal axes, for isotropy, and for comparing the tensors at two or more points are given. In certain common experimental situations, a design for this work may be represented by a set of points on the unit sphere. If such a design is rotatable (Box & Hunter, 1957), the estimates have optimum properties, and the confidence intervals and regions take on particularly simple forms. Seven rotatable designs are given, which should cover most practical requirements. These results are illustrated with a numerical example and diagram, taken from recent work on rock magnetism. Finally, all the results are extended to more general symmetric matrices, and some further rotatable designs are given.

202 citations


Journal ArticleDOI
TL;DR: In this article, the problem of experimental design which arises in the fitting of a graduating function has been discussed by Box & Draper (1959), and it has been briefly restated here.
Abstract: When the actual functional form is not known, useful information about the nature of the actual relationship in some particular region R of the g space can often be obtained by approximating the relationship by a graduating function g(g, P) where P3 is a vector of adjustable constants. The graduating functions usually employed have been polynomials in the variables g. The problem of experimental design which arises in the fitting of a graduating function has been discussed by Box & Draper (1959) and will be briefly restated here. We desire to choose a design matrix D of N rows and k columns which will specify the levels of the k variables to be run in N experiments. Denote the uth row of this matrix by gu. This vector has as elements the levels (61u, 62u .... 6 ku) of the k factors to be employed in the uth experiment, u = 1, 2, ..., N. Our primary objective will be to choose these levels so that when the graduating function g(g) is fitted by least squares, it will closely represent the true function y(g), within the region of interest R. Subject to the satisfaction of our primary objective, we shall also require that the factor levels be such that there is a high chance that the inadequacy of the graduating function g(g, P3) to represent y(g) will be detected. As will be seen, a subclass of all possible designs can be selected which will satisfy our primary requirement. We can then make use of our secondary requirement to make a selection of a particular design from this subclass. We now define 'region of interest', 'closely represent' and 'detection of inadequacy of model'.

195 citations




Journal ArticleDOI
TL;DR: In this article, a comparison of the distribution functions of three leptokurtic distributions, namely (i) the Pearson type IV, (ii) the non-central t, and (iii) Johnson's Su, when their first four moments have identical values is made.
Abstract: In the history of the development of statistical distribution theory there have been many instances where it has been possible to determine the sampling moments of the distribution of a statistic, without any immediate prospect of deriving the mathematical distribution itself in explicit form. Two examples of this are the distributions of (i) the Cram6r-von Mises-Simirnoff statistic WN (or NW2) I and (ii) the standardized fourth moment b2 = M4/in2 in samples from a normal universe (where m8 is the sth central sample moment). In so far as there may be a number of alternative mathematical forms which could be used to approximate the unknown true distribution, the question arises as to how to select between them. Suppose, for example, that we take two different frequency functions each having the same first four moments as the unknown true distribution, should we expect that the empirical function whose higher moments are the closer to the true values will give the better representation? And what do we mean by better representation? In so far as a distribution can be represented by a Gram-Charlier or Fisher-Cornish type of expansion we might expect in theory that agreement in moments would lead to agreement in probability integrals, but it is well known that questions of the convergence of such expansions arise in the case of distributions which are far from normal. The distributions (i) and (ii) referred to above are indeed extremely leptokurtic. In the following paper it is proposed to draw together several hitherto unpublished investigations, some dating back a number of years, which bear on these points. In particular, we shall: (a) Consider the proportionate contributions, arising from different parts of the parent frequency, to each of the first six moments of certain selected distributions. (b) Make a comparison of the distribution functions of three leptokurtic distributions, namely (i) the Pearson type IV, (ii) the non-central t, and (iii) Johnson's Su, when their first four moments have identical values. (c) Apply some of the conclusions drawn from the studies (a) and (b) to the problem of determining significance points for the moment ratio statistics Vb1 = m3/ml and b2 m 2 used in testing for departure from normality. To help the comparison of distributions, we shall represent them as points on a (f1,82) chart as illustrated in Figs. 2 and 3, where

112 citations



Journal ArticleDOI
TL;DR: In this paper, a solution to the problem of arbitrary K is given in terms of tolerance intervals on the distributions of future observations, the intervals being (probabilistically) simultaneous in each possible value of the independent variable.
Abstract: SUMMARY Joint prediction intervals (based upon the original fitted model) for K future responses at each of K separate settings of the independent variable have been treated by Lieberman (1961). When K is unknown and possibly arbitrarily large, these results do not apply. A solution to the problem of arbitrary K is given in terms of tolerance intervals on the distributions of future observations, the intervals being (probabilistically) simultaneous in each possible value of the independent variable. Four alternative techniques are proposed and compared for their applicability in different situations. The first is the simultaneous extension of the Wallis (1951) technique. The other three are based on Scheff6 simultaneous confidence principles. One gives intervals for a fixed central proportion P of the distribution which are simultaneous in all values of the independent variable; the other two give intervals simultaneous in the independent variable and different central proportions P. A numerical example is analysed, and some remarks are made on the applicability of the Scheff6 techniques to the detection of outliers.

Journal ArticleDOI
TL;DR: In the course of a more general investigation, Roberts and Ursell as discussed by the authors examined random walk on the surface of a sphere when each angular step is governed by a symmetrical density function, and gave the density function for a point which, starting at the north pole, moves with Brownian diffusion over the surface.
Abstract: In the course of a more general investigation, Roberts & Ursell (1960) examined random walk on the surface of a sphere when each angular step is governed by a symmetrical density function. In particular they gave the density function for a point which, starting at the north pole, moves with Brownian diffusion over the surface. The density of the polar coordinate 0, measuring the angular displacement of the pole, is


Journal ArticleDOI
TL;DR: In this paper, the authors examined the adequacy of diffusion approximations when the population size is small by comparing numerically the values given by diffusion methods with exact values calculated by a high-speed computer.
Abstract: In a well-known genetic model due to Wright (1931) exact expressions are not available for either the probability that a particular gene is eventually lost from the population by random elimination or for the mean time until elimination of one or other gene. However, in this model diffusion approximations are readily obtained for these quantities, and the purpose of this paper is to examine the adequacy of these and other approximations when the population size is small by comparing numerically the values given by diffusion methods with exact values calculated by a high-speed computer. The exact values were obtained simply by powering a transition matrix and also by a matrix inversion. We examine the diffusion approximation for the following quantities: (i) the probability that one or other gene is eventually eliminated; (ii) the mean time for such elimination; (iii) the probability that one or other gene is eliminated by the nth generation; (iv) the dominant non-unit latent root of a transition matrix. The numerical results obtained by powering a transition matrix are subject to the rounding error in the computer. However, these results were checked by adding the elements in each row of each power of the matrix. It was found that the sums thus obtained differed from unity at the most in the sixth decimal place, so that since all results are given here to four decimal places only they may be taken as being correct to this order. Although the population size considered is extremely small, so that diffusion results may not be expected to be very accurate, the numerical results obtained indicate that for populations of reasonable size the approximate results obtained by diffusion methods may be expected to be remarkably accurate. This conclusion is useful in that, in many cases, exact formulae for quantities of statistical interest are not available, while a comparatively simple formula exists for the diffusion approximation. As far as the four quantities (i)-(iv) mentioned above are concerned, the results obtained indicate that even though the population size considered is very small, the diffusion approximation for (i) is quite accurate, the approximation for (ii) is reasonably accurate but could be improved, the approximation for (iii) is accurate for large n, and the approximation for (iv) is quite fair. A further point of interest is as follows. It has been thought by some writers (e.g. Fisher, 1930; Kolmogorov, 1959) that diffusion methods break down near the boundaries within which the variate under consideration lies and that branching process techniques are necessary to examine the behaviour of the process near such points. This has led, amongst other things, to the necessity for 'fusing' the branching process results with the diffusion results in the neighbourhood of the boundaries. However, it is difficult to see from the nature and derivation of the diffusion equation why diffusion results should not continue to hold down to the boundaries, and in fact the numerical results obtained here suggest that

Journal ArticleDOI
TL;DR: The negative multinomial distribution is a generalization of the negative binomial distribution as discussed by the authors, and it can be deduced from widely different models, so also can the negative multi-inverse sampling.
Abstract: INTRODUCTION In the generalization of Bernoulli trials where we have k possible outcomes of each trial, k let the probability of the ith outcome in each trial be pi (i = 1,...,k) where Pi= 1. i=l For a fixed number of trials (n), the probability of exactly x1 occurrences of outcome 1, x2 of 2, ..., Xk of k is given by the multinomial distribution. If, however, the number of trials is not fixed in advance, but we continue to consider new trials until exactly r occurrences of the kth outcome (say) have been noted, then the probability of exactly xl occurrences of outcome 1, x2 of 2, ..., Xk1 of k 1, is given by the negative multinomial distribution. This latter procedure we shall refer to as 'inverse sampling' (cf. Tweedie, 1952). The negative multinomial distribution is clearly a generalization of the negative binomial. Just as this latter can be deduced from widely different models, so also can the negative multinomial. Thus Bates & Neyman (1952) arrive at this distribution (their multivariate negative binomial distribution) by considering the joint distribution of kIc-1 independent Poisson variables Xi (i = il ..., Ik-i) each with mean ai A. The parameter A is in turn regarded as a random variable A having a y(or type III) distribution. Integrating the joint frequency function of the X's and A with respect to the latter reveals that the marginal frequency function of the X's is that of the negative multinomial. (Using this approach r (r > 0) is the parameter of a y-distribution and is not restricted to integer values as in the model considered here.) Wishart (1949, p. 48) uses the term 'Pascal multinomial' for the case where r is an integer. The terms of the negative multinomial distribution are readily obtained as the coefficients k-1 of II t in the expansion of k-1 ]

Journal ArticleDOI
TL;DR: The simple stochastic epidemic model is a simplified epidemic model which involves infection but not recovery (see Bailey, 1950, 1957), and it is applicable to spread of a mild upper respiratory infection, at least in the preliminary stages of rapid spread before recovery or removal from circulation could occur as mentioned in this paper.
Abstract: 1. THE MATHEMATICAL MODEL The so-called 'simple stochastic epidemic' is a simplified epidemic model which involves infection but not recovery (see Bailey, 1950, 1957). It might well be applicable, for example, to the spread of a mild upper respiratory infection, at least in the preliminary stages of rapid spread before recovery or removal from circulation could occur. We suppose that initially, when the time t = 0, there is a group of n susceptibles and one infective. These are assumed to mix together homogeneously and randomly with contactrate ,l. This means that the chance of a contact, sufficient to transmit infection, in an interval At between any two specified individuals is fiAt. Let the probability that there are still r susceptibles uninfected at time t be pr(t). Thus the chance of one new infection in the whole group in an interval At is ,/r(n r + 1) At, since there are r susceptibles and n r + 1 infectives. If we change the time scale to r = fit, the chance becomes r(n r + 1) Ar. The basic differential-difference equations for the process are easily found to be

Journal ArticleDOI
TL;DR: In this article, the authors discuss the asymptotic properties of Geary's and Reiersol's procedures using spectral methods and show that the spectral information introduced by these authors can best be viewed as information on the relative shapes of the spectra of the 6, t themselves.
Abstract: where the xj, t are not directly observable but instead we record gj,t = Xj,t+6j,t. It is well known that if et and the 6j,t are sequences of uncorrelated random variables then the /% cannot be identified by means of estimates based only on variances and covariances. Reiersol (1941) and Geary (1949) have, amongst others, considered the use of 'instrumental variables' to enable the relation (1) to be estimated. The main instrumental variables considered by them were lagged values of the 6,t themselves, it being assumed that the 6, t are not serially correlated while the xj, t are. We will discuss the asymptotic properties, including the asymptotic efficiency, of Geary's and Reiersol's procedures using spectral methods. The additional information introduced by these authors can best be viewed as information on the relative shapes of the spectra of

Journal ArticleDOI
TL;DR: In this article, the correspondence between PDC's and PBIB's with m associate classes was considered and two rather general classes were derived from a generalization of group divisible PDIB with two associate classes.
Abstract: Various forms of the diallel cross have been used in plant and animal breeding, but this form of investigation requires a large number of lines or individuals which very often leads to rather unmanageable experiments. For this reason the concept of the partial diallel cross (PDC) has been introduced (Kempthorne, 1957; Gilbert, 1958; Hinkelmann & Stern, 1960; Kempthorne & Curnow, 1961), in which only a sample of all possible crosses of the complete diallel cross (CDC) is carried out. Gilbert (1958) has mentioned the analogy between PDC's and incomplete block designs with blocks of size two. Since a CDC corresponds to a balanced incomplete block design (Kempthorne & Curnow, 1961) it was to be expected that partially balanced incomplete block designs (PBIB) would be related to certain types of PDC's. Curnow (1963) and Fyfe & Gilbert (1963) have given some PDC's derived from PBIB's with two associate classes, and Fyfe & Gilbert (1963) further introduce PDC's derived from a PBIB with three associate classes (their factorial designs). In this paper we shall consider the correspondence between PDC's and PBIB's with m associate classes and give a general method of analysing these designs (?2). In search for some more flexible designs than those derived from PBIB's with two associate classes we came across two rather general classes. One class is derived from a generalization of group divisible PBIB's with two associate classes (Roy, 1953-54). The other class is believed to be new, representing an extension of a design with three associate classes given by Vartak (1959). The definitions of these PBIB's are given in ? 3. The formation of PDC's from these PBIB's is discussed briefly in ? 4. An example is presented in Appendix A. In ?? 5 and 6 the analysis of these PDC's is given. For some selected values of N, the number of lines involved in a diallel cross experiment, we list the best designs out of these two classes in Appendix B.

Journal ArticleDOI
TL;DR: The main point of the new theory is that, for a wide class of more practical designs, the complete analysis can be carried out almost by inspection of the normal equations as mentioned in this paper.
Abstract: : This paper deals with some applications of a general theory for the analysis of factorial experiments as reported by the authors in the June 1962 issue of the Annals of Mathematical Statistics. General expressions are given for the usual quantities associated with the analysis of variance for the cases where simple treatments or factorial treatment-combinations are applied to Randomized Blocks, Balanced Incomplete Blocks, Group Divisible designs, and a wide class of Kronecker Product designs. The main point of the new theory is that, for a wide class of the more practical designs, the complete analysis can be carried out almost by inspection of the normal equations, with no requirement for inverting the normal equations. The complete version of this paper is published in BIOMETRIKA, Vol. 50, Parts 1 and 2, June 1963.




Journal ArticleDOI
TL;DR: In the case of unequal dispersions, the discriminant function becomes of the second degree if the dispersion matrices differ as mentioned in this paper, and it is worth examining this special situation in practice, as it may sometimes arise in practice.
Abstract: and p2(x) the same proportion of the probability for the second group In the case of multivariate normal populations with the same variance-covariance matrix, this criterion is equivalent to the linear discriminant function, but the discriminating function becomes of the second degree if the dispersion matrices differ (Smith, 1946-47) If the mean differences between two groups are all zero, no discrimination is of course possible in the normal case with equal dispersion, but it is worth examining this special situation in the case of unequal dispersions, as it may sometimes arise in practice, as will be shown below by an example


Journal ArticleDOI
TL;DR: A number of methods are available for estimating total populations, survival rates and emergence rates of mobile animal populations from capture-recapture data as mentioned in this paper, and their methods differ mainly in the grouping of the data and in the type of estimates used.
Abstract: A number of methods are available for estimating total populations, survival rates and emergence rates of mobile animal populations from capture-recapture data. Most assume a deterministic model in which each class in the population is assumed to be subject to exactly the same survival rate, and their methods differ mainly in the grouping of the data and in the type of estimates used. The animals captured on each occasion are assumed to be a random sample from the population. The two commonest methods of grouping are method A which amounts to recording total number of previous captures among the animals caught at each sampling, and method B in which only the last of the previous captures of a particular animal is counted and any earlier captures ignored. The earlier writers, Jackson (1939, 1948), and Fisher & Ford (1947) have grouped their data according to method A, and Bailey (1951) discusses the variances of the estimates these writers have used. Grouping according to method B was first introduced by Leslie & Chitty (1951), who give the maximum likelihood equations for both methods A and B for a chain of five successive samplings when the survival rate is assumed constant. The two methods are compared and it is shown that method A results in loss of information, whereas method B is fully efficient under the assumption of the deterministic model. These ideas are extended in Leslie (1952) to cover the more general case when the survival rate is not constant over the period of sampling and when dilution of the population may be occurring due either to fresh emergences or immigration into the area. Solution of the maximum likelihood equations, whether or not the survival rate is assumed constant, is done by iteration to give estimates of total populations, survival and dilution rates, the computation becoming somewhat laborious in the case of a long chain of samples. Leslie also gives a quick method of obtaining approximate estimates from a triangle of three entries in the table of recaptures. In an example of the practical application of estimating population parameters, Leslie, Chitty & Chitty (1953) derive their estimates by an improved approximate method utilizing data from two adjacent columns of their recapture table to provide estimates for any one sampling occasion. Moran (1952) discusses the theoretical aspects associated with the deterministic and other models. Hammersley (1953) and Darroch (1958, 1959) base their estimates on a fully stochastic model. Darroch has shown that in the three cases when the population is: (1) a closed one subject neither to death nor immigration, (2) subject to death only, (3) subject to immigration only, the population parameters can be very simply estimated together with their variances and covariances. In the more general case when both death and immigration are occurring, estimation equations are derived. A method is indicated for obtaining variances and covariances, but the procedure is complex and formulae are not given.

Journal ArticleDOI
TL;DR: In this paper, the theoretical properties of the sampling procedure are investigated and a distribution-free control median test is proposed, which is as efficient asymptotically as the joint median test proposed by Mood (1954).
Abstract: In this paper the theoretical statistical properties of this sampling procedure are investigated. It is shown that a distribution-free control median test can be based on this procedure which is as efficient asymptotically as the joint median test proposed by Mood (1954). An associated confidence interval is also derived. The efficiency of this test is compared with those based on fully efficient parametric procedures. Finally, a sequential form of the test is discussed based on an underlying logistic family of curves.

Journal ArticleDOI
TL;DR: In this paper, Youden proposed a nonparametric extreme rank sum test, which can be used to supplement the concordance coefficient in the same way as the maximum Studentized deviate.
Abstract: Consider a complete unreplicated two-way classification. Analysis of variance provides a test for the equality, say, of the row means; while Friedman's x% (1937) or the concordance coefficient, W, of Kendall & Babington Smith (1939) give analogous non-parametric tests, based on ranking, of essentially the same hypothesis. The maximum Studentized deviate (Halperin, Greenhouse, Cornfield & Zalokar, 1955) is one statistic which may be used to supplement the analysis of variance to point out the row, or rows, which have means different from the rest. The non-parametric extreme rank sum test proposed by Youden (1963) and developed mathematically in this paper can be used to supplement the concordance coefficient in the same way. Table 1 presents the coded data of a well-known experiment (Youden, 1960) performed to test the identity of the triple points of four chemical 'cells' used as temperature reference standards. The temperature of each cell at its triple point was determined according to four different thermometers.

Journal ArticleDOI
TL;DR: In this article, the moments of the distribution of UN were given, and some small-sample results of the moments were given for the moments in Part 1 (Stephens, 1963).
Abstract: In Part 1 (Stephens, 1963) were given the moments, and some small-sample results, of the distribution of UN. This is a goodness-of-fit statistic introduced by Watson (1961, 1962), and tests the null hypothesis that N observations come from a cumulative distribution function F(x). Throughout this paper the distribution of UN will refer to the distribution on the null hypothesis. UN is particularly useful when the observations are points on a circle; Watson has shown that its distribution is independent of F(x), and its value, in the circular case, is independent of the choice of origin. Suppose the observations, in increasing order, are x., x2, ..., XN, and let y. = F(xj), with y the mean of the yi. The value of UN2 may be calculated from N 2-1 21 2 1 UN=.z1(jSI 2N) N(Y2) 12N (1)