scispace - formally typeset
Search or ask a question

Showing papers in "Biometrika in 1965"


Journal ArticleDOI
S. S. Shapiro1, M. B. Wilk1
TL;DR: In this article, a new statistical procedure for testing a complete sample for normality is introduced, which is obtained by dividing the square of an appropriate linear combination of the sample order statistics by the usual symmetric estimate of variance.
Abstract: The main intent of this paper is to introduce a new statistical procedure for testing a complete sample for normality. The test statistic is obtained by dividing the square of an appropriate linear combination of the sample order statistics by the usual symmetric estimate of variance. This ratio is both scale and origin invariant and hence the statistic is appropriate for a test of the composite hypothesis of normality. Testing for distributional assumptions in general and for normality in particular has been a major area of continuing statistical research-both theoretically and practically. A possible cause of such sustained interest is that many statistical procedures have been derived based on particular distributional assumptions-especially that of normality. Although in many cases the techniques are more robust than the assumptions underlying them, still a knowledge that the underlying assumption is incorrect may temper the use and application of the methods. Moreover, the study of a body of data with the stimulus of a distributional test may encourage consideration of, for example, normalizing transformations and the use of alternate methods such as distribution-free techniques, as well as detection of gross peculiarities such as outliers or errors. The test procedure developed in this paper is defined and some of its analytical properties described in ? 2. Operational information and tables useful in employing the test are detailed in ? 3 (which may be read independently of the rest of the paper). Some examples are given in ? 4. Section 5 consists of an extract from an empirical sampling study of the comparison of the effectiveness of various alternative tests. Discussion and concluding remarks are given in ?6. 2. THE W TEST FOR NORMALITY (COMPLETE SAMPLES) 2 1. Motivation and early work This study was initiated, in part, in an attempt to summarize formally certain indications of probability plots. In particular, could one condense departures from statistical linearity of probability plots into one or a few 'degrees of freedom' in the manner of the application of analysis of variance in regression analysis? In a probability plot, one can consider the regression of the ordered observations on the expected values of the order statistics from a standardized version of the hypothesized distribution-the plot tending to be linear if the hypothesis is true. Hence a possible method of testing the distributional assumptionis by means of an analysis of variance type procedure. Using generalized least squares (the ordered variates are correlated) linear and higher-order

16,906 citations


Journal ArticleDOI
TL;DR: Some comparisons are made for five cases of varying degrees of censoring and tying between probabilities from the exact test and those from the proposed test and these suggest the test is appropriate under certain conditions when the sample size is five in each group.
Abstract: H2: F1(t) F2(t) (t F2(t) (t < T). The asymptotic efficiency of the test relative to the efficient parametric test when the distributions are exponential is at least 0 75 and increases with degree of censoring. When Ho is true, the test is not seriously affected by real differences in the percentage censored in the two groups. Some comparisons are made for five cases of varying degrees of censoring and tying between probabilities from the exact test and those from the proposed test and these suggest the test is appropriate under certain conditions when the sample size is five in each group. A worked example is presented and some discussion is given to further problems.

3,318 citations


Journal ArticleDOI
TL;DR: The first purpose of the present paper is to derive a general probability distribution designed to fit the majority of capture-recapture problems involving a 'single' population.
Abstract: A brief review, with references, of the literature on capture-recapture theory is given in Jolly (1963). More recently, Cormack (1964) gives a solution, including asymptotic variances, for a specific situation involving the marking and release of a non-random sample of fulmar petrels. His model is stochastic and will be referred to in ? 4. Seber (1962, 1965) has also produced some interesting solutions, and these together with Darroch (1958, 1959) are most directly relevant to our present problem. Darroch (1958, 1959) shows that in a fully stochastic model with either immigration (often called dilution) or death (or emigration), the population parameters can be easily estimated by maximum likelihood. For the more general case when death and immigration are operating simultaneously he derives estimation equations by equating certain observations to their expectations, but does not give variances or covariances for the estimates. In a later paper, (Darroch, 1961), he considers estimation for a closed population consisting of different strata. Seber (1962) establishes a stochastic model for what he calls the multi-sample single recapture census in which an individual cannot be recaptured more than once. This situation arises, for example, when the recaptures are made in the course of hunting or fishing. He allows for both death and immigration in the population, provides explicit maximumlikelihood estimates of the parameters with variances, and suggests tests for certain of the assumptions. In a second paper, (Seber, 1965), he considers a multiple-recapture model differing only slightly from that of Darroch (1959), with both death and immigration. Again, he provides explicit maximum-likelihood estimates of the parameters with variances. A test is also given for equi-catchability in a closed population of individuals with different capture histories. The first purpose of the present paper is to derive a general probability distribution designed to fit the majority of capture-recapture problems involving a 'single' population. The word 'single' here denotes a population covering an area within whose boundaries the animals (or, in general, individuals or members) are free to move and to mix with others of their kind, but which is regarded as a single area in respect of which parameters are to be estimated. The type of situation which is thus excluded by this definition is one where the population is split into a number of defined areas, and separate population estimates are required for each area as well as for numbers of animals moving from one area to another. The single population, however, need not be homogeneous but may consist of different classes of animals behaving in different ways. The other assumptions underlying the model are stated with the notation in ? 2, and the generalized probability distribution is derived in ?3.

2,449 citations


Journal ArticleDOI
TL;DR: Jolly (1965), tackling this problem from a different viewpoint, gives a very elegant solution to the problem of finding maximumlikelihood estimates of the unknown population parameters and gives the means and variances of these estimates.
Abstract: In the field of biology the problem of estimating population parameters such as population size, death rate, birth rate, etc., for animal populations is obviously an important one. Various capture-tag-recapture models have been developed to estimate these population parameters with a minimum number of assumptions on the underlying population. One such method, the multiple-recapture census, has been the topic of many papers and is described briefly as follows. The experimenter takes a sequence of random samples a,, a2, ... , a8, say. The members of each sample a* are tagged and returned to the population before taking the next sample. Thus the members of a2, a3, ..., a, can be classified according to when, if at all, they have been captured before. Although several models have been developed from different basic assumptions, three papers in particular by Darroch (1958, 1959) and Jolly (1965) give the most general treatment of this method in the form of exact, fully stochastic models which lend themselves readily to the method of maximum-likelihood estimation. The first paper of Darroch's deals with the closed population, i.e. a population in which there is neither augmentation due to immigration (or birth) nor departure due to death (or emigration). This population is usually dealt with separately as the algebra involved is quite different from that which arises in more general populations. The main assumption made, which in fact underlies all capturerecapture models, is that marked and unmarked individuals have the same probability of being caught and as far as the author knows no test statistic has been given for testing this assumption. In ? 4 of this paper we shall consider this problem and a likelihood-ratio test statistic will be derived which gives an approximate test of the above assumption. In Darroch's second paper he derives, for a population in which there is either immigration or death (but not both), modified maximum-likelihood estimates of the population parameters that are almost unbiased and asymptotically efficient, that is efficient for a certain class of reasonable estimates. For the population in which there is both immigration and death he gives only the probability generating function for the model and derives moment estimates of the parameters. However, Jolly (1965), tackling this problem from a different viewpoint, gives a very elegant solution to the problem of finding maximumlikelihood estimates of the unknown population parameters and gives the means and variances of these estimates. In this paper we shall consider this general population with both immigration and death and set up a model which differs slightly from that of Darroch and Jolly in that certain parameters are treated as unknown constants rather than as random variables. The notation and approach to the model is essentially that of Darroch's while the method of solution is similar to that given in Seber (1962). The estimates obtained from this model will be compared briefly with those obtained by Jolly.

1,749 citations


Journal ArticleDOI
TL;DR: In the present paper, a class of problems where the dispersion matrix has a known structure is considered and the appropriate statistical methods are discussed.
Abstract: In an earlier paper (Rao, 1959), the author discussed the method of least squares when the observations are dependent and the dispersion matrix is unknown but an independent estimate is available. The unknown dispersion matrix was, however, considered as an arbitrary positive definite matrix. In the present paper we shall consider a class of problems where the dispersion matrix has a known structure and discuss the appropriate statistical methods. More specifically the structure of the dispersion matrix results from considering the parameters in the well-known Gauss-Markoff linear model as random variables. Let Y be a vector random variable with the structure

609 citations



Journal ArticleDOI
TL;DR: In this article, Renyi and Sulanki have given integral expressions for the expected area, expected perimeter, expected probability content and expected number of sides of a convex hull of a point set in the plane.
Abstract: SUMMARY Various expectations concerning the convex hull of N independently and identically distributed random points in the plane or in space are evaluated. Integral expressions are given for the expected area, expected perimeter, expected probability content and expected number of sides. These integrals are shown to be particularly simple when the underlying distribution is normal or uniform over a disk or sphere. In two recent papers Renyi & Sulanki (1963, 1964) have given expressions for the expected area, perimeter, and number of vertices of the convex hull of N independently and identically selected random points in the plane. In these papers, limit theorems for asymptotically large N receive the greatest attention. Here the emphasis will be on the development of convenient formulae for fixed values of N. Some new results for random convex hulls in the plane are derived, such as the expected probability content, and also various expectations concerned with random convex hulls in three and higher dimensions. Special attention is given to the case of normally distributed points and also to the case of points drawn uniformly from an ellipse or an ellipsoid. Historically, calculating the expected probability content of three random points in two dimensions is known as 'Sylvester's problem', and has been solved explicitly for many different distributions by Deltheil (1926, p. 42). The corresponding problem of four points in three dimensions is connected with the name of Hostinsky (1925). A discussion of Sylvester's problem is given in Kendall & Moran's monograph, Geometrical Probability (1963). 2. THE EXPECTED NUMBER OF VERTICES, FACES, AND EDGES

271 citations


Journal ArticleDOI
TL;DR: In this paper, the authors consider the problem of making inferences about a possible shift in level of the series associated with the occurrance of an event at some particular time, for example, the observations might be of some economic indicator and one might suspect a change in level to occur in a particular interval because of change in fiscal policy.
Abstract: : Suppose that observations zsubt of a time series are available at equally spaced time intervals The authors consider the problem of making inferences about a possible shift in level of the series associated with the occurrance of an event E at some particular time For example, the observations might be of some economic indicator and one might suspect a change in level to occur in a particular interval because of a change in fiscal policy Alternatively, zsubt could be the daily output of a chemical process and the event E might be a change in the supplier of raw material

255 citations


Journal ArticleDOI
TL;DR: In this paper, it was shown that n multivariate observations are available with the uth such observation, and that n is the maximum number of observations that can be obtained from the observation set.
Abstract: 1. SUMMARY Suppose that n multivariate observations are available with the uth such observation

255 citations




Journal ArticleDOI
J. L. Naus1

Journal ArticleDOI
Ying Yao1
TL;DR: In this paper, an extension of the Welch "approximate degrees of freedom" (APDF) solution provided by Tukey (1959), and discusses the results of a Monte Carlo sampling study on this new APDF solution and its comparison with the James series solution.
Abstract: The comparison of the means of two populations on the basis of two independent samples is one of the oldest problems in statistics. Indeed, it has been a testing ground for many methods of inference as well as for a variety of analytic approaches to practical problems. The univariate problem was first studied by Behrens (1929) and his solution was presented by Fisher (1935) in terms of the fiducial theory. Welch studied it in the confidence theory framework and provided an 'approximate degrees of freedom' solution as well as an asymptotic series solution (1936, 1947). Many others have investigated this topic and various methods of approach were also suggested by Jeifreys (1940), Scheff6 (1943), McCullough, Gurland & Rosenberg (1960), Banerjee (1961), and Savage (1961). In the multivariate extension of the Behrens-Fisher problem, Bennett (1951) has extended the Scheff6 solution, and James (1954) the Welch series solution. The present paper studies an extension of the Welch 'approximate degrees of freedom' (APDF) solution provided by Tukey (1959), and discusses the results of a Monte Carlo sampling study on this new APDF solution and its comparison with the James series solution.

Journal ArticleDOI
TL;DR: The Hermite distribution is the generalized Poisson distribution whose probability generating function (P.G.F.) is exp [a1(s - 1) + a2(s2 - 1)] and the probabilities (and factorial moments) can be conveniently expressed in terms of modified Hermite polynomials (hence the proposed name).
Abstract: SUMMARY The Hermite distribution is the generalized Poisson distribution whose probability generating function (P.G.F.) is exp [a1(s - 1) + a2(s2 - 1)]. The probabilities (and factorial moments) can be conveniently expressed in terms of modified Hermite polynomials (hence the proposed name). Writing these in confluent hypergeometric form leads to two quite different representations of any particular probability-one a finite series, the other an infinite series. The cumulants and moments are given. Necessary conditions on the parameters and their maximum-likelihood estimation are discussed. It is shown, with examples, that the Hermite distribution is a special case of the Poisson Binomial distribution (n = 2) and may be regarded as either the distribution of the sum of two correlated Poisson variables or the distribution of the sum of an ordinary Poisson variable and an (independent) Poisson 'doublet' variable (i.e. the occurrence of pairs of events is distributed as a Poisson). Lastly its use as a penultimate limiting form of distributions with P.G.F.

Journal ArticleDOI
TL;DR: In this paper, the problem of estimating variance components has attracted the attention of many writers-see, for example, Bross (1950), Bulmer (1957), Bush & Anderson (1963), Crump (1946, 1951), Daniels (1939), Fisher (1935), Green (1954) and Healy (1963) etc.
Abstract: Thus, E(yij-_U)2 = o2 + oa2. The parameters or2 and oa2 are therefore also called variancecomponents. The problem of estimating variance-components has attracted the attention of many writers-see, for example, Bross (1950), Bulmer (1957), Bush & Anderson (1963), Crump (1946, 1951), Daniels (1939), Fisher (1935), Green (1954) and Healy (1963), etc. In most of these works, the problem has been analysed from the sampling theory point of view. A difficulty which has concerned many of these writers is the so-called 'negative estimated variance' problem. For example, using the model in (1.fi1) with the added assumption that the ai's and eij's are independent among themselves, the following unbiased estimator for

Journal ArticleDOI
TL;DR: In this paper, the authors considered the special case of probabilities of joint events associated with the pair of random variables 1Tf = (X + 81)/ Y and 2Tf= (X+ 82)/ Y, i.e. where X1 = X2 = X (or p = 1).
Abstract: Let X1 and X2 have a bivariate normal distribution with means zero and variances one, and correlation p. Let Y be distributed independent of the X's and let Y have a square root of a chi-square (with f degrees of freedom) divided by f-distribution, where the square root extends over the f in the denominator also. Then (Xl + 8)/ Y and (X2 + 82)/Y have non-central t-distributions with f degrees of freedom and non-centrality parameters 81 and 62, respectively. A bivariate non-central t-distribution may then be defined as the joint distribution of (X + 81)/ Y and (X2 +82)/ Y. This definition is in conformance with the definition of a multivariate t-distribution given by Dunnett & Sobel (1954). In this paper, we will be interested in the special case of probabilities of joint events associated with the pair of random variables 1Tf = (X + 81)/ Y and 2Tf = (X + 82)/ Y, i.e. where X1 = X2 = X (or p = 1). Certain of these probabilities have applications to two-sided tolerance limits and two-sided sampling plans. The connexion between these two-sided tolerance limits, twosided sampling plans and one-sided sampling plans and tolerance limits will also be discussed. We will also use the notation Tf = (X + d)! Y for a random variable with the univariate noncentral t-distribution.


Journal ArticleDOI
TL;DR: In this article, a series expansion of the distributionp (X) of X =,y'2 (denoting a non-central chi-squared by x'2) is developed in terms of Laguerre polynomials.
Abstract: Fisher (1928) expressed the distribution function of the non-central %2 in terms of Bessel functions with imaginary argument. Wishart (1932) and Tang (1938) evaluated the probability integrals of the non-central %2 and F distributions; they involve a heavy amount of labour. In this paper Laguerre series expansions of these distributions are derived. In ? 2 series expansions of the distributionp (X) of X = ,y'2 (denoting a non-central chi-squared by x'2) is developed in terms of Laguerre polynomials, namely

Journal ArticleDOI
TL;DR: A simple test based on Durbin's modification is described in this article, and its asymptotic relative efficiency with respect to the most powerful test against the alternative of a renewal process with gamma-distributed intervals is given.
Abstract: SUMMARY It is shown that a test proposed by Barnard for Poisson processes, using certain distributionfree statistics, is not consistent against renewal alternatives. However, empirical evidence is given which suggests that a modification of this test due to Durbin results in relatively powerful tests of the Poisson hypothesis. A simple test based on Durbin's modification is described, and its asymptotic relative efficiency with respect to the asymptotically most powerful test against the alternative of a renewal process with gamma-distributed intervals is given. A central problem in the statistical analysis of stationary series of events (point processes) is to test whether an observed series is a realization of a Poisson process. Denote by A the rate parameter of the Poisson process. The test is for a composite null hypothesis, A being a nuisance parameter. In a Poisson process the numbers of events in non-overlapping intervals are independent Poisson variates, and moreover the intervals between events are independent random variables with the simple exponential distribution. Consequently tests of the Poisson hypothesis can be devised for various alternative hypotheses. Epstein (1960) has surveyed these tests. We will be concerned here with tests which are useful against rather vaguely specified alternatives. There are many such tests and we will try to assess the relative utility of the most important ones. The main source of these tests is an idea of Barnard's (1953). Assume first of all that observation of the series is for a fixed time period (0, t) and that n events occur at times t1, t2..., t., measured from the origin. The idea is to test, conditionally on the number of events in (0, t), N, being equal to n, that the variables Ui = Tilt (i = 1, 2,..., n), are independent variates with distribution 0 (u < 0),

Journal ArticleDOI
TL;DR: In this article, an adaptation of the Kolmogorov statistic is proposed to test the null hypothesis that a random sample of size N comes from a population with given continuous distribution function F(x).
Abstract: 1.1. Kuiper (1960) has proposed VN, an adaptation of the Kolmogorov statistic, to test the null hypothesis that a random sample of size N comes from a population with given continuous distribution function F(x). If the sample distribution function is FN(x), VN is defined by VN= sup (FN(x)-F(x))- inf (FN(x) -F(x)). (1) -co

Journal ArticleDOI



Journal ArticleDOI
E. N. Gilbert1
TL;DR: In this article, the authors give an approximation algorithm for the problem of finding the probability that the N caps cover the sphere, where the assumption is that every point x on the surface of the unit sphere belongs to at least one cap.
Abstract: The caps C,, ..., CQn cover the sphere if every point x on the surface of the unit sphere belongs to at least one cap. X1, ..., XN are to be chosen independently at random with constant probability density 1/(47r) over the surface of the sphere. Let P(N) be the probability that the N caps cover the sphere. This note will estimate P(N) partly theoretically and partly by a computer simulation experiment. An exact formula P(N) = 1(N2-N+2).2-N holds in the special case p = 90?. For other values of p only bounds on P(N) are found. Kendall & Moran (1963) mention P(N) as an example of a difficult problem in geometrical probability. Moran & Fazekas de St Groth (1962) give an approximation formula for P(N) and estimate P(N) experimentally for the value p = 530 26'. This angle occurs in an interesting biological application in which the sphere is a virus, X1, ..., XN are points at which antibodies attack the virus, and P(N) is the probability that N antibodies render the virus harmless. Alternatively if N observatories, situated at random on the Earth, can each look at angles p or less away from the zenith then P(N) is the probability that no point in the sky is hidden from all N observatories. Each cap covers a fraction f=sin2 p (1)

Journal ArticleDOI
TL;DR: The reason for the discrepancy between the theories is discussed and it is suggested that it is a consequence of restricting attention to fixed sample size experiments and may be serious in extreme cases.
Abstract: SUMMARY Suppose that we have a sample of observations on a continuous random variable with a distribution depending on one unknown parameter 0. We consider the problem of making inferences about 0 when there is no prior information about the value of the parameter. Some examples are discussed for which it is not possible to find Bayesian methods possessing the properties required by frequentist statisticians. The magnitude of the difference between comparable Bayesian and frequentist probability statements is calculated. It is found that although the difference is not usually large, even in small samples, it may be serious in extreme cases. We discuss the reason for the discrepancy between the theories and suggest that it is a consequence of restricting attention to fixed sample size experiments. The argument is supported by a detailed discussion of an experimental situation common in industrial

Journal ArticleDOI
TL;DR: The history of the first few years of the 1890-1905 period of mathematical statistics is described in this article, with a focus on the men concerned in the pioneer movements, from what background they started and what was the combination of circumstances which lead to the particular lines of advance which they followed.
Abstract: Perhaps the two great formative periods in the history of mathematical statistics were the years 1890-1905 and 1915-30. In both, the remarkable leap forward was made in answer to a need for new theory and techniques to help in solving very real problems in the biological field. In the earlier years the original questions posed concerned the interpretation of data bearing on theories of heredity and evolution; in the later period the first call was to sharpen and develop the tools used in agricultural experimentation. There is considerable fascination in trying to find out how things looked at the time to the men concerned in such pioneer movements, from what background they started and what was the combination of circumstances which lead to the particular lines of advance which they followed. Below I shall try to describe some of the history of the first few years of the 1890-1905 period. A good deal of this has already been put on record, for example, by K. Pearson (1906, 1930) and I shall draw freely on this material, but the availability of certain letters between Francis Galton (1822-1911), F. Y. Edgeworth (1845-1926), Karl Pearson (1857-1936) and W. F. R. Weldon (1860-1906)t has made it possible to add some illuminating personal touches to what is already on record. The final event which brought about the association of Pearson and Weldon, leading 10 years later to the founding of Biometrika, was Weldon's election in 1890 to the Jodrell Chair of Zoology at University College, London where Pearson had been Professor of Applied Mathematics since 1884. But to understand the basis of the co-operation between these two men we must look still further back. The threads were gathered from many sources. In the second half of the 1880's Pearson was by profession an applied mathematician, a good deal of whose teaching was to students of engineering. Between 1884 and 1893 he had first prepared for the press W. K. Clifford's unfinished manuscript of The Common Sense of the Exact Sciences and had then undertaken the far more arduous task of completing Isaac Todhunter's A History of the Theory of Elasticity; the second volume of this, containing some 1300 pages were almost entirely Pearson's contribution. But throughout the '80's his research energies were also occupied in a quite different direction, in the study of mediaeval and Renaissance German literature and folk lore. This work is recorded in the series of essays, most of them first given as lectures, which were later published in The Ethic of Freethought (1888) and The Chances of Death (1897). The

Journal ArticleDOI
TL;DR: In this paper, general expressions for the first time were given for the upper 5 and 1 % points of the largest root for s = 8, 9 and 10, and these expressions were used to compute the upper 3 and 4 % points.
Abstract: The cumulative distribution function (C.D.F.) of the largest characteristic root of a matrix in multivariate analysis has been studied by Pillai (1954, 1956a, 1957, 1960 and 1964) with a view to obtaining an approximation to this C.D.F. useful for computing the upper percentage points. The approach, so far, has been to approximate the C.D.F. of the largest root for each value of the number, s, of non-null roots from two to seven. In this paper, general expressions are given for the first time, approximating at the upper end the C.D.F. of the largest of s non-null characteristic roots. Further, these expressions are used to compute the upper 5 and 1 % points of the largest root for s = 8, 9 and 10. It may be pointed out that Roy (1945, 1953, 1957) has shown that tests of certain hypotheses in multivariate analysis and associated confidence interval estimation can be based on the extreme characteristic roots. The monotonic character of the power functions of two multivariate tests using the largest characteristic root has been demonstrated by several authors (Roy & Mikhail, 1961; Das Gupta, Anderson & Mudholkar, 1964; Anderson & Das Gupta, 1964).

Journal ArticleDOI
TL;DR: The most common method of testing the hypothesis that an observed spatial distribution of points in the Euclidean plane is a realization of a Poisson point process, or in practical terminology that the points are distributed 'completely at random' is based on the Index of Dispersion as mentioned in this paper.
Abstract: The most familiar method of testing the hypothesis that an observed spatial distribution of points in the Euclidean plane is a realization of a Poisson point process, or in practical terminology that the points are distributed 'completely at random' is based on the Index of Dispersion. In recent years, however, statistical ecologists have shown interest in what are known as distance, or nearest neighbour methods for studying the pattern of plant distributions. These involve choosing a set of n sampling points in a fixed sampling area B according to some rule, and measuring the distances Xij from the ith sample point to the jth nearest plant to it, measurements to plants outside B being permitted if necessary. The observations Xij are then used in estimators of the density of the spatial point process of which the plant distribution is a realization, or in testing hypotheses about its form. Distance methods have been discussed by Skellam (1952), Strand (1953), Moore (1954), Hopkins (1954), Thompson (1956), Matern (1959), Pielou (1959), Mountford (1961), Kendall & Moran (1963, p. 38), Keuls, Over & De Wit (1963), Persson (1964) and Holgate (1964, 1965 a, b). Tests of randomness based wholly or partly on distance methods have been proposed by Skellam, Moore, Hopkins and Pielou (see also Mountford's paper). These tests are discussed in the present paper, with particular emphasis on their power to detect certain kinds of departure from randomness. In Holgate (1965a) two more tests are proposed and discussed.

Journal ArticleDOI
TL;DR: A clear and flexible framework within which inference and decision prediction can be discussed is suggested, to indicate briefly how existing inference procedures fall within this framework, and then to develop the model towards specific decision prediction procedures.
Abstract: SUMMARY A general framework is introduced for the study of inference and decision predictions about the outcome of a future experiment from the data of an independent informative experiment. This allows a simple classification of prediction problems, and shows the place of standard inference predictions within the framework. A Bayesian approach to decision prediction is then presented and techniques appropriate to a variety of realistic utility functions are developed. Finally, some prediction problems associated with classes of experiments are considered. Statistical prediction is the use of the data from an informative experiment E to make some statement about the outcome of a future experiment F. The prediction statements commonly treated in the literature are of inference type, in which the purpose is to give some indication of the likely outcome of F, or to suggest some subset of possible outcomes in which the actual outcome of F is likely to fall. There are also, however, prediction problems of a decision type, for which the decision space consists of subsets of the outcome space of F, and where the prediction is related in a much more precise way to some specific purpose. Our own interest in the subject has arisen from decision problems in the supply of hospital engineering services (e.g. oxygen, gas, conditioned air, suction, etc.). In a simple version the supply system may be supposed to function at a series of independent operations, at each of which a constant quantity r (e.g. number of outlets) of the commodity is available for supply. At each operation of the system some variable quantity y is demanded; this may be below or above r. If y > r the system has failed fully to meet demand and if y < r the system has oversupplied. The extent to which fixing the supply at r is satisfactory depends on the relative demerits of failing to meet demand and of oversupplying, and on the variation in y. Here we can suppose F to be the observation of a free demand, unrestricted by the limited supply. The informative experiment E may consist of demands xl, ..., Xn on an existing similar system which has been overdesigned, so that E consists of n replicates of F. If the existing system is not overdesigned but supplies r,, say, at each operation then E may be regarded as n replicates of F, with observations truncated or censored at rL; this case can be treated only by asymptotic methods and we shall not consider it here. Our purpose in this paper is first to suggest a clear and flexible framework within which such inference and decision prediction can be discussed, to indicate briefly how existing inference procedures fall within this framework, and then to develop the model towards specific decision prediction procedures. We do this for the case in which E and F are independent experiments; thus we do not consider the situation where the outcome of E is a part realization of a stochastic process and F is the continuation of the process.

Journal ArticleDOI
TL;DR: In this paper, a set of tables for Pearson frequency curves in Johnson, Nixon, Amos & Pearson (1963) were constructed from the present tables by using the relation 0.0.
Abstract: 0.9995, 0 999, 0 9975, 0 995, 0-99, 0 975, 0 95, 0 90, 0 75, 0 50, 0-25, 0-10, 0 05, 0-025, 0-01, 0 005, 0-0025, 0 0001 and 0 0005 points of the S u distribution. The tables are in the same form as a corresponding set of tables for Pearson frequency curves in Johnson, Nixon, Amos & Pearson (1963). It is a simple matter to construct such a table from the present tables by using the relation