scispace - formally typeset
Search or ask a question

Showing papers in "Biometrika in 1938"


Journal ArticleDOI
TL;DR: Rank correlation as mentioned in this paper is a measure of similarity between two rankings of the same set of individuals, and it has been used in psychological work to compare two different rankings of individuals in order to indicate similarity of taste.
Abstract: 1. In psychological work the problem of comparing two different rankings of the same set of individuals may be divided into two types. In the first type the individuals have a given order A which is objectively defined with reference to some quality, and a characteristic question is: if an observer ranks the individuals in an order B, does a comparison of B with A suggest that he possesses a reliable judgment of the quality, or, alternatively, is it probable that B could have arisen by chance? In the second type no objective order is given. Two observers consider the individuals and rank them in orders A and B. The question now is, are these orders sufficiently alike to indicate similarity of taste in the observers, or, on the other hand, are A and B incompatible within assigned limits of probability? An example of the first type occurs in the familiar experiments wherein an observer has to arrange a known set of weights in ascending order of weight; the second type would arise if two observers had to rank a set of musical compositions in order of preference. The measure of rank correlation proposed in this paper is capable of being applied to both problems, which are, in fact, formally very much the same. For purposes of simplicity in the exposition it has, however, been thought convenient to preserve a distinction between theni.

5,688 citations





Journal ArticleDOI
TL;DR: In this article, a generalization of Fisher's z and a test suitable for dealing with certain generalized analysis of variance problems, having the advantage of being easy to apply, was proposed.
Abstract: and A ij is the cofactor of a,j in the determinant A I = j a1 ijj At a later date Wilks (1932) defined a generalized variance and found the appropriate A-criteria for testing certain hypotheses concerning the means, variances, and covariances of k normal multivariate populations from which k independent samples have been drawn. These criteria were developed more fully by Pearson and Wilks in a subsequent paper (1933) for the case of two variates, but though the sampling distributions obtained were in certain cases relatively simple the arithmetical calculations required for practical application were, in general, not of a very simple nature.* In this paper we shall find a quantity which may be regarded as a generalization of Fisher's z and which provides a test suitable for dealing with certain generalized analysis of variance problems, having the advantage of being easy to apply.

186 citations


Journal ArticleDOI
TL;DR: In this paper, it was shown that if we wish to use a set of n independent observations x1, X2,... X, to test the hypothesis Ho that a probability law is of specified form, say p(x I Ho), it may be possible to carry out this by testing the equivalent hypothesis, ho, that the corresponding values YI, Y2, Y, obtained by means of the transformation (2), have been randomly drawn from the rectangular distribution (3).
Abstract: y is a non-decreasing function of x, having values confined to the interval (0, 1). Further d p(y) = p(x)Iy = I for < y < l. . (3) ~!dx In other words the probability law for the integral, y, is rectangular, all values of y between O and 1 being equally likely to occur. It follows that if we wish to use a set of n independent observations x1, X2, ... X., to test the hypothesis Ho that a probability law is of specified form, say p(x I Ho), it may be possible to carry out this by testing the equivalent hypothesis, ho, that the corresponding values YI, Y2, ..., Y,,, obtained by means of the transformation (2), have been randomly drawn from the rectangular distribution (3). The relation between xi and yi is illustrated in Fig. 1; corresponding to the abscissae xi, (i = 1, 2, .. ., 10), of the ten ordinates drawn above, are ten values of y shown below on the scale 0 to 1. The hypothesis Ho that the ten x's are a random sample from a population distribution represented by the frequency curve is therefore equivalent to the hypothesis ho that the ten y's form a random sample from a rectangular distribution, range O to 1. If the probability laws p(x) are not the same for all the x's, so that

137 citations




Journal ArticleDOI

38 citations




Journal ArticleDOI
TL;DR: In this article, it was shown that the distribution of statistics such as Student's "t", the correlation coefficient and Fisher's "z" differ very slightly from one population to another.
Abstract: THE theoretical distribution of many statistics calculated from small samples is known when the population is normal, but when it is not normal we know very little about the distribution of such statistics. Such work as has been done has generally assumed population forms of standard types, but we may occasionally come up against samples from populations which do not appear to fit into any known type. This has led to many attempts being made to build up, by experimental sampling from non-normal data, partial populations of samples from which can be inferred in an empirical way the laws of distribution followed by derived statistics. A list of papers dealing with this subject which have come to the author's notice is given on pp. 79, 80 below. In many cases it has been found that in sampling from curves with one mode not at the end of the range, the distributions of statistics such as "Student's " "t", the correlation coefficient and, in certain cases, Fisher's "z ", differ very slightly from one population to another. On the whole these investigations have suggested that in such cases we can neglect the departure of the population from normality without introducing serious error into our tests of significance. The possibility of further theoretical work must not be overlooked, but unless our results are independent of population form (as, for instance, in recent work by Pitman and Welch) it is unlikely that we shall be able to make much practical use of the results. It is customary to designate a non-normal population by the values of 1 and f2; but in the case of samples of 100 or less from a normal population the range of values of ,81 and /2 excluding 5 % of the total at each end is comparable with the range of 1 and /2 in the non-normal populations which have been used for sampling experiments. Further, this range of populations is considered by E. S. Pearson to cover most cases which will be found to occur in practice. On these grounds I think that conclusions of practical value are most likely to be reached by further sampling. No attempt appears to have been made to carry out an experimental sampling from a bivariate population in which the distribution surface is not normal and in which the correlation coefficient is high, or to take sets of samples from a univariate non-normal population and to assign the samples to blocks and treatments in a randomized block experiment, taking a completely fresh



Journal ArticleDOI







Journal ArticleDOI




Journal ArticleDOI
TL;DR: This article pointed out that the confusion of thought into which his [Sheppard's] work brought order is well illustrated throughout Pearson's (1895) paper and pointed out the methods used by Prof. Fisher to discredit the work of the late Karl Pearson.
Abstract: READERS of statistical journals can hardly fail to have noticed what appears to be a regular campaign carried on by Prof. R. A. Fisher to discredit the work of the late Karl Pearson. Owing to the tone and form of these depreciations, one feels reluctant to reply, but it seems to be useful to point out just one instance illustrating the methods used by Prof. Fisher. Commenting on the achievements of the late W. F. Sheppard, Prof. Fisher (1937b, pp. 10-11 note) recently wrote: "The confusion of thought into which his [Sheppard's] work brought order is well illustrated throughout Pearson's (1895) paper. Thus to the formula


Journal ArticleDOI
TL;DR: In this paper, the authors considered the problem of recording the number of people falling ill at any instant, or more precisely per unit time, in any population at risk of being at risk from a certain morbidity rate.
Abstract: IN any population at risk there is a certain morbidity rate, and ideally there would be machinery for recording the number falling sick at any instant, or more precisely, per unit time. Of those who so fall sick, the great majority recover, and ideally that fraction so recovering of the number first recorded would be recorded also at any subsequent instant. If the death-rate from sickness is nil we should then have a sickness recovery curve (s.R.C.) which would approach asymptotically with unlimited time the value zero, the curve giving at any instant the number still sick out of those who fell sick at any specified previous instant. We may further suppose that of the number who at any particular instant become unfit all are at once as sick then as they will be, i.e. there is no incubation period, and that they all immediately begin, in varying degrees and at varying rates, to recover. The S.R.C. will therefore be a J curve, monotone in its decrease with time. Thus in a particular day secondary school, taking boys and girls of ages 10-19, a table of duration of absence was as follows:

Journal ArticleDOI