Author

# Karl Pearson

Other affiliations: University of London

Bio: Karl Pearson is an academic researcher from University College London. The author has contributed to research in topics: Inheritance (object-oriented programming) & Population. The author has an hindex of 60, co-authored 351 publications receiving 23260 citations. Previous affiliations of Karl Pearson include University of London.

##### Papers published on a yearly basis

##### Papers

More filters

••

TL;DR: In this article, a system of deviations from the means of n variables with standard deviations σ 1, σ 2 σ σ n and with correlations r12, r13, r23, r n −1,n.

Abstract: Let x1, x2 … x n be a system of deviations from the means of n variables with standard deviations σ1, σ2 … σ n and with correlations r12, r13, r23 … r n −1,n.

2,912 citations

••

TL;DR: In this paper, the authors consider a population in which sexual selection and natural selection may or may not be taking place, and assume only that the deviations from the mean in the case of any organ of any generation follow exactly or closely the normal law of frequency.

Abstract: Consider a population in which sexual selection and natural selection may or may not be taking place. Assume only that the deviations from the mean in the case of any organ of any generation follow exactly or closely the normal law of frequency, then the following expressions may be shown to give the law of inheritance of the population.

2,394 citations

••

TL;DR: In this paper, the authors discuss the dissection of abnormal frequency-curve into normal curves, which is a special case of the normal curve problem, and the equations for dissection into n normal curves can be written down in the same manner as for the case of n = 2.

Abstract: (1.) If measurements be made of the same part or organ in several hundred or thousand specimens of the same type or family, and a curve be constructed of which the abscissa x represents the size of the organ and the ordinate y the number of specimens falling within a definite small range δx of organ, this curve may be termed a frequency-curve . The centre or origin for measurement of the organ may, if we please, be taken at the mean of all the specimens measured. In this case the frequency-curve may be looked upon as one in which the frequency—per thousand or per ten thousand, as the case may be—of a given small range of deviations from the mean, is plotted up to the mean of that range. Such frequency-curves play a large part in the mathematical theory of evolution, and have been dealt with by Mr. F. Galton, Professor Weldon, and others. In most cases, as in the case of errors of observation, they have a fairly definite symmetrical shape and one that approaches with a close degree of approximation to the well-known error or probability-curve. A frequency-curve, which, for practical purposes, can be represented by the error curve, will for the remainder of this paper be termed a normal curve . When a series of measurements gives rise to a normal curve, we may probably assume something approaching a stable condition; there is production and destruction impartially round the mean. In the case of certain biological, sociological, and economic measurements there is, however, a well-marked deviation from this normal shape, and it becomes important to determine the direction and amount of such deviation. The asymmetry may arise from the fact that the units grouped together in the measured material are not really homogeneous. It may happen that we have a mixture of 2, 3, . . . n homogeneous groups, each of which deviates about its own mean symmetrically and in a manner represented with sufficient accuracy by the normal curve. Thus an abnormal frequency-curve may be really built up of normal curves having parallel but not necessarily coincident axes and different parameters. Even where the material is really homogeneous, but gives an abnormal frequency-curve the amount and direction of the abnormality will be indicated if this frequency-curve can be split up into normal curves. The object of the present paper is to discuss the dissection of abnormal frequency-curves into normal curves. The equations for the dissection of a frequency-curve into n normal curves can be written down in the same manner as for the special case of n = 2 treated in this paper; they require us only to calculate higher moments. But the analytical difficulties, even for the case of n = 2, are so considerable, that it may be questioned whether the general theory could ever be applied in practice to any numerical case. There are reasons, indeed, why the resolution into two is of special importance. A family probably breaks up first into two species, rather than three or more, owing to the pressure at a given time of some particular form of natural selection; in attempting to procure an absolutely homogeneous material, we are less likely to have got a mixture of three or more heterogeneous groups than of two only. Lastly, even where the heterogeneity may be threefold or more, the dissection into two is likely to give us, at any rate, an approximation to the two chief groups. In the case of homogeneous material, with an abnormal frequency-curve, dissection into two normal curves will generally give us the amount and direction of the chief abnormality. So much, then, may be said of the value of the special case dealt with here.

1,614 citations

••

TL;DR: A considerable portion of the present memoir is devoted to the expansion and fuller development of Galton's ideas, particularly their application to the problem of bi-parental inheritance as mentioned in this paper.

Abstract: There are few branches of the Theory of Evolution which appear to the mathematical statistician so much in need of exact treatment as those of Regression, Heredity, and Panmixia. Round the notion of panmixia much obscurity has accumulated, owing to the want of precise definition and quantitative measurement. The problems of regression and heredity have been dealt with by Mr. Francis Galton in his epochmaking work on ‘Natural Inheritance,’ but, although he has shown exact methods of dealing, both experimentally and mathematically, with the problems of inheritance, it does not appear that mathematicians have hitherto developed his treatment, or that biologists and medical men have yet fully appreciated that he has really shown how many of the problems which perplex them may receive at any rate a partial answer. A considerable portion of the present memoir will be devoted to the expansion and fuller development of Mr. Galton’s ideas, particularly their application to the problem of bi-parental inheritance . At the same time I shall endeavour to point out how the results apply to some current biological and medical problems. In the first place, we must definitely free our minds, in the present state of our knowledge of the mechanism of inheritance and reproduction, of any hope of reaching a mathematical relation expressing the degree of correlation between individual parent and individual offspring. The causes in any individual case of inheritance are far too complex to admit of exact treatment; and up to the present the classification of the circumstances under which greater or less degrees of correlation between special groups of parents and offspring may be expected has made but little progress. This is largely owing to a certain prevalence of almost metaphysical speculation as to the causes of heredity, which has usurped the place of that careful collection and elaborate experiment by which alone sufficient data might have been accumulated, with a view to ultimately narrowing and specialising the circumstances under which correlation was measured. We must proceed from inheritance in the mass to inheritance in narrower and narrwoer classes, rather than attempt to build up general rules on the observation of individual instances. Shortly, we must proceed by the method of statistics, rather than by the consideration of typical cases. It may seem discouraging to the medical practitioner, with the problem before him of inheritance in a particular family, to be told that nothing but averages, means, and probabilities with regard to large classes can as yet be scientifically dealt with ; but the very nature of the distribution of variation, whether healthy or morhid, seems to indicate that we are dealing with that sphere of indefinitely numerous small causes, which in so many other instances has shown itself only amenable to the calculus of chance, and not to any analysis of the individual instance. On the other hand, the mathematical theory wall be of assistance to the medical man by answering, inter alia, in its discussion of regression the problem as to the average effect upon the offspring of given degrees of morbid variation in the parents. It may enable the physician, in many cases, to state a belief based on a high degree of probability, if it offers no ground for dogma in individual cases. One of the most noteworthy results of Mr. Francis Galton’s researches is his discovery of the mode in which a population actually reproduces itself by regression and fraternal variation. It is with some expansion and fuller mathematical treatment of these ideas that this memoir commences.

1,367 citations

##### Cited by

More filters

••

TL;DR: A representation and interpretation of the area under a receiver operating characteristic (ROC) curve obtained by the "rating" method, or by mathematical predictions based on patient characteristics, is presented and it is shown that in such a setting the area represents the probability that a randomly chosen diseased subject is (correctly) rated or ranked with greater suspicion than a random chosen non-diseased subject.

Abstract: A representation and interpretation of the area under a receiver operating characteristic (ROC) curve obtained by the "rating" method, or by mathematical predictions based on patient characteristics, is presented. It is shown that in such a setting the area represents the probability that a randomly chosen diseased subject is (correctly) rated or ranked with greater suspicion than a randomly chosen non-diseased subject. Moreover, this probability of a correct ranking is the same quantity that is estimated by the already well-studied nonparametric Wilcoxon statistic. These two relationships are exploited to (a) provide rapid closed-form expressions for the approximate magnitude of the sampling variability, i.e., standard error that one uses to accompany the area under a smoothed ROC curve, (b) guide in determining the size of the sample required to provide a sufficiently reliable estimate of this area, and (c) determine how large sample sizes should be to ensure that one can statistically detect difference...

19,398 citations

••

TL;DR: In this article, a test of the hypothesis that the samples are from the same population may be made by ranking the observations from from 1 to Σn i (giving each observation in a group of ties the mean of the ranks tied for), finding the C sums of ranks, and computing a statistic H. Under the stated hypothesis, H is distributed approximately as χ2(C − 1), unless the samples were too small, in which case special approximations or exact tables are provided.

Abstract: Given C samples, with n i observations in the ith sample, a test of the hypothesis that the samples are from the same population may be made by ranking the observations from from 1 to Σn i (giving each observation in a group of ties the mean of the ranks tied for), finding the C sums of ranks, and computing a statistic H. Under the stated hypothesis, H is distributed approximately as χ2(C – 1), unless the samples are too small, in which case special approximations or exact tables are provided. One of the most important applications of the test is in detecting differences among the population means.* * Based in part on research supported by the Office of Naval Research at the Statistical Research Center, University of Chicago.

9,365 citations

••

TL;DR: Fractional kinetic equations of the diffusion, diffusion-advection, and Fokker-Planck type are presented as a useful approach for the description of transport dynamics in complex systems which are governed by anomalous diffusion and non-exponential relaxation patterns.

7,412 citations

••

TL;DR: In this article, a multiple comparison procedure for comparing several treatments with a control is presented, which is based on the Multiple Comparison Procedure for Comparing Several Treatments with a Control (MCPC).

Abstract: (1955). A Multiple Comparison Procedure for Comparing Several Treatments with a Control. Journal of the American Statistical Association: Vol. 50, No. 272, pp. 1096-1121.

5,756 citations

••

TL;DR: Measures of directional and stabilizing selection on each of a set of phenotypically correlated characters are derived, retrospective, based on observed changes in the multivariate distribution of characters within a generation, not on the evolutionary response to selection.

Abstract: Natural selection acts on phenotypes, regardless of their genetic basis, and produces immediate phenotypic effects within a generation that can be measured without recourse to principles of heredity or evolution. In contrast, evolutionary response to selection, the genetic change that occurs from one generation to the next, does depend on genetic variation. Animal and plant breeders routinely distinguish phenotypic selection from evolutionary response to selection (Mayo, 1980; Falconer, 1981). Upon making this critical distinction, emphasized by Haldane (1954), precise methods can be formulated for the measurement of phenotypic natural selection. Correlations between characters seriously complicate the measurement of phenotypic selection, because selection on a particular trait produces not only a direct effect on the distribution of that trait in a population, but also produces indirect effects on the distribution of correlated characters. The problem of character correlations has been largely ignored in current methods for measuring natural selection on quantitative traits. Selection has usually been treated as if it acted only on single characters (e.g., Haldane, 1954; Van Valen, 1965a; O'Donald, 1968, 1970; reviewed by Johnson, 1976 Ch. 7). This is obviously a tremendous oversimplification, since natural selection acts on many characters simultaneously and phenotypic correlations between traits are ubiquitous. In an important but neglected paper, Pearson (1903) showed that multivariate statistics could be used to disentangle the direct and indirect effects of selection to determine which traits in a correlated ensemble are the focus of direct selection. Here we extend and generalize Pearson's major results. The purpose of this paper is to derive measures of directional and stabilizing (or disruptive) selection on each of a set of phenotypically correlated characters. The analysis is retrospective, based on observed changes in the multivariate distribution of characters within a generation, not on the evolutionary response to selection. Nevertheless, the measures we propose have a close connection with equations for evolutionary change. Many other commonly used measures of the intensity of selection (such as selective mortality, change in mean fitness, variance in fitness, or estimates of particular forms of fitness functions) have little predictive value in relation to evolutionary change in quantitative traits. To demonstrate the utility of our approach, we analyze selection on four morphological characters in a population of pentatomid bugs during a brief period of high mortality. We also summarize a multivariate selection analysis on nine morphological characters of house sparrows caught in a severe winter storm, using the classic data of Bumpus (1899). Direct observations and measurements of natural selection serve to clarify one of the major factors of evolution. Critiques of the "adaptationist program" (Lewontin, 1978; Gould and Lewontin, 1979) stress that adaptation and selection are often invoked without strong supporting evidence. We suggest quantitative measurements of selection as the best alternative to the fabrication of adaptive scenarios. Our optimism that measurement can replace rhetorical claims for adaptation and selection is founded in the growing success of field workers in their efforts to measure major components of fitness in natural populations (e.g., Thornhill, 1976; Howard, 1979; Downhower and Brown, 1980; Boag and Grant, 1981; Clutton-Brock et

4,990 citations