scispace - formally typeset
Search or ask a question

Showing papers in "Biometrics in 1965"



Journal Article•DOI•
TL;DR: In this article, the division of a population into equal age groups is replaced by one of unequal stage groups, no assumptions being made about the variation of the duration of the stage that different individuals may show.
Abstract: In this extension to the use of matrices in population mathematics (Lewis [1942] and Leslie [1945]), the division of a population into equal age groups is replaced by one of unequal stage groups, no assumptions being made about the variation of the duration of the stage that different individuals may show. This extension has application in ecological studies where the age of an individual is rarely known. The model is briefly applied to three experimental situations.

933 citations


Journal Article•DOI•

785 citations


Journal Article•DOI•
TL;DR: This comparison shows that the two-period changeover design is preferable when the residual effects of the treatments are equal and the correlation between the response to the two tieatments is positive, otherwise the design in which there is random assignment to a single treatment is preferable.
Abstract: Some properties of the two-period change-over design are investigated. It is shown that if the effect of subjects is assumed to be a random variable, the difference between the direct effects and the differenice between the residual effects of treatments are estimable, but the difference between periods is not. The amount of experimentation necessary to achieve a specified power of the test of equality of the direct effects of treatments resulting from a two-period change-over design is compared to the amount required for a design in which the subjects are assigned randomly to a single treatment. This comparison shows that the two-period changeover design is preferable when the residual effects of the treatments are equal and the correlation between the response to the two tieatments is positive. Otherwise the design in which there is random assignment to a single treatment is preferable.

765 citations


Journal Article•DOI•
TL;DR: A method for investigating the relation of points in multidimensional space using an analysis of variance technique, and the pTocess repeated sequentially so that a tree diagram is formed.
Abstract: A method for investigating the relation of points in multidimensional space is described. Using an analysis of variance technique, the points are divided into the two most-compact clusters, and the pTocess repeated sequentially so that a tree diagram is formed. It is pointed out that the method is well suited to electronic computing. The application of the method to problems of classification is stressed, and numerical examples are given of applications in the analysis of chromosome patterns in cells. (C.H.)

480 citations


Journal Article•DOI•
TL;DR: It is said forthwith that factor analysis's most valuable functions lie in the biological and behavioral sciences, where a great array of phenomena are multiply determined and where the conceptual independent variables are not easily located and agreed upon.
Abstract: Factor analysis aims to explain observed relations among numerous variables in terms of simpler relations. The simplification may consist of producing a set of classificatory categories, or creating a smaller number of hypothetical variables. Such resolution is so central to all scientific work that we must pause in a moment to ask about the relation of factor analysis to investigatory methods in general. However, it may be said forthwith that its most valuable functions lie in the biological and behavioral sciences, where a great array of phenomena are multiply determined and where the conceptual independent variables are not easily located and agreed upon. For example, in the behavioral sciences, a student of delinquency or mental disorder may have doubts whether there is such a single thing as delinquency or mental disorder and must have an open mind as to where the important causal influences lie in the bewildering array of possibilities. In such a situation he may take a set of variables, each semantically entitled to be called a form of delinquency, and measure each on members of a population and then seek for the functional unities by computing correlation coefficients for each pair of variables over the list of people concerned. Into this correlation matrix-as we designate the square matrix of coefficients shown in Table 1 below-he may also introduce measures of putative causal influences, seeking in the subsequent factor analysis to find those underlying 'hypothetical variables' or factors among them which are of -theoretical importance. Or again a biologist concerned with nutrition might correlate-over many animals, each fed in a different wayvarious signs of health and disease and various ingredients in the food. The vitamins, for example, might historically have been located-perhaps more economically of research time than they were-as factors each with a certain pattern of deficiency signs and of relative presence in various foods.

362 citations


Journal Article•DOI•
TL;DR: This paper presents a method of estimating survival distributions when the survival times are asstumed to follow simple exponential distributions, with a different parameter for eacli patient.
Abstract: Concomitant information on a l)atient's condition often accompanies survival time information. This paper presents a method of estimating survival distributions when the survival times are asstumed to follow simple exponential distributions, with a different parameter for eacli patient. The parameter associated with each patient's distribution is functionially related to the concomitant variates. The parameters are estimated by the method of maximum likelihood. An illustration of the method using acute leukemia survival data is given.

357 citations


Journal Article•DOI•
TL;DR: In this article, an analysis of an experiment in which p species are compared by growing pure stand and mixtures in pots or plots at equivalent total density of planting is presented, where each species is grown in monoculture and each combination of two grown in equal proportion.
Abstract: Biologists have, for some time, been studying the effect of growing one plant species in close proximity to another. In some cases the yield of a species is increased over its yield in monoculture and in other cases decreased. A species is regarded as a good competitor if its yield generally increases when grown with other species. This paper presents an analysis of an experiment in which p species are compared by growing pure stand and mixtures in pots or plots at equivalent total density of planting. Each species is grown in monoculture and each combination of two grown in equal proportion. It is assumed that only one harvest is made and that each species is competing either for just one nutrient or for the same nutrients with no differentiation being made between competition for differing nutrients. Such an analysis has previously been presented by K. Sakai [1961] and Williams [1962] and, although the analysis is essentially similar to that of Williams, a different parametrisation is used which corresponds more closely to easily defined biological concepts.

201 citations


Journal Article•DOI•
TL;DR: In this paper, a Monte Carlo investigation of 2 X n tables with fixed marginals has been performed and the results of the Monte Carlo distribution show that the probability of Type I error given by the conventional x2 test is in general conservative for 5 or more degrees of freedom even when expectations of successes are very small in each
Abstract: SUMMARY A Monte Carlo investigation of 2 X n tables with fixed marginals has been performed. The results of the Monte Carlo distribution show that the probability of Type I error given by the conventional x2 test is in general conservative for 5 or more degrees of freedom even when expectations of successes are very small in each

189 citations


Journal Article•DOI•
TL;DR: From factoring the correlations among factors one can derive second and higher order factor systems which do not exist in the orthogonal case, and the implications of the oblique model are outlined.
Abstract: If, as the extensive findings with simple structure, supported by the few presently available confactor rotations, suggest, oblique factors are the rule in nature, with orthogonality as a special, rather rare, case, it behoves us to outline more fully the implications of the oblique model, for it has certain important new properties. Two main consequences appear, both unknown in the orthogonal case: (1) The correlations of a factor with its variables, on the one hand, and its loadings on them on the other, which fuse into the same values in orthogonal factors, here become recognizably and importantly different; (2) From factoring the correlations among factors one can derive second and higher order factor systems which do not exist in the orthogonal case. The differentiation of a correlation and a loading can be shown geometrically in Figure 3, which also brings out the distinction between reference vectors and factors. The reference vector is the perpendicular to a hyperplane, and it is by examination of the projections formed by correlations with the reference vector that we perform rotations for simple structure. The corresponding factor is the line of intersection of the hyperplanes of all the reference vectors other than the one perpendicular to the hyperplane. Corresponding reference vectors and factors are shown in Figure 3, whence one sees that where the former are positively correlated the latter are inversely correlated. In the orthogonal case, reference vectors and factors are one and the same. As Figure 3 shows, the lines of projection of variables on factor axes, as in all nmathematical oblique representation, must run parallel, in getting projections on one factor, to the other factor axis. Thus, as is easily seen, the correlations of variables with a reference vector are

174 citations




Journal Article•DOI•
TL;DR: It is assumed that only carriers are responsible for the spread of the disease, and that public health measures are efficient enough to isolate infected individuals who may be able to transmit the disease to others.
Abstract: Quantitative problems in the phenomenon of epidemnics have been of interest for over fifty years. Specific applications of the theoretical developments have been made to malaria (Ross [1911]; Martini [1921]), to measles (Wilson et al. [1939]; Bartlett [1957]), and to other communicable diseases. Specific references may be found in the excellent monograph by Bailey [1957]. All of the diseases whose theory has been developed so far are such that a population can be divided into subpopulations whose members are considered to be susceptible, infected, or imnmune. However, there are several diseases in which carriers are a significant factor in the spread of the epidenmic. The prime example of these is typhoid, although carriers may be important in the spread of bilharzia, amoebic dysentery, and typhus. A carrier is defined to be an individual who does not have overt disease symptoms but nevertheless is able to communicate the disease to others. Under this category we may include not only human carriers but also inanimate sources of disease such as polluted streams which may be used by a fairly large population. Diseases involving carriers are still important notwithstanding modern health controls-as witness the recent outbreak of typhoid in Zermatt, Switzerland. In more primitive societies the problem can obviously be more acute. To date, no theory seems to have been developed to make quantitative the factors involved in a carrier-borlne disease. It is the purpose of this paper to analyze a fairly simple and admittedly incomplete model of an epidemic involving carriers. I will assume in the present work, that only carriers are responsible for the spread of the disease. By implication this assunmes that public health measures are efficient enough to isolate infected individuals who may be able to transmit the disease to others. This may not be as unrealistic as it sounds. Consider, for example, the case of typhoid. It is estimated that about one or two percent of all those who have re-

Journal Article•DOI•
TL;DR: Three-cycled Fourier equations are used in representing the frontal and lateral views of the human face and the general pattern of a group of individuals can also be obtained.
Abstract: Three-cycled Fourier equations are used in representing the frontal and lateral views of the human face. Fourier equations representing the general pattern of a group of individuals can also be obtained. The group pattern then represents the pattern characteristics common to all individuals. Difference in patterns may be represented by the difference between two Fourier equations. In the frontal view, the cosine terms measure the form of symmetry and the sine terms measure the form of asymmetry.

Journal Article•DOI•
TL;DR: In this article, the authors proposed a new approach and test and test of significance, which consists of calculating a weighted average of the differences in efficacy between the two treatments in the various 2 X 2 tables, the weights being based on some consideration of optimality.
Abstract: After reviewing various methods commonly used to combine the results from several 2 X 2 contingency tables, Cochran [1954] suggested a new approach and test of significance. Although the method is general, it will for convenience be discussed against the background of clinical trials. Briefly, it consists of calculating a weighted average of the differences in efficacy between the two treatments in the various 2 X 2 tables, the weights being based on some consideration of optimality.

Journal Article•DOI•

Journal Article•DOI•
TL;DR: In this paper, the relative efficiency of matched and simple random samples in a variety of experimental situations is empirically assessed on a series of experiments simulated with the aid of an electronic computer.
Abstract: The general principles of matching and the appropriate statistical tests are discussed. The relative efficiency of matched and simple random samples in a variety of experimental situations is empirically assessed on a series of experiments simulated with the aid of an electronic computer. It is concluded that matching appears to be less efficient than covariance analysis on simple random samples for investigations involving quantitative response irrespective of whether the matching criteria are quantitative or qualitative. For investigations with all-or-none response, matching is a useful technique for ensuring group comparability and increasing the precision of the comparison. The effect of matching on the duration of experiments is also considered.

Journal Article•DOI•
TL;DR: In this paper, the correlation between a ratio and its denominator in terms of the correlationi between the nlumerator and the denominator an-d of the coefficients of variation of these two variables is predicted.
Abstract: Formulae are presented for the prediction of the correlation between a ratio and its denominlator in terms of the correlationi between the nlumerator and the denominator an-d of the coefficients of variation of these two variables. Graphs of the predicted values are shown for various ratios of the two coefficients of variation, as the primary correlation between numerator and denominator changes from -1.0 to +1.0. Experimental results from mice studies show reasonable agreement between the predicted and the actual correlation between gain in body weight and 'efficiency of feed use' when the latter is measured by the ratio, feed consumed/gain in body weight.

Journal Article•DOI•
TL;DR: In this paper, the Pearl-Reed function is used to locate a point of inflection free from the effects of two additional parameters specifying the manner in which the function approaches its asymptotes.
Abstract: Many functions previously employed in describing growth plhenoimlena can Inow be included in a single new function obtained principally by raising to a power the difference between a weighted elementary functioni and its exponiential raised to another power. Furthermore, new parameters for the simplest two-asymptote, third-degree form of the well-known Pearl-Reed function can be specified so as to locate a point of inflection free from the effects of two additional parameters specifying the manner in which the function approaches its asymptotes. Convenient expressions for derivatives and partial derivatives of all the inew functions and the locationis of points of inflection in terms of the new parameters are given to help interested scientists in preparing short FORTRAN or ALGOL subroutines for use with existiIng computer programs, and in making close initial estimates of parameters so that convergence of iterative solutions will be more likely.

Journal Article•DOI•
TL;DR: The examination of designs for the estimation of polynomials in which the design is as 'nearly-saturated' as possible is suggested, to center, almost exclusively, on the variances of the predicted responses over the relevant region of factor space.
Abstract: A situation which is frequently encountered in experimental work is that in which a multi-response system is to be explored with the minimum of experimental work. In research and exploratory work, in particular, it is often necessary to ascertain whether there is any region of the factor space for which certain inequalities on the responses hold. This requirement almost inevitably implies polynomial estimation (responses as functions of the factors) together with some suitable graphical display such as that provided by contour plots. Often there is a severe limitation on the amount of experimental work that can be undertaken, although more will certainly be performed as a check if the initial experimentation suggests that a desirable region of factor space appears to exist. The foregoing comments suggest the examination of designs for the estimation of polynomials (specifically, quadratic polynomials in this paper) in which the design is as 'nearly-saturated' as possible. The possibility of providing tests on individual coefficients of the polynomial will not be required of such designs, since their main function is to allow estimation of the coefficients of the polynomial; this polynomnial can then be used as an interpolating function over the appropriate region of factor space. In such cases, our interest will center, almost exclusively, on the variances of the predicted responses over the relevant region of factor space. The general quadratic in n factors is:

Journal Article•DOI•

Journal Article•DOI•
TL;DR: In this paper, the authors considered two types of plans, one with a fixed sample size and one with random pairing and sequential, one-pair-at-a-time, observations.
Abstract: In a previous paper (Colton [1963]) I considered a cost function approach to the design of a clinical trial for the comparison of two medical treatments. I examined two types of plans, one with a fixed sample size and one with random pairing and sequential, one-pairat-a-time, observations. Anscombe [1963] considered a similar formulation, dealing with the sequential case. The results of these investigations indicated that the optimal sequential plan led to an overall smaller cost (or, alternatively, an overall greater gain) than the corresponding optimal fixed sample size plan. Numerical results showed that the overall net gain with the optimal sequential plan could be as much as 25 percent more than that for the corresponding optimal fixed sample size plan. Here I consider intermediate plans: in particular, two possible two-stage plans. The appropriate cost functions are determined, the optimal plans are obtained, and the corresponding net gains are compared with each other and with the previously reported fixed sample size and sequential results. The results show that although the twostage plans proposed are quite different, their optimal net gains are similar. The overall net gain of the two-stage optimal plans is at most 13 percent more than that of the corresponding optimal fixed sample size plan. However, the optimal two-stage plan can account for as much as 50 percent of the difference in overall net gain between the fixed sample size and sequential plans. Hence, within this cost formulation it appears that by going from the one extreme of a fixed sample size plan to a two-stage plan one can achieve about half of the gain that is attainable by the opposite extreme of a pair-by-pair fully sequential plan.


Journal Article•DOI•
TL;DR: Several distributions have been proposed for fitting to a large class of biological data which have been characterized as being contagious as mentioned in this paper, and the problem of finding the best distribution for a particular set of data is complicated by the interrelations of these distributions and their occasionally ambiguous relationship to the models used in forming them.
Abstract: Several distributions have been proposed for fitting to a large class of biological data which have been characterized as being contagious. Two of the earliest distributions used in fitting this type of data are the Negative Binomial and the Neyman Type A (cf. Greenwood and Yule [1920] and Neyman [1939]). More recently the Poisson Binomial and the Poisson with zeroes have been suggested (cf. McGuire et al. [1957] and A. C. Cohen [1960]). The selection of the best distribution for a particular set of data is complicated by the interrelations of these distributions and their occasionally ambiguous relationship to the models used in formulating them. Some of the distributions are limiting forms of the others for extreme parameter values. For example, the Neyman Type A and the Negative Binomial are limits of the Poisson Binomial as n -co and n -0 respectively, while both the Poisson and Poisson with zeroes are limiting forms of the Neyman Type A. Feller [1943] and Gurland [1958] have shown that different models can be used in deriving the same distributions. Apparently because of the complexity of the N.A. distribution, it is often fitted by the method of moments, and then compared with the negative binomial distribution which is fitted by the maximum likelihood method (c.f. Bliss [1953] and McGuire, et al. [1957]). This results in the confounding of the fitting process with the comparison of the distributions. The same is true about the fitting of the Poisson Bionomial. We note for reference that certain simplifications have been obtained by Douglas [1955] for the Neyman Type A fitting and by Sprott [1958] for the Poisson Binomial fitting. In spite of the simplification of various methods of fitting some of the contagious distributions the task of setting up computer routines or of using desk calculators to fit these distributions is relatively tedious. This is particularly true when the investigator is not directly concerned with the distributions fitted in this paper but wishes to use them for a comparison with some new distribution. In view of this, it was decided to set up programs

Journal Article•DOI•
TL;DR: The maximum likelihood solution for the regression coefficients is presented below, an extension of the method presented by Lea [1945], who adapted previous work of Bliss and Stevens [1937].
Abstract: Situations frequently occur in practice where a regression analysis seems suitable but where the dependent variable, in some instances, is known only to be greater than a particular value which varies among individuals Such a situation would arise, for example, if we were studying times to death for a group of patients with a specific disease and at the time of the analysis some patients were still alive If we wish to do a regression analysis with time to death (or a function of time to death) as the dependent variable the survival time for each of the surviving patients would obviously be known to be greater than its value at the time of the analysis Given this situation, the maximum likelihood solution for the regression coefficients is presented below The solution is an extension of the method presented by Lea [1945], who adapted previous work of Bliss and Stevens [1937] Other related work has been published by Cohen [1950; 1955; 1957], Sampford [1952a,b; 1954], Gupta [1952], Halperin [1952], and Boag [1949], as well as others (see references in these papers) In these papers, when more than one variable was considered, multivariate normality was assumed The present approach assumes that the X's are fixed Lea's presentation and notation will be closely followed throughout this paper

Journal Article•DOI•
TL;DR: In this article, a procedure is described for finding approximate confidence limits for the coefficient of variation if a sample of size n from a gamma distribution (formula (1), below) is given.
Abstract: 1. In this paper a procedure is described for finding approximate confidence limits for the coefficient of variation if a sample of size n from a gamma distribution (formula (1), below) is given. In many instances the variables with which one is dealing are nonnegative. In some of these cases the normal distribution cannot be assumed, either because it obviously fits badly, or else because admitting the possibility of negative values is embarrassing. In such cases the obvious next choice is either the lognormnal distribution or the gamma distribution (sometimes also called the Pearson type III distribution, or the distribution of ax'). Both distributions are similar in form and have two parameters. The advantage of the lognormal distribution is that normal theory can be applied once the variables have been transformed. However, it can happen that one does not wish to analyse transformned variables, and in such cases the gamma distribution is preferable, because it is easier to handle analytically. In many fields of application the coefficient of variation is more popular as a descriptive parameter than the variance or standard deviation as such. This is so because often the coefficient of variatioll, but not the variance, remains invariant if one shifts to a distribution with another mean. This is another reason for using a gamma distribution; in fact, one of its parameters determines the coefficient of variation, whereas the variance depends on both parameters.


Journal Article•DOI•
TL;DR: The paper gives the solution of the method of maximum likelihood for the estimation of the density of a suspension of infective particles when counts are available for a number of independent dilutions of the original suspension, which has a smaller true variance than the conventional one.
Abstract: (1) The paper gives the solution of the method of maximum likelihood for the estimation of the density of a suspension of infective particles when counts are available for a number of independent dilutions of the original suspension. This estimate has a smaller true variance than the conventional one, viz. the arithmetic mean of the estimates from the individual dilutiolns, and is, additionally, more easily calculated. (2) Overcrowding and/or clumping is likely to occur at high concentrations. Inclusion of counts from such dilutions underestimates the density of the suspension. A test for the presence of this effect has been devised so that counts for dilutions for which there is evidence that the count is too low can be excluded and particle density estimated without bias.

Journal Article•DOI•
TL;DR: Differences between the mortality patterns of the two sexes were especially striking and seemed to have a common character for all strains and both species.
Abstract: Male and female flour beetles are described in terms of one mortality characteristic: adult age at death. Four strains of Triboliurn ccnfusum and four strains of Tribolium castaneum were studied. The actual data are reported in two Appendix tables. Death rates were found to depend upon the species, strain, sex, and age of the individuals concerned. Differences between the mortality patterns of the two sexes were especially striking and seemed to have a common character for all strains and both species.

Journal Article•DOI•
TL;DR: In this paper, the disturbance to the level of significance of additive sums of squares methods of analyzing disproportionate data is studied for several patterns of subclass numbers in the two-way classification.
Abstract: The disturbance to the level of significance of additive sums of squares methods of analyzing disproportionate data is studied for several patterns of subclass numbers in the two-way classification. These methods yield too many significant results for main effects under the null hypothesis although this disturbance is judged to be moderate for the method of unweighted means. Factors to remove the bias in the method of expected subclass numbers are given. A procedure for computing the power of the exact methods, similar to the computations for equal subclass numbers, and approximate procedures for determining the power of the method of unweighted means are described.