scispace - formally typeset
Search or ask a question

Showing papers in "Biometrics in 1976"



Journal Article•DOI•
TL;DR: In this article, the inverse of a numerator relationship matrix is needed for best linear unbiased prediction of breeding values, and a simple method for computing the elements of this inverse without computing the relationship matrix itself is presented.
Abstract: The inverse of a numerator relationship matrix is needed for best linear unbiased prediction of breeding values. The purpose of this paper to is present a rapid and simple method for computation of the elements of this inverse without computing the relationship matrix itself. The method is particularly useful in noninbred populations but is much faster than the conventional method in the presence of inbreeding.

793 citations



Journal Article•DOI•
TL;DR: Methods to analyse confounded, polytomous and interacting risk factors are proposed and it shown that there is a simple relationship between two distinct estimators previously suggested for use with deleterious and beneficial (or preventive) factors.
Abstract: Various measures of attributable risk are discussed together with a rationale for their use as an alternative to relative risk in health research. Methods of estimation are presented for use with three important kinds of epidemiological study design with one dichotomous risk factor for a dichotomous disease outcome; the study designs are then compared with respect to efficiency. Procedures to analyse confounded, polytomous and interacting risk factors are proposed and it shown that there is a simple relationship between two distinct estimators previously suggested for use with deleterious and beneficial (or preventive) factors. Finally the relevance of attributable risk to an assessment of the potential effects of risk factor modification is discussed in the preventive medicine framework.

430 citations


Journal Article•DOI•

419 citations


Journal Article•DOI•
TL;DR: In this article, a modification of Henderson's procedure for finding the diagonal elements of an L (or A) matrix which does not require that L or A be stored in memory is described.
Abstract: A numerator relationship matrix for a group of animals is, by definition, the matrix with the ijth off-diagonal element equal to the numerator of Wright's [1922] coefficient of relationship between the ith and jth animals and with the ith diagonal element equal to 1 + fi where fi is Wright's [1922] coefficient of inbreeding for the ith animal. The numerator relationship matrix, say A, can be computed recursively (see Emik and Terrill [1949]), and for most situations, inbreeding and relationship coefficients can be calculated with a computer more rapidly in this manner than by path coefficient methods (Wright [1922]). The exception to this is when the dimension of A is too large for it to be stored in computer memory. Then computation of A is exceedingly time consuming. In addition to its usefulness for obtaining inbreeding and relationship coefficients, the inverse of A is required for best linear unbiased prediction of breeding values (Henderson [1973]) but, in general, A is too large to invert by conventional means. Recently, however, Henderson [1976] has described methods for computing a lower triangular matrix Z, defined such that LL' = A, with the object of computing A` = (L')1(LX1. He discovered that A-1 can be found directly from a list of sires and dams and the diagonal elements of L. Since the latter are functions of the diagonal elements of A, A1 for a noninbred population can be computed without having to compute either A or L. However, for an inbred population, the diagonal elements of L (or A) must first be found and when L is too large to store in computer memory, this can be very time consuming if Henderson's computing formulas are used. The purpose of this paper is to describe a modification of Henderson's procedure for finding the diagonal elements of an L (or A) matrix which does not require that L (or A) be stored in memory. It is therefore possible to compute rapidly inbreeding coefficients or the inverse of a numerator relationship matrix for very large numbers of animals. For example, less than three minutes were required by an IBM 370/135 to compute the diagonal elements and the inverse of a numerator relationship matrix for 1000 animals. Use of this procedure

388 citations


Journal Article•DOI•

362 citations


Journal Article•DOI•
TL;DR: The relationship between response probability and dosage in quantal response bioassay is modelled using a four parameter class and the data of Bliss (1935) illustrates the potential improvement over usual methods in the estimation of critical dose levels.
Abstract: The relationship between response probability and dosage in quantal response bioassay is modelled using a four parameter class. In addition to location and scale quantities the model includes two shape parameters that essentially index skewness and heaviness of tails of the dose-response curve. The class of models includes such special cases as the logistic, normal, extreme minimum value, extreme maximum value, double exponential, exponential and reflected exponential distribution functions. Score tests are derived for logistic and normal hypotheses and certain submodels are discussed for which the model fitting is computationally convenient. The data of Bliss (1935) illustrates the potential improvement over usual methods in the estimation of critical dose levels.

340 citations


Journal Article•DOI•
TL;DR: The error of asserting admixture whenever there is skewness has been avoided, and estimates of admixture parameters provide a basis for more conclusive tests in relatives or other populations.
Abstract: A likelihood ratio test is given for distinguishing skewness from commingled distributions, using a power transform to remove skewness appropriately for each of the alternatives tested. The alternative hypotheses postulate that the transformed data are from one normal or a mixture of two or three normal homoscedastic distributions. Since each mixture has unique asymmetry, skewness is estimated simultaneously with the means, proportions and variance of components. Commingling cannot be rigorously proven in this way, as some other transform may provide a better approximation to normality. However, the error of asserting admixture whenever there is skewness has been avoided, and estimates of admixture parameters provide a basis for more conclusive tests in relatives or other populations. Two examples are given, one in which adjustment for skeweness left evidence of commingling.

316 citations


Journal Article•DOI•
TL;DR: This modification has the effect of decreasing the "effective" length of the confidence interval, on which the decision concerning bioequivalence is based, while increasing the confidence coefficient.
Abstract: The conventional method of setting confidence intervals for the difference of the means of two normal populations gives an interval which is not, in general, symmetrical about zero. A modification of the conventional method which leads to symmetry about zero is discussed and is recommended as particularly appropriate for use in bioequivalence trials. This modification has the effect of decreasing the "effective" length of the confidence interval, on which the decision concerning bioequivalence is based, while increasing the confidence coefficient.

309 citations


Journal Article•DOI•
TL;DR: The procedure, which is an extension of the Kruskal-Wallis ranks test, allows for the calculation of interaction effects and linear contrasts in ranked data arising from completely randomized factorial designs.
Abstract: This paper presents a method for the analysis of ranked data arising from completely randomized factorial designs. The procedure, which is an extension of the Kruskal-Wallis ranks test, allows for the calculation of interaction effects and linear contrasts. A Monte Carlo study of the convergence of the test and a worked example are presented.

Journal Article•DOI•
TL;DR: Incomplete Factorials, Fractional Replication, Intermediate Factorial, and Nested Designs as discussed by the authors are some of the examples of incomplete Factorial Experiments and incomplete fractional replicates.
Abstract: Introduction. Simple Comparison Experiments. Two Factors, Each at Two Levels. Two Factors, Each at Three Levels. Unreplicated Three--Factor, Two--Level Experiments. Unreplicated Four--Factor, Two--Level Experiments. Three Five--Factor, Two--Level Unreplicated Experiments. Larger Two--Way Layouts. The Size of Industrial Experiments. Blocking Factorial Experiments, Fractional Replication--Elementary. Fractional Replication--Intermediate. Incomplete Factorials. Sequences of Fractional Replicates. Trend--Robust Plans. Nested Designs. Conclusions and Apologies.

Journal Article•DOI•
TL;DR: In this paper, various distance-based methods of testing for randomness in a population of spatially distributed events are described, with special emphasis placed upon preliminary analysis in which the complete enumeration of the events within the study area is not available.
Abstract: Summary Various distance-based methods of testing for randomness in a population of spatially distributed events are described. Special emphasis is placed upon preliminary analysis in which the complete enumeration of the events within the study area is not available. Analytical progress in assessing the power of the techniques against extremes of aggregation and regularity is reviewed and the results obtained from the Monte Carlo simulation of more realistic processes are presented. It is maintained that the method of T-square sampling can help to provide quick and informative results and is especially suited to large populations. Some comments on contiguous quadrat methods are made.

Journal Article•DOI•
TL;DR: A logistic regression model is used to study the association between a dichotomous exposure variable and a disease and a case-control study relating post-menopausal estrogen use and endometrial cancer provides illustration.
Abstract: A logistic regression model is used to study the association between a dichotomous exposure variable and a disease. The method takes into account factors that may confound the association and leads to a quantitative study of the influence of factors which are related to the strength of the association. Special results are given for matched pair data. A case-control study relating post-menopausal estrogen use and endometrial cancer provides illustration.

Journal Article•DOI•
TL;DR: A regression model for the analysis of survival data adjusting for concomitant information is developed and can lead to the log linear exponential model (Glasser [1967]) and the life table regression model of Cox [1972].
Abstract: A regression model for the analysis of survival data adjusting for concomitant information is developed. The model presented can lead to the log linear exponential model (Glasser [1967]) and the life table regression model of Cox [1972]. In addition, the model described can be used to analyze data from the commonly employed actuarial life table. A discussion of the special case where one is comparing two survival curves is presented. The methods developed are illustrated using data from a clinical trial investigating treatments for lung cancer.

Journal Article•DOI•
TL;DR: It is argued that the prevailing paradigm of diagnostic statistics, which concentrates on incidence of symptoms for given disease, is largely inappropriate and should be replaced by an emphasis on diagnostic distributions, and the generalized logistic model is seen to fit naturally into the new framework.
Abstract: In applications of statistical methods to medical diagnosis, information on patients' diseases and symptoms is collected and the resulting data-base is used to diagnose new patients. The data-structure is complicated by a number of factors, two of which are examined here: selection bias and unstable population. Under reasonable conditions, no correction for selection bias is required when assessing probabilities for diseases based on symptom information, and it is suggested that these "diagnostic distributions" should form the principal object of study. Transformation of these distributions under changing population structure is considered and shown to take on a simple form in many situations. It is argued that the prevailing paradigm of diagnostic statistics, which concentrates on incidence of symptoms for given disease, is largely inappropriate and should be replaced by an emphasis on diagnostic distributions. The generalized logistic model is seen to fit naturally into the new framework.


Journal Article•DOI•
TL;DR: A prologue on science and statistics focuses attention on the role of statistics in science and the formulation, modification and verification of stochastic models designed to represent natural phenomena.
Abstract: A prologue on science and statistics focuses attention on the role of statistics in science. Statisticians must be trained as scientists and to meet the needs of science. Those needs surely involve the formulation, modification and verification of stochastic models designed to represent natural phenomena. The method of paired comparisons provides a simple experimental technique but one with a literature rich in model development.

Journal Article•DOI•
TL;DR: In this article, the authors present examples of multivariate matching methods that will yield the same percent reduction in bias for each matching variable for a variety of underlying distributions, and for each one, matching methods are defined which are equal percent bias reducing.
Abstract: Multivariate matching methods are commonly used in the behavioral and medical sciences in an attempt to control bias when randomization is not feasible. Some examples of multivariate matching methods are discussed in Althauser and Rubin (1970) and Cochran and Rubin (1973), but otherwise seem to have received little attention in the literature. Here, we present examples of multivariate matching methods that will yield the same percent reduction in bias for each matching variable for a variety of underlying distributions. Eleven distributional cases are considered, and for each one, matching methods are defined which are equal percent bias reducing. Methods discussed in Section 8, which are based on the values of the estimated best linear discriminant or which define distance by a sample based inner-product, will probably be the most generally applicable in practice.

Journal Article•DOI•
TL;DR: In this article, the authors consider one class of multivariate matching methods which yield the same percent reduction in expected bias for each of the matching variables, and derive the expression for the maximum attainable percent reduction of bias given fixed distributions and fixed sample sizes.
Abstract: Matched sampling is a method of data collection designed to reduce bias and variability due to specific matching variables. Although often used to control for bias in studies in which randomization is practically impossible, there is virtually no statistical literature devoted to investigating the ability of matched sampling to control bias in the common case of many matching variables. An obvious problem in studying the multivariate matching situation is the variety of sampling plans, underlying distributions, and intuitively reasonable matching methods. This article considers one class of multivariate matching methods which yield the same percent reduction in expected bias for each of the matching variables. The primary result is the derivation of the expression for the maximum attainable percent reduction in bias given fixed distributions and fixed sample sizes. An examination of trends in this maximum leads to a procedure for estimating minimum ratios of sample sizes needed to obtain well-matched samples.

Journal Article•DOI•
TL;DR: The method becomes particularly easy to apply if a purely deterministic system is used, that is, where the treatment assignment for a new patient is entirely determined by the prognostic details of that patient and the previous assignments, and there is no random element.
Abstract: One apparent drawback to the use of the method of Pocock and Simon [1975] for sequential assignment in controlled clinical trials, where it is desired to balance treatment numbers for each level of various prognostic factors, is the large amount of calculation required before a new patient entering the trial can be assigned to a treatment series. Pocock and Simon's suggestion that a small computer be programmed to perform the calculations is not realistic, given the resources of many institutions, and this therefore limits the flexibility of application of the method. It is possible, however, to introduce a number of simplifications into the method to enable it to be applied quickly using hand calculations. The method becomes particularly easy to apply if a purely deterministic system is used, that is, where the treatment assignment for a new patient is entirely determined by the prognostic details of that patient and the previous assignments, and there is no random element. This is feasible when the clinician submitting a patient does not have the full information necessary for predicting the next treatment assignment, such as is the case with single patient entries in a multi-center trial. In other situations some form of randomization should be introduced, so that the clinician cannot know in advance what the treatment assignment for his next patient will be. The method is more complicated in this case, but it is still practicable to apply it by hand.

Journal Article•DOI•
TL;DR: The authors show that each multiplicative model for a contingency table corresponds to one particular covariance selection model, and point at the resulting similarities in the interpretation of patterns, in test statistics for each pattern and in implied marginal associations among variable pairs.
Abstract: A certain class of patterns of association can be investigated by fitting multiplicative models to a contingency table or by using covariance selection on a covariance matrix. We show that each multiplicative model for a contingency table corresponds to one particular covariance selection model, and we point at the resulting similarities in the interpretation of patterns, in test statistics for each pattern and in implied marginal associations among variable pairs.

Journal Article•DOI•
TL;DR: These studies and those of Klotz, Milton and Zacks point, with some exceptions, to the greater efficiency of ML estimators under a range of experimental settings.
Abstract: Explicit solutions have been derived for the maximum likelihood (ML) and restricted maximum likelihood (REML) equations under normality for four common variance components models with balanced (equal subclass numbers) data. Solutions of the REML equations are identical to analysis of variance (AOV) estimators. Ratios of mean squared errors of REML and ML solutions have also been derived. Unbalanced (unequal subclass numbers) data have been used in a series of numerical trials to compare ML and REML procedures with three other estimation methods using a two-way crossed classification mixed model with no interaction and zero or one observation per cell. Results are similar to those reported by Hocking and Kutner [1975] for the balanced incomplete block design. Collectively, these studies and those of Klotz, Milton and Zacks [1969] point, with some exceptions, to the greater efficiency of ML estimators under a range of experimental settings.

Journal Article•DOI•
TL;DR: A general mathematical theory of line transects is develoepd which supplies a framework for nonparametric density estimation based on either right angle or sighting distances, and it is shown there areNonparametric approaches to density estimation using the observed right angle distances.
Abstract: A general mathematical theory of line transects is develoepd which supplies a framework for nonparametric density estimation based on either right angle or sighting distances. The probability of observing a point given its right angle distance (y) from the line is generalized to an arbitrary function g(y). Given only that g(O) = 1, it is shown there are nonparametric approaches to density estimation using the observed right angle distances. The model is then generalized to include sighting distances (r). Let f(y/r) be the conditional distribution of right angle distance given sighting distance. It is shown that nonparametric estimation based only on sighting distances requires we know the transformation of r given by f(O/r).

Journal Article•DOI•
TL;DR: Wie et al. as discussed by the authors proposed a non-iterative model search technique to find simple patterns of association for several variables, which are interpretable in terms of zero partial associations of variable pairs.
Abstract: We propose a non-iterative model search technique to find simple patterns of association for several variables. Our selection procedure is restricted to multiplicative models, therefore all patterns under consideration are interpretable in terms of zero partial associations of variable pairs. Wie illustrate the selection technique on two sets of data, one in a contingency table, one in a covariance matrix.

Journal Article•DOI•
TL;DR: In this paper, the estimation of the group level, group amplitude and group phase of a certain group of individuals based on their time series data was studied for both the small-sample and the large-sample cases.
Abstract: In the model under consideration for the circadian rhythm study there are three unknown physiological parameters involved: the level, the amplitude and the phase. This paper concerns the estimation of the group level, group amplitude and group phase of a certain group of individuals based on their time series data. Special attention is paid to the amplitude and phase parameters, and solutions are obtained for both the small-sample and the large-sample cases.

Journal Article•DOI•
TL;DR: In laboratory proficiency surveys which use reference laboratories for the evaluation of participant laboratories, a measure of the agreement of the participant laboratory with thereference laboratories is needed which considers the extent of agreement (or disagreement) among the reference laboratories themselves.
Abstract: In laboratory proficiency surveys which use reference laboratories for the evaluation of participant laboratories, a measure of the agreement of the participant laboratory with the reference laboratories is needed which considers the extent of agreement (or disagreement) among the reference laboratories themselves. In, a measure of nominal scale agreement, is proposed. In is interpreted as follows: Let a specimen be selected at random and rated by a reference laboratory which itself has been randomly selected from the n reference laboratories. If the specimen was also rated by the participant laboratory, this second rating would agree with the first at a rate In of the rate that would be obtained by a second randomly selected reference laboratory. An approximate (large sample) confidence interval for the ratio In is developed. In order to account for the more general case of scaled agreement, a weighted index of agreement is also considered.

Journal Article•DOI•
TL;DR: Formulae are derived and examples given for the expected values of the order statistics from a sample of several groups or families of equal size where the group members have intra-class correlation t.
Abstract: Formulae are derived and examples given for the expected values of the order statistics from a sample of several groups or families of equal size where the group members have intra-class correlation t. As t increases from O the means of the highest ranking individuals are little reduced initially, but as t approaches 1 the change becomes more rapid, the total reduction depending on the number of groups. The effect on selection differentials of the intra-class correlation of family members is generally small for t less than 0.5, such as if selection is practised on individual performance. If, however, family mean performance is used in an index, the selection differentials may be substantially reduced.

Journal Article•DOI•
TL;DR: The estimation of maternal genetic variances by a multivariate maximum likelihood method is discussed and modifications required when parents are selected on their phenotypic values are given.
Abstract: The estimation of maternal genetic variances by a multivariate maximum likelihood method is discussed. As an illustration the method is applied to data on Tribolium using a model based on partitioning the maternal genetic effect into additive and dominance components. An alternative model due to Falconer (1965) is also fitted. The method is applied to designs suggested for estimating maternal variances by Eisen [1967]. Modifications required when parents are selected on their phenotypic values are given.

Journal Article•DOI•
TL;DR: The intentions in presenting this model are to provide a means of relating survival and another time-dependent event to one another, and to obtain more precise estimates of survival time by exploiting its relationship with this other event.
Abstract: In clinical trials and other investigations of survival time, information is often available on a time-dependent event other than survival. An example of such an auxiliary event in cancer studies is objective progression of disease. While some patients expire without experiencing objective disease progression, others die after progression is observed. This paper proposes a stochastic model which utilizes this type of information in the evaluation of survival time. Our intentions in presenting this model are to provide a means of relating survival and another timedependent event to one another (each of which may be used in the evaluation of a patient's condition), and to obtain more precise estimates of survival time by exploiting its relationship with this other event. The intrinsic aspects of the model are related to the semi-Markov model proposed by Weiss and Zelen [1965]. An important difference is that the present model incorporates incomplete (censored) observations as well as covariate variables. Analysis of the model via the method of maximum likelihood and its testability are discussed. The methods are applied to the results of a recent lung cancer study. This paper proposes a method of analyzing survival data when auxiliary information is available for a time-dependent event which may be related to survival time. To illustrate, consider a clinical trial for advanced lung cancer in which the survival experiences of the participating patients are to be investigated. Suppose that in addition to survival time, there is also available for each patient (i) an indicator of whether his primary lesion has appreciably grown in size,1 and (ii) if so, the elapsed time until this "progression" of disease occurred. Thus, the information available for some patients consists of both time until progression and time until death, while for others (i.e., those who expire without experiencing progression) it consists only of survival time. Since disease progression may be closely related to survival time (e.g., patients who experience progression tend to expire relatively soon thereafter), one would expect a more informative assessment of survival to be made possible by incorporating this auxiliary variable into the survival model. This would also be expected in the interim analysis of an ongoing clinical trial where some observations may be incomplete (censored). The interim information for some patients would be that they are alive and have not yet progressed by a specified time, and for others would be their actual progression time along with a censored observation on their survival time. For both of these censored situations 1 Residual tumor mass is usually monitored during the patient's treatment period and, by convention,