scispace - formally typeset
Search or ask a question

Showing papers in "Biometrics in 1986"


Journal ArticleDOI
TL;DR: A class of generalized estimating equations (GEEs) for the regression parameters is proposed, extensions of those used in quasi-likelihood methods which have solutions which are consistent and asymptotically Gaussian even when the time dependence is misspecified as the authors often expect.
Abstract: Longitudinal data sets are comprised of repeated observations of an outcome and a set of covariates for each of many subjects. One objective of statistical analysis is to describe the marginal expectation of the outcome variable as a function of the covariates while accounting for the correlation among the repeated observations for a given subject. This paper proposes a unifying approach to such analysis for a variety of discrete and continuous outcomes. A class of generalized estimating equations (GEEs) for the regression parameters is proposed. The equations are extensions of those used in quasi-likelihood (Wedderburn, 1974, Biometrika 61, 439-447) methods. The GEEs have solutions which are consistent and asymptotically Gaussian even when the time dependence is misspecified as we often expect. A consistent variance estimate is presented. We illustrate the use of the GEE approach with longitudinal data from a study of the effect of mothers' stress on children's morbidity.

7,080 citations


Journal ArticleDOI
TL;DR: This work addresses the question of how to analyze unbalanced or incomplete repeated-measures data through maximum likelihood analysis using a general linear model for expected responses and arbitrary structural models for the within-subject covariances.
Abstract: The question of how to analyze unbalanced or incomplete repeated-measures data is a common problem facing analysts. We address this problem through maximum likelihood analysis using a general linear model for expected responses and arbitrary structural models for the within-subject covariances. Models that can be fit include standard univariate and multivariate models with incomplete data, random-effects models, and models with time-series and factor-analytic error structures. We describe Newton-Raphson and Fisher scoring algorithms for computing maximum likelihood estimates, and generalized EM algorithms for computing restricted and unrestricted maximum likelihood estimates. An example fitting several models to a set of growth data is included.

1,332 citations


Journal ArticleDOI
TL;DR: The method developed is used to analyze data from an animal tumorigenicity study and also a clinical trial, and results given for testing the hypothesis of a zero regression coefficient lead to a generalization of the log-rank test for comparison of several survival curves.
Abstract: This paper develops a method for fitting the proportional hazards regression model when the data contain left-, right-, or interval-censored observations. Results given for testing the hypothesis of a zero regression coefficient lead to a generalization of the log-rank test for comparison of several survival curves. The method is used to analyze data from an animal tumorigenicity study and also a clinical trial.

625 citations




Journal ArticleDOI
TL;DR: A review of the literature in the estimation of animal abundance and related parameters such as survival rates and suggest further avenues for research.
Abstract: During the past 5 years there have been a number of important developments in the estimation of animal abundance and related parameters such as survival rates. Many of the new techniques need to be more widely publicized as they supplant previous methods. The aim of this paper is to review this literature and suggest further avenues for research.

498 citations


Journal ArticleDOI
TL;DR: The lod score method is widely used to test linkage and to estimate the recombination fraction between a disease locus and a marker locus, but the power of the linkage test is sensitive to the degree of dominance, and slightly to the penetrance, but not to the gene frequency.
Abstract: The lod score method is widely used to test linkage and to estimate the recombination fraction between a disease locus and a marker locus. The parameters (gene frequency, penetrance, and degree of dominance) are assumed to be known at each locus. This condition may not be fulfilled at the disease locus. In this paper, we evaluate the errors due to the use of wrong parameters. The power of the linkage test is sensitive to the degree of dominance, and slightly to the penetrance, but not to the gene frequency. In contrast, the estimation of the recombination fraction may be strongly affected by an error on any genetic parameter.

443 citations


Journal ArticleDOI
TL;DR: A new estimators of the variance of the Mantel-Haenszel estimator of common odds ratio that is easily computed and consistent in both sparse data and large-strata limiting models is proposed.
Abstract: This paper proposes a new estimator of the variance of the Mantel-Haenszel estimator of common odds ratio that is easily computed and consistent in both sparse data and large-strata limiting models. Monte Carlo experiments compare its performance to that of previously proposed variance estimators.

421 citations


Journal ArticleDOI
TL;DR: The simple Markovian structures of dependence, defined previously for continuous traits, are extended here to familial disease and other binary traits through the use of the logistic function.
Abstract: The simple Markovian structures of dependence, defined previously for continuous traits, are extended here to familial disease and other binary traits through the use of the logistic function. The regressive models so formulated can incorporate explanatory variables and major gene effects for segregation and linkage analyses. Thus, the goals of epidemiology and genetics in the analysis of familial disease can be accomplished in the same computational scheme.

365 citations



Journal ArticleDOI
TL;DR: The method of Lachin (1981, Controlled Clinical Trials 2, 93-113) is extended to allow for cases where patients enter the trial in a nonuniform manner over time, patients may exit from the trial due to loss to follow-up, and a stratified analysis may be planned according to one or more prognostic covariates.
Abstract: When designing a clinical trial to test the equality of survival distributions for two treatment groups, the usual assumptions are exponential survival, uniform patient entry, full compliance, and censoring only administratively at the end of the trial. Various authors have presented methods for estimation of sample size or power under these assumptions, some of which allow for an R-year accrual period with T total years of study, T greater than R. The method of Lachin (1981, Controlled Clinical Trials 2, 93-113) is extended to allow for cases where patients enter the trial in a nonuniform manner over time, patients may exit from the trial due to loss to follow-up (other than administrative), other patients may continue follow-up although failing to comply with the treatment regimen, and a stratified analysis may be planned according to one or more prognostic covariates.

Journal ArticleDOI
TL;DR: A Markov model is developed to assess the dependence of risk of death on marker level or disease state and inferences are based directly on data collected in this haphazard way.
Abstract: In studies of serial cancer markers or disease states and their relation to survival, data on the marker or state are usually obtained at infrequent time points during follow-up. A Markov model is developed to assess the dependence of risk of death on marker level or disease state and inferences within this model are based directly on data collected in this haphazard way. An application relating changing levels of serum alpha-fetoprotein to death in hepatocellular carcinoma is discussed in detail.

Journal ArticleDOI
TL;DR: A family of statistical models is presented for bivariate, discrete response to a regressor when both components of the response have ordered categories.
Abstract: SUMMARY A family of statistical models is presented for bivariate, discrete response to a regressor when both components of the response have ordered categories. Association between components is expressed in terms of global cross-ratios, cross-product ratios of quadrant probabilities, for each double dichotomy of the response table of probabilities into quadrants (Pearson and Heron, 1913, Biometrika 9, 159-315). These models are extensions to the work of Plackett (1965, Journal of the American Statistical Association 60, 516-522) and Mantel and Brown (1973, Biometrics 29, 649-665). The marginal cumulative probabilities may satisfy linear logistic or other generalized linear models (McCullagh, 1980, Journal of the Royal Statistical Society, Series B 42, 109-142). An analysis of patients' postoperative pain level and medication frequency illustrates these methods.

Journal ArticleDOI
TL;DR: In this article, a doubly stochastic Poisson distribution for the number of deaths with mean proportional to the population size and an exponential function of a linear combination of the explanatories was proposed.
Abstract: The first concern of this work is the development of approximations to the distributions of crude mortality rates age-specific mortality rates age-standardized rates standardized mortality ratios and the like for the case of a closed population or period study. It is found that assuming Poisson birthtimes and independent lifetimes implies that the number of deaths and the corresponding midyear population have a bivariate Poisson distribution....It is...suggested that situations in which explanatory variables are present may be modelled via a doubly stochastic Poisson distribution for the number of deaths with mean proportional to the population size and an exponential function of a linear combination of the explanatories. Such a model is fit to mortality data for Canadian females classified by age and year. A dynamic variant of the model is further fit to the time series of total female deaths alone by year. The models with extra-Poisson variation are found to lead to substantially improved fits. (EXCERPT)


Journal ArticleDOI
TL;DR: In this article, the authors describe and provide examples of four different applications of the use of neighbouring plot values in the analysis of agricultural field experiments, including check plots to control environmental variation in large unreplicated variety trials; spatial models, based on first differences, to accommodate fertility effects in trials that have some degree of replication; adjustment for interplot competition using a simultaneous-equations formulation.
Abstract: The paper describes and provides examples of four different applications of the use of neighbouring plot values in the analysis of agricultural field experiments. The topics include the use of check plots to control environmental variation in large unreplicated variety trials; spatial models, based on first differences, to accommodate fertility effects in trials that have some degree of replication; adjustment for interplot competition using a simultaneous-equations formulation; and the analysis of trials in which plot values are affected by the particular treatments on neighbouring plots.



Journal ArticleDOI
TL;DR: The beta-binomial distribution is utilized to introduce varying degrees of intralitter correlation, and a logistic dose-response model that describes the logit of risk as a straight-line function of ln(dose) is considered.
Abstract: The fitting of dose-response models to teratology data involving littermates in order to generate estimates of teratogenic risk is receiving increasing attention as a potential alternative to the "safety-factor" approach to risk estimation. In this paper, we utilize the beta-binomial distribution to introduce varying degrees of intralitter correlation, and, for purposes of illustration, consider a logistic dose-response model that describes the logit of risk as a straight-line function of ln(dose). The biases and (exact and asymptotic) variances of the maximum likelihood estimators of the intercept and slope are studied by simulation as a function of the intralitter correlation structure.

Journal ArticleDOI
TL;DR: With this method, the genetic parameters of a locus affecting plant height linked to an electrophoretic marker for esterase were accurately estimated from a sample of 1596 F-2 progeny of a cross between two species of Lycopersicon (tomato).
Abstract: A method is presented to estimate the biometric parameters of a quantitative trait locus linked to a genetic marker when both loci are segregating in the F-2 generation of a cross between two inbred lines. The method, which assumes underlying normal distributions, is a combination of maximum likelihood and moments methods and uses the statistics of the genetic marker genotype samples for the quantitative trait to estimate the recombination frequency between the two loci and the means and variances of the genotypes of the quantitative trait locus. With this method, the genetic parameters of a locus affecting plant height linked to an electrophoretic marker for esterase were accurately estimated from a sample of 1596 F-2 progeny of a cross between two species of Lycopersicon (tomato). Linkage distance between the two loci was 38 map units and the effect of the quantitative trait locus was 1.6 phenotypic standard deviation units. Accurate estimates of the genetic parameters and linkage distance for populations of 2000 individuals simulated with a segregating codominant locus with an effect of 1.63 standard deviations linked to a genetic marker with .2 recombination were also derived by this method. The method is not effective in distinguishing between complete and partial linkage in samples of only 500 individuals or for quantitative loci with effects less than a phenotypic standard deviation. The method is more effective for codominant than for dominant loci.

Journal ArticleDOI
TL;DR: Ten parameters extracted from six currently used parametrizations of the four-parameter logistic model, and one new proposal, were examined for their statistical behavior in nonlinear least-squares estimation in combination with ELISA and RIA data.
Abstract: Ten parameters extracted from six currently used parametrizations of the four-parameter logistic model, and one new proposal, were examined for their statistical behavior in nonlinear least-squares estimation in combination with ELISA and RIA data. Those which are adequately near-linear on the basis of the Lowry-Morton lambda statistic were identified and can be recommended for use in practice.

Journal ArticleDOI
TL;DR: The model is illustrated by applying it to data on cycles to pregnancy in smokers and nonsmokers, with adjustment for covariates, and the pre-interview attempt time is shown to follow a beta-geometric distribution.
Abstract: A convenient measure of fecundability is time (number of menstrual cycles) required to achieve pregnancy. Couples attempting pregnancy are heterogeneous in their per-cycle probability of success. If success probabilities vary among couples according to a beta distribution, then cycles to pregnancy will have a beta-geometric distribution. Under this model, the inverse of the cycle-specific conception rate is a linear function of time. Data on cycles to pregnancy can be used to estimate the beta parameters by maximum likelihood in a straightforward manner with a package such as GLIM. The likelihood ratio test can thus be employed in studies of exposures that may impair fecundability. Covariates are incorporated in a natural way. The model is illustrated by applying it to data on cycles to pregnancy in smokers and nonsmokers, with adjustment for covariates. For a cross-sectional study, when length-biased sampling is taken into account, the pre-interview attempt time is shown to follow a beta-geometric distribution, so that the same methods of analysis can be applied even though all of the available data are right-censored. For a cohort followed prospectively, there will be some couples enrolled whose fecundability is effectively 0, and for such applications, the beta could be considered to be contaminated by a distribution degenerate at 0. The mixing parameter (proportion sterile) can be estimated by application of the expectation-maximization (EM) algorithm. This, too, can be carried out using GLIM.

Journal ArticleDOI
TL;DR: Different methods of obtaining confidence intervals for the intraclass correlation coefficient rho in the unbalanced one-way random-effects model are investigated, focusing on applications to family studies.
Abstract: Different methods of obtaining confidence intervals for the intraclass correlation coefficient rho in the unbalanced one-way random-effects model are investigated, focusing on applications to family studies. Methods based on simple modifications of formulas for the case of equal group sizes are found to provide adequate coverage at small to moderate values of rho. A method based on the large-sample standard error of the sample intraclass correlation, as derived by Smith (1956, Annals of Human Genetics 21, 363-373), is shown to provide consistently good coverage at all values of rho. A method proposed by Thomas and Hultquist (1978, Annals of Statistics 6, 582-587) also provides consistently good coverage, but generates mean interval widths substantially greater than those generated by Smith's method at values of rho likely to arise in practice.

Journal ArticleDOI
TL;DR: Shirley as discussed by the authors proposed a nonparametric version of Williams' test for determining the lowest dose level at which there is evidence of a toxic effect, which improved the power of the test.
Abstract: Shirley (1977, Biometrics 33, 386-389) proposed a nonparametric version of Williams' test for determining the lowest dose level at which there is evidence of a toxic effect. A modification of Shirley's procedure is now proposed, which improves its power.

Journal ArticleDOI
TL;DR: Unified approach to mathematical modeling of host's immune response to viral and bacterial challenge is presented and an approach to estimating parameters of a particular patient is suggested.
Abstract: Unified approach to mathematical modeling of host's immune response to viral and bacterial challenge is presented. Models are formulated by the systems of delay-differential equations within the framework of the Burnet's principle of clonal selection, the major histocompatibility complex restricted recognition of antigens by T-cells ; consider T- and B-cells of one specificity and fixed affinity antibodies to pathogen's antigen, and are derived using the birth-death cell population balances. The models are used for quantitative analysis of the viral hepatitis B infection, influenza A virus infection, acute pneumonia and viral-bacterial complications in lung. An approach to estimating parameters of a particular patient is suggested. Adjoint equations are used for sensitivity analysis of the models.

Journal ArticleDOI
TL;DR: In this article, maximum likelihood estimation for parameters of genetic and mating structure under two plant mating system models, the mixed-mating model and the effective selfing model, is described for the entire array of open-pollinated progeny genotypes at a single locus, descended from a maternal parent of unknown genotype.
Abstract: SUMMARY Maximum likelihood estimation is described for parameters of genetic and mating structure under two plant mating system models, the mixed-mating model and the effective selfing model. Genetic structure is described by Wright's gene fixation index, which measures excess homozygosity at a single locus, and mating structure is described by selfing rates. The unit of observation is treated as the entire array of open-pollinated progeny genotypes at a single locus, descended from a maternal parent of unknown genotype. For the mixed-mating model, recursions are given for joint estimates of selffertilization rate s, fixation index of parents F, and allele frequencies of each sex. For the effective selfing model, recursions are given for joint estimates of effective selfing rate of inbred parents si, effective selfing rate of outbred parents so, fixation index F, and allele frequency among both sexes. Two alternate forms are given for each recursion. The first is the single-variable Newton-Raphson (Fisher scoring) method, where information divides the score. The second is the expectationmaximization (EM) method, which can be derived from the first method by increasing the information to that expected if the underlying variables of incomplete data could be directly observed. The EM form is more numerically stable, whereas the Newton-Raphson form gives faster convergence and allows negative estimates. Consideration of the progeny array as the unit of sampling allows asymptotic variances to be a function of progeny array size, for fixed total zygote sample size. The optimal number of progeny per array, which minimizes variance of estimates, is from 4 to 8 zygotes for F and the generational change of gene fixation, AF. It is near 4 for inbred and/or substructured populations and for more alleles/ more extreme allele frequencies, and near 8 for outbred populations. Optimal number is large for s, while in contrast, optimal number can be 4-5 for the effective selfing rate E (an average of si and so) because of variation of selfing rate among plants in substructured populations. Progeny sizes as low as 2 give information but a minimum size of 3 is recommended.

Journal ArticleDOI
TL;DR: A model for cell lineage data is presented and analysed that is an extension of the classical first-order autoregression, used in time-series studies, to bifurcating data trees of general size and shape.
Abstract: A model for cell lineage data is presented and analysed. The model is an extension of the classical first-order autoregression, used in time-series studies, to bifurcating data trees of general size and shape. Maximum likelihood theory is developed and compared with an extensive simulation study. Some properties of moment estimators are also presented.

Journal ArticleDOI
TL;DR: A design is proposed for "case-control within cohort" studies where controls are sampled without replacement from failure-free members of the cohort at each distinct failure time.
Abstract: A design is proposed for "case-control within cohort" studies. In this design, controls are sampled without replacement from failure-free members of the cohort at each distinct failure time. Upon selection, a subject ceases to be eligible for control selection at later failure times. Also, if a subject failing at time t had been selected as a control at t' less than t, then the matched controls at t are selected to have also been at risk at t'. In these circumstances correlation exists between score statistic contributions at t and t'. An estimator is developed for this correlation. A small simulation study compares the design just described to other possible synthetic case-control designs.


Journal ArticleDOI
TL;DR: In this paper, a relatively new partial least squares (PLS) regression method was used to relate yearly differences in straw length of 15 barley genotypes to climatic variations over 9 years.
Abstract: SUMMARY The relatively new partial least squares (PLS) regression method is used to relate yearly differences in straw length of 15 barley genotypes to climatic variations over 9 years. After main effects of genotypes and years were removed by conventional two-way ANOVA, the residual table of genotype x year interactions was related to a table of rainfall, temperature, and global radiation at different stages throughout the growth season, over the 9 years. In contrast to ANOVA interaction analysis based on principal component analysis, the related PLS regression method gave a direct, parsimonious linear modelling of the systematic relationships between the two tables in one single estimation procedure. Two main factors were found. The first one related warm, dry weather during the ear emergence to higher straw lengths in 6-row barleys than in 2-row barleys. The second factor related climatic differences between the first and last part of the growing season to straw length variations in certain related 2-row barley genotypes.