scispace - formally typeset
Search or ask a question

Showing papers in "Biometrics in 1987"


Journal Article•DOI•

5,148 citations


Journal Article•DOI•
TL;DR: A point estimator and its associated confidence interval for the size of a closed population are proposed under models that incorporate heterogeneity of capture probability andumerical results show that the proposed confidence interval performs satisfactorily in maintaining the nominal levels.
Abstract: A point estimator and its associated confidence interval for the size of a closed population are proposed under models that incorporate heterogeneity of capture probability. Real data sets provided in Edwards and Eberhardt (1967, Journal of Wildlife Management 31, 87-96) and Carothers (1973, Journal of Animal Ecology 42, 125-146) are used to illustrate this method and to compare it with other estimates. The performance of the proposed procedure is also investigated by means of Monte Carlo experiments. The method is especially useful when most of the captured individuals are caught once or twice in the sample, for which case the jackknife estimator usually does not work well. Numerical results also show that the proposed confidence interval performs satisfactorily in maintaining the nominal levels.

2,173 citations


Journal Article•DOI•
TL;DR: The first and still the only book of its kind, this volume offers a concise introduction to human genetic linkage analysis and gene mapping and introduces the reader to many of the intricate aspects of complex traits.
Abstract: The first and still the only book of its kind, this volume offers a concise introduction to human genetic linkage analysis and gene mapping. Jurg Ott provides mathematical and statistical foundations of linkage analysis for researchers and practitioners, as well as practical comments on available computer programs and websites. Each chapter ends with a set of problems, whose solutions are found at the end of the book. New to this edition is a chapter on complex traits, such as diabetes, some cancers, and psychiatric conditions. Also new is an overview of nonparametric approaches to linkage and association analysis. A chapter on two-locus inheritance introduces the reader to many of the intricate aspects of complex traits. Although the book's primary audience is in the field of genetics, physicians and others without sophisticated training in genetics can understand and apply the principles and techniques discussed.

1,509 citations


Journal Article•DOI•
TL;DR: A new approach using empirical Bayes estimation is proposed to map incidence and mortality from diseases such as cancer and the resulting estimators represent a weighted compromise between the SMR, the overall mean relative rate, and a local mean of the relative rate in nearby areas.
Abstract: There have been many attempts in recent years to map incidence and mortality from diseases such as cancer. Such maps usually display either relative rates in each district, as measured by a standardized mortality ratio (SMR) or some similar index, or the statistical significance level for a test of the difference between the rates in that district and elsewhere. Neither of these approaches is fully satisfactory and we propose a new approach using empirical Bayes estimation. The resulting estimators represent a weighted compromise between the SMR, the overall mean relative rate, and a local mean of the relative rate in nearby areas. The compromise solution depends on the reliability of each individual SMR and on estimates of the overall amount of dispersion of relative rates over different districts.

1,204 citations


Journal Article•DOI•

1,012 citations


Report•DOI•
TL;DR: In this paper, a linear random effect model with a probit model for the right censoring process is used to estimate the expected rates of change and the parameters of the right process.
Abstract: : In estimating and comparing the rates of change of a continuous variable between two groups, the unweighted averages of individual simple least squares estimates from each group are often used. Under a linear random effects model, when all individuals have completed observations at identical time points these statistics are maximum likelihood estimates for the expected rates of change. However, with censored of missing data, these estimates are no longer efficient when compared to generalized least squares estimates. When, in addition, the right censoring process is dependent upon the individual rates of change (i.e., informative right censoring), the generalized least squares estimates will be biased. Likelihood ratio test for informativeness of the censoring process and maximum likelihood estimates for the expected rates of change and the parameters of the right censoring process are developed under a linear random effect models with a probit model for the right censoring process. In realistic situations, we illustrate that the bias in estimating group rate of change and the reduction of power in comparing group difference could be substantial when strong dependency of the right censoring process on individual rates of change is ignored. (Author)

676 citations


Journal Article•DOI•
TL;DR: This paper explores one such test applicable to any set of asymptotically normal test statistics and presents two examples and the relative merits of the proposed strategies.
Abstract: Treatment comparisons in randomized clinical trials usually involve several endpoints such that conventional significance testing can seriously inflate the overall Type I error rate. One option is to select a single primary endpoint for formal statistical inference, but this is not always feasible. Another approach is to apply Bonferroni correction (i.e., multiply each P-value by the total number of endpoints). Its conservatism for correlated endpoints is examined for multivariate normal data. A third approach is to derive an appropriate global test statistic and this paper explores one such test applicable to any set of asymptotically normal test statistics. Quantitative, binary, and survival endpoints are all considered within this general framework. Two examples are presented and the relative merits of the proposed strategies are discussed.

495 citations


Journal Article•DOI•
TL;DR: A comparison of ordination methods by multiple Procrustes analysis and classification and a proof of the triangle-area theorem are presented.

467 citations



Journal Article•DOI•
TL;DR: The study of occurrence problems in medicine focuses on the design of the occurrence relation and elements of data analysis and inference, and the analysis of case-referent data.

386 citations



Journal Article•DOI•
TL;DR: In this paper, it was shown that both the Holm and Shaffer procedures can be improved under the assumption of positive orthant dependence for the test statistics, and that the new procedure can be used in place of its predecessors whenever the required observed significance levels are available.
Abstract: complexity These procedures can be used whenever the observed levels of significance are available for all individual tests It is shown that both the Holm and Shaffer procedures can be improved under the assumption of positive orthant dependence for the test statistics It is noted that this assumption is met in many important practical situations and recommended that in these cases the new procedure be used in place of its predecessors whenever the required observed significance levels are available The methodology is illustrated with a numerical example


Journal Article•DOI•
TL;DR: A class of group sequential tests, indexed by a single parameter, that yields approximately optimal results and tables of key values to help in the design ofgroup sequential tests that meet selected specifications.
Abstract: We present a class of group sequential tests, indexed by a single parameter, that yields approximately optimal results. We also provide tables of key values to help in the design of group sequential tests that meet selected specifications.

Journal Article•DOI•
TL;DR: Algorithms for generating the exact distribution of a finite sample drawn from a population in Hardy-Weinberg equilibrium are given for multiple alleles.
Abstract: Algorithms for generating the exact distribution of a finite sample drawn from a population in Hardy-Weinberg equilibrium are given for multiple alleles. The finite sampling distribution is derived analogously to Fisher's 2 X 2 exact distribution and is equivalent to Levene's conditional finite sampling distribution for Hardy-Weinberg populations. The algorithms presented are fast computationally and allow for quick alternatives to standard methods requiring corrections and approximations. Computation time is on the order of a few seconds for three-allele examples and up to 2 minutes for four-allele examples on an IBM 3081 machine.

Journal Article•DOI•
TL;DR: The efficiency of using a simulated critical point for exact intervals, which has been suggested before but never put to serious test, is investigated and is found to be completely reliable and essentially exact.
Abstract: A frequently encountered problem in practice is that of simultaneous interval estimation of p linear combinations of a parameter beta in the setting of (or equivalent to) a univariate linear model. This problem has been solved adequately only in a few settings when the covariance matrix of the estimator is diagonal; in other cases, conservative solutions can be obtained by the methods of Scheffe, Bonferroni, or Sidak (1967, Journal of the American Statistical Association 62, 626-633). Here we investigate the efficiency of using a simulated critical point for exact intervals, which has been suggested before but never put to serious test. We find the simulation-based method to be completely reliable and essentially exact. Sample size savings are substantial (in our settings): 3-19% over the Sidak method, 4-37% over the Bonferroni method, and 27-33% over the Scheffe method. We illustrate the efficiency and flexibility of the simulation-based method with case studies in physiology and marine ecology.

Journal Article•DOI•
TL;DR: The statistical properties and tests of significance for two nonparametric measures of phenotypic stability (mean of the absolute rank differences of a genotype over the environments and variance among the ranksover the environments) are investigated.
Abstract: Parametric methods for estimating genotype-environment interactions and phenotypic stability (stability of genotypes over environments) are widely used in plant and animal breeding and production. Several nonparametric methods proposed by Huhn (1979, EDP in Medicine and Biology 10, 112-117) are based on the ranks of genotypes in each environment and use the idea of homeostasis as a measure of stability. In this study the statistical properties and tests of significance for two of these nonparametric measures of phenotypic stability (mean of the absolute rank differences of a genotype over the environments and variance among the ranks over the environments) are investigated. The purpose of this study is to develop approximate but easily applied statistical tests based on the normal distribution. Finally, application of the theoretical results is demonstrated using data on 20 genotypes (varieties) of winter wheat in 10 environments (locations) from the official German registration trials.

Journal Article•DOI•
George E. Bonney1•
TL;DR: The discussion includes serially dependent outcomes, equally predictive outcomes, more specialized patterns of dependence, multidimensional tables, and three examples.
Abstract: The likelihood of a set of binary dependent outcomes, with or without explanatory variables, is expressed as a product of conditional probabilities each of which is assumed to be logistic. The models are called regressive logistic models. They provide a simple but relatively unknown parametrization of the multivariate distribution. They have the theoretical and practical advantage that they can be analyzed and fitted as in logistic regression for independent outcomes, and with the same computer programs. The paper is largely expository and is intended to motivate the development and usage of the regressive logistic models. The discussion includes serially dependent outcomes, equally predictive outcomes, more specialized patterns of dependence, multidimensional tables, and three examples.

Journal Article•DOI•
TL;DR: In this paper, a spatial analysis of field experiments is proposed which takes account of association between neighboring plots, and the residual maximum likelihood (REML) method of Patterson and Thompson (1971, Biometrika 58, 545-554) is used to estimate parameters of a general neighbour model, which can be expressed as an autoregressive moving average (ARMA) model.
Abstract: A spatial analysis of field experiments is proposed which takes account of association between neighbouring plots. The residual maximum likelihood (REML) method of Patterson and Thompson (1971, Biometrika 58, 545-554) is used to estimate parameters of a general neighbour model, which can be expressed as an autoregressive moving average (ARMA) model. Three data sets are analysed to (i) highlight the need for a model selection procedure, (ii) illustrate the differing results between incomplete block and neighbour analysis and the effect of including treated border plots in the design, and (iii) illustrate the environmental variation within an experiment using prediction of trend.

Journal Article•DOI•
TL;DR: In this article, the authors discuss the consequences of group sequential methods on the design and analysis of randomized clinical trials, and provide guidelines for the conduct of interim analysis including tables of nominal significance levels and required sample sizes for several group sequential plans.
Abstract: Recent developments in group sequential methods have had a great impact on the design and analysis of randomized clinical trials The consequences for both planned and unplanned interim analyses are discussed using several real trials as illustrations Guidelines for the conduct of interim analysis are given, including tables of nominal significance levels and required sample sizes for several group sequential plans Areas in need of further theoretical advance include multiple endpoints, estimation of treatment differences, stratification, and design of multiple-armed trials



Journal Article•DOI•
TL;DR: In this article, a form of cross-validation, in the context of principal component analysis, which has a number of useful aspects as regards multivariate data inspection and description, is described.
Abstract: SUMMARY This paper describes a form of cross-validation, in the context of principal component analysis, which has a number of useful aspects as regards multivariate data inspection and description. Topics covered include choice of dimensionality, identification of influential observations, and selection of important variables. The methods are motivated by and illustrated on a well-known data set. 1. Data Set and Objectives Jeffers (1967) described two detailed multivariate case studies, one of which concerned 19 variables measured on each of 40 winged aphids alatee adelges) that had been caught in a light trap. The 19 variables are listed in Table 1. Principal component analysis (PCA) was used to examine the structure in the data, and if possible to answer the following questions: (i) How many dimensions of the individuals are being measured? (ii) How many distinct taxa are present in the habitat? (iii) Which variables among the 19 are redundant for distinguishing between taxa, and which must be retained in future work? Of the 19 variables, 14 are length or width measurements, four are counts, and one (anal fold) is a presence/absence variable scored 0 or 1. In view of this disparity in variable type, Jeffers elected to standardise the data and thus effect the PCA by finding the latent roots and vectors of the correlation (rather than covariance) matrix of the data. The elements of each latent vector provide the coefficients of one of 19 linear combinations of the standardised original variables that successively maximise sample variance subject to being orthogonal with each other, and the corresponding latent root is the sample variance of that linear combination. The 19 observations for each aphid were subjected to each of these 19 linear transformations to form the 19 principal component scores for that aphid. The above questions were then answered as follows: (i) The latent roots of the correlation matrix were as given in Table 1. The four largest comprise 73.0%, 12.5%, 3.9%, and 2.6%, respectively, of the total variance (19.0) of the standardised variables; the dimensionality of the data was therefore taken to be 2. (ii) When the scores of the first two principal components for the 40 aphids were plotted against orthogonal axes, the resulting 40 points divided into four groups as shown in Figure 1. Hence, four distinct species were identified for the aphids. (iii) From consideration of the size of coefficients in the first three principal components, it was concluded that only the four variables length of tibia, number of ovipositor

Journal Article•DOI•
TL;DR: The method has advantages similar to those of rank transformations--namely, it is easy to use and is resistant to extreme observations, and is a useful generalization of least squares.
Abstract: A method is presented for choosing an additive constant c when transforming data x to y = log(x + c) The method preserves Type I error probability and power in ANOVA under the assumption that the x + c for some c are log-normally distributed The method has advantages similar to those of rank transformations--namely, it is easy to use and is resistant to extreme observations Since the special case c----infinity corresponds in ANOVA to y = x, the method is a useful generalization of least squares

Journal Article•DOI•
TL;DR: The results of a large-scale Monte Carlo simulation study indicate that for data with a constant coefficient of variation, the present method is superior to other published methods, including the conventional transformations to linearity and the nonparametric technique proposed by Eisenthal and Cornish-Bowden.
Abstract: SUMMARY An application of the method of maximum likelihood (ML) is described for analysing the results of enzyme kinetic experiments in which the Michaelis-Menten equation is obeyed. Accurate approximate solutions to the ML equations for the parameter estimates are presented for the case in which the experimental errors are of constant relative magnitude. Formulae are derived that approximate the standard errors of these estimates. The estimators are shown to be asymptotically unbiased and the standard errors observed in simulated data rapidly approach the theoretical lower bound as the sample size increases. The results of a large-scale Monte Carlo simulation study indicate that for data with a constant coefficient of variation, the present method is superior to other published methods, including the conventional transformations to linearity and the nonparametric technique proposed by Eisenthal and Cornish-Bowden (1974, Biochemical Journal 139, 715-720). Finally, the present results are extended to the analysis of simple receptor binding experiments using the general approach described by Munson and Rodbard (1980, Analytical Biochemistry 107, 220-239).

Journal Article•DOI•
TL;DR: In this article, the mean squared error of prediction (MSEP) as a criterion for evaluating models used for studying ecological and agronomic systems is considered, and the different sources of error that have been discussed in the literature for such models can be rigorously defined in terms of the MSEP.
Abstract: The mean squared error of prediction (MSEP) as a criterion for evaluating models used for studying ecological and agronomic systems is considered. It is shown that the different sources of error that have been discussed in the literature for such models can be rigorously defined in terms of the MSEP. A situation that occurs frequently for such models is that the model parameters are determined independently of data used to test the model, and the population of interest is structured in subpopulations. It is shown that in this case obtaining estimators of MSEP and of the individual error contributions can be reduced to the classic problem of estimating components of variance in a oneway random model. For comparison, the case where the parameters are estimated from the same data used to test the model is also addressed here.

Journal Article•DOI•
TL;DR: A noniterative method for estimating and comparing location parameters in random-coefficient growth curve models and two criteria for testing multivariate general linear hypotheses are introduced and their asymptotic properties are investigated.
Abstract: Growth and dose-response curve studies often result in incomplete or unbalanced data. Random-effects models together with a variety of computer-intensive iterative techniques have been suggested for the analysis of such data. This paper is concerned with a noniterative method for estimating and comparing location parameters in random-coefficient growth curve models. Consistent and asymptotically efficient estimators of the location parameters are obtained using estimated generalized least squares. Two criteria for testing multivariate general linear hypotheses are introduced and their asymptotic properties are investigated. The results are applied to clinical data obtained on the blood ultrafiltration performance of hemodialyzers used in the treatment of patients with end-stage renal disease.

Journal Article•DOI•
TL;DR: In many cases, the sample size for the independent-sample case provides a conservative approximation for the matched-pair design, and a simple alternative approximation is presented here.
Abstract: Miettinen (1968, Biometrics 24, 339-352) presented an approximation for power and sample size for testing the differences between proportions in the matched-pair case. Duffy (1984, Biometrics 40, 1005-1015) gave the exact power for this case and showed that Miettinen's approximation tends to slightly overestimate the power or underestimate the sample size necessary for the design power. A simple alternative approximation that is more conservative is presented here. In many cases, the sample size for the independent-sample case provides a conservative approximation for the matched-pair design.


Journal Article•DOI•
TL;DR: It is demonstrated that the perceived advantage of not preassigning scores is illusory, and recommendations are to assign reasonable column scores whenever possible, and to consider equally spaced scores when the choice is not apparent.
Abstract: The numerous statistical methods for testing no association between a binary response (rows) and K ordered categories (columns) group naturally into two classes: those that require preassigned numerical column scores and those that do not. An example of the former would be a logistic regression analysis, and of the latter would be a Wilcoxon rank-sum test. In this paper we demonstrate that the perceived advantage of not preassigning scores is illusory. We do this by presenting an example from our consulting experience in which the midrank scores used by the rank tests that do not require preassigned scores are clearly inappropriate. Our recommendations are to assign reasonable column scores whenever possible, and to consider equally spaced scores when the choice is not apparent. Midranks as scores should always be examined for their appropriateness before a rank test is applied.