scispace - formally typeset
Search or ask a question

Showing papers in "Biometrical Journal in 2008"


Journal ArticleDOI
TL;DR: This paper describes simultaneous inference procedures in general parametric models, where the experimental questions are specified through a linear combination of elemental model parameters, and extends the canonical theory of multiple comparison procedures in ANOVA models to linear regression problems, generalizedlinear models, linear mixed effects models, the Cox model, robust linear models, etc.
Abstract: Simultaneous inference is a common problem in many areas of application. If multiple null hypotheses are tested simultaneously, the probability of rejecting erroneously at least one of them increases beyond the pre-specified significance level. Simultaneous inference procedures have to be used which adjust for multiplicity and thus control the overall type I error rate. In this paper we describe simultaneous inference procedures in general parametric models, where the experimental questions are specified through a linear combination of elemental model parameters. The framework described here is quite general and extends the canonical theory of multiple comparison procedures in ANOVA models to linear regression problems, generalized linear models, linear mixed effects models, the Cox model, robust linear models, etc. Several examples using a variety of different statistical models illustrate the breadth

10,545 citations


Journal ArticleDOI
TL;DR: The limitations and usefulness of each method are addressed in order to give researchers guidance in constructing appropriate estimates of biomarkers' true discriminating capabilities.
Abstract: The receiver operating characteristic (ROC) curve is used to evaluate a biomarker's ability for classifying disease status. The Youden Index (J), the maximum potential effectiveness of a biomarker, is a common summary measure of the ROC curve. In biomarker development, levels may be unquantifiable below a limit of detection (LOD) and missing from the overall dataset. Disregarding these observations may negatively bias the ROC curve and thus J. Several correction methods have been suggested for mean estimation and testing; however, little has been written about the ROC curve or its summary measures. We adapt non-parametric (empirical) and semi-parametric (ROC-GLM [generalized linear model]) methods and propose parametric methods (maximum likelihood (ML)) to estimate J and the optimal cut-point (c *) for a biomarker affected by a LOD. We develop unbiased estimators of J and c * via ML for normally and gamma distributed biomarkers. Alpha level confidence intervals are proposed using delta and bootstrap methods for the ML, semi-parametric, and non-parametric approaches respectively. Simulation studies are conducted over a range of distributional scenarios and sample sizes evaluating estimators' bias, root-mean square error, and coverage probability; the average bias was less than one percent for ML and GLM methods across scenarios and decreases with increased sample size. An example using polychlorinated biphenyl levels to classify women with and without endometriosis illustrates the potential benefits of these methods. We address the limitations and usefulness of each method in order to give researchers guidance in constructing appropriate estimates of biomarkers' true discriminating capabilities.

801 citations


Journal ArticleDOI
TL;DR: A systematic review of the modern way of assessing risk prediction models using methods derived from ROC methodology and from probability forecasting theory to compare measures of predictive performance.
Abstract: For medical decision making and patient information, predictions of future status variables play an important role. Risk prediction models can be derived with many different statistical approaches. To compare them, measures of predictive performance are derived from ROC methodology and from probability forecasting theory. These tools can be applied to assess single markers, multivariable regression models and complex model selection algorithms. This article provides a systematic review of the modern way of assessing risk prediction models. Particular attention is put on proper benchmarks and resampling techniques that are important for the interpretation of measured performance. All methods are illustrated with data from a clinical study in head and neck cancer patients.

249 citations


Journal ArticleDOI
TL;DR: It is shown that signal intensity plots are a sine qua condition non in today's GWAs and different strategies aimed at tackling the problem of multiple testing, including adjustment of p ‐values, the false positive report probability and the false discovery rate are discussed.
Abstract: To search the entire human genome for association is a novel and promising approach to unravelling the genetic basis of complex genetic diseases. In these genome-wide association studies (GWAs), several hundreds of thousands of single nucleotide polymorphisms (SNPs) are analyzed at the same time, posing substantial biostatistical and computational challenges. In this paper, we discuss a number of biostatistical aspects of GWAs in detail. We specifically consider quality control issues and show that signal intensity plots are a sine qua condition non in today's GWAs. Approaches to detect and adjust for population stratification are briefly examined. We discuss different strategies aimed at tackling the problem of multiple testing, including adjustment of p -values, the false positive report probability and the false discovery rate. Another aspect of GWAs requiring special attention is the search for gene-gene and gene-environment interactions. We finally describe multistage approaches to GWAs.

168 citations


Journal ArticleDOI
TL;DR: A general multistage (stepwise) procedure is proposed for dealing with arbitrary gatekeeping problems including parallel and serial gatekeeping and is based on the idea of carrying forward the Type I error rate for any rejected hypotheses to test hypotheses in the next ordered family.
Abstract: A general multistage (stepwise) procedure is proposed for dealing with arbitrary gatekeeping problems including parallel and serial gatekeeping. The procedure is very simple to implement since it does not require the application of the closed testing principle and the consequent need to test all nonempty intersections of hypotheses. It is based on the idea of carrying forward the Type I error rate for any rejected hypotheses to test hypotheses in the next ordered family. This requires the use of a so-called separable multiple test procedure (MTP) in the earlier family. The Bonferroni MTP is separable, but other standard MTPs such as Holm, Hochberg, Fallback and Dunnett are not. Their truncated versions are proposed which are separable and more powerful than the Bonferroni MTP. The proposed procedure is illustrated by a clinical trial example.

119 citations


Journal ArticleDOI
TL;DR: Inference methods for the conditional logistic regression model in this setup are developed, which can be formulated within a generalized estimating equation (GEE) framework and permits the use of statistical techniques developed for GEE-based inference, such as robust variance estimators and model selection criteria adapted for non-independent data.
Abstract: This paper considers inference methods for case-control logistic regression in longitudinal setups. The motivation is provided by an analysis of plains bison spatial location as a function of habitat heterogeneity. The sampling is done according to a longitudinal matched case-control design in which, at certain time points, exactly one case, the actual location of an animal, is matched to a number of controls, the alternative locations that could have been reached. We develop inference methods for the conditional logistic regression model in this setup, which can be formulated within a generalized estimating equation (GEE) framework. This permits the use of statistical techniques developed for GEE-based inference, such as robust variance estimators and model selection criteria adapted for non-independent data. The performance of the methods is investigated in a simulation study and illustrated with the bison data analysis.

111 citations


Journal ArticleDOI
TL;DR: Resampling-based multiple testing methods that control the Familywise Error Rate in the strong sense are presented and it is shown that no assumptions whatsoever are required to obtain a reasonably powerful and flexible class of multiple testing procedures.
Abstract: Resampling-based multiple testing methods that control the Familywise Error Rate in the strong sense are presented. It is shown that no assumptions whatsoever on the data-generating process are required to obtain a reasonably powerful and flexible class of multiple testing procedures. Improvements are obtained with mild assumptions. The methods are applicable to gene expression data in particular, but more generally to any multivariate, multiple group data that may be character or numeric. The role of the disputed "subset pivotality" condition is clarified.

106 citations


Journal ArticleDOI
TL;DR: Four different general approaches for testing the ratio of two Poisson rates are compared and some recommendations favoring the likelihood ratio and certain asymptotic tests are based on these simulation results.
Abstract: In this paper we compare the properties of four different general approaches for testing the ratio of two Poisson rates. Asymptotically normal tests, tests based on approximate p -values, exact conditional tests, and a likelihood ratio test are considered. The properties and power performance of these tests are studied by a Monte Carlo simulation experiment. Sample size calculation formulae are given for each of the test procedures and their validities are studied. Some recommendations favoring the likelihood ratio and certain asymptotic tests are based on these simulation results. Finally, all of the test procedures are illustrated with two real life medical examples.

102 citations


Journal ArticleDOI
TL;DR: This paper proposes a general approach for handling multiple contrast tests for normally distributed data in the presence of heteroscedasticity, and three candidate procedures are described and compared by simulations.
Abstract: This paper proposes a general approach for handling multiple contrast tests for normally distributed data in the presence of heteroscedasticity. Three candidate procedures are described and compared by simulations. Only the procedure with both comparison-specific degrees of freedom and a correlation matrix depending on sample variances maintains the α-level over all situations. Other approaches may fail notably as the variances differ more. Furthermore, related approximate simultaneous confidence intervals are given. The approach will be applied to a toxicological experiment.

93 citations


Journal ArticleDOI
TL;DR: An approximative method for maximum likelihood estimation of parameters of Neyman-Scott and similar point processes is proposed, based on the point pattern resulting from forming all difference points of pairs of points in the window of observation.
Abstract: This paper proposes an approximative method for maximum likelihood estimation of parameters of Neyman-Scott and similar point processes. It is based on the point pattern resulting from forming all difference points of pairs of points in the window of observation. The intensity function of this constructed point process can be expressed in terms of second-order characteristics of the original process. This opens the way to parameter estimation, if the difference pattern is treated as a non-homogeneous Poisson process. The computational feasibility and accuracy of this approach is examined by means of simulated data. Furthermore, the method is applied to two biological data sets. For these data, various cluster process models are considered and compared with respect to their goodness-of-fit.

93 citations


Journal ArticleDOI
TL;DR: It is shown that cluster mean imputation yields valid inferences and given its simplicity, may be an attractive option in some large community intervention trials which are subject to individual-level attrition only; however, it may yield less powerful inferences than alternative procedures which pool across clusters especially when the cluster sizes are small and cluster follow-up rates are highly variable.
Abstract: In cluster randomized trials, intact social units such as schools, worksites or medical practices - rather than individuals themselves - are randomly allocated to intervention and control conditions, while the outcomes of interest are then observed on individuals within each cluster. Such trials are becoming increasingly common in the fields of health promotion and health services research. Attrition is a common occurrence in randomized trials, and a standard approach for dealing with the resulting missing values is imputation. We consider imputation strategies for missing continuous outcomes, focusing on trials with a completely randomized design in which fixed cohorts from each cluster are enrolled prior to random assignment. We compare five different imputation strategies with respect to Type I and Type II error rates of the adjusted two-sample t -test for the intervention effect. Cluster mean imputation is compared with multiple imputation, using either within-cluster data or data pooled across clusters in each intervention group. In the case of pooling across clusters, we distinguish between standard multiple imputation procedures which do not account for intracluster correlation and a specialized procedure which does account for intracluster correlation but is not yet available in standard statistical software packages. A simulation study is used to evaluate the influence of cluster size, number of clusters, degree of intracluster correlation, and variability among cluster follow-up rates. We show that cluster mean imputation yields valid inferences and given its simplicity, may be an attractive option in some large community intervention trials which are subject to individual-level attrition only; however, it may yield less powerful inferences than alternative procedures which pool across clusters especially when the cluster sizes are small and cluster follow-up rates are highly variable. When pooling across clusters, the imputation procedure should generally take intracluster correlation into account to obtain valid inferences; however, as long as the intracluster correlation coefficient is small, we show that standard multiple imputation procedures may yield acceptable type I error rates; moreover, these procedures may yield more powerful inferences than a specialized procedure, especially when the number of available clusters is small. Within-cluster multiple imputation is shown to be the least powerful among the procedures considered.

Journal ArticleDOI
TL;DR: Dose finding studies play a key role in any drug development program and are often the gate‐keeper for large confirmatory studies, especially in the context of pharmaceutical drug development.
Abstract: A good understanding and characterization of the dose response relationship of any new compound is an important and ubiquitous problem in many areas of scientific investigation. This is especially true in the context of pharmaceutical drug development, where it is mandatory to launch safe drugs which demonstrate a clinically relevant effect. Selecting a dose too high may result in unacceptable safety problems, while selecting a dose too low may lead to ineffective drugs. Dose finding studies thus play a key role in any drug development program and are often the gate-keeper for large confirmatory studies. In this overview paper we focus on definitive and confirmatory dose finding studies in Phase II or III, reviewing relevant statistical design and analysis methods. In particular, we describe multiple comparison procedures, modeling approaches, and hybrid methods combining the advantages of both. An outlook to adaptive dose finding methods is also given. We use a real data example to illustrate the methods, together with a brief overview of relevant software.

Journal ArticleDOI
Jack Bowden, Ekkehard Glimm1
TL;DR: This paper extends the method of Cohen and Sackrowitz (1989) who proposed a two-stage unbiased estimate for the best performing treatment at interim and enables their estimate to work for unequal stage one and two sample sizes, and also when the quantity of interest is the best, second best, or j -th best treatment out of k.
Abstract: Straightforward estimation of a treatment's effect in an adaptive clinical trial can be severely hindered when it has been chosen from a larger group of potential candidates. This is because selection mechanisms that condition on the rank order of treatment statistics introduce bias. Nevertheless, designs of this sort are seen as a practical and efficient way to fast track the most promising compounds in drug development. In this paper we extend the method of Cohen and Sackrowitz (1989) who proposed a two-stage unbiased estimate for the best performing treatment at interim. This enables their estimate to work for unequal stage one and two sample sizes, and also when the quantity of interest is the best, second best, or j -th best treatment out of k. The implications of this new flexibility are explored via simulation. (© 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim)

Journal ArticleDOI
TL;DR: This paper reviews methods for nearest neighbour analysis that adjust for local trend in one dimension, including first differences and the Papadakis method, and discusses mixed model representations of these methods on the scale of the observed data.
Abstract: This paper reviews methods for nearest neighbour analysis that adjust for local trend in one dimension. Such methods are commonly used in plant breeding and variety testing. The focus is on simple differencing methods, including first differences and the Papadakis method. We discuss mixed model representations of these methods on the scale of the observed data. Modelling observed data has a number of practical advantages compared to differencing, for example the facility to conveniently compute adjusted cultivar means. Most models considered involve a linear variance-covariance structure and can be represented as state-space models. The reviewed methods and models are exemplified using three datasets.

Journal ArticleDOI
TL;DR: A hierarchical specification using spatial random effects modeled with a Dirichlet process prior to extend the model to spatio-temporal settings and demonstrates the methodology with simulated data as well as with a data set on lung cancer incidences for all 88 counties in the state of Ohio over an observation period of 21 years.
Abstract: Disease incidence or mortality data are typically available as rates or counts for specified regions, collected over time. We propose Bayesian nonparametric spatial modeling approaches to analyze such data. We develop a hierarchical specification using spatial random effects modeled with a Dirichlet process prior. The Dirichlet process is centered around a multivariate normal distribution. This latter distribution arises from a log-Gaussian process model that provides a latent incidence rate surface, followed by block averaging to the areal units determined by the regions in the study. With regard to the resulting posterior predictive inference, the modeling approach is shown to be equivalent to an approach based on block averaging of a spatial Dirichlet process to obtain a prior probability model for the finite dimensional distribution of the spatial random effects. We introduce a dynamic formulation for the spatial random effects to extend the model to spatio-temporal settings. Posterior inference is implemented through Gibbs sampling. We illustrate the methodology with simulated data as well as with a data set on lung cancer incidences for all 88 counties in the state of Ohio over an observation period of 21 years.

Journal ArticleDOI
TL;DR: In a simulation study it is found that no one method dominates the others in terms of power apart from the adaptive Dunnett test that dominates the classical Dunnett by construction.
Abstract: Traditionally drug development is generally divided into three phases which have different aims and objectives. Recently so-called adaptive seamless designs that allow combination of the objectives of different development phases into a single trial have gained much interest. Adaptive trials combining treatment selection typical for Phase II and confirmation of efficacy as in Phase III are referred to as adaptive seamless Phase II/III designs and are considered in this paper. We compared four methods for adaptive treatment selection, namely the classical Dunnett test, an adaptive version of the Dunnett test based on the conditional error approach, the combination test approach, and an approach within the classical group-sequential framework. The latter two approaches have only recently been published. In a simulation study we found that no one method dominates the others in terms of power apart from the adaptive Dunnett test that dominates the classical Dunnett by construction. Furthermore, scenarios under which one approach outperforms others are described.

Journal ArticleDOI
Olivier Guilbaud1
TL;DR: This article provides simultanenous confidence regions for Holm's MTP in the class of short-cut consonant closed-testing procedures based on marginal p -values and weighted Bonferroni tests for intersection hypotheses considered by Hommel, Bretz and Maurer (2007).
Abstract: Holm's (1979) step-down multiple-testing procedure (MTP) is appealing for its flexibility, transparency, and general validity, but the derivation of corresponding simultaneous confidence regions has remained an unsolved problem. This article provides such confidence regions. In fact, simultanenous confidence regions are provided for any MTP in the class of short-cut consonant closed-testing procedures based on marginal p -values and weighted Bonferroni tests for intersection hypotheses considered by Hommel, Bretz and Maurer (2007). In addition to Holm's MTP, this class includes the fixed-sequence MTP, recently proposed gatekeeping MTPs, and the fallback MTP. The simultaneous confidence regions are generally valid if underlying marginal p -values and corresponding marginal confidence regions (assumed to be available) are valid. The marginal confidence regions and estimated quantities are not assumed to be of any particular kinds/dimensions. Compared to the rejections made by the MTP for the family of null hypotheses H under consideration, the proposed confidence regions provide extra free information. In particular, with Holm's MTP, such extra information is provided: for all nonrejected H s, in case not all H s are rejected; or for certain (possibly all) H s, in case all H s are rejected. In case not all H s are rejected, no extra information is provided for rejected H s. This drawback seems however difficult to overcome. Illustrations concerning clinical studies are given. (© 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim)

Journal ArticleDOI
TL;DR: A class of generalized log-rank tests is presented for this type of survival data where observed data include both exact and interval-censored observations on the survival time of interest and their asymptotic properties are established.
Abstract: In this paper, we consider incomplete survival data: partly interval-censored failure time data where observed data include both exact and interval-censored observations on the survival time of interest. We present a class of generalized log-rank tests for this type of survival data and establish their asymptotic properties. The method is evaluated using simulation studies and illustrated by a set of real data from a diabetes study.

Journal ArticleDOI
TL;DR: The zero-truncated negative binomial regression model is presented to estimate the population size in the presence of a single registration file and yields a substantially higher population size estimate than the Poisson model.
Abstract: This paper presents the zero-truncated negative binomial regression model to estimate the population size in the presence of a single registration file. The model is an alternative to the zero-truncated Poisson regression model and it may be useful if the data are overdispersed due to unobserved heterogeneity. Horvitz-Thompson point and interval estimates for the population size are derived, and the performance of these estimators is evaluated in a simulation study. To illustrate the model, the size of the population of opiate users in the city of Rotterdam is estimated. In comparison to the Poisson model, the zero-truncated negative binomial regression model fits these data better and yields a substantially higher population size estimate.

Journal ArticleDOI
TL;DR: A survey of the general theory of multiple comparisons and its recent results in applications is given,phasizing the closure of multiple tests.
Abstract: The introduction of sequentially rejective multiple test procedures (Einot and Gabriel, 1975; Naik, 1975; Holm, 1977; Holm, 1979) has caused considerable progress in the theory of multiple comparisons. Emphasizing the closure of multiple tests we give a survey of the general theory and its recent results in applications. Some new applications are given including a discussion of the connection with the theory of confidence regions.

Journal ArticleDOI
TL;DR: It is found that, although the array of candidate models can be improved, finite mixtures of a small number of components (point masses or simple diffuse distributions) represent a promising direction, and the usefulness of finite mixture models is noted.
Abstract: We consider parametric distributions intended to model heterogeneity in population size estimation, especially parametric stochastic abundance models for species richness estimation. We briefly review (conditional) maximum likelihood estimation of the number of species, and summarize the results of fitting 7 candidate models to frequency-count data, from a database of >40000 such instances, mostly arising from microbial ecology. We consider error estimation, goodness-of-fit assessment, data subsetting, and other practical matters. We find that, although the array of candidate models can be improved, finite mixtures of a small number of components (point masses or simple diffuse distributions) represent a promising direction. Finally we consider the connections between parametric models for abundance and incidence data, again noting the usefulness of finite mixture models.

Journal ArticleDOI
Marc Vandemeulebroecke1
TL;DR: A structured review of the current controversial discussion on practical issues, opportunities and challenges of these new designs of group sequential and adaptive designs for clinical trials is given.
Abstract: In recent times, group sequential and adaptive designs for clinical trials have attracted great attention from industry, academia and regulatory authorities. These designs allow analyses on accumulating data - as opposed to classical, "fixed-sample" statistics. The rapid development of a great variety of statistical procedures is accompanied by a lively debate on their potential merits and shortcomings. The purpose of this review article is to ease orientation in both respects. First, we provide a concise overview of the essential technical concepts, with special emphasis on their interrelationships. Second, we give a structured review of the current controversial discussion on practical issues, opportunities and challenges of these new designs.

Journal ArticleDOI
TL;DR: An intuitive interpretation for "independence" between samples based on 2 x 2 categorical data formed by capture/non-capture in each of the two samples is provided and a general measure of "dependence" is reviewed.
Abstract: The Petersen-Lincoln estimator has been used to estimate the size of a population in a single mark release experiment. However, the estimator is not valid when the capture sample and recapture sample are not independent. We provide an intuitive interpretation for "independence" between samples based on 2 x 2 categorical data formed by capture/non-capture in each of the two samples. From the interpretation, we review a general measure of "dependence" and quantify the correlation bias of the Petersen-Lincoln estimator when two types of dependences (local list dependence and heterogeneity of capture probability) exist. An important implication in the census undercount problem is that instead of using a post enumeration sample to assess the undercount of a census, one should conduct a prior enumeration sample to avoid correlation bias. We extend the Petersen-Lincoln method to the case of two populations. This new estimator of the size of the shared population is proposed and its variance is derived. We discuss a special case where the correlation bias of the proposed estimator due to dependence between samples vanishes. The proposed method is applied to a study of the relapse rate of illicit drug use in Taiwan.

Journal ArticleDOI
TL;DR: This work summarises mixture model results for closed populations, using a skink data set for illustration, and discusses new mixture models for heterogeneous open populations.
Abstract: Modelling heterogeneity of capture is an important problem in estimating animal abundance from capturerecapture data, with underestimation of abundance occurring if different animals have intrinsically high or low capture probabilities Mixture models are useful in many cases to model the heterogeneity We summarise mixture model results for closed populations, using a skink data set for illustration New mixture models for heterogeneous open populations are discussed, and a closed population model is shown to have new and potentially effective applications in community analysis (© 2008 WILEY-VCH Verlag GmbH & Co KGaA, Weinheim)

Journal ArticleDOI
Gerhard Hommel, Frank Bretz1
TL;DR: This paper discusses aesthetical concepts and requirements for reasonable multiple test procedures, and considers three different concepts of monotonicity, including the recently proposed "fallback procedure".
Abstract: In this paper we discuss aesthetical concepts and requirements for reasonable multiple test procedures. Aesthetical considerations lead to logical decision patterns which are conceivable and, if possible, simple to use and to communicate. Such considerations are sometimes contradictory to the ubiquitous requirement of maximizing power for a multiple test procedure. We illustrate the necessary trade-offs with several examples. We start by considering important logical properties and then discuss three different concepts of monotonicity. Afterwards we have a closer look at the recently proposed "fallback procedure" and show that it has some less appealing properties. Finally, we investigate the distribution of the numbers of significant results with respect to both expectation and variance.

Journal ArticleDOI
TL;DR: The Type I error and power trade-off achieved by the empirical Bayes procedures under a variety of testing scenarios allows this approach to be competitive with or outperform the Storey and Tibshirani (2003) linear step-up procedure, as an alternative to the classical Benjamini and Hochberg (1995) procedure.
Abstract: This article proposes resampling-based empirical Bayes multiple testing procedures for controlling a broad class of Type I error rates, defined as generalized tail probability (gTP) error rates, gTP (q,g) = Pr(g (V(n),S(n)) > q), and generalized expected value (gEV) error rates, gEV (g) = E [g (V(n),S(n))], for arbitrary functions g (V(n),S(n)) of the numbers of false positives V(n) and true positives S(n). Of particular interest are error rates based on the proportion g (V(n),S(n)) = V(n) /(V(n) + S(n)) of Type I errors among the rejected hypotheses, such as the false discovery rate (FDR), FDR = E [V(n) /(V(n) + S(n))]. The proposed procedures offer several advantages over existing methods. They provide Type I error control for general data generating distributions, with arbitrary dependence structures among variables. Gains in power are achieved by deriving rejection regions based on guessed sets of true null hypotheses and null test statistics randomly sampled from joint distributions that account for the dependence structure of the data. The Type I error and power properties of an FDR-controlling version of the resampling-based empirical Bayes approach are investigated and compared to those of widely-used FDR-controlling linear step-up procedures in a simulation study. The Type I error and power trade-off achieved by the empirical Bayes procedures under a variety of testing scenarios allows this approach to be competitive with or outperform the Storey and Tibshirani (2003) linear step-up procedure, as an alternative to the classical Benjamini and Hochberg (1995) procedure.

Journal ArticleDOI
TL;DR: This work revisits the heterogeneous closed population multiple recapture problem, modeling individual-level heterogeneity using the Grade of Membership model, and proposes a full hierarchical Bayes specification and a MCMC algorithm to obtain samples from the posterior distribution.
Abstract: We revisit the heterogeneous closed population multiple recapture problem, modeling individual-level heterogeneity using the Grade of Membership model (Woodbury et al., 1978). This strategy allows us to postulate the existence of homogeneous latent "ideal" or "pure" classes within the population, and construct a soft clustering of the individuals, where each one is allowed partial or mixed membership in all of these classes. We propose a full hierarchical Bayes specification and a MCMC algorithm to obtain samples from the posterior distribution. We apply the method to simulated data and to three real life examples.

Journal ArticleDOI
TL;DR: This work extended recently proposed confidence interval methods for the difference of two proportions or single contrasts to multiple contrasts by using quantiles of the multivariate normal distribution, taking the correlation into account.
Abstract: Simultaneous confidence intervals for contrasts of means in a one-way layout with several independent samples are well established for Gaussian distributed data. Procedures addressing different hypotheses are available, such as all pairwise comparisons or comparisons to control, comparison with average, or different tests for order-restricted alternatives. However, if the distribution of the response is not Gaussian, corresponding methods are usually not available or not implemented in software. For the case of comparisons among several binomial proportions, we extended recently proposed confidence interval methods for the difference of two proportions or single contrasts to multiple contrasts by using quantiles of the multivariate normal distribution, taking the correlation into account. The small sample performance of the proposed methods was investigated in simulation studies. The simple adjustment of adding 2 pseudo-observations to each sample estimate leads to reasonable coverage probabilities. The methods are illustrated by the evaluation of real data examples of a clinical trial and a toxicological study. The proposed methods and examples are available in the R package MCPAN. ((c) 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim).

Journal ArticleDOI
TL;DR: A new bootstrapping approach to improve upon the bootstrap percentile method that is currently available is proposed and the most efficient way of constructing confidence intervals will be identified and extended to the censored data case.
Abstract: In health policy and economics studies, the incremental cost-effectiveness ratio (ICER) has long been used to compare the economic consequences relative to the health benefits of therapies. Due to the skewed distributions of the costs and ICERs, much research has been done on how to obtain confidence intervals of ICERs, using either parametric or nonparametric methods, with or without the presence of censoring. In this paper, we will examine and compare the finite sample performance of many approaches via simulation studies. For the special situation when the health effect of the treatment is not statistically significant, we will propose a new bootstrapping approach to improve upon the bootstrap percentile method that is currently available. The most efficient way of constructing confidence intervals will be identified and extended to the censored data case. Finally, a data example from a cardiovascular clinical trial is used to demonstrate the application of these methods.

Journal ArticleDOI
TL;DR: Specific assumptions necessary for permutation multiple tests to control the Familywise Error Rate (FWER) are discussed, including the surprising fact that, in the case of a linear model with i.i.d. errors, this issue has no impact on control of FWER, if the test statistic is of a particular form.
Abstract: This article discusses specific assumptions necessary for permutation multiple tests to control the Familywise Error Rate (FWER). At issue is that, in comparing parameters of the marginal distributions of two sets of multivariate observations, validity of permutation testing is affected by all the parameters in the joint distributions of the observations. We show the surprising fact that, in the case of a linear model with i.i.d. errors such as in the analysis of Quantitative Trait Loci (QTL), this issue has no impact on control of FWER, if the test statistic is of a particular form. On the other hand, in the analysis of gene expression levels or multiple safety endpoints, unless some assumption connecting the marginal distributions of the observations to their joint distributions is made, permutation multiple tests may not control FWER.