scispace - formally typeset
Search or ask a question

Showing papers in "Biometrics in 2001"


Journal ArticleDOI
Wei Pan1
TL;DR: This work proposes a modification to AIC, where the likelihood is replaced by the quasi-likelihood and a proper adjustment is made for the penalty term.
Abstract: Correlated response data are common in biomedical studies. Regression analysis based on the generalized estimating equations (GEE) is an increasingly important method for such data. However, there seem to be few model-selection criteria available in GEE. The well-known Akaike Information Criterion (AIC) cannot be directly applied since AIC is based on maximum likelihood estimation while GEE is nonlikelihood based. We propose a modification to AIC, where the likelihood is replaced by the quasi-likelihood and a proper adjustment is made for the penalty term. Its performance is investigated through simulation studies. For illustration, the method is applied to a real data set.

2,233 citations


Journal ArticleDOI
TL;DR: This paper proposes a bias-corrected covariance estimator for generalized estimating equations (GEE) that gives tests with sizes close to the nominal level even when the number of subjects was 10 and cluster sizes were unequal, whereas the robust and jackknife covariances estimators gave Tests with sizes that could be 2-3 times the nominallevel.
Abstract: Summary. In this paper, we propose an alternative covariance estimator to the robust covariance estimator of generalized estimating equations (GEE). Hypothesis tests using the robust covariance estimator can have inflated size when the number of independent clusters is small. Resampling methods, such as the jackknife and bootstrap, have been suggested for covariance estimation when the number of clusters is small. A drawback of the resampling methods when the response is binary is that the methods can break down when the number of subjects is small due to zero or near-zero cell counts caused by resampling. We propose a bias-corrected covariance estimator that avoids this problem. In a small simulation study, we compare the bias-corrected covariance estimator to the robust and jackknife covariance estimators for binary responses for situations involving 10–40 subjects with equal and unequal cluster sizes of 16–64 observations. The bias-corrected covariance estimator gave tests with sizes close to the nominal level even when the number of subjects was 10 and cluster sizes were unequal, whereas the robust and jackknife covariance estimators gave tests with sizes that could be 2–3 times the nominal level. The methods are illustrated using data from a randomized clinical trial on treatment for bone loss in subjects with periodontal disease.

455 citations


Journal ArticleDOI
TL;DR: The multiplicative model corresponds to that used in the multivariate technique of factor analysis and provides a parsimonious and interpretable model for the genetic covariances between environments.
Abstract: The recommendation of new plant varieties for commercial use requires reliable and accurate predictions of the average yield of each variety across a range of target environments and knowledge of important interactions with the environment. This information is obtained from series of plant variety trials, also known as multi-environment trials (MET). Cullis, Gogel, Verbyla, and Thompson (1998) presented a spatial mixed model approach for the analysis of MET data. In this paper we extend the analysis to include multiplicative models for the variety effects in each environment. The multiplicative model corresponds to that used in the multivariate technique of factor analysis. It allows a separate genetic variance for each environment and provides a parsimonious and interpretable model for the genetic covariances between environments. The model can be regarded as a random effects analogue of AMMI (additive main effects and multiplicative interactions). We illustrate the method using a large set of MET data from a South Australian barley breeding program.

448 citations


Journal ArticleDOI
TL;DR: A Bayesian approach is used to draw inferences about the disease prevalence and test properties while adjusting for the possibility of conditional dependence between tests, particularly when it is not always feasible to have results from this many tests.
Abstract: Many analyses of results from multiple diagnostic tests assume the tests are statistically independent conditional on the true disease status of the subject. This assumption may be violated in practice, especially in situations where none of the tests is a perfectly accurate gold standard. Classical inference for models accounting for the conditional dependence between tests requires that results from at least four different tests be used in order to obtain an identifiable solution, but it is not always feasible to have results from this many tests. We use a Bayesian approach to draw inferences about the disease prevalence and test properties while adjusting for the possibility of conditional dependence between tests, particularly when we have only two tests. We propose both fixed and random effects models. Since with fewer than four tests the problem is nonidentifiable, the posterior distributions are strongly dependent on the prior information about the test properties and the disease prevalence, even with large sample sizes. If the degree of correlation between the tests is known a priori with high precision, then our methods adjust for the dependence between the tests. Otherwise, our methods provide adjusted inferences that incorporate all of the uncertainty inherent in the problem, typically resulting in wider interval estimates. We illustrate our methods using data from a study on the prevalence of Strongyloides infection among Cambodian refugees to Canada.

434 citations


Journal ArticleDOI
TL;DR: A method of analyzing collections of related curves in which the individual curves are modeled as spline functions with random coefficients, which produces a low-rank, low-frequency approximation to the covariance structure, which can be estimated naturally by the EM algorithm.
Abstract: We propose a method of analyzing collections of related curves in which the individual curves are modeled as spline functions with random coefficients. The method is applicable when the individual curves are sampled at variable and irregularly spaced points. This produces a low-rank, low-frequency approximation to the covariance structure, which can be estimated naturally by the EM algorithm. Smooth curves for individual trajectories are constructed as best linear unbiased predictor (BLUP) estimates, combining data from that individual and the entire collection. This framework leads naturally to methods for examining the effects of covariates on the shapes of the curves. We use model selection techniques--Akaike information criterion (AIC), Bayesian information criterion (BIC), and cross-validation--to select the number of breakpoints for the spline approximation. We believe that the methodology we propose provides a simple, flexible, and computationally efficient means of functional data analysis.

402 citations


Journal ArticleDOI
TL;DR: A score test for testing zero-inflated Poisson regression models against zero- inflated negative binomial alternatives is provided.
Abstract: Count data often show a higher incidence of zero counts than would be expected if the data were Poisson distributed. Zero-inflated Poisson regression models are a useful class of models for such data, but parameter estimates may be seriously biased if the nonzero counts are overdispersed in relation to the Poisson distribution. We therefore provide a score test for testing zero-inflated Poisson regression models against zero-inflated negative binomial alternatives.

382 citations


Journal ArticleDOI
TL;DR: A general method is presented integrating the concept of adaptive interim analyses into classical group sequential testing to allow the researcher to represent every group sequential plan as an adaptive trial design and to make design changes during the course of the trial after every interim analysis.
Abstract: A general method is presented integrating the concept of adaptive interim analyses into classical group sequential testing. This allows the researcher to represent every group sequential plan as an adaptive trial design and to make design changes during the course of the trial after every interim analysis in the same way as with adaptive designs. The concept of adaptive trial designing is thereby generalized to a large variety of possible sequential plans.

331 citations


Journal ArticleDOI
TL;DR: The solution is an adaptation of a procedure by Firth originally developed to reduce the bias of maximum likelihood estimates, which produces finite parameter estimates by means of penalized maximum likelihood estimation and is exemplified in the analysis of a breast cancer study.
Abstract: The phenomenon of monotone likelihood is observed in the fitting process of a Cox model if the likelihood converges to a finite value while at least one parameter estimate diverges to +/- infinity. Monotone likelihood primarily occurs in small samples with substantial censoring of survival times and several highly predictive covariates. Previous options to deal with monotone likelihood have been unsatisfactory. The solution we suggest is an adaptation of a procedure by Firth (1993, Biometrika 80, 27-38) originally developed to reduce the bias of maximum likelihood estimates. This procedure produces finite parameter estimates by means of penalized maximum likelihood estimation. Corresponding Wald-type tests and confidence intervals are available, but it is shown that penalized likelihood ratio tests and profile penalized likelihood confidence intervals are often preferable. An empirical study of the suggested procedures confirms satisfactory performance of both estimation and inference. The advantage of the procedure over previous options of analysis is finally exemplified in the analysis of a breast cancer study.

295 citations


Journal ArticleDOI
TL;DR: It is demonstrated that standard information criteria may be used to choose the tuning parameter and detect departures from normality, and the approach is illustrated via simulation and using longitudinal data from the Framingham study.
Abstract: Normality of random effects is a routine assumption for the linear mixed model, but it may be unrealistic, obscuring important features of among-individual variation. We relax this assumption by approximating the random effects density by the seminonparameteric (SNP) representation of Gallant and Nychka (1987, Econometrics 55, 363-390), which includes normality as a special case and provides flexibility in capturing a broad range of nonnormal behavior, controlled by a user-chosen tuning parameter. An advantage is that the marginal likelihood may be expressed in closed form, so inference may be carried out using standard optimization techniques. We demonstrate that standard information criteria may be used to choose the tuning parameter and detect departures from normality, and we illustrate the approach via simulation and using longitudinal data from the Framingham study.

278 citations


Journal ArticleDOI
TL;DR: Two general shrinkage approaches to estimating the covariance matrix and regression coefficients are considered, the first involves shrinking the eigenvalues of the unstructured ML or REML estimator and the second involves shrinking an un Structured estimator toward a structured estimator.
Abstract: Estimation of covariance matrices in small samples has been studied by many authors. Standard estimators, like the unstructured maximum likelihood estimator (ML) or restricted maximum likelihood (REML) estimator, can be very unstable with the smallest estimated eigenvalues being too small and the largest too big. A standard approach to more stably estimating the matrix in small samples is to compute the ML or REML estimator under some simple structure that involves estimation of fewer parameters, such as compound symmetry or independence. However, these estimators will not be consistent unless the hypothesized structure is correct. If interest focuses on estimation of regression coefficients with correlated (or longitudinal) data, a sandwich estimator of the covariance matrix may be used to provide standard errors for the estimated coefficients that are robust in the sense that they remain consistent under misspecification of the covariance structure. With large matrices, however, the inefficiency of the sandwich estimator becomes worrisome. We consider here two general shrinkage approaches to estimating the covariance matrix and regression coefficients. The first involves shrinking the eigenvalues of the unstructured ML or REML estimator. The second involves shrinking an unstructured estimator toward a structured estimator. For both cases, the data determine the amount of shrinkage. These estimators are consistent and give consistent and asymptotically efficient estimates for regression coefficients. Simulations show the improved operating characteristics of the shrinkage estimators of the covariance matrix and the regression coefficients in finite samples. The final estimator chosen includes a combination of both shrinkage approaches, i.e., shrinking the eigenvalues and then shrinking toward structure. We illustrate our approach on a sleep EEG study that requires estimation of a 24 x 24 covariance matrix and for which inferences on mean parameters critically depend on the covariance estimator chosen. We recommend making inference using a particular shrinkage estimator that provides a reasonable compromise between structured and unstructured estimators.

264 citations


Journal ArticleDOI
TL;DR: It is found that the sequential procedure generally results in fewer treatment failures than the other procedures, particularly when the success probabilities of treatments are smaller.
Abstract: We derive the optimal allocation between two treatments in a clinical trial based on the following optimality criterion: for fixed variance of the test statistic, what allocation minimizes the expected number of treatment failures? A sequential design is described that leads asymptotically to the optimal allocation and is compared with the randomized play-the-winner rule, sequential Neyman allocation, and equal allocation at similar power levels. We find that the sequential procedure generally results in fewer treatment failures than the other procedures, particularly when the success probabilities of treatments are smaller.

Journal ArticleDOI
TL;DR: This work offers some practical modifications to robust Wald-type tests from estimating equations that are sums of K independent or approximately independent terms, and examines by simulation the modifications applied to the generalized estimating equations of Liang and Zeger (1986), conditional logistic regression, and the Cox proportional hazard model.
Abstract: The sandwich estimator of variance may be used to create robust Wald-type tests from estimating equations that are sums of K independent or approximately independent terms. For example, for repeated measures data on K individuals, each term relates to a different individual. These tests applied to a parameter may have greater than nominal size if K is small, or more generally if the parameter to be tested is essentially estimated from a small number of terms in the estimating equation. We offer some practical modifications to these robust Wald-type tests, which asymptotically approach the usual robust Wald-type tests. We show that one of these modifications provides exact coverage for a simple case and examine by simulation the modifications applied to the generalized estimating equations of Liang and Zeger (1986), conditional logistic regression, and the Cox proportional hazard model.

Journal ArticleDOI
TL;DR: The influence of perturbing a missing-at-random dropout model in the direction of non random dropout is explored and the method is applied to data from a randomized experiment on the inhibition of testosterone production in rats.
Abstract: Diggle and Kenward (1994, Applied Statistics 43, 49-93) proposed a selection model for continuous longitudinal data subject to nonrandom dropout. It has provoked a large debate about the role for such models. The original enthusiasm was followed by skepticism about the strong but untestable assumptions on which this type of model invariably rests. Since then, the view has emerged that these models should ideally be made part of a sensitivity analysis. This paper presents a formal and flexible approach to such a sensitivity assessment based on local influence (Cook, 1986, Journal of the Royal Statistical Society, Series B 48, 133-169). The influence of perturbing a missing-at-random dropout model in the direction of nonrandom dropout is explored. The method is applied to data from a randomized experiment on the inhibition of testosterone production in rats.

Journal ArticleDOI
TL;DR: This work presents a model for estimating availability for detection that relaxes two assumptions required in previous approaches, and applies it to estimate survival and breeding probability in a study of hawksbill sea turtles, where previous approaches are not appropriate.
Abstract: Capture-recapture studies are crucial in many circumstances for estimating demographic parameters for wildlife and fish populations. Pollock's robust design, involving multiple sampling occasions per period of interest, provides several advantages over classical approaches. This includes the ability to estimate the probability of being present and available for detection, which in some situations is equivalent to breeding probability. We present a model for estimating availability for detection that relaxes two assumptions required in previous approaches. The first is that the sampled population is closed to additions and deletions across samples within a period of interest. The second is that each member of the population has the same probability of being available for detection in a given period. We apply our model to estimate survival and breeding probability in a study of hawksbill sea turtles (Eretmochelys imbricata), where previous approaches are not appropriate.

Journal ArticleDOI
TL;DR: In this article, the authors show that less conservative behavior results from inverting a single two-sided test than inverting two separate one-sided tests of half the nominal level each.
Abstract: Summary. The traditional definition of a confidence interval requires the coverage probability at any value of the parameter to be at least the nominal confidence level. In constructing such intervals for parameters in discrete distributions, less conservative behavior results from inverting a single two-sided test than inverting two separate one-sided tests of half the nominal level each. We illustrate for a variety of discrete problems, including interval estimation of a binomial parameter, the difference and the ratio of two binomial parameters for independent samples, and the odds ratio.

Journal ArticleDOI
TL;DR: A Bayesian nonlinear approach for the analysis of spatial count data that allows us to make probability statements on the incidence rates around point sources without making any parametric assumptions about the nature of the influence between the sources and the surrounding location.
Abstract: Summary. This paper presents a Bayesian nonlinear approach for the analysis of spatial count data. It extends the Bayesian partition methodology of Holmes, Denison, and Mallick (1999, Bayesian partitioning for classification and regression, Technical Report, Imperial College, London) to handle data that involve counts. A demonstration involving incidence rates of leukemia in New York state is used to highlight the methodology. The model allows us to make probability statements on the incidence rates around point sources without making any parametric assumptions about the nature of the influence between the sources and the surrounding location.

Journal ArticleDOI
TL;DR: Estimators for the difference of the restricted mean lifetime between two groups that account for treatment imbalances in prognostic factors assuming a proportional hazards relationship are proposed and large-sample properties of these estimators based on martingale theory for counting processes are derived.
Abstract: Summary. When comparing survival times between two treatment groups, it may be more appropriate to compare the restricted mean lifetime, i.e., the expectation of lifetime restricted to a time L, rather than mean lifetime in order to accommodate censoring. When the treatments are not assigned to patients randomly, as in observational studies, we also need to account for treatment imbalances in confounding factors. In this article, we propose estimators for the difference of the restricted mean lifetime between two groups that account for treatment imbalances in prognostic factors assuming a proportional hazards relationship. Large-sample properties of our estimators based on martingale theory for counting processes are also derived. Simulation studies were conducted to compare these estimators and to assess the adequacy of the large-sample approximations. Our methods are also applied to an observational database of acute coronary syndrome patients from Duke University Medical Center to estimate the treatment effect on the restricted mean lifetime over 5 years.

Journal ArticleDOI
TL;DR: A class of simple designs that can be used in early dose-finding studies in HIV, in contrast with Phase I designs in cancer, have a lot of the Phase II flavor about them.
Abstract: We present a class of simple designs that can be used in early dose-finding studies in HIV. Such designs, in contrast with Phase I designs in cancer, have a lot of the Phase II flavor about them. Information on efficacy is obtained during the trial and is as important as that relating to toxicity. The designs proposed here sequentially incorporate the information obtained on viral reduction. Initial doses are given from some fixed range of dose regimens. The doses are ordered in terms of their toxic potential. At any dose, a patient can have one of three outcomes: inability to take the treatment (toxicity), ability to take the treatment but insufficient reduction in viral load (viral failure), and ability to take the treatment as well as a sufficient reduction of viral load (success). A clear goal for some class of designs would be the identification of the dose leading to the greatest percentage of successes. Under certain assumptions, which we identify and discuss, we can obtain efficient designs for this task. Under weaker, sometimes more realistic assumptions, we can still obtain designs that have good operating characteristics in identifying a level, if such a level exists, having some given or greater success rate. In the absence of such a level, the designs will come to an early closure, indicating the ineffectiveness of the new treatment.

Journal ArticleDOI
TL;DR: The aim of this article is to present hierarchical Bayesian approaches that allow one to simultaneously incorporate temporal and spatial dependencies between pixels directly in the model formulation.
Abstract: Mapping of the human brain by means of functional magnetic resonance imaging (fMRI) is an emerging field in cognitive and clinical neuroscience. Current techniques to detect activated areas of the brain mostly proceed in two steps. First, conventional methods of correlation, regression, and time series analysis are used to assess activation by a separate, pixelwise comparison of the fMRI signal time courses to the reference function of a presented stimulus. Spatial aspects caused by correlations between neighboring pixels are considered in a separate second step, if at all. The aim of this article is to present hierarchical Bayesian approaches that allow one to simultaneously incorporate temporal and spatial dependencies between pixels directly in the model formulation. For reasons of computational feasibility, models have to be comparatively parsimonious, without oversimplifying. We introduce parametric and semiparametric spatial and spatiotemporal models that proved appropriate and illustrate their performance applied to visual fMRI data.

Journal ArticleDOI
TL;DR: A new dissimilarity measure based on Kullback–Leibler discrepancy between frequencies of all n‐words in the two sequences is introduced and can significantly enhance the current technology in comparing large datasets of DNA sequences.
Abstract: In molecular biology, the issue of quantifying the similarity between two biological sequences is very important. Past research has shown that word-based search tools are computationally efficient and can find some new functional similarities or dissimilarities invisible to other algorithms like FASTA. Recently, under the independent model of base composition, Wu, Burke, and Davison (1997, Biometrics 53, 1431 1439) characterized a family of word-based dissimilarity measures that defined distance between two sequences by simultaneously comparing the frequencies of all subsequences of n adjacent letters (i.e., n-words) in the two sequences. Specifically, they introduced the use of Mahalanobis distance and standardized Euclidean distance into the study of DNA sequence dissimilarity. They showed that both distances had better sensitivity and selectivity than the commonly used Euclidean distance. The purpose of this article is to extend Mahalanobis and standardized Euclidean distances to Markov chain models of base composition. In addition, a new dissimilarity measure based on Kullback-Leibler discrepancy between frequencies of all n-words in the two sequences is introduced. Applications to real data demonstrate that Kullback-Leibler discrepancy gives a better performance than Euclidean distance. Moreover, under a Markov chain model of order kQ for base composition, where kQ is the estimated order based on the query sequence, standardized Euclidean distance performs very well. Under such a model, it performs as well as Mahalanobis distance and better than Kullback-Leibler discrepancy and Euclidean distance. Since standardized Euclidean distance is drastically faster to compute than Mahalanobis distance, in a usual workstation/PC computing environment, the use of standardized Euclidean distance under the Markov chain model of order kQ of base composition is generally recommended. However, if the user is very concerned with computational efficiency, then the use of Kullback-Leibler discrepancy, which can be computed as fast as Euclidean distance, is recommended. This can significantly enhance the current technology in comparing large datasets of DNA sequences.

Journal ArticleDOI
TL;DR: Three statistical models are developed for multiply imputing the missing values of airborne particulate matter and it is expected that these models are useful for creating multiple imputations in a variety of incomplete multivariate time series data sets.
Abstract: Summary. Many chemical and environmental data sets are complicated by the existence of fully missing values or censored values known to lie below detection thresholds. For example, week-long samples of airborne particulate matter were obtained at Alert, NWT, Canada, between 1980 and 1991, where some of the concentrations of 24 particulate constituents were coarsened in the sense of being either fully missing or below detection limits. To facilitate scientific analysis, it is appealing to create complete data by filling in missing values so that standard complete-data methods can be applied. We briefly review commonly used strategies for handling missing values and focus on the multiple-imputation approach, which generally leads to valid inferences when faced with missing data. Three statistical models are developed for multiply imputing the missing values of airborne particulate matter. We expect that these models are useful for creating multiple imputations in a variety of incomplete multivariate time series data sets.

Journal ArticleDOI
TL;DR: It is studies whether the use of multiple markers can improve inferences about a treatment's effects on a clinical endpoint and proposes two complementary measures of the relative benefit of multiple surrogates as opposed to a single one.
Abstract: Surrogate endpoints are desirable because they typically result in smaller, faster efficacy studies compared with the ones using the clinical endpoints. Research on surrogate endpoints has received substantial attention lately, but most investigations have focused on the validity of using a single biomarker as a surrogate. Our paper studies whether the use of multiple markers can improve inferences about a treatment's effects on a clinical endpoint. We propose a joint model for a time to clinical event and for repeated measures over time on multiple biomarkers that are potential surrogates. This model extends the formulation of Xu and Zeger (2001, in press) and Fawcett and Thomas (1996, Statistics in Medicine 15, 1663-1685). We propose two complementary measures of the relative benefit of multiple surrogates as opposed to a single one. Markov chain Monte Carlo is implemented to estimate model parameters. The methodology is illustrated with an analysis of data from a schizophrenia clinical trial.

Journal ArticleDOI
TL;DR: A method to estimate the regression coefficients in a competing risks model where the cause-specific hazard for the cause of interest is related to covariates through a proportional hazards relationship and when cause of failure is missing for some individuals is proposed.
Abstract: We propose a method to estimate the regression coefficients in a competing risks model where the cause-specific hazard for the cause of interest is related to covariates through a proportional hazards relationship and when cause of failure is missing for some individuals. We use multiple imputation procedures to impute missing cause of failure, where the probability that a missing cause is the cause of interest may depend on auxiliary covariates, and combine the maximum partial likelihood estimators computed from several imputed data sets into an estimator that is consistent and asymptotically normal. A consistent estimator for the asymptotic variance is also derived. Simulation results suggest the relevance of the theory in finite samples. Results are also illustrated with data from a breast cancer study.

Journal ArticleDOI
TL;DR: Reducing dimensionality by specifying semiparametric cause-specific selection models is proposed, useful for conducting a sensitivity analysis to examine how inference for the treatment-arm mean difference changes as one varies the magnitude of the cause- specific selection bias over a plausible range.
Abstract: We consider inference for the treatment-arm mean difference of an outcome that would have been measured at the end of a randomized follow-up study if, during the course of the study, patients had not initiated a nonrandomized therapy or dropped out. We argue that the treatment-arm mean difference is not identified unless unverifiable assumptions are made. We describe identifying assumptions that are tantamount to postulating relationships between the components of a pattern-mixture model but that can also be interpreted as imposing restrictions on the cause-specific censoring probabilities of a selection model. We then argue that, although sufficient for identification, these assumptions are insufficient for inference due to the curse of dimensionality. We propose reducing dimensionality by specifying semiparametric cause-specific selection models. These models are useful for conducting a sensitivity analysis to examine how inference for the treatment-arm mean difference changes as one varies the magnitude of the cause-specific selection bias over a plausible range. We provide methodology for conducting such sensitivity analysis and illustrate our methods with an analysis of data from the AIDS Clinical Trial Group (ACTG) study 002.

Journal ArticleDOI
TL;DR: It is found that the benchmark calculations are highly dependent on the choice of the dose‐effect function and the definition of the benchmark dose, and it is recommended that several sets of biologically relevant default settings be used to illustrate the effect on the benchmark results.
Abstract: A threshold for dose-dependent toxicity is crucial for standards setting but may not be possible to specify from empirical studies. Crump (1984) instead proposed calculating the lower statistical confidence bound of the benchmark dose, which he defined as the dose that causes a small excess risk. This concept has several advantages and has been adopted by regulatory agencies for establishing safe exposure limits for toxic substances such as mercury. We have examined the validity of this method as applied to an epidemiological study of continuous response data associated with mercury exposure. For models that are linear in the parameters, we derived an approximative expression for the lower confidence bound of the benchmark dose. We find that the benchmark calculations are highly dependent on the choice of the dose-effect function and the definition of the benchmark dose. We therefore recommend that several sets of biologically relevant default settings be used to illustrate the effect on the benchmark results and to stimulate research that will guide an a priori choice of proper default settings.

Journal ArticleDOI
TL;DR: Generalized additive mixed models for the analysis of geographic and temporal variability of mortality rates and the production of a series of smoothed maps from which spatial patterns of mortality risks can be monitored over time are proposed.
Abstract: This article proposes generalized additive mixed models for the analysis of geographic and temporal variability of mortality rates. This class of models accommodates random spatial effects and fixed and random temporal components. Spatiotemporal models that use autoregressive local smoothing across the spatial dimension and B‐spline smoothing over the temporal dimension are developed. The objective is the identification of temporal trends and the production of a series of smoothed maps from which spatial patterns of mortality risks can be monitored over time. Regions with consistently high rate estimates may be followed for further investigation. The methodology is illustrated by analysis of British Columbia infant mortality data.

Journal ArticleDOI
TL;DR: In this proposal, a two-stage adaptive design is comprised of a main stage and an extension stage, where the main stage has sufficient power to reject the null under the anticipated effect size and the extension stage allows increasing the sample size in case the true effect size is smaller than anticipated.
Abstract: Proschan and Hunsberger (1995, Biometrics 51, 1315-1324) proposed a two-stage adaptive design that maintains the Type I error rate. For practical applications, a two-stage adaptive design is also required to achieve a desired statistical power while limiting the maximum overall sample size. In our proposal, a two-stage adaptive design is comprised of a main stage and an extension stage, where the main stage has sufficient power to reject the null under the anticipated effect size and the extension stage allows increasing the sample size in case the true effect size is smaller than anticipated. For statistical inference, methods for obtaining the overall adjusted p-value, point estimate and confidence intervals are developed. An exact two-stage test procedure is also outlined for robust inference.

Journal ArticleDOI
TL;DR: A semiparametric cure rate model with a smoothing parameter that controls the degree of parametricity in the right tail of the survival distribution is proposed and it is shown that such a parameter is crucial for these kinds of models and can have an impact on the posterior estimates.
Abstract: We propose methods for Bayesian inference for a new class of semiparametric survival models with a cure fraction. Specifically, we propose a semiparametric cure rate model with a smoothing parameter that controls the degree of parametricity in the right tail of the survival distribution. We show that such a parameter is crucial for these kinds of models and can have an impact on the posterior estimates. Several novel properties of the proposed model are derived. In addition, we propose a class of improper noninformative priors based on this model and examine the properties of the implied posterior. Also, a class of informative priors based on historical data is proposed and its theoretical properties are investigated. A case study involving a melanoma clinical trial is discussed in detail to demonstrate the proposed methodology.

Journal ArticleDOI
TL;DR: This work shows that the problem of small discrepancies between the guesses and the actual probabilities can seriously degrade odds-ratio estimates is mitigated by a Bayes analysis that incorporates uncertainty about the classification probabilities as prior information.
Abstract: Summary. Consider case-control analysis with a dichotomous exposure variable that is subject to misclassification. If the classification probabilities are known, then methods are available to adjust odds-ratio estimates in light of the misclassification. We study the realistic scenario where reasonable guesses, but not exact values, are available for the classification probabilities. If the analysis proceeds by simply treating the guesses as exact, then even small discrepancies between the guesses and the actual probabilities can seriously degrade odds-ratio estimates. We show that this problem is mitigated by a Bayes analysis that incorporates uncertainty about the classification probabilities as prior information.

Journal ArticleDOI
TL;DR: In this paper, Markov chain Monte Carlo (MCMCMC) algorithms based on the approach of Albert and Chib (1993, Journal of the American Statistical Association88, 669-679) are developed for the fitting of these models.
Abstract: Summary. This paper considers the class of sequential ordinal models in relation to other models for ordinal response data. Markov chain Monte Carlo (MCMC) algorithms, based on the approach of Albert and Chib (1993, Journal of the American Statistical Association88, 669–679), are developed for the fitting of these models. The ideas and methods are illustrated in detail with a real data example on the length of hospital stay for patients undergoing heart surgery. A notable aspect of this analysis is the comparison, based on marginal likelihoods and training sample priors, of several nonnested models, such as the sequential model, the cumulative ordinal model, and Weibull and log-logistic models.