scispace - formally typeset
Search or ask a question

Showing papers in "Lifetime Data Analysis in 2018"


Journal ArticleDOI
TL;DR: A new conditional screening method for survival outcome data is proposed by computing the marginal contribution of each biomarker given priorily known biological information, based on the premise that some biomarkers are known to be associated with disease outcomes a priori.
Abstract: Identifying important biomarkers that are predictive for cancer patients’ prognosis is key in gaining better insights into the biological influences on the disease and has become a critical component of precision medicine. The emergence of large-scale biomedical survival studies, which typically involve excessive number of biomarkers, has brought high demand in designing efficient screening tools for selecting predictive biomarkers. The vast amount of biomarkers defies any existing variable selection methods via regularization. The recently developed variable screening methods, though powerful in many practical setting, fail to incorporate prior information on the importance of each biomarker and are less powerful in detecting marginally weak while jointly important signals. We propose a new conditional screening method for survival outcome data by computing the marginal contribution of each biomarker given priorily known biological information. This is based on the premise that some biomarkers are known to be associated with disease outcomes a priori. Our method possesses sure screening properties and a vanishing false selection rate. The utility of the proposal is further confirmed with extensive simulation studies and analysis of a diffuse large B-cell lymphoma dataset. We are pleased to dedicate this work to Jack Kalbfleisch, who has made instrumental contributions to the development of modern methods of analyzing survival data.

37 citations


Journal ArticleDOI
TL;DR: The proposed generalized estimating equation methods to model RMST as a function of baseline covariates avoid potentially problematic distributional assumptions pertaining to restricted survival time and allow censoring to depend on both baseline and time-dependent factors.
Abstract: Restricted mean survival time (RMST) is often of great clinical interest in practice. Several existing methods involve explicitly projecting out patient-specific survival curves using parameters estimated through Cox regression. However, it would often be preferable to directly model the restricted mean for convenience and to yield more directly interpretable covariate effects. We propose generalized estimating equation methods to model RMST as a function of baseline covariates. The proposed methods avoid potentially problematic distributional assumptions pertaining to restricted survival time. Unlike existing methods, we allow censoring to depend on both baseline and time-dependent factors. Large sample properties of the proposed estimators are derived and simulation studies are conducted to assess their finite sample performance. We apply the proposed methods to model RMST in the absence of liver transplantation among end-stage liver disease patients. This analysis requires accommodation for dependent censoring since pre-transplant mortality is dependently censored by the receipt of a liver transplant.

24 citations


Journal ArticleDOI
TL;DR: This paper considers situations where certain covariates are expensive to measure, so they are obtained only for selected individuals in a cohort, and focuses on cases where there is no association between failure time and expensive covariates.
Abstract: Two- or multi-phase study designs are often used in settings involving failure times. In most studies, whether or not certain covariates are measured on an individual depends on their failure time and status. For example, when failures are rare, case-cohort or case-control designs are used to increase the number of failures relative to a random sample of the same size. Another scenario is where certain covariates are expensive to measure, so they are obtained only for selected individuals in a cohort. This paper considers such situations and focuses on cases where we wish to test hypotheses of no association between failure time and expensive covariates. Efficient score tests based on maximum likelihood are developed and shown to have a simple form for a wide class of models and sampling designs. Some numerical comparisons of study designs are presented.

21 citations


Journal ArticleDOI
TL;DR: This study shows that the exponentiated Weibull distribution is closed under the accelerated failure time family, formulates a regression model based on the exponentiate WeIBull distribution, and develops large sample theory for statistical inference.
Abstract: The Weibull, log-logistic and log-normal distributions are extensively used to model time-to-event data. The Weibull family accommodates only monotone hazard rates, whereas the log-logistic and log-normal are widely used to model unimodal hazard functions. The increasing availability of lifetime data with a wide range of characteristics motivate us to develop more flexible models that accommodate both monotone and nonmonotone hazard functions. One such model is the exponentiated Weibull distribution which not only accommodates monotone hazard functions but also allows for unimodal and bathtub shape hazard rates. This distribution has demonstrated considerable potential in univariate analysis of time-to-event data. However, the primary focus of many studies is rather on understanding the relationship between the time to the occurrence of an event and one or more covariates. This leads to a consideration of regression models that can be formulated in different ways in survival analysis. One such strategy involves formulating models for the accelerated failure time family of distributions. The most commonly used distributions serving this purpose are the Weibull, log-logistic and log-normal distributions. In this study, we show that the exponentiated Weibull distribution is closed under the accelerated failure time family. We then formulate a regression model based on the exponentiated Weibull distribution, and develop large sample theory for statistical inference. We also describe a Bayesian approach for inference. Two comparative studies based on real and simulated data sets reveal that the exponentiated Weibull regression can be valuable in adequately describing different types of time-to-event data.

18 citations


Journal ArticleDOI
TL;DR: A new method to simulate uniform pairs with PVF dependence structure based on conditional sampling for copulas and on numerical approximation to solve a target equation is proposed and small sample properties of the Bayesian estimators are explored.
Abstract: Copula models have become increasingly popular for modelling the dependence structure in multivariate survival data. The two-parameter Archimedean family of Power Variance Function (PVF) copulas includes the Clayton, Positive Stable (Gumbel) and Inverse Gaussian copulas as special or limiting cases, thus offers a unified approach to fitting these important copulas. Two-stage frequentist procedures for estimating the marginal distributions and the PVF copula have been suggested by Andersen (Lifetime Data Anal 11:333–350, 2005), Massonnet et al. (J Stat Plann Inference 139(11):3865–3877, 2009) and Prenen et al. (J R Stat Soc Ser B 79(2):483–505, 2017) which first estimate the marginal distributions and conditional on these in a second step to estimate the PVF copula parameters. Here we explore an one-stage Bayesian approach that simultaneously estimates the marginal and the PVF copula parameters. For the marginal distributions, we consider both parametric as well as semiparametric models. We propose a new method to simulate uniform pairs with PVF dependence structure based on conditional sampling for copulas and on numerical approximation to solve a target equation. In a simulation study, small sample properties of the Bayesian estimators are explored. We illustrate the usefulness of the methodology using data on times to appendectomy for adult twins in the Australian NH&MRC Twin registry. Parameters of the marginal distributions and the PVF copula are simultaneously estimated in a parametric as well as a semiparametric approach where the marginal distributions are modelled using Weibull and piecewise exponential distributions, respectively.

18 citations


Journal ArticleDOI
TL;DR: A censored cumulative residual independent screening method that is model-free and enjoys the sure independent screening property, and it is invariant to the monotone transformation of the responses, as well as requiring substantially weaker moment conditions.
Abstract: For complete ultrahigh-dimensional data, sure independent screening methods can effectively reduce the dimensionality while retaining all the active variables with high probability. However, limited screening methods have been developed for ultrahigh-dimensional survival data subject to censoring. We propose a censored cumulative residual independent screening method that is model-free and enjoys the sure independent screening property. Active variables tend to be ranked above the inactive ones in terms of their association with the survival times. Compared with several existing methods, our model-free screening method works well with general survival models, and it is invariant to the monotone transformation of the responses, as well as requiring substantially weaker moment conditions. Numerical studies demonstrate the usefulness of the censored cumulative residual independent screening method, and the new approach is illustrated with a gene expression data set.

14 citations


Journal ArticleDOI
TL;DR: This work develops a method to conduct valid analysis when additional auxiliary variables are available for cases only, and uses an informative likelihood approach that will yield consistent estimates even when the underlying model for missing cause of failure is misspecified.
Abstract: In the analysis of time-to-event data with multiple causes using a competing risks Cox model, often the cause of failure is unknown for some of the cases. The probability of a missing cause is typically assumed to be independent of the cause given the time of the event and covariates measured before the event occurred. In practice, however, the underlying missing-at-random assumption does not necessarily hold. Motivated by colorectal cancer molecular pathological epidemiology analysis, we develop a method to conduct valid analysis when additional auxiliary variables are available for cases only. We consider a weaker missing-at-random assumption, with missing pattern depending on the observed quantities, which include the auxiliary covariates. We use an informative likelihood approach that will yield consistent estimates even when the underlying model for missing cause of failure is misspecified. The superiority of our method over naive methods in finite samples is demonstrated by simulation study results. We illustrate the use of our method in an analysis of colorectal cancer data from the Nurses' Health Study cohort, where, apparently, the traditional missing-at-random assumption fails to hold.

14 citations


Journal ArticleDOI
TL;DR: In this paper, a general technique, reverse alignment, is used for constructing statistical models for survival processes, termed revival models, which incorporates covariate and treatment effects into both the distribution of survival times and the joint distribution of health outcomes.
Abstract: Survival studies often generate not only a survival time for each patient but also a sequence of health measurements at annual or semi-annual check-ups while the patient remains alive. Such a sequence of random length accompanied by a survival time is called a survival process. Robust health is ordinarily associated with longer survival, so the two parts of a survival process cannot be assumed independent. This paper is concerned with a general technique—reverse alignment—for constructing statistical models for survival processes, here termed revival models. A revival model is a regression model in the sense that it incorporates covariate and treatment effects into both the distribution of survival times and the joint distribution of health outcomes. The revival model also determines a conditional survival distribution given the observed history, which describes how the subsequent survival distribution is determined by the observed progression of health outcomes.

12 citations


Journal ArticleDOI
TL;DR: This work relies on semi-parametric theory to derive an augmented inverse probability of censoring weighted (AIPCW) estimator and applies it to evaluate the safety and efficacy of three anti-HIV regimens in a randomized trial conducted by the AIDS Clinical Trial Group, ACTG A5095.
Abstract: Competing risks occur in a time-to-event analysis in which a patient can experience one of several types of events. Traditional methods for handling competing risks data presuppose one censoring process, which is assumed to be independent. In a controlled clinical trial, censoring can occur for several reasons: some independent, others dependent. We propose an estimator of the cumulative incidence function in the presence of both independent and dependent censoring mechanisms. We rely on semi-parametric theory to derive an augmented inverse probability of censoring weighted (AIPCW) estimator. We demonstrate the efficiency gained when using the AIPCW estimator compared to a non-augmented estimator via simulations. We then apply our method to evaluate the safety and efficacy of three anti-HIV regimens in a randomized trial conducted by the AIDS Clinical Trial Group, ACTG A5095.

12 citations


Journal ArticleDOI
TL;DR: A joint modeling approach to a finite mixture of NLME models for longitudinal data and proportional hazard Cox model for time-to-event data, linked by individual latent class indicators, under a Bayesian framework is developed.
Abstract: Longitudinal and time-to-event data are often observed together. Finite mixture models are currently used to analyze nonlinear heterogeneous longitudinal data, which, by releasing the homogeneity restriction of nonlinear mixed-effects (NLME) models, can cluster individuals into one of the pre-specified classes with class membership probabilities. This clustering may have clinical significance, and be associated with clinically important time-to-event data. This article develops a joint modeling approach to a finite mixture of NLME models for longitudinal data and proportional hazard Cox model for time-to-event data, linked by individual latent class indicators, under a Bayesian framework. The proposed joint models and method are applied to a real AIDS clinical trial data set, followed by simulation studies to assess the performance of the proposed joint model and a naive two-step model, in which finite mixture model and Cox model are fitted separately.

12 citations


Journal ArticleDOI
TL;DR: An adaptive group bridge method, enabling simultaneous selection both within and between groups, for competing risks data, and possesses excellent asymptotic properties, including variable selection consistency at group and within-group levels.
Abstract: Variable selection in the presence of grouped variables is troublesome for competing risks data: while some recent methods deal with group selection only, simultaneous selection of both groups and within-group variables remains largely unexplored. In this context, we propose an adaptive group bridge method, enabling simultaneous selection both within and between groups, for competing risks data. The adaptive group bridge is applicable to independent and clustered data. It also allows the number of variables to diverge as the sample size increases. We show that our new method possesses excellent asymptotic properties, including variable selection consistency at group and within-group levels. We also show superior performance in simulated and real data sets over several competing approaches, including group bridge, adaptive group lasso, and AIC / BIC-based methods.

Journal ArticleDOI
TL;DR: A nonparametric survivor function estimator for an arbitrary number of failure time variates that has a simple recursive formula for its calculation.
Abstract: The Dabrowska (Ann Stat 16:1475–1489, 1988) product integral representation of the multivariate survivor function is extended, leading to a nonparametric survivor function estimator for an arbitrary number of failure time variates that has a simple recursive formula for its calculation. Empirical process methods are used to sketch proofs for this estimator’s strong consistency and weak convergence properties. Summary measures of pairwise and higher-order dependencies are also defined and nonparametrically estimated. Simulation evaluation is given for the special case of three failure time variates.

Journal ArticleDOI
TL;DR: A sieve maximum likelihood approach is developed for the joint analysis and in the proposed method, Bernstein polynomials are used to approximate unknown functions and the asymptotic properties of the resulting estimators are established and the proposed estimators of regression parameters are shown to be semiparametrically efficient.
Abstract: Interval-censored failure time data and panel count data are two types of incomplete data that commonly occur in event history studies and many methods have been developed for their analysis separately (Sun in The statistical analysis of interval-censored failure time data. Springer, New York, 2006; Sun and Zhao in The statistical analysis of panel count data. Springer, New York, 2013). Sometimes one may be interested in or need to conduct their joint analysis such as in the clinical trials with composite endpoints, for which it does not seem to exist an established approach in the literature. In this paper, a sieve maximum likelihood approach is developed for the joint analysis and in the proposed method, Bernstein polynomials are used to approximate unknown functions. The asymptotic properties of the resulting estimators are established and in particular, the proposed estimators of regression parameters are shown to be semiparametrically efficient. In addition, an extensive simulation study was conducted and the proposed method is applied to a set of real data arising from a skin cancer study.

Journal ArticleDOI
TL;DR: It is shown that the Kolmogorov Forward Differential Equations can be used to derive a relation between the prevalence and the transition rates in the illness-death model, and it is proved mathematical well-definedness and epidemiological meaningfulness of the prevalence of the disease.
Abstract: The aim of this work is to relate the theory of stochastic processes with the differential equations associated with multistate (compartment) models. We show that the Kolmogorov Forward Differential Equations can be used to derive a relation between the prevalence and the transition rates in the illness-death model. Then, we prove mathematical well-definedness and epidemiological meaningfulness of the prevalence of the disease. As an application, we derive the incidence of diabetes from a series of cross-sections.

Journal ArticleDOI
TL;DR: A sieve approximation maximum likelihood approach is presented and the asymptotic properties of the resulting estimators are established and an extensive simulation study indicates that the method seems to work well for practical situations.
Abstract: This paper discusses regression analysis of doubly censored failure time data when there may exist a cured subgroup. By doubly censored data, we mean that the failure time of interest denotes the elapsed time between two related events and the observations on both event times can suffer censoring (Sun in The statistical analysis of interval-censored failure time data. Springer, New York, 2006). One typical example of such data is given by an acquired immune deficiency syndrome cohort study. Although many methods have been developed for their analysis (De Gruttola and Lagakos in Biometrics 45:1–12, 1989; Sun et al. in Biometrics 55:909–914, 1999; 60:637–643, 2004; Pan in Biometrics 57:1245–1250, 2001), it does not seem to exist an established method for the situation with a cured subgroup. This paper discusses this later problem and presents a sieve approximation maximum likelihood approach. In addition, the asymptotic properties of the resulting estimators are established and an extensive simulation study indicates that the method seems to work well for practical situations. An application is also provided.

Journal ArticleDOI
TL;DR: A marginal approach is chosen to study more complex correlation structures in modeling the infection times of the four udder quarters clustered within the cow, leaving the modeling of marginal distributions unaffected by the association parameters.
Abstract: The correlation structure imposed on multivariate time to event data is often of a simple nature, such as in the shared frailty model where pairwise correlations between event times in a cluster are all the same. In modeling the infection times of the four udder quarters clustered within the cow, more complex correlation structures are possibly required, and if so, such more complex correlation structures give more insight in the infection process. In this article, we will choose a marginal approach to study more complex correlation structures, therefore leaving the modeling of marginal distributions unaffected by the association parameters. The dependency of failure times will be induced through copula functions. The methods are shown for (mixtures of) the Clayton copula, but can be generalized to mixtures of Archimedean copulas for which the nesting conditions are met (McNeil in J Stat Comput Simul 6:567–581, 2008; Hofert in Comput Stat Data Anal 55:57–70, 2011).

Journal ArticleDOI
TL;DR: A conditional likelihood approach is proposed and the conditional maximum likelihood estimators (cMLE) for the regression parameters and cumulative hazard function of these models are developed to be consistent and asymptotically normal.
Abstract: Left-truncated data often arise in epidemiology and individual follow-up studies due to a biased sampling plan since subjects with shorter survival times tend to be excluded from the sample. Moreover, the survival time of recruited subjects are often subject to right censoring. In this article, a general class of semiparametric transformation models that include proportional hazards model and proportional odds model as special cases is studied for the analysis of left-truncated and right-censored data. We propose a conditional likelihood approach and develop the conditional maximum likelihood estimators (cMLE) for the regression parameters and cumulative hazard function of these models. The derived score equations for regression parameter and infinite-dimensional function suggest an iterative algorithm for cMLE. The cMLE is shown to be consistent and asymptotically normal. The limiting variances for the estimators can be consistently estimated using the inverse of negative Hessian matrix. Intensive simulation studies are conducted to investigate the performance of the cMLE. An application to the Channing House data is given to illustrate the methodology.

Journal ArticleDOI
TL;DR: This work considers observational studies in pregnancy where the outcome of interest is spontaneous abortion (SAB), and develops a conditional nonparametric maximum likelihood approach that addresses both occurrence and timing of SAB as compared to existing approaches in practice.
Abstract: We consider observational studies in pregnancy where the outcome of interest is spontaneous abortion (SAB). This at first sight is a binary ‘yes’ or ‘no’ variable, albeit there is left truncation as well as right-censoring in the data. Women who do not experience SAB by gestational week 20 are ‘cured’ from SAB by definition, that is, they are no longer at risk. Our data is different from the common cure data in the literature, where the cured subjects are always right-censored and not actually observed to be cured. We consider a commonly used cure rate model, with the likelihood function tailored specifically to our data. We develop a conditional nonparametric maximum likelihood approach. To tackle the computational challenge we adopt an EM algorithm making use of “ghost copies” of the data, and a closed form variance estimator is derived. Under suitable assumptions, we prove the consistency of the resulting estimator which involves an unbounded cumulative baseline hazard function, as well as the asymptotic normality. Simulation results are carried out to evaluate the finite sample performance. We present the analysis of the motivating SAB study to illustrate the advantages of our model addressing both occurrence and timing of SAB, as compared to existing approaches in practice.

Journal ArticleDOI
TL;DR: This work considers the problem of selecting important prognostic biomarkers from a large set of candidates when the event times of interest are truncated and right- or interval-censored and describes an expectation–maximization algorithm which is empirically shown to perform well.
Abstract: With the increasing availability of large prospective disease registries, scientists studying the course of chronic conditions often have access to multiple data sources, with each source generated based on its own entry conditions. The different entry conditions of the various registries may be explicitly based on the response process of interest, in which case the statistical analysis must recognize the unique truncation schemes. Moreover, intermittent assessment of individuals in the registries can lead to interval-censored times of interest. We consider the problem of selecting important prognostic biomarkers from a large set of candidates when the event times of interest are truncated and right- or interval-censored. Methods for penalized regression are adapted to handle truncation via a Turnbull-type complete data likelihood. An expectation-maximization algorithm is described which is empirically shown to perform well. Inverse probability weights are used to adjust for the selection bias when assessing predictive accuracy based on individuals whose event status is known at a time of interest. Application to the motivating study of the development of psoriatic arthritis in patients with psoriasis in both the psoriasis cohort and the psoriatic arthritis cohort illustrates the procedure.

Journal ArticleDOI
TL;DR: A fully parametric approach to semi-competing risks modeling, where the time to the terminal event is the first passage time to a fixed level c in a stochastic process, while thetime to the non-terminal event is represented by the first Passage time of the same process to a Stochastic threshold S.
Abstract: In semi-competing risks one considers a terminal event, such as death of a person, and a non-terminal event, such as disease recurrence. We present a model where the time to the terminal event is the first passage time to a fixed level c in a stochastic process, while the time to the non-terminal event is represented by the first passage time of the same process to a stochastic threshold S, assumed to be independent of the stochastic process. In order to be explicit, we let the stochastic process be a gamma process, but other processes with independent increments may alternatively be used. For semi-competing risks this appears to be a new modeling approach, being an alternative to traditional approaches based on illness-death models and copula models. In this paper we consider a fully parametric approach. The likelihood function is derived and statistical inference in the model is illustrated on both simulated and real data.

Journal ArticleDOI
TL;DR: Simulation studies show that with realistic sample sizes and censoring rates, the proposed tests have the desired Type I error probabilities and are more powerful than the adjusted log-rank test when the treatment-specific hazards differ in non-proportional ways.
Abstract: When observational data are used to compare treatment-specific survivals, regular two-sample tests, such as the log-rank test, need to be adjusted for the imbalance between treatments with respect to baseline covariate distributions. Besides, the standard assumption that survival time and censoring time are conditionally independent given the treatment, required for the regular two-sample tests, may not be realistic in observational studies. Moreover, treatment-specific hazards are often non-proportional, resulting in small power for the log-rank test. In this paper, we propose a set of adjusted weighted log-rank tests and their supremum versions by inverse probability of treatment and censoring weighting to compare treatment-specific survivals based on data from observational studies. These tests are proven to be asymptotically correct. Simulation studies show that with realistic sample sizes and censoring rates, the proposed tests have the desired Type I error probabilities and are more powerful than the adjusted log-rank test when the treatment-specific hazards differ in non-proportional ways. A real data example illustrates the practical utility of the new methods.

Journal ArticleDOI
TL;DR: Under the additive hazards model, a frailty model is employed to describe the relationship between the failure time of interest and censoring time through some latent variables and an estimated partial likelihood estimator of regression parameters that makes use of the available auxiliary information is proposed.
Abstract: This paper discusses regression analysis of current status failure time data with information observations and continuous auxiliary covariates. Under the additive hazards model, we employ a frailty model to describe the relationship between the failure time of interest and censoring time through some latent variables and propose an estimated partial likelihood estimator of regression parameters that makes use of the available auxiliary information. Asymptotic properties of the resulting estimators are established. To assess the finite sample performance of the proposed method, an extensive simulation study is conducted, and the results indicate that the proposed method works well. An illustrative example is also provided.

Journal ArticleDOI
TL;DR: This paper develops a novel modeling approach for estimating the time-lag period and for comparing the two treatments properly after thetime-lag effect is accommodated and shows that it is effective in practice.
Abstract: Medical treatments often take a period of time to reveal their impact on subjects, which is the so-called time-lag effect in the literature. In the survival data analysis literature, most existing methods compare two treatments in the entire study period. In cases when there is a substantial time-lag effect, these methods would not be effective in detecting the difference between the two treatments, because the similarity between the treatments during the time-lag period would diminish their effectiveness. In this paper, we develop a novel modeling approach for estimating the time-lag period and for comparing the two treatments properly after the time-lag effect is accommodated. Theoretical arguments and numerical examples show that it is effective in practice.

Journal ArticleDOI
TL;DR: This paper proposes to use a mixture of Gaussian distributions as an approximation to this unknown distribution and adopt an Expectation–Maximization (EM) algorithm for computation and demonstrates the proposed method via a number of simulation studies.
Abstract: Joint models with shared Gaussian random effects have been conventionally used in analysis of longitudinal outcome and survival endpoint in biomedical or public health research. However, misspecifying the normality assumption of random effects can lead to serious bias in parameter estimation and future prediction. In this paper, we study joint models of general longitudinal outcomes and survival endpoint but allow the underlying distribution of shared random effect to be completely unknown. For inference, we propose to use a mixture of Gaussian distributions as an approximation to this unknown distribution and adopt an Expectation-Maximization (EM) algorithm for computation. Either AIC and BIC criteria are adopted for selecting the number of mixtures. We demonstrate the proposed method via a number of simulation studies. We illustrate our approach with the data from the Carolina Head and Neck Cancer Study (CHANCE).

Journal ArticleDOI
TL;DR: This work introduces an inverse censoring probability re-weighted semi-parametric single index model based approach to estimate conditional state occupation probabilities of a given individual in a multistate model under right-censoring.
Abstract: Inference for the state occupation probabilities, given a set of baseline covariates, is an important problem in survival analysis and time to event multistate data. We introduce an inverse censoring probability re-weighted semi-parametric single index model based approach to estimate conditional state occupation probabilities of a given individual in a multistate model under right-censoring. Besides obtaining a temporal regression function, we also test the potential time varying effect of a baseline covariate on future state occupation. We show that the proposed technique has desirable finite sample performances and its performance is competitive when compared with three other existing approaches. We illustrate the proposed methodology using two different data sets. First, we re-examine a well-known data set dealing with leukemia patients undergoing bone marrow transplant with various state transitions. Our second illustration is based on data from a study involving functional status of a set of spinal cord injured patients undergoing a rehabilitation program.

Journal ArticleDOI
TL;DR: The dynamics of a chronic disease and its associated exacerbation-remission process over two time scales: calendar time and time-since-onset is considered and nonparametric estimation techniques for characteristic quantities of the process are provided.
Abstract: In the literature studying recurrent event data, a large amount of work has been focused on univariate recurrent event processes where the occurrence of each event is treated as a single point in time. There are many applications, however, in which univariate recurrent events are insufficient to characterize the feature of the process because patients experience nontrivial durations associated with each event. This results in an alternating event process where the disease status of a patient alternates between exacerbations and remissions. In this paper, we consider the dynamics of a chronic disease and its associated exacerbation-remission process over two time scales: calendar time and time-since-onset. In particular, over calendar time, we explore population dynamics and the relationship between incidence, prevalence and duration for such alternating event processes. We provide nonparametric estimation techniques for characteristic quantities of the process. In some settings, exacerbation processes are observed from an onset time until death; to account for the relationship between the survival and alternating event processes, nonparametric approaches are developed for estimating exacerbation process over lifetime. By understanding the population dynamics and within-process structure, the paper provide a new and general way to study alternating event processes.

Journal ArticleDOI
TL;DR: The properties of a regularized variable selection procedure in stratified case-cohort design under an additive hazards model with a diverging number of parameters are investigated and the consistency and asymptotic normality of the penalized estimator is established and its oracle property is proved.
Abstract: Case-cohort designs are commonly used in large epidemiological studies to reduce the cost associated with covariate measurement. In many such studies the number of covariates is very large. An efficient variable selection method is needed for case-cohort studies where the covariates are only observed in a subset of the sample. Current literature on this topic has been focused on the proportional hazards model. However, in many studies the additive hazards model is preferred over the proportional hazards model either because the proportional hazards assumption is violated or the additive hazards model provides more relevent information to the research question. Motivated by one such study, the Atherosclerosis Risk in Communities (ARIC) study, we investigate the properties of a regularized variable selection procedure in stratified case-cohort design under an additive hazards model with a diverging number of parameters. We establish the consistency and asymptotic normality of the penalized estimator and prove its oracle property. Simulation studies are conducted to assess the finite sample performance of the proposed method with a modified cross-validation tuning parameter selection methods. We apply the variable selection procedure to the ARIC study to demonstrate its practical use.

Journal ArticleDOI
TL;DR: This paper considers statistical inference for this type of data under the additive hazard model with asymptotic properties based on simple and augmented inverse probability based on Reweighting methods.
Abstract: Survival data with missing censoring indicators are frequently encountered in biomedical studies. In this paper, we consider statistical inference for this type of data under the additive hazard model. Reweighting methods based on simple and augmented inverse probability are proposed. The asymptotic properties of the proposed estimators are established. Furthermore, we provide a numerical technique for checking adequacy of the fitted model with missing censoring indicators. Our simulation results show that the proposed estimators outperform the simple and augmented inverse probability weighted estimators without reweighting. The proposed methods are illustrated by analyzing a dataset from a breast cancer study.


Journal ArticleDOI
Jing Yang1, Limin Peng1
TL;DR: A new nonparametric estimator of this dependence measure with left truncated semi-competing risks data is proposed that overcomes the limitation of the existing estimator that is resulted from demanding a strong assumption on the truncation mechanism.
Abstract: A semi-competing risks setting often arises in biomedical studies, involving both a nonterminal event and a terminal event. Cross quantile residual ratio (Yang and Peng in Biometrics 72:770-779, 2016) offers a flexible and robust perspective to study the dependency between the nonterminal and the terminal events which can shed useful scientific insight. In this paper, we propose a new nonparametric estimator of this dependence measure with left truncated semi-competing risks data. The new estimator overcomes the limitation of the existing estimator that is resulted from demanding a strong assumption on the truncation mechanism. We establish the asymptotic properties of the proposed estimator and develop inference procedures accordingly. Simulation studies suggest good finite-sample performance of the proposed method. Our proposal is illustrated via an application to Denmark diabetes registry data.