scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Regression analysis of competing risks data with general missing pattern in failure types

TL;DR: In this paper, the cause-specific hazard rates under the general missing pattern were estimated using some semi-parametric models, and the regression coefficients and the baseline hazards were investigated.
About: This article is published in Statistical Methodology.The article was published on 2016-03-01 and is currently open access. It has received 2 citations till now. The article focuses on the topics: Missing data & Nelson–Aalen estimator.

Summary (2 min read)

1 Introduction

  • Competing risks models are usually employed to analyse such type of data.
  • Due to inadequacy on the diagnostic mechanism one often uncertain about the true failure type or it is reluctant to reveal the value of J for certain individuals.
  • In carcinogenicity studies, death of individuals can be classified into death s with tumour or deaths due to other reasons.

2 Data and Models

  • Note that these hazard rates in (3) can also be viewed as another set of cause-specific hazard rates with the different g’s in G representing the different failure types.
  • This may be a strong assumption; but the authors relax this to some extent in the second model.
  • The baseline cause-specific hazard rates, λ0j(t)’s, are now assumed to be proportional to each other under this model with eγj ’s being the proportionality constants.
  • Both the models can be independently tested from standard competing risks data without any missingness.
  • Under the general missing pattern as discussed, none of these can be tested.

3 Estimation under Model 1

  • Also, these cause-specific hazard rates for the ‘modified’ competing risks problem are of the same semi-parametric form as those for the original cause-specific hazard rates in (4).
  • Hence, the following partial likelihood is the most appropriate to estimate the regression parameters β, in the absence of any knowledge on the baseline cause-specific hazards λ∗0g(t), based on ‘modified’ competing risks data which is available without any missing failure type (Kalbfleisch and Prentice, 1980, Sec. 7.2.3).

3.1 Estimation of regression parameters

  • Then, at each of these g-type failure times, say t(gi), the authors consider the conditional probability that the individual (gi) with covariate value z(gi) fails at time t(gi), given the history up to time t(gi)− and that one failure with missing pattern g occurs at time t(gi).
  • Clearly, this partial likelihood (7) can accommodate tied failure times with different missing pattern, but an approximation may be needed to deal with tied failure times with the same missing pattern.
  • Note that the standard asymptotic likelihood techniques can be applied to this partial likelihood (7) and to the estimate β̂ to make inference on β.

3.2 Estimation of baseline cumulative cause-specific hazards

  • The baseline cumulative cause-specific hazards Λ∗0g(t) = ∫ t 0 λ∗0g(u)du for the ‘modified’ competing risks problem can also be estimated as follows.
  • Under the same set of regularity conditions, as required for the asymptotic normality of β̂, the process( Λ̂∗0g(t)−.
  • There have been some works concerning estimation of these masking probabilities usually requiring either additional modeling assumptions or secondary data.
  • In practice, one can use “pooling-the-adjacent-violators” algorithm to achieve monotonicity.
  • If some of the {Ng(t)}’s are not observed to have any jump during the study, the corresponding Λ̂∗0g(t)’s, and the associated entries of Σ̂(t), turn out to be zero; the corresponding rows of P are also estimated, as given in Dewanji and Sengupta (2003), to be zero.

4 Estimation under Model 2

  • These have the similar semi-parametric form as those for the original cause-specific hazard rates in (5), except that the parametric component fg(z, t, θ ∼ ), for different g’s, are not of the simple exponential form.
  • Nevertheless, from Kalbfleisch and Prentice (1980, Sec. 7.2.3), the partial likelihood in (13) is the most appropriate to estimate the vector of regression parameters θ ∼ , in the absence of any knowledge on the baseline cause-specific hazards λ0(t), based on the ‘modified’ competing risks data which is available without any missingness.

4.1 Estimation of regression parameters

  • Then, at each of these failure times, say t(i), the authors consider the conditional probability that the individual (i) with covariate value z(i) fails at time t(i) with missing pattern g(i), given the history up to time t(i)− and that one failure occurs at time t(i).
  • See Dewanji (1992) for a special case of this partial likelihood.
  • An approximation may be needed to deal with tied failure times.
  • Note that the model (12) cannot be written as the underlying multiplicative hazard competing risks model of Andersen et al. (1993, Ch. VII.2) for the asymptotic results therein to be readily available, as for the model in (6).
  • The proofs of the asymptotic results follow the similar steps, as those in Andersen and Gill (1982) and Andersen et al. (1993, Ch VII.2), with little modification, as worked out by Prentice and Self (1983) in the context of ordinary survival data with general relative risk form.

4.2 Estimation of baseline cumulative cause-specific hazards

  • A note on a test for competing risks with missing failure type.
  • Nonparametric prevalence and mortality estimators for animal experiments with incomplete cause of death data.

Did you find this useful? Give us your feedback

Citations
More filters
Book ChapterDOI
01 Jan 2017
TL;DR: In this article, some statistical inference procedures used when the cause of failure is missing or masked for some units are reviewed.
Abstract: Competing risks data arise when the study units are exposed to several risks at the same time but it is assumed that the eventual failure of a unit is due to only one of these risks, which is called the “cause of failure”. Statistical inference procedures when the time to failure and the cause of failure are observed for each unit are well documented. In some applications, it is possible that the cause of failure is either missing or masked for some units. In this article, we review some statistical inference procedures used when the cause of failure is missing or masked for some units.
Journal ArticleDOI
01 Jun 2017
TL;DR: This paper considers the nonparametric estimation of cumulative cause specific reversed hazard rates for left censored competing risks data under masked causes of failure with maximum likelihood estimators and least squares type estimators.
Abstract: In the analysis of competing risks data, it is common that the exact cause of failure for certain study subjects is missing. This problem of missing failure type may be due to inadequacy in the diagnostic mechanism or reluctance to report the exact cause of failure. In the present paper, we consider the nonparametric estimation of cumulative cause specific reversed hazard rates for left censored competing risks data under masked causes of failure. We first develop maximum likelihood estimators of cumulative cause specific reversed hazard rates. We then consider the least squares type estimators for cumulative cause specific reversed hazard rates, when the information about the conditional probability of exact failure type given a set of possible failure types is available. Simulation studies are conducted to assess the performance of the proposed estimators. We illustrate the applicability of the proposed methods using a data set. Abstract
References
More filters
Journal ArticleDOI
TL;DR: In this article, the authors show how stage 1 and stage 2 information can be combined to provide statistical inference about (a) survival functions of individual risks, (b) the proportions of failures associated with individual risks and (c) probability, for a specified masked case, that each of the masked competing risks is responsible for the failure.
Abstract: Consider a life testing situation in which systems are subject to failure from independent competing risks. The hazards of various risks are proportional to each other. When a failure occurs, immediate, i.e. stage 1, procedures are used in an attempt to reach a definitive diagnosis. If a diagnosis is not reached, this phenomenon is called masking. Stage 2 procedures, such as failure analysis or autopsy, provide definitive diagnosis for a small sample of the masked cases. This paper shows how stage 1 and stage 2 information can be combined to provide statistical inference about (a) survival functions of the individual risks, (b) the proportions of failures associated with individual risks and (c) probability, for a specified masked case, that each of the masked competing risks is responsible for the failure.

57 citations

Journal ArticleDOI
TL;DR: An estimator of the net survival function, for time-to-death due to the cause of interest, is developed and is consistent and asymptotically normally distributed.
Abstract: SUMMARY A nonparametric estimator for the survival function, accommodating censored survival times and uncertainty in the assignment of cause of death, is proposed. For example, in a carcinogenicity experiment the data on each animal may consist of an observed age-at-death and some indication of the probability that the tumor type under study caused death. An estimator of the net survival function, for time-to-death due to the cause of interest, is developed. Under certain assumptions, the proposed estimator is consistent and asymptotically normally distributed. Monte Carlo simulations were used to compare this estimator with the Kaplan-Meier estimator. Forcing the cause of death to be specified with certainty, as required by the Kaplan-Meier estimator, may result in substantial biases.

52 citations

Journal ArticleDOI
TL;DR: In this paper, the authors proposed an EM algorithm for estimating the parameters of a weakly parameterised competing risks model with masked causes of failure and second-stage data, which is applied to a real dataset and the asymptotic and robustness properties of the estimators are investigated through simulation.
Abstract: SUMMARY In this paper we propose inference methods based on the EM algorithm for estimating the parameters of a weakly parameterised competing risks model with masked causes of failure and second-stage data. With a carefully chosen definition of complete data, the maximum likelihood estimation of the cause-specific hazard functions and of the masking probabilities is performed via an EM algorithm. Both the E- and M-steps can be solved in closed form under the full model and under some restricted models of interest. We illustrate the flexibility of the method by showing how grouped data and tests of common hypotheses in the literature on missing cause of death can be handled. The method is applied to a real dataset and the asymptotic and robustness properties of the estimators are investigated through simulation.

52 citations


"Regression analysis of competing ri..." refers background or methods or result in this paper

  • ...Craiu and Duchesne (2004) suggested an estimation procedure...

    [...]

  • ...This result on weak convergence to a Gaussian process similarly holds for the estimate of Dewanji and Sengupta (2003) and can be useful for nonparametric one- and k-sample tests for the cumulative cause-specific hazards in the spirit of Andersen and Borgan (1985, Section 5) and Andersen et al....

    [...]

  • ...Recently, Dewanji and Sengupta (2003), in addition to suggesting a nonparametric estimator using EM algorithm, developed a Nelson-Aalen type estimator of the cumulative cause-specific hazard rates (and also a smooth estimator of the cause-specific hazard rates), when certain information on the diagnostic probabilities are available from the experimentalists, but the missing pattern could be allowed to be non-ignorable and no second stage diagnosis was required....

    [...]

Journal ArticleDOI
TL;DR: In this article, a random censorship model is proposed to permit uncertainty in the cause of death assessments for a subset of the subjects in a survival experiment. But only some of the solutions are consistent; i.e., the MLEs and self-consistent estimators are not consistent in general.

47 citations


"Regression analysis of competing ri..." refers background in this paper

  • ...and Louis, T. A. (1988). Use of tumour lethality to interpret tumorigenicity experiments lacking cause-of-death data....

    [...]

  • ...This problem was subsequently studied by different researchers (See Miyakawa, 1984; Racine-poon and Hoel, 1984; Lo, 1991; Mukherjee and Wang, 1993)....

    [...]

Journal ArticleDOI
TL;DR: It is shown how stage-1 and stage-2 information can be combined to provide statistical inference about survival functions of the individual risks, the proportions of failures associated with individual risks and probability, for a specified masked case, that each of the masked competing risks is responsible for the failure.
Abstract: We consider a life testing situation in which systems are subject to failure from independent competing risks. Following a failure, immediate (stage-1) procedures are used in an attempt to reach a definitive diagnosis. If these procedures fail to result in a diagnosis, this phenomenon is called masking. Stage-2 procedures, such as failure analysis or autopsy, provide definitive diagnosis for a sample of the masked cases. We show how stage-1 and stage-2 information can be combined to provide statistical inference about (a) survival functions of the individual risks, (b) the proportions of failures associated with individual risks and (c) probability, for a specified masked case, that each of the masked competing risks is responsible for the failure. Our development is based on parametric distributional assumptions and the special case for which the failure times for the competing risks have a Weibull distribution is discussed in detail.

47 citations

Frequently Asked Questions (1)
Q1. What have the authors contributed in "Regression analysis of competing risks data with general missing pattern in failure types" ?

In this work, the authors deal with the regression problem, in which the cause-specific hazard rates may depend on some covariates, and consider estimation of the regression coefficients and the cause-specific baseline hazards under the general missing pattern using some semi-parametric models. The authors consider two different proportional hazards type semi-parametric models for their analysis. The authors also consider an example from an animal experiment to illustrate their methodology.