scispace - formally typeset
Search or ask a question

Showing papers on "Semiparametric model published in 2003"



Posted Content
TL;DR: In this article, the consistency and asymptotic normality of a class of semiparametric optimization estimators where the criterion function does not obey standard smoothness conditions and simultaneously depends on some preliminary nonparametric estimators are verified.
Abstract: We provide easy to verify suffcient conditions for the consistency and asymptotic normality of a class of semiparametric optimization estimators where the criterion function does not obey standard smoothness conditions and simultaneously depends on some preliminary nonparametric estimators. Our results extend existing theories like those of Pakes and Pollard (1989), Andrews (1994a), and Newey (1994). We apply our results to two examples: a 'hit rate' and a partially linear median regression with some endogenous regressors.

354 citations


Posted Content
TL;DR: In this article, the authors analyzed the effects of public R&D subsidies on research expenditure in the German manufacturing sector and found that public funding increases firms' investment in research. But the magnitude of the treatment effect depends on the assumptions imposed by the particular selection model.
Abstract: This paper analyzes the effects of public R&D subsidies on R&D expenditure in the German manufacturing sector. The focus is on the question whether public R&D funding stimulates or crowds out private investment. Cross sectional data at the firm level is used. By apllying parametric and semiparametric selection models, it turns out that public funding increases firms? R&D expenditure. Altough the magnitude of the treatment effect depends on the assumptions imposed by the particular selection model.

299 citations


Book
01 Jan 2003
TL;DR: In this article, a collection of techniques for analyzing nonparametric and semiparametric regression models is provided, including simple goodness of fit tests and residual regression tests, which can be used to test hypotheses such as parametric and semi-parametric specifications, significance, monotonicity and additive separability.
Abstract: This book provides an accessible collection of techniques for analyzing nonparametric and semiparametric regression models. Worked examples include estimation of Engel curves and equivalence scales, scale economies, semiparametric Cobb-Douglas, translog and CES cost functions, household gasoline consumption, hedonic housing prices, option prices and state price density estimation. The book should be of interest to a broad range of economists including those working in industrial organization, labor, development, urban, energy and financial economics. A variety of testing procedures are covered including simple goodness of fit tests and residual regression tests. These procedures can be used to test hypotheses such as parametric and semiparametric specifications, significance, monotonicity and additive separability. Other topics include endogeneity of parametric and nonparametric effects, as well as heteroskedasticity and autocorrelation in the residuals. Bootstrap procedures are provided.

246 citations


Journal ArticleDOI
TL;DR: In this paper, the authors developed inference tools in a semiparametric partially linear regression model with missing response data and defined a class of estimators that includes as special cases a semi-parametric regression imputation estimator, a marginal average estimator and a (marginal) propensity score weighted estimator.
Abstract: We develop inference tools in a semiparametric partially linear regression model with missing response data. A class of estimators is defined that includes as special cases a semiparametric regression imputation estimator, a marginal average estimator, and a (marginal) propensity score weighted estimator. We show that any of our class of estimators is asymptotically normal. The three special estimators have the same asymptotic variance. They achieve the semiparametric efficiency bound in the homoscedastic Gaussian case. We show that the jackknife method can be used to consistently estimate the asymptotic variance. Our model and estimators are defined with a view to avoid the curse of dimensionality, which severely limits the applicability of existing methods. The empirical likelihood method is developed. It is shown that when missing responses are imputed using the semiparametric regression method the empirical log-likelihood is asymptotically a scaled chi-squared variable. An adjusted empirical log-likel...

213 citations


01 Jan 2003
TL;DR: In this article, the authors developed asymptotically normal estimators of the second order parameter ρ, a parameter related to the rate of convergence of maximum values, linearly normalized towards its limit.
Abstract: The main goal of this paper is to develop, under a semi-parametric context, asymptotically normal estimators of the second order parameter ρ, a parameter related to the rate of convergence of maximum values, linearly normalized, towards its limit. Asymptotic normality of such estimators is achieved under a third order condition on the tail 1 − F of the underlying model F , and for suitably large intermediate ranks. The class of estimators introduced is dependent on some control or tuning parameters and has the advantage of providing estimators with stable sample paths, as functions of the number k of top order statistics to be considered, for large values of k; such a behaviour makes obviously less important the choice of an optimal k. The practical validation of asymptotic results for small finite samples is done by means of simulation techniques in Fréchet and Burr models.

178 citations


Journal ArticleDOI
TL;DR: In this paper, a generalized structural mean model is proposed to estimate cause-effect relationships in empirical research where exposures are not completely controlled, as in observational studies or with patient noncompliance and self-selected treatment switches in randomized clinical trials.
Abstract: Summary. We estimate cause–effect relationships in empirical research where exposures are not completely controlled, as in observational studies or with patient non-compliance and self-selected treatment switches in randomized clinical trials. Additive and multiplicative structural mean models have proved useful for this but suffer from the classical limitations of linear and log-linear models when accommodating binary data. We propose the generalized structural mean model to overcome these limitations. This is a semiparametric two-stage model which extends the structural mean model to handle non-linear average exposure effects. The first-stage structural model describes the causal effect of received exposure by contrasting the means of observed and potential exposure-free outcomes in exposed subsets of the population. For identification of the structural parameters, a second stage ‘nuisance’ model is introduced. This takes the form of a classical association model for expected outcomes given observed exposure. Under the model, we derive estimating equations which yield consistent, asymptotically normal and efficient estimators of the structural effects. We examine their robustness to model misspecification and construct robust estimators in the absence of any exposure effect. The double-logistic structural mean model is developed in more detail to estimate the effect of observed exposure on the success of treatment in a randomized controlled blood pressure reduction trial with self-selected non-compliance.

168 citations


Journal ArticleDOI
TL;DR: This article proposed a semiparametric estimator that, because of its theoretical properties and its simulation results, enables one to empirically proceed with a higher degree of confidence, which is problematic because conclusions from economic analyses, which require estimated conditional yield densities, tend not to be invariant to the modeling assumption.
Abstract: Given the increasing interest in agricultural risk, many have sought improved methods to characterize conditional crop-yield densities While most have postulated the Beta as a flexible alternative to the Normal, others have chosen nonparametric methods Unfortunately, yield data tends not to be sufficiently abundant to invalidate many reasonable parametric models This is problematic because conclusions from economic analyses, which require estimated conditional yield densities, tend not to be invariant to the modeling assumption We propose a semiparametric estimator that, because of its theoretical properties and our simulation results, enables one to empirically proceed with a higher degree of confidence Copyright 2003, Oxford University Press

156 citations


Journal ArticleDOI
TL;DR: This test provides a goodness-of-fit test for checking parametric models against nonparametric models, based on the mixed-model representation of the smoothing spline estimator of the non parametric function and the variance component score test by treating the inverse of the smoother parameter as an extra variance component.
Abstract: We consider testing whether the nonparametric function in a semiparametric additive mixed model is a simple fixed degree polynomial, for example, a simple linear function. This test provides a goodness-of-fit test for checking parametric models against nonparametric models. It is based on the mixed-model representation of the smoothing spline estimator of the nonparametric function and the variance component score test by treating the inverse of the smoothing parameter as an extra variance component. We also consider testing the equivalence of two nonparametric functions in semiparametric additive mixed models for two groups, such as treatment and placebo groups. The proposed tests are applied to data from an epidemiological study and a clinical trial and their performance is evaluated through simulations.

140 citations


Journal ArticleDOI
TL;DR: The paper develops a simple estimator of the parametric component of the conditional quantile, and the semiparametric efficiency bound for theparametric component is derived, and two types of efficient estimators are considered.
Abstract: This paper is concerned with estimating a conditional quantile function that is assumed to be partially linear. The paper develops a simple estimator of the parametric component of the conditional quantile. The semiparametric efficiency bound for the parametric component is derived, and two types of efficient estimators are considered. Asymptotic properties of the proposed estimators are established under regularity conditions. Some Monte Carlo experiments indicate that the proposed estimators perform well in small samples.This paper is a part of my Ph.D. dissertation submitted to the University of Iowa. I am grateful to my adviser, Joel Horowitz, for his insightful comments, suggestions, guidance, and support. I also thank John Geweke, Gene Savin, two anonymous referees, the co-editor Oliver Linton, and participants at the 2001 Midwest Econometrics Group Annual Meeting in Kansas City for many helpful comments and suggestions. Of course, the responsibility for any errors is mine.

109 citations


Journal ArticleDOI
TL;DR: SPTA controls for population stratification through a set of genomic markers by first deriving a genetic background variable for each sampled individual through his/her genotypes at a series of independent markers, and then modeling the relationship between trait values, genotypic scores at the candidate marker, and genetic background variables through a semiparametric model.
Abstract: Although genetic association studies using unrelated individuals may be subject to bias caused by population stratification, alternative methods that are robust to population stratification such as family-based association designs may be less powerful. Recently, various statistical methods robust to population stratification were proposed for association studies, using unrelated individuals to identify associations between candidate markers and traits of interest (both qualitative and quantitative). Here, we propose a semiparametric test for association (SPTA). SPTA controls for population stratification through a set of genomic markers by first deriving a genetic background variable for each sampled individual through his/her genotypes at a series of independent markers, and then modeling the relationship between trait values, genotypic scores at the candidate marker, and genetic background variables through a semiparametric model. We assume that the exact form of relationship between the trait value and the genetic background variable is unknown and estimated through smoothing techniques. We evaluate the performance of SPTA through simulations both with discrete subpopulation models and with continuous admixture population models. The simulation results suggest that our procedure has a correct type I error rate in the presence of population stratification and is more powerful than statistical association tests for family-based association designs in all the cases considered. Moreover, SPTA is more powerful than the Quantitative Similarity-Based Association Test (QSAT) developed by us under continuous admixture populations, and the number of independent markers needed by SPTA to control for population stratification is substantially fewer than that required by QSAT.

Journal ArticleDOI
TL;DR: In this paper, a semiparametric statistical model is proposed for estimating an integral by Monte Carlo methods using simulated observations as data, which is applicable to Markov chain and more general Monte Carlo sampling schemes with multiple samplers.
Abstract: Summary. The task of estimating an integral by Monte Carlo methods is formulated as a statistical model using simulated observations as data. The difficulty in this exercise is that we ordinarily have at our disposal all of the information required to compute integrals exactly by calculus or numerical integration, but we choose to ignore some of the information for simplicity or computational feasibility. Our proposal is to use a semiparametric statistical model that makes explicit what information is ignored and what information is retained. The parameter space in this model is a set of measures on the sample space, which is ordinarily an infinite dimensional object. None-the-less, from simulated data the base-line measure can be estimated by maximum likelihood, and the required integrals computed by a simple formula previously derived by Vardi and by Lindsay in a closely related model for biased sampling. The same formula was also suggested by Geyer and by Meng and Wong using entirely different arguments. By contrast with Geyer's retrospective likelihood, a correct estimate of simulation error is available directly from the Fisher information. The principal advantage of the semiparametric model is that variance reduction techniques are associated with submodels in which the maximum likelihood estimator in the submodel may have substantially smaller variance than the traditional estimator. The method is applicable to Markov chain and more general Monte Carlo sampling schemes with multiple samplers.

Journal ArticleDOI
TL;DR: In this paper, the authors derived asymptotic information bounds and the form of the efficient score and influence functions for the semiparametric regression models studied by Lawless, Kalbfleisch and Wild (1999) under two-phase sampling designs.
Abstract: Outcome-dependent, two-phase sampling designs can dramatically reduce the costs of observational studies by judicious selection of the most informative subjects for purposes of detailed covariate measurement. Here we derive asymptotic information bounds and the form of the efficient score and influence functions for the semiparametric regression models studied by Lawless, Kalbfleisch and Wild (1999) under two-phase sampling designs. We show that the maximum likelihood estimators for both the parametric and nonparametric parts of the model are asymptotically normal and efficient. The efficient influence function for the parametric part agrees with the more general information bound calculations of Robins, Hsieh and Newey (1995). By verifying the conditions of Murphy and van der Vaart (2000) for a least favorable parametric submodel, we provide asymptotic justification for statistical inference based on profile likelihood.

Journal ArticleDOI
TL;DR: In this paper, the empirical likelihood for an α-mixing process is employed to formulate a test statistic that measures the goodness of fit of a parametric regression model against a series of nonparametric alternatives, based on residuals arising from a fitted model.
Abstract: Summary. Standard goodness-of-fit tests for a parametric regression model against a series of nonparametric alternatives are based on residuals arising from a fitted model. When a parametric regression model is compared with a nonparametric model, goodness-of-fit testing can be naturally approached by evaluating the likelihood of the parametric model within a nonparametric framework. We employ the empirical likelihood for an α-mixing process to formulate a test statistic that measures the goodness of fit of a parametric regression model. The technique is based on a comparison with kernel smoothing estimators. The empirical likelihood formulation of the test has two attractive features. One is its automatic consideration of the variation that is associated with the nonparametric fit due to empirical likelihood’s ability to Studentize internally. The other is that the asymptotic distribution of the test statistic is free of unknown parameters, avoiding plug-in estimation.We apply the test to a discretized diffusion model which has recently been considered in financial market analysis.

Journal ArticleDOI
TL;DR: In this paper, a class of semiparametric functional regression models is proposed to describe the influence of vector-valued covariates on a sample of response curves, where each observed curve is viewed as the realization of a random process, composed of an overall mean function and random components.
Abstract: Summary. We propose a class of semiparametric functional regression models to describe the influence of vector-valued covariates on a sample of response curves. Each observed curve is viewed as the realization of a random process, composed of an overall mean function and random components. The finite dimensional covariates influence the random components of the eigenfunction expansion through single-index models that include unknown smooth link and variance functions. The parametric components of the single-index models are estimated via quasi-score estimating equations with link and variance functions being estimated nonparametrically. We obtain several basic asymptotic results. The functional regression models proposed are illustrated with the analysis of a data set consisting of egg laying curves for 1000 female Mediterranean fruit-flies (medflies).

01 Jan 2003
TL;DR: In semi-parametric models, the long-range dependence parameter is estimated without assuming that the short-term dependence structure of the covariance function is known as discussed by the authors, which has not been tested before on simulated data.
Abstract: In semi-parametric models, the long-range dependence parameter is estimated without assuming that the short-range dependence structure of the covariance function is known. We review some of the estimation methods and present new ones which have not been tested before on simulated data.


Journal ArticleDOI
TL;DR: Several semiparametric techniques are introduced to estimate the volatilities of the market prices of a portfolio and several new techniques for forecasting multiple period VaR are introduced.
Abstract: Summary Value at Risk (VaR) is a fundamental tool for managing market risks. It measures the worst loss to be expected of a portfolio over a given time horizon under normal market conditions at a given confidence level. Calculation of VaR frequently involves estimating the volatility of return processes and quantiles of standardized returns. In this paper, several semiparametric techniques are introduced to estimate the volatilities of the market prices of a portfolio. In addition, both parametric and nonparametric techniques are proposed to estimate the quantiles of standardized return processes. The newly proposed techniques also have the flexibility to adapt automatically to the changes in the dynamics of market prices over time. Their statistical efficiencies are studied both theoretically and empirically. The combination of newly proposed techniques for estimating volatility and standardized quantiles yields several new techniques for forecasting multiple period VaR. The performance of the newly proposed VaR estimators is evaluated and compared with some of existing methods. Our simulation results and empirical studies endorse the newly proposed time-dependent semiparametric approach for estimating VaR.

Journal ArticleDOI
TL;DR: In this article, the authors explore a semiparametric approach by assuming a density ratio model for disease and disease-free densities, which has a natural connection with the logistic regression model.
Abstract: SUMMARY Estimation of a receiver operating characteristic, ROC, curve is usually based either on a fully parametric model such as a normal model or on a fully nonparametric model. In this paper, we explore a semiparametric approach by assuming a density ratio model for disease and disease-free densities. This model has a natural connection with the logistic regression model. The proposed semiparametric approach is more robust than a fully parametric approach and is more efficient than a fully nonparametric approach. Two real examples demonstrate that the ROC curve estimated by our semiparametric method is much smoother than that estimated by the nonparametric method.

Journal ArticleDOI
TL;DR: In this paper, the authors consider the problem of consistent model specification tests using series estimation methods and derive a test statistic for an additive partially linear model using a central limit theorem for Hilbert-valued random arrays.

01 Jan 2003
TL;DR: In this article, the semiparametric estimation of multivariate long-range dependent processes is analyzed, and the proposed estimator is shown to have a smaller limiting variance than the two-step Gaussian estimator studied by Lobato (1999).
Abstract: This paper analyzes the semiparametric estimation of multivariate long-range dependent processes. The class of spectral densities considered includes multivariate fractionally integrated processes, which are not covered by the existing literature. This paper also establishes the consistency of the multivariate Gaussian semiparametric estimator, which has not been shown in the other works. Asymptotic normality of the multivariate Gaussian semiparametric estimator is also established, and the proposed estimator is shown to have a smaller limiting variance than the two-step Gaussian semiparametric estimator studied by Lobato (1999). Gaussianity is not assumed in the asymptotic theory.

Journal ArticleDOI
TL;DR: A new O(d) estimation procedure is proposed for a large class of semiparametric models based on a generalization of the EM‐like construction of the surrogate objective function so it does not depend on the missing data representation of the model.
Abstract: In semiparametric models, the dimension d of the maximum likelihood problem is potentially unlimited. Conventional estimation methods generally behave like O(d(3)). A new O(d) estimation procedure is proposed for a large class of semiparametric models. Potentially unlimited dimension is handled in a numerically efficient way through a Nelson-Aalen-like estimator. Discussion of the new method is put in the context of recently developed minorization-maximization algorithms based on surrogate objective functions. The procedure for semiparametric models is used to demonstrate three methods to construct a surrogate objective function: using the difference of two concave functions, the EM way and the new quasi-EM (QEM) approach. The QEM approach is based on a generalization of the EM-like construction of the surrogate objective function so it does not depend on the missing data representation of the model. Like the EM algorithm, the QEM method has a dual interpretation, a result of merging the idea of surrogate maximization with the idea of imputation and self-consistency. The new approach is compared with other possible approaches by using simulations and analysis of real data. The proportional odds model is used as an example throughout the paper.

Journal ArticleDOI
TL;DR: In this paper, a semi-parametric median residual life regression model is proposed for small cell lung cancer patients with moderate censoring, which is based on Dirichlet process mixing.
Abstract: With survival data there is often interest not only in the survival time distribution but also in the residual survival time distribution. In fact, regression models to explain residual survival time might be desired. Building upon recent work of Kottas & Gelfand (J. Amer. Statist. Assoc. 96 (2001) 1458), we formulate a semiparametric median residual life regression model induced by a semiparametric accelerated failure time regression model. We utilize a Bayesian approach which allows full and exact inference. Classical work essentially ignores covariates and is either based upon parametric assumptions or is limited to asymptotic inference in non-parametric settings. No regression modelling of median residual life appears to exist. The Bayesian modelling is developed through Dirichlet process mixing. The models are fitted using Gibbs sampling. Re- sidual life inference is implemented extending the approach of Gelfand & Kottas (J. Comput. Graph. Statist. 11 (2002) 289). Finally, we present a fairly detailed analysis of a set of survival times with moderate censoring for patients with small cell lung cancer.

Journal ArticleDOI
TL;DR: A new computational method for the cure model is presented that combines the computational methods for logistic regression and the Cox proportional hazards models and is easy to implement in many statistical packages and allows a number of useful extensions of the model to relax the restrictive assumptions.

Journal Article
TL;DR: In this article, a family of time-dependent diffusion processes is introduced to model the term structure dynamics, which can be used to test the goodness-of-fit of these famous time-homogeneous short rate models.
Abstract: In an effort to capture the time variation on the instantaneous return and volatility functions, a family of time-dependent diffusion processes is introduced to model the term structure dynamics. This allows one to examine how the instanta- neous return and price volatility change over time and price level. Nonparametric techniques, based on kernel regression, are used to estimate the time-varying co- efficient functions in the drift and diffusion. The newly proposed semiparametric model includes most of the well-known short-term interest rate models, such as those proposed by Cox, Ingersoll and Ross (1985) and Chan, Karolyi, Longstaff and Sanders (1992). It can be used to test the goodness-of-fit of these famous time-homogeneous short rate models. The newly proposed method complements the time-homogeneous nonparametric estimation techniques of Stanton (1997) and Fan and Yao (1998), and is shown through simulations to truly capture the het- eroscedasticity and time-inhomogeneous structure in volatility. A family of new statistics is introduced to test whether the time-homogeneous models adequately fit interest rates for certain periods of the economy. We illustrate the new methods by using weekly three-month treasury bill data.

Journal ArticleDOI
TL;DR: In this paper, a general expository description of the use of quadratic score test statistics as inference functions is given, which allows one to do efficient estimation and testing in a semiparametric model defined by a set of mean zero estimating functions.
Abstract: A general expository description is given of the use of quadratic score test statistics as inference functions. This methodology allows one to do efficient estimation and testing in a semiparametric model defined by a set of mean-zero estimating functions. The inference function is related to a quadratic minimum distance problem. The asymptotic chi-squared properties are shown to be the consequences of asymptotic projection properties. Shortcomings of the asymptotic theory are discussed and a bootstrap method is shown to correct for anticonservative testing behavior.

Journal ArticleDOI
TL;DR: The authors examined three pattern-mixture models for making inference about parameters of the distribution of an outcome of interest Y that is to be measured at the end of a longitudinal study when this outcome is missing in some subjects.
Abstract: Summary. We examine three pattern-mixture models for making inference about parameters of the distribution of an outcome of interest Y that is to be measured at the end of a longitudinal study when this outcome is missing in some subjects. We show that these pattern-mixture models also have an interpretation as selection models. Because these models make unverifiable assumptions, we recommend that inference about the distribution of Y be repeated under a range of plausible assumptions. We argue that, of the three models considered, only one admits a parameterization that facilitates the examination of departures from the assumption of sequential ignorability. The three models are nonparametric in the sense that they do not impose restrictions on the class of observed data distributions. Owing to the curse of dimensionality, the assumptions that are encoded in these models are sufficient for identification but not for inference. We describe additional flexible and easily interpretable assumptions under which it is possible to construct estimators that are well behaved with moderate sample sizes. These assumptions define semiparametric models for the distribution of the observed data. We describe a class of estimators which, up to asymptotic equivalence, comprise all the consistent and asymptotically normal estimators of the parameters of interest under the postulated semiparametric models. We illustrate our methods with the analysis of data from a randomized clinical trial of contracepting women.

Journal ArticleDOI
TL;DR: In this paper, a nonparametric model of random uncertainties is proposed for dynamic substructuring, which does not require identifying uncertain parameters in the reduced matrix model of each substructure as is usually done for the parametric approach.
Abstract: This paper presents a new approach, called a nonparametric approach, for constructing a model of random uncertainties in dynamic substructuring in order to predict the matrix-valued frequency response functions of complex structures. Such an approach allows nonhomogeneous uncertainties to be modeled with the nonparametric approach. The Craig-Bampton dynamic substructuring method is used. For each substructure, a nonparametric model of random uncertainties is introduced. This nonparametric model does not require identifying uncertain parameters in the reduced matrix model of each substructure as is usually done for the parametric approach. This nonparametric model of random uncertainties is based on the use of a probability model for symmetric positive-definite real random matrices using the entropy optimization principle. The theory and a numerical example are presented in the context of the finite-element method. The numerical results obtained show the efficiency of the model proposed.

Journal ArticleDOI
TL;DR: Chen et al. as mentioned in this paper proposed a three-step estimator for the parameters of interest in the outcome equation, which is shown to be -consistent and asymptotically normal under standard regularity conditions.
Abstract: This paper considers estimation of a sample selection model subject to conditional heteroskedasticity in both the selection and outcome equations. The form of heteroskedasticity allowed for in each equation is multiplicative, and each of the two scale functions is left unspecified. A three-step estimator for the parameters of interest in the outcome equation is proposed. The first two stages involve nonparametric estimation of the “propensity score” and the conditional interquartile range of the outcome equation, respectively. The third stage reweights the data so that the conditional expectation of the reweighted dependent variable is of a partially linear form, and the parameters of interest are estimated by an approach analogous to that adopted in Ahn and Powell (1993, Journal of Econometrics 58, 3–29). Under standard regularity conditions the proposed estimator is shown to be -consistent and asymptotically normal, and the form of its limiting covariance matrix is derived.We are grateful to B. Honore, R. Klein, E. Kyriazidou, L.-F. Lee, J. Powell, two anonymous referees, and the co-editor D. Andrews and also to seminar participants at Princeton, Queens, UCLA, and the University of Toronto for helpful comments. Chen's research was supported by RGC grant HKUST 6070/01H from the Research Grants Council of Hong Kong.

Journal ArticleDOI
TL;DR: The results show that when the component-baseline hazard is monotonic increasing, the semi-parametric and fully parametric mixture approaches are comparable for mildly and moderately censored samples, and are comparable in efficiency in the estimation of the parameters for all levels of censoring.
Abstract: We consider a mixture model approach to the regression analysis of competing-risks data. Attention is focused on inference concerning the effects of factors on both the probability of occurrence and the hazard rate conditional on each of the failure types. These two quantities are specified in the mixture model using the logistic model and the proportional hazards model, respectively. We propose a semi-parametric mixture method to estimate the logistic and regression coefficients jointly, whereby the component-baseline hazard functions are completely unspecified. Estimation is based on maximum likelihood on the basis of the full likelihood, implemented via an expectation-conditional maximization (ECM) algorithm. Simulation studies are performed to compare the performance of the proposed semi-parametric method with a fully parametric mixture approach. The results show that when the component-baseline hazard is monotonic increasing, the semi-parametric and fully parametric mixture approaches are comparable for mildly and moderately censored samples. When the component-baseline hazard is not monotonic increasing, the semi-parametric method consistently provides less biased estimates than a fully parametric approach and is comparable in efficiency in the estimation of the parameters for all levels of censoring. The methods are illustrated using a real data set of prostate cancer patients treated with different dosages of the drug diethylstilbestrol.