scispace - formally typeset
Search or ask a question

Showing papers on "Semiparametric model published in 2006"


Journal ArticleDOI
TL;DR: The estimation of a class of copula-based semiparametric stationary Markov models is studied, and simple estimators of the marginal distribution and the copula parameter are provided, and their asymptotic properties are established under easily verifiable conditions.

443 citations


Journal ArticleDOI
TL;DR: In this paper, the authors studied the asymptotic behavior of the estimate of the CDR space with high-dimensional covariates, that is, when the dimension of the covariates goes to infinity as the sample size went to infinity.
Abstract: Sliced inverse regression is a promising method for the estimation of the central dimension-reduction subspace (CDR space) in semiparametric regression models. It is particularly useful in tackling cases with high-dimensional covariates. In this article we study the asymptotic behavior of the estimate of the CDR space with high-dimensional covariates, that is, when the dimension of the covariates goes to infinity as the sample size goes to infinity. Strong and weak convergence are obtained. We also suggest an estimation procedure of the Bayes information criterion type to ascertain the dimension of the CDR space and derive the consistency. A simulation study is conducted.

267 citations


Journal ArticleDOI
TL;DR: In this paper, an efficient, constructible and practicable estimators of PLSIMs are designed with applications to time series, and the proposed technique answers two questions from Carroll et al. [generalized partially linear single-index models, J. Amer. Statist. Assoc. 92 (1997) 477-489] : no root-n pilot estimator for the single index part of the model is needed and complexity parameters can be selected at the optimal smoothing rate.

225 citations


01 Jan 2006
TL;DR: Two methods for estimating parameters in the correlation structure—a quasi-likelihood approach and a minimum generalized variance method—are proposed and an estimation procedure for model coefficients using a profile weighted least squares approach is proposed.

214 citations


Journal ArticleDOI
TL;DR: In this paper, a class of semiparametric transformation models is proposed to characterise the effects of possibly time-varying covariates on the intensity functions of counting processes, which includes the proportional intensity model and linear transformation models as special cases.
Abstract: SUMMARY A class of semiparametric transformation models is proposed to characterise the effects of possibly time-varying covariates on the intensity functions of counting processes. The class includes the proportional intensity model and linear transformation models as special cases. Nonparametric maximum likelihood estimators are developed for the regression parameters and cumulative intensity functions of these models based on censored data. The estimators are shown to be consistent and asymptotically normal. The limiting variances for the estimators of the regression parameters achieve the semi parametric efficient bounds and can be consistently estimated. The limiting variances for the estimators of smooth functionals of the cumulative intensity function can also be consistently estimated. Simulation studies reveal that the proposed inference procedures perform well in practical settings. Two medical studies are provided.

207 citations


Journal ArticleDOI
TL;DR: In this article, the authors proposed a sieve maximum likelihood estimation procedure for a broad class of semiparametric multivariate distributions, characterized by a parametric copula function evaluated at nonparametric marginal distributions.
Abstract: We propose a sieve maximum likelihood estimation procedure for a broad class of semiparametric multivariate distributions. A joint distribution in this class is characterized by a parametric copula function evaluated at nonparametric marginal distributions. This class of distributions has gained popularity in diverse fields due to its flexibility in separately modeling the dependence structure and the marginal behaviors of a multivariate random variable, and its circumvention of the “curse of dimensionality” associated with purely nonparametric multivariate distributions. We show that the plug-in sieve maximum likelihood estimators (MLEs) of all smooth functionals, including the finite-dimensional copula parameters and the unknown marginal distributions, are semiparametrically efficient, and that their asymptotic variances can be estimated consistently. Moreover, prior restrictions on the marginal distributions can be easily incorporated into the sieve maximum likelihood estimation procedure to achieve fu...

206 citations


Journal ArticleDOI
TL;DR: In this paper, the authors describe simple and reliable inference procedures based on the least-squares principle for this model with right-censored data, which is shown to be consistent and asymptotically normal.
Abstract: SUMMARY The semiparametric accelerated failure time model relates the logarithm of the failure time linearly to the covariates while leaving the error distribution unspecified. The present paper describes simple and reliable inference procedures based on the least-squares principle for this model with right-censored data. The proposed estimator of the vector valued regression parameter is an iterative solution to the Buckley-James estimating equation with a preliminary consistent estimator as the starting value. The estimator is shown to be consistent and asymptotically normal. A novel resampling procedure is developed for the estimation of the limiting covariance matrix. Extensions to marginal models for multivariate failure time data are considered. The performance of the new inference procedures is assessed through simulation studies. Illustrations with medical studies are provided.

181 citations


Book
01 Jan 2006
TL;DR: In this article, the authors present a model for nonparametric Regression Smoothers and penalized Spline Methods, as well as Semiparametric Models and Local Polynomial Methods.
Abstract: Preface. Acronyms. 1. Introduction. 2. Parametric Mixed-Effects Models. 3. Nonparametric Regression Smoothers. 4. Local Polynomial Methods. 5. Regression Spline Methods. 6. Smoothing Splines Methods. 7. Penalized Spline Methods. 8. Semiparametric Models. 9. Time-Varying Coefficient Models. 10. Discrete Longitudinal Data. References. Index.

171 citations


Journal ArticleDOI
TL;DR: In this paper, the authors develop asymptotic optimality theory for statistical treatment rules in smooth parametric and semiparametric models. But the problem of choosing treatments to maximize social welfare is distinct from the point estimation and hypothesis testing problems usually considered in the treatment effect literature, and advocate formal analysis of decision procedures that map empirical data into treatment choices.
Abstract: This paper develops asymptotic optimality theory for statistical treatment rules in smooth parametric and semiparametric models. Manski (2000, 2002, 2004) and Dehejia (2005) have argued that the problem of choosing treatments to maximize social welfare is distinct from the point estimation and hypothesis testing problems usually considered in the treatment eects literature, and advocate formal analysis of decision procedures that map empirical data into treatment choices. We develop large-sample approximations to statistical treatment assignment problems in both randomized experiments and observational data settings in which treatment eects are identified. We derive a local asymptotic minmax regret bound on social welfare, and a local asymptotic risk bound for a two-point loss function. We show that certain natural treatment assignment rules attain these bounds.

150 citations


Journal ArticleDOI
TL;DR: In this paper, a class of transformation models for survival data with a cure fraction is proposed, motivated by biological considerations and includes both the proportional hazards and the proportional odds cure models as two special cases.
Abstract: We propose a class of transformation models for survival data with a cure fraction. The class of transformation models is motivated by biological considerations and includes both the proportional hazards and the proportional odds cure models as two special cases. An efficient recursive algorithm is proposed to calculate the maximum likelihood estimators (MLEs). Furthermore, the MLEs for the regression coefficients are shown to be consistent and asymptotically normal, and their asymptotic variances attain the semiparametric efficiency bound. Simulation studies are conducted to examine the finite-sample properties of the proposed estimators. The method is illustrated on data from a clinical trial involving the treatment of melanoma.

132 citations


Journal ArticleDOI
TL;DR: This article proposed profile kernel and backfitting estimation methods for a wide class of semiparametric problems with a parametric part for some covariate effects and repeated evaluations of a nonparametric function.
Abstract: Summary. The paper considers a wide class of semiparametric problems with a parametric part for some covariate effects and repeated evaluations of a nonparametric function. Special cases in our approach include marginal models for longitudinal or clustered data, conditional logistic regression for matched case–control studies, multivariate measurement error models, generalized linear mixed models with a semiparametric component, and many others. We propose profile kernel and backfitting estimation methods for these problems, derive their asymptotic distributions and show that in likelihood problems the methods are semiparametric efficient. Although generally not true, it transpires that with our methods profiling and backfitting are asymptotically equivalent. We also consider pseudolikelihood methods where some nuisance parameters are estimated from a different algorithm. The methods proposed are evaluated by using simulation studies and applied to the Kenya haemoglobin data.

Journal ArticleDOI
TL;DR: In this article, a semiparametric factor model is proposed to approximate the implied volatility surface (IVS) in a finite dimensional function space, which is tailored to the degenerated design of IVS data.
Abstract: We propose a semiparametric factor model, which approximates the implied volatility surface (IVS) in a finite dimensional function space. Unlike standard principal component approaches typically used to reduce complexity, our approach is tailored to the degenerated design of IVS data. In particular, we only fit in the local neighborhood of the design points by exploiting the expiry effect present in option data. Using DAX index option data, we estimate the nonparametric components and a low-dimensional time series of latent factors. The modeling approach is completed by studying vector autoregressive models fitted to the latent factors.

Journal ArticleDOI
TL;DR: This paper proposed a global semiparametric quantile regression model that has the ability to estimate conditional quantiles without the usual distributional assumptions, and developed a new model assessment tool for longitudinal growth data.
Abstract: Growth charts are often more informative when they are customized per subject, taking into account prior measurements and possibly other covariates of the subject. We study a global semiparametric quantile regression model that has the ability to estimate conditional quantiles without the usual distributional assumptions. The model can be estimated from longitudinal reference data with irregular measurement times and with some level of robustness against outliers, and it is also flexible for including covariate information. We propose a rank score test for large sample inference on covariates, and develop a new model assessment tool for longitudinal growth data. Our research indicates that the global model has the potential to be a very useful tool in conditional growth chart analysis.

Journal ArticleDOI
TL;DR: In this paper, the authors propose a natural bandwidth choice by joint maximization of the M -estimation criterion with respect to the parameter of interest and the bandwidth, and prove asymptotic normality for their estimator.

Journal ArticleDOI
TL;DR: In the paper the case series model for estimating the association between an age-dependent exposure and an outcome event is presented in greater generality than hitherto, including more general discussion of its derivation, underlying assumptions, applicability, limitations and efficiency.
Abstract: The case series model for estimating the association between an age-dependent exposure and an outcome event requires information only on cases and implicitly adjusts for all age-independent multiplicative confounders, while allowing for an age-dependent base-line incidence. In the paper the model is presented in greater generality than hitherto, including more general discussion of its derivation, underlying assumptions, applicability, limitations and efficiency. A semiparametric version of the model is developed, in which the age-specific relative incidence is left unspecified. Modelling covariate effects and testing assumptions are discussed. The small sample performance of this model is studied in simulations. The methods are illustrated with several examples from epidemiology.

Journal ArticleDOI
TL;DR: A semiparametric model is proposed for the marginal recurrent event rate, wherein the covariates are assumed to add to the unspecified baseline rate and estimators of the regression parameters and baseline rate are shown to be consistent and asymptotically Gaussian.
Abstract: Recurrent event data often arise in biomedical studies, with examples including hospitalizations, infections, and treatment failures. In observational studies, it is often of interest to estimate the effects of covariates on the marginal recurrent event rate. The majority of existing rate regression methods assume multiplicative covariate effects. We propose a semiparametric model for the marginal recurrent event rate, wherein the covariates are assumed to add to the unspecified baseline rate. Covariate effects are summarized by rate differences, meaning that the absolute effect on the rate function can be determined from the regression coefficient alone. We describe modifications of the proposed method to accommodate a terminating event (e.g., death). Proposed estimators of the regression parameters and baseline rate are shown to be consistent and asymptotically Gaussian. Simulation studies demonstrate that the asymptotic approximations are accurate in finite samples. The proposed methods are applied to a state-wide kidney transplant data set.

Journal ArticleDOI
TL;DR: Panel count data with informative observation times is studied with nonparametric and semiparametric proportional rate models for the underlying event process, where the form of the baseline rate function is left unspecified and a subject-specific frailty variable inflates or deflates the rate function multiplicatively.
Abstract: In this paper, we study panel count data with informative observation times. We assume nonparametric and semiparametric proportional rate models for the underlying event process, where the form of the baseline rate function is left unspecified and a subject-specific frailty variable inflates or deflates the rate function multiplicatively. The proposed models allow the event processes and observation times to be correlated through their connections with the unobserved frailty; moreover, the distributions of both the frailty variable and observation times are considered as nuisance parameters. The baseline rate function and the regression parameters are estimated by maximising a conditional likelihood function of observed event counts and solving estimation equations. Large-sample properties of the proposed estimators are studied. Numerical studies demonstrate that the proposed estimation procedures perform well for moderate sample sizes. An application to a bladder tumour study is presented.

Journal ArticleDOI
TL;DR: In this paper, a hierarchical semiparametric model for dynamic binary longitudinal responses is presented, where the authors propose a new O(N) Markov chain Monte Carlo based algorithm for estimation of the nonparametric function when the errors are correlated.
Abstract: This article deals with the analysis of a hierarchical semiparametric model for dynamic binary longitudinal responses. The main complicating components of the model are an unknown covariate function and serial correlation in the errors. Existing estimation methods for models with these features are of O(N3), where N is the total number of observations in the sample. Therefore, nonparametric estimation is largely infeasible when the sample size is large, as in typical in the longitudinal setting. Here we propose a new O(N) Markov chain Monte Carlo based algorithm for estimation of the nonparametric function when the errors are correlated, thus contributing to the growing literature on semiparametric and nonparametric mixed-effects models for binary data. In addition, we address the problem of model choice to enable the formal comparison of our semiparametric model with competing parametric and semiparametric specifications. The performance of the methods is illustrated with detailed studies involving simul...

Posted Content
TL;DR: In this article, the authors present a review of recent advances in nonparametric and semiparametric estimation, with an emphasis on applicability to empirical research and on resolving issues that arise in implementation.
Abstract: This chapter reviews recent advances in nonparametric and semiparametric estimation, with an emphasis on applicability to empirical research and on resolving issues that arise in implementation. It considers techniques for estimating densities, conditional mean functions, derivatives of functions and conditional quantiles in a flexible way that imposes minimal functional form assumptions. The chapter begins by illustrating how flexible modeling methods have been applied in empirical research, drawing on recent examples of applications from labor economics, consumer demand estimation and treatment effects models. Then, key concepts in semiparametric and nonparametric modeling are introduced that do not have counterparts in parametric modeling, such as the so-called curse of dimensionality, the notion of models with an infinite number of parameters, the criteria used to define optimal convergence rates, and "dimension-free" estimators. After defining these new concepts, a large literature on nonparametric estimation is reviewed and a unifying framework presented for thinking about how different approaches relate to one another. Local polynomial estimators are discussed in detail and their distribution theory is developed. The chapter then shows how nonparametric estimators form the building blocks for many semiparametric estimators, such as estimators for average derivatives, index models, partially linear models, and additively separable models. Semiparametric methods offer a middle ground between fully nonparametric and parametric approaches. Their main advantage is that they typically achieve faster rates of convergence than fully nonparametric approaches. In many cases, they converge at the parametric rate. The second part of the chapter considers in detail two issues that are central with regard to implementing flexible modeling methods: how to select the values of smoothing parameters in an optimal way and how to implement "trimming" procedures. It also reviews newly developed techniques for deriving the distribution theory of semiparametric estimators. The chapter concludes with an overview of approximation methods that speed up the computation of nonparametric estimates and make flexible estimation feasible even in very large size samples.

Journal ArticleDOI
TL;DR: In this article, an empirical log-likelihood ratio for the parametric components is proposed and the nonparametric version of the Wilk's theorem is derived, and the confidence regions of the parameterized components with asymptotically correct coverage probabilities can be constructed.

Posted Content
TL;DR: In this article, a likelihood-based estimator for a double index, semiparametric binary response equation is proposed, which is based on density estimation under local smoothing.
Abstract: This paper formulates a likelihood-based estimator for a double index, semiparametric binary response equation. A novel feature of this estimator is that it is based on density estimation under local smoothing. While the proofs differ from those based on alternative density estimators, the finite sample performance of the estimator is significantly improved. As binary responses often appear as endogenous regressors in continuous outcome equations, we also develop an optimal instrumental variables estimator in this context. For this purpose, we specialize the double index model for binary response to one with heteroscedasticity that depends on an index different from that underlying the "mean-response". We show that such (multiplicative) heteroscedasticity, whose form is not parametrically specified, effectively induces exclusion restrictions on the outcomes equation. The estimator developed below exploits such identifying information. We provide simulation evidence on the favorable performance of the estimators and illustrate their use through an empirical application on the determinants, and affect, of attendance at a government financed school.

Posted Content
TL;DR: In this paper, a concrete formula for the asymptotic distribution of two-step, possibly non-smooth semiparametric M-estimators under general misspecification is developed.
Abstract: This paper develops a concrete formula for the asymptotic distribution of two-step, possibly non-smooth semiparametric M-estimators under general misspecification. Our regularity conditions are relatively straightforward to verify and also weaker than those available in the literature. The first-stage nonparametric estimation may depend on finite dimensional parameters. We characterize: (1) conditions under which the first-stage estimation of nonparametric components do not affect the asymptotic distribution, (2) conditions under which the asymptotic distribution is affected by the derivatives of the first-stage nonparametric estimator with respect to the finite-dimensional parameters, and (3) conditions under which one can allow non-smooth objective functions. Our framework is illustrated by applying it to three examples: (1) profiled estimation of a single index quantile regression model, (2) semiparametric least squares estimation under model misspecification, and (3) a smoothed matching estimator.

Journal ArticleDOI
TL;DR: In this article, a concrete formula for the asymptotic distribution of two-step, possibly non-smooth semiparametric M-estimators under general misspecification is developed.

Journal ArticleDOI
TL;DR: In this article, a semiparametric spatial regression approach is proposed to avoid the curse of dimensionality in time series and regression, but no such development has taken place for spatial models.
Abstract: Nonparametric methods have been very popular in the last couple of decades in time series and regression, but no such development has taken place for spatial models. A rather obvious reason for this is the curse of dimensionality. For spatial data on a grid evaluating the conditional mean given its closest neighbors requires a four-dimensional nonparametric regression. In this paper a semiparametric spatial regression approach is proposed to avoid this problem. An estimation procedure based on combining the so-called marginal integration technique with local linear kernel estimation is developed in the semiparametric spatial regression setting. Asymptotic distributions are established under some mild conditions. The same convergence rates as in the one-dimensional regression case are established. An application of the methodology to the classical Mercer and Hall wheat data set is given and indicates that one directional component appears to be nonlinear, which has gone unnoticed in earlier analyses.

Journal ArticleDOI
TL;DR: A novel method of estimating functions in information geometry is used to estimate the shape parameter without estimating the unknown function of a gammadistribution of interspike intervals, and an optimal estimating function is obtained analytically for the shape parameters independent of the functional form of the firing rate.
Abstract: We considered a gammadistribution of interspike intervals as a statistical model for neuronal spike generation. A gamma distribution is a natural extension of the Poisson process taking the effect of a refractory period into account. The model is specified by two parameters: a time-dependent firing rate and a shape parameter that characterizes spiking irregularities of individual neurons. Because the environment changes over time, observed data are generated from a model with a time-dependent firing rate, which is an unknown function. A statistical model with an unknown function is called a semiparametric model and is generally very difficult to solve. We used a novel method of estimating functions in information geometry to estimate the shape parameter without estimating the unknown function. We obtained an optimal estimating function analytically for the shape parameter independent of the functional form of the firing rate. This estimation is efficient without Fisher information loss and better than maximum likelihood estimation. We suggest a measure of spiking irregularity based on the estimating function, which may be useful for characterizing individual neurons in changing environments.

Journal ArticleDOI
TL;DR: In this paper, a general class of semiparametric transformation models is studied for analyzing survival data from the case-cohort design, which was introduced by Prentice (1986), and weighted estimating equations are proposed for simultaneous estimation of the regression parameters and the transformation function.
Abstract: A general class of semiparametric transformation models is studied for analysing survival data from the case-cohort design, which was introduced by Prentice (1986). Weighted estimating equations are proposed for simultaneous estimation of the regression parameters and the transformation function. It is shown that the resulting regression estimators are asymptotically normal, with variance-covariance matrix that has a closed form and can be consistently estimated by the usual plug-in method. Simulation studies show that the proposed approach is appropriate for practical use. An application to a case-cohort dataset from the Atherosclerosis Risk in Communities study is also given to illustrate the methodology.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a family of consistent estimators and investigated their asymptotic properties, showing that the optimal efficiency bound can be reached by a semiparametric kernel estimator in this family.
Abstract: SUMMARY We study the heteroscedastic partially linear model with an unspecified partial baseline component and a nonparametric variance function. An interesting finding is that the performance of a naive weighted version of the existing estimator could deteriorate when the smooth baseline component is badly estimated. To avoid this, we propose a family of consistent estimators and investigate their asymptotic properties. We show that the optimal semiparametric efficiency bound can be reached by a semiparametric kernel estimator in this family. Building upon our theoretical findings and heuristic arguments about the equivalence between kernel and spline smoothing, we conjecture that a weighted partial spline estimator could also be semiparametric efficient. Properties of the proposed estimators are presented through theoretical illustration and numerical simulations.

Journal Article
TL;DR: In this article, the authors focus on the estimation of the e ect of risk factors on interval-censored data under the semiparametric additive hazards model and use a nonparametric step-function to characterize the baseline hazard function.
Abstract: Interval-censored event time data often arise in medical and public health studies. In such a setting, the exact time of the event of interest cannot be observed and is only known to fall between two monitoring times. Our interest focuses on the estimation of the eect of risk factors on interval-censored data under the semiparametric additive hazards model. A nonparametric step-function is used to characterize the baseline hazard function. The covariate coecien ts are estimated by maximizing the observed likelihood function, and their variances are obtained using the prole likelihood approach. We show that the proposed estimates are con- sistent and have asymptotic normal distributions. We also show that the estimator obtained for the covariate coecien t is the most ecien t estimator. Simulation studies are conducted to assess the performance of the estimate. The method is illustrated through application to a data set from an HIV study.

Journal ArticleDOI
TL;DR: A dynamic frailty model and Bayesian semiparametric approach to inference is proposed to allow subject-specific frailties to change dynamically with age while also accommodating nonproportional hazards.
Abstract: Many biomedical studies collect data on times of occurrence for a health event that can occur repeatedly, such as infection, hospitalization, recurrence of disease, or tumor onset. To analyze such data, it is necessary to account for within-subject dependency in the multiple event times. Motivated by data from studies of palpable tumors, this article proposes a dynamic frailty model and Bayesian semiparametric approach to inference. The widely used shared frailty proportional hazards model is generalized to allow subject-specific frailties to change dynamically with age while also accommodating nonproportional hazards. Parametric assumptions on the frailty distribution are avoided by using Dirichlet process priors for a shared frailty and for multiplicative innovations on this frailty. By centering the semiparametric model on a conditionally conjugate dynamic gamma model, we facilitate posterior computation and lack-of-fit assessments of the parametric model. Our proposed method is demonstrated using data from a cancer chemoprevention study.

Journal ArticleDOI
TL;DR: A finite sample criterion based on cross validation that can be used to select a nuisance parameter model from a list of candidate models is proposed and it is shown that expected value of this criterion is minimized by the nuisance parameters model that yields the estimator of the parameter of interest with the smallest mean-squared error relative to the expectedvalue of an initial consistent reference estimator.