scispace - formally typeset
Search or ask a question

Showing papers on "Semiparametric model published in 2011"


Journal ArticleDOI
Simon N. Wood1
TL;DR: In this article, a Laplace approximation is used to obtain an approximate restricted maximum likelihood (REML) or marginal likelihood (ML) for smoothing parameter selection in semiparametric regression.
Abstract: Summary. Recent work by Reiss and Ogden provides a theoretical basis for sometimes preferring restricted maximum likelihood (REML) to generalized cross-validation (GCV) for smoothing parameter selection in semiparametric regression. However, existing REML or marginal likelihood (ML) based methods for semiparametric generalized linear models (GLMs) use iterative REML or ML estimation of the smoothing parameters of working linear approximations to the GLM. Such indirect schemes need not converge and fail to do so in a non-negligible proportion of practical analyses. By contrast, very reliable prediction error criteria smoothing parameter selection methods are available, based on direct optimization of GCV, or related criteria, for the GLM itself. Since such methods directly optimize properly defined functions of the smoothing parameters, they have much more reliable convergence properties. The paper develops the first such method for REML or ML estimation of smoothing parameters. A Laplace approximation is used to obtain an approximate REML or ML for any GLM, which is suitable for efficient direct optimization. This REML or ML criterion requires that Newton–Raphson iteration, rather than Fisher scoring, be used for GLM fitting, and a computationally stable approach to this is proposed. The REML or ML criterion itself is optimized by a Newton method, with the derivatives required obtained by a mixture of implicit differentiation and direct methods. The method will cope with numerical rank deficiency in the fitted model and in fact provides a slight improvement in numerical robustness on the earlier method of Wood for prediction error criteria based smoothness selection. Simulation results suggest that the new REML and ML methods offer some improvement in mean-square error performance relative to GCV or Akaike's information criterion in most cases, without the small number of severe undersmoothing failures to which Akaike's information criterion and GCV are prone. This is achieved at the same computational cost as GCV or Akaike's information criterion. The new approach also eliminates the convergence failures of previous REML- or ML-based approaches for penalized GLMs and usually has lower computational cost than these alternatives. Example applications are presented in adaptive smoothing, scalar on function regression and generalized additive model selection.

4,846 citations


Journal ArticleDOI
TL;DR: This work proposes adaptive penalization methods for variable selection in the semiparametric varying-coefficient partially linear model and proves that the methods possess the oracle property.
Abstract: The complexity of semiparametric models poses new challenges to statistical inference and model selection that frequently arise from real applications In this work, we propose new estimation and variable selection procedures for the semiparametric varying-coefficient partially linear model We first study quantile regression estimates for the nonparametric varying-coefficient functions and the parametric regression coefficients To achieve nice efficiency properties, we further develop a semiparametric composite quantile regression procedure We establish the asymptotic normality of proposed estimators for both the parametric and nonparametric parts and show that the estimators achieve the best convergence rate Moreover, we show that the proposed method is much more efficient than the least-squares-based method for many non-normal errors and that it only loses a small amount of efficiency for normal errors In addition, it is shown that the loss in efficiency is at most 111% for estimating varying coefficient functions and is no greater than 136% for estimating parametric components To achieve sparsity with high-dimensional covariates, we propose adaptive penalization methods for variable selection in the semiparametric varying-coefficient partially linear model and prove that the methods possess the oracle property Extensive Monte Carlo simulation studies are conducted to examine the finite-sample performance of the proposed procedures Finally, we apply the new methods to analyze the plasma beta-carotene level data

265 citations


Book
22 Jun 2011
TL;DR: In this article, the authors present a model space for polynomial spline regression based on the Reproducing Kernel Hilbert Space (RKHS) model for diabetic retinopathy.
Abstract: Introduction Parametric and Nonparametric Regression Polynomial Splines Scope of This Book The assist Package Smoothing Spline Regression Reproducing Kernel Hilbert Space Model Space for Polynomial Splines General Smoothing Spline Regression Models Penalized Least Squares Estimation The ssr Function Another Construction for Polynomial Splines Periodic Splines Thin-Plate Splines Spherical Splines Partial Splines L-Splines Smoothing Parameter Selection and Inference Impact of the Smoothing Parameter Trade-Offs Unbiased Risk Cross-Validation and Generalized Cross-Validation Bayes and Linear Mixed-Effects Models Generalized Maximum Likelihood Comparison and Implementation Confidence Intervals Hypothesis Tests Smoothing Spline ANOVA Multiple Regression Tensor Product Reproducing Kernel Hilbert Spaces One-Way SS ANOVA Decomposition Two-Way SS ANOVA Decomposition General SS ANOVA Decomposition SS ANOVA Models and Estimation Selection of Smoothing Parameters Confidence Intervals Examples Spline Smoothing with Heteroscedastic and/or Correlated Errors Problems with Heteroscedasticity and Correlation Extended SS ANOVA Models Variance and Correlation Structures Examples Generalized Smoothing Spline ANOVA Generalized SS ANOVA Models Estimation and Inference Wisconsin Epidemiological Study of Diabetic Retinopathy Smoothing Spline Estimation of Variance Functions Smoothing Spline Spectral Analysis Smoothing Spline Nonlinear Regression Motivation Nonparametric Nonlinear Regression Models Estimation with a Single Function Estimation with Multiple Functions The nnr Function Examples Semiparametric Regression Motivation Semiparametric Linear Regression Models Semiparametric Nonlinear Regression Models Examples Semiparametric Mixed-Effects Models Linear Mixed-Effects Models Semiparametric Linear Mixed-Effects Models Semiparametric Nonlinear Mixed-Effects Models Examples Appendix A: Data Sets Appendix B: Codes for Fitting Strictly Increasing Functions Appendix C: Codes for Term Structure of Interest Rates References Author Index Subject Index

232 citations


Journal ArticleDOI
TL;DR: In this paper, the authors proposed a more robust modeling approach by considering the model for the nonresponding part as an exponential tilting of the model of the responding part, which can be justified under the assumption that the response probability can be expressed as a semiparametric logistic regression model.
Abstract: Parameter estimation with nonignorable missing data is a challenging problem in statistics. The fully parametric approach for joint modeling of the response model and the population model can produce results that are quite sensitive to the failure of the assumed model. We propose a more robust modeling approach by considering the model for the nonresponding part as an exponential tilting of the model for the responding part. The exponential tilting model can be justified under the assumption that the response probability can be expressed as a semiparametric logistic regression model. In this paper, based on the exponential tilting model, we propose a semiparametric estimation method of mean functionals with nonignorable missing data. A semiparametric logistic regression model is assumed for the response probability and a nonparametric regression approach for missing data discussed in Cheng (1994) is used in the estimator. By adopting nonparametric components for the model, the estimation method can be mad...

161 citations


Journal ArticleDOI
TL;DR: In this article, fast large-sample tests for assessing goodness of fit are obtained by means of multiplier central limit theorems, and the resulting procedures are shown to be asymptotically valid when based on popular method-of-moment estimators.
Abstract: Goodness-of-fit tests are a fundamental element in the copula-based mod- eling of multivariate continuous distributions. Among the different procedures pro- posed in the literature, recent large scale simulations suggest that one of the most powerful tests is based on the empirical process comparing the empirical copula with a parametric estimate of the copula derived under the null hypothesis. As for most of the currently available goodness-of-fit procedures for copula models, the null distribution of the statistic for the latter test is obtained by means of a parametric bootstrap. The main inconvenience of this approach is its high compu- tational cost, which, as the sample size increases, can be regarded as an obstacle to its application. In this work, fast large-sample tests for assessing goodness of fit are obtained by means of multiplier central limit theorems. The resulting procedures are shown to be asymptotically valid when based on popular method-of-moment estimators. Large scale Monte Carlo experiments, involving six frequently used parametric copula families and three different estimators of the copula parameter, confirm that the proposed procedures provide a valid, much faster alternative to the corresponding parametric bootstrap-based tests. An application of the derived tests to the modeling of a well-known insurance data set is presented. The use of the multiplier approach instead of the parametric bootstrap can reduce the computing time from about a day to minutes.

109 citations


Journal ArticleDOI
TL;DR: Targeted maximum likelihood estimators that guarantee that the parametric submodel employed by the TMLE procedure respects the global bounds on the continuous outcomes, are especially suitable for dealing with positivity violations because in addition to being double robust and semiparametric efficient, they are substitution estimators.
Abstract: There is an active debate in the literature on censored data about the relative performance of model based maximum likelihood estimators, IPCW-estimators, and a variety of double robust semiparametric efficient estimators. Kang and Schafer (2007) demonstrate the fragility of double robust and IPCW-estimators in a simulation study with positivity violations. They focus on a simple missing data problem with covariates where one desires to estimate the mean of an outcome that is subject to missingness. Responses by Robins, et al. (2007), Tsiatis and Davidian (2007), Tan (2007) and Ridgeway and McCaffrey (2007) further explore the challenges faced by double robust estimators and offer suggestions for improving their stability. In this article, we join the debate by presenting targeted maximum likelihood estimators (TMLEs). We demonstrate that TMLEs that guarantee that the parametric submodel employed by the TMLE procedure respects the global bounds on the continuous outcomes, are especially suitable for dealing with positivity violations because in addition to being double robust and semiparametric efficient, they are substitution estimators. We demonstrate the practical performance of TMLEs relative to other estimators in the simulations designed by Kang and Schafer (2007) and in modified simulations with even greater estimation challenges.

101 citations


Journal ArticleDOI
TL;DR: In this paper, the authors consider dynamic time series binary choice models with correlated errors and show that no long-run variance estimator is needed for the validity of the smoothed maximum score procedure.
Abstract: This paper considers dynamic time series binary choice models. It proves near epoch dependence and strong mixing for the dynamic binary choice model with correlated errors. Using this result, it shows in a time series setting the validity of the dynamic probit likelihood procedure when lags of the dependent binary variable are used as regressors, and it establishes the asymptotic validity of Horowitz’s smoothed maximum score estimation of dynamic binary choice models with lags of the dependent variable as regressors. For the semiparametric model, the latent error is explicitly allowed to be correlated. It turns out that no long-run variance estimator is needed for the validity of the smoothed maximum score procedure in the dynamic time series framework.

99 citations


Journal ArticleDOI
TL;DR: In this article, the authors proposed a method for identifying the causal parameter of interest under a nonparametric and semiparametric model, which is applicable not only to a continuous outcome but also to a binary outcome.
Abstract: In medical studies, there are many situations where the final outcomes are truncated by death, in which patients die before outcomes of interest are measured. In this article we consider identifiability and estimation of causal effects by principal stratification when some outcomes are truncated by death. Previous studies mostly focused on large sample bounds, Bayesian analysis, sensitivity analysis. In this article, we propose a new method for identifying the causal parameter of interest under a nonparametric and semiparametric model. We show that the causal parameter of interest is identifiable under some regularity assumptions and the assumption that there exists a pretreatment covariate whose conditional distributions among two principal strata are not the same, but our approach does not need the assumption of a mixture normal distribution for outcomes as required by Zhang, Rubin, and Mealli (2009). Hence, the proposed method is applicable not only to a continuous outcome but also to a binary outcome....

81 citations


Journal ArticleDOI
TL;DR: In this article, a general conditional Markov Chain Monte Carlo (MCMC) method for inference in the wide subclass of these models where the parameters of the marginal stick-breaking process are non-decreasing sequences is proposed.

72 citations


Journal ArticleDOI
TL;DR: In this paper, the authors consider the weak fractional cointegration model and show that the NBLS estimator is asymptotically biased, and also that the bias can be consistently estimated.
Abstract: We consider estimation of the cointegrating relation in the weak fractional cointegration model, where the strength of the cointegrating relation (dierence in memory parameters) is less than one-half. A special case is the stationary fractional cointegration model, which has found important application recently, especially in …nancial economics. Previous research on this model has considered a semiparametric narrow-band least squares (NBLS) estimator in the frequency domain, but in the stationary case its asymptotic distribution has been derived only under a condition of non-coherence between regressors and errors at the zero frequency. We show that in the absence of this condition, the NBLS estimator is asymptotically biased, and also that the bias can be consistently estimated. Consequently, we introduce a fully modi…ed NBLS estimator which eliminates the bias, and indeed enjoys a faster rate of convergence than NBLS in general. We also show that local Whittle estimation of the integration order of the errors can be conducted consistently based on NBLS residuals, but the estimator has the same asymptotic distribution as if the errors were observed only under the condition of non-coherence. Furthermore, compared to much previous research, the development of the asymptotic distribution theory is based on a dierent spectral density representation, which is relevant for multivariate fractionally integrated processes, and the use of this representation is shown to result in lower asymptotic bias and variance of the narrow-band estimators. We present simulation evidence and a series of empirical illustrations to demonstrate the feasibility and empirical relevance of our methodology.

71 citations


ReportDOI
TL;DR: In this paper, it was shown that the semiparametric efficiency bound for a parameter identified by an unconditional moment restriction with data missing at random (MAR) coincides with that of a particular augmented moment condition problem.
Abstract: This paper shows that the semiparametric efficiency bound for a parameter identified by an unconditional moment restriction with data missing at random (MAR) coincides with that of a particular augmented moment condition problem. The augmented system consists of the inverse probability weighted (IPW) original moment restriction and an additional conditional moment restriction which exhausts all other implications of the MAR assumption. The paper also investigates the value of additional semiparametric restrictions on the conditional expectation function (CEF) of the original moment function given always observed covariates. In the program evaluation context, for example, such restrictions are implied by semiparametric models for the potential outcome CEFs given baseline covariates. The efficiency bound associated with this model is shown to also coincide with that of a particular moment condition problem. Some implications of these results for estimation are briefly discussed.

Journal ArticleDOI
TL;DR: In this paper, the assumption of the Gumbel distribution is substantially relaxed to include a large class of distributions that is stable with respect to the minimum operation, which leads to a semi-parametric choice model which links the linear combination of travelrelated attributes to the choice probabilities via an unknown sensitivity function.
Abstract: The multinomial logit model in discrete choice analysis is widely used in transport research. It has long been known that the Gumbel distribution forms the basis of the multinomial logit model. Although the Gumbel distribution is a good approximation in some applications such as route choice problems, it is chosen mainly for mathematical convenience. This can be restrictive in many other scenarios in practice. In this paper we show that the assumption of the Gumbel distribution can be substantially relaxed to include a large class of distributions that is stable with respect to the minimum operation. The distributions in the class allow heteroscedastic variances. We then seek a transformation that stabilizes the heteroscedastic variances. We show that this leads to a semi-parametric choice model which links the linear combination of travel-related attributes to the choice probabilities via an unknown sensitivity function. This sensitivity function reflects the degree of travelers’ sensitivity to the changes in the combined travel cost. The estimation of the semi-parametric choice model is also investigated and empirical studies are used to illustrate the developed method.

Journal ArticleDOI
TL;DR: In this paper, a difference-based approach is proposed to estimate the linear component based on the differences of the observations and then estimate the nonparametric component by ei- ther a kernel or a wavelet thresholding method using the residuals of the linear fit.
Abstract: A commonly used semiparametric partial linear model is con- sidered. We propose analyzing this model using a difference based approach. The procedure estimates the linear component based on the differences of the observations and then estimates the nonparametric component by ei- ther a kernel or a wavelet thresholding method using the residuals of the linear fit. It is shown that both the estimator of the linear component and the estimator of the nonparametric component asymptotically perform as well as if the other component were known. The estimator of the linear com- ponent is asymptotically efficient and the estimator of the nonparametric component is asymptotically rate optimal. A test for linear combinations of the regression coefficients of the linear component is also developed. Both the estimation and the testing procedures are easily implementable. Nu- merical performance of the procedure is studied using both simulated and real data. In particular, we demonstrate our method in an analysis of an attitude data set as well as a data set from the Framingham Heart Study.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a method that combines the efficient semiparametric estimator with nonparametric covariance estimation, and is robust against misspecification of covariance models.
Abstract: SUMMARY For longitudinal data, when the within-subject covariance is misspecified, the semiparametric regression estimator may be inefficient. We propose a method that combines the efficient semiparametric estimator with nonparametric covariance estimation, and is robust against misspecification of covariance models. We show that kernel covariance estimation provides uniformly consistent estimators for the within-subject covariance matrices, and the semiparametric profile estimator with substituted nonparametric covariance is still semiparametrically efficient. The finite sample performance of the proposed estimator is illustrated by simulation. In an application to CD4 count data from an AIDS clinical trial, we extend the proposed method to a functional analysis of the covariance model.

Journal ArticleDOI
TL;DR: A general sieve M-theorem for bundled parameters is proposed and the theorem is applied to deriving the asymptotic theory for the sieve maximum likelihood estimation in the linear regression model for censored survival data.
Abstract: In many semiparametric models that are parameterized by two types of parameters – a Euclidean parameter of interest and an infinite-dimensional nuisance parameter, the two parameters are bundled together, i.e., the nuisance parameter is an unknown function that contains the parameter of interest as part of its argument. For example, in a linear regression model for censored survival data, the unspecified error distribution function involves the regression coefficients. Motivated by developing an efficient estimating method for the regression parameters, we propose a general sieve M-theorem for bundled parameters and apply the theorem to deriving the asymptotic theory for the sieve maximum likelihood estimation in the linear regression model for censored survival data. The numerical implementation of the proposed estimating method can be achieved through the conventional gradient-based search algorithms such as the Newton-Raphson algorithm. We show that the proposed estimator is consistent and asymptotically normal and achieves the semiparametric efficiency bound. Simulation studies demonstrate that the proposed method performs well in practical settings and yields more efficient estimates than existing estimating equation based methods. Illustration with a real data example is also provided.

Journal ArticleDOI
TL;DR: In this paper, the authors provide methods for inference on a finite dimensional parameter of interest, 2 < d, in a semiparametric probability model when an infinite dimensional nuisance parameter, g, is present.
Abstract: We provide methods for inference on a finite dimensional parameter of interest, � 2 < d�, in a semiparametric probability model when an infinite dimensional nuisance parameter, g, is present. We depart from the semiparametric literature in that we do not require that the pair (�,g) is point identified and so we construct confidence regions forthat are robust to non-point identification. This allows practitioners to examine the sensitivity of their estimates ofto specification of g in a likelihood setup. To construct these confidence regions for �, we invert a profiled sieve likelihood ratio (LR) statistic. We derive the asymptotic null distribution of this profiled sieve LR, which is nonstandard whenis not point identified (but is � 2 distributed under point identifica- tion). We show that a simple weighted bootstrap procedure consistently estimates this complicated distribution's quantiles. Monte Carlo studies of a semiparametric dynamic binary response panel data model indicate that our weighted bootstrap procedures per- forms adequately in finite samples. We provide three empirical illustrations where we compare our results to the ones obtained using standard (less robust) methods.

Journal ArticleDOI
TL;DR: In this article, the authors view a game abstractly as a semiparametric mixture distribution and study the efficiency bound of this model, showing that if the number of equilibria is sufficiently large compared to the total number of outcomes, root-n consistent estimation of the model will not be possible.
Abstract: We view a game abstractly as a semiparametric mixture distribution and study the semiparametric efficiency bound of this model. Our results suggest that a key issue for inference is the number of equilibria compared to the number of outcomes. If the number of equilibria is sufficiently large compared to the number of outcomes, root-n consistent estimation of the model will not be possible. We also provide a simple estimator in the case when the efficiency bound is strictly above zero.

01 Jan 2011
TL;DR: In this article, a selective review on the recent developments of nonparametric and semiparametric panel data models is given, and the basic models and ideas of estimation, and comment on the asymptotic properties of different estimators and speciµcation tests.
Abstract: This paper gives a selective review on the recent developments of nonparametric and semiparametric panel data models. We focus on the conventional panel data models with one-way error component structure, partially linear panel data models, varying coe¢ cient panel data models, nonparametric panel data models with multi-factor error structure, and nonseparable nonparametric panel data models. For each area, we discuss the basic models and ideas of estimation, and comment on the asymptotic properties of di¤erent estimators and speci…cation tests. Much theoretical and empirical research is needed in this emerging area. KEY WORDS: Cross section dependence; …xed e¤ects; hypothesis testing; nonadditive model; nonparametric GMM; nonseparable model; partially linear panel data model; random e¤ects; varying coe¢ cient model; The …rst author gratefully acknowledges the …nancial support from the NSFC under the grant numbers 70501001 and 70601001. The second author acknowledges the …nancial support from the academic senate, UCR.

Proceedings ArticleDOI
06 Nov 2011
TL;DR: This work derives precise conditions under which material reflectance properties may be estimated from a single image of a homogeneous curved surface, lit by a directional source, and proposes a semiparametric BRDF abstraction that lies between purely parametric and purely data-driven models.
Abstract: We derive precise conditions under which material reflectance properties may be estimated from a single image of a homogeneous curved surface (canonically a sphere), lit by a directional source. Based on the observation that light is reflected along certain (a priori unknown) preferred directions such as the half-angle, we propose a semiparametric BRDF abstraction that lies between purely parametric and purely data-driven models. Formulating BRDF estimation as a particular type of semiparametric regression, both the preferred directions and the form of BRDF variation along them can be estimated from data. Our approach has significant theoretical, algorithmic and empirical benefits, lends insights into material behavior and enables novel applications. While it is well-known that fitting multi-lobe BRDFs may be ill-posed under certain conditions, prior to this work, precise results for the well-posedness of BRDF estimation had remained elusive. Since our BRDF representation is derived from physical intuition, but relies on data, we avoid pitfalls of both parametric (low generalizability) and non-parametric regression (low interpretability, curse of dimensionality). Finally, we discuss several applications such as single-image relighting, light source estimation and physically meaningful BRDF editing.

Journal Article
TL;DR: This paper provides an introduction to a simple, yet comprehensive, set of programs for the implementation of some Bayesian non- and semi-parametric models in R, DPpackage, containing functions to compute pseudo-Bayes factors for model comparison, and for eliciting the precision parameter of the Dirichlet process prior.
Abstract: Data analysis sometimes requires the relaxation of parametric assumptions in order to gain modeling flexibility and robustness against mis-specification of the probability model. In the Bayesian context, this is accomplished by placing a prior distribution on a function space, such as the space of all probability distributions or the space of all regression functions. Unfortunately, posterior distributions ranging over function spaces are highly complex and hence sampling methods play a key role. This paper provides an introduction to a simple, yet comprehensive, set of programs for the implementation of some Bayesian non- and semi-parametric models in R, DPpackage. Currently DPpackage includes models for marginal and conditional density estimation, ROC curve analysis, interval-censored data, binary regression data, item response data, longitudinal and clustered data using generalized linear mixed models, and regression data using generalized additive models. The package also contains functions to compute pseudo-Bayes factors for model comparison, and for eliciting the precision parameter of the Dirichlet process prior. To maximize computational efficiency, the actual sampling for each model is carried out using compiled FORTRAN.

Journal ArticleDOI
TL;DR: In this article, it was shown that a nonparametric rank condition and differentiability of the moment conditions with respect to a certain norm imply local identification in a non-parametric model.
Abstract: In parametric models a sufficient condition for local identification is that the vector of moment conditions is differentiable at the true parameter with full rank derivative matrix. We show that there are corresponding sufficient conditions for nonparametric models. A nonparametric rank condition and differentiability of the moment conditions with respect to a certain norm imply local identification. It turns out these conditions are slightly stronger than needed and are hard to check, so we provide weaker and more primitive conditions. We extend the results to semiparametric models. We illustrate the sufficient conditions with endogenous quantile and single index examples. We also consider a semiparametric habit-based, consumption capital asset pricing model. There we find the rank condition is implied by an integral equation of the second kind having a one-dimensional null space.

Journal ArticleDOI
TL;DR: This paper proposes a generalization of the CoxPH model in terms of the cumulative hazard function taking a form similar to the Cox PH model, with the extension that the baseline cumulative hazardfunction is raised to a power function.

Journal ArticleDOI
TL;DR: This work proposes a new methodology to capture the spatial pattern by assuming a prior based on a mixture of spatially dependent Polya trees for the baseline survival in the proportional hazards model and provides better goodness of fit over the traditional alternatives as measured by log pseudo marginal likelihood (LPML), the deviance information criterion (DIC), and full sample score (FSS) statistics.
Abstract: With the proliferation of spatially oriented time-to-event data, spatial modeling in the survival context has received increased recent attention. A traditional way to capture a spatial pattern is to introduce frailty terms in the linear predictor of a semiparametric model, such as proportional hazards or accelerated failure time. We propose a new methodology to capture the spatial pattern by assuming a prior based on a mixture of spatially dependent Polya trees for the baseline survival in the proportional hazards model. Thanks to modern Markov chain Monte Carlo (MCMC) methods, this approach remains computationally feasible in a fully hierarchical Bayesian framework. We compare the spatially dependent mixture of Polya trees (MPT) approach to the traditional spatial frailty approach, and illustrate the usefulness of this method with an analysis of Iowan breast cancer survival data from the Surveillance, Epidemiology, and End Results (SEER) program of the National Cancer Institute. Our method provides better goodness of fit over the traditional alternatives as measured by log pseudo marginal likelihood (LPML), the deviance information criterion (DIC), and full sample score (FSS) statistics.

Journal ArticleDOI
TL;DR: In this paper, the authors show that the parametric component of a semi-parametric model can be improved essentially when more structure is put into the nonparametric part of the model.
Abstract: It is widely admitted that structured nonparametric modeling that circumvents the curse of dimensionality is important in nonparametric estimation. In this paper we show that the same holds for semi-parametric estimation. We argue that estimation of the parametric component of a semi-parametric model can be improved essentially when more structure is put into the nonparametric part of the model. We illustrate this for the partially linear model, and investigate efficiency gains when the nonparametric part of the model has an additive structure. We present the semi-parametric Fisher information bound for estimating the parametric part of the partially linear additive model and provide semi-parametric efficient estimators for which we use a smooth backfitting technique to deal with the additive nonparametric part. We also present the finite sample performances of the proposed estimators and analyze Boston housing data as an illustration.

Journal ArticleDOI
TL;DR: In this article, a new semiparametric threshold model for censored longitudinal data analysis was proposed to investigate the relationship between blood pressure change and progression of microalbuminuria (MA) among individuals with type I diabetes.
Abstract: Motivated by an investigation of the relationship between blood pressure change and progression of microalbuminuria (MA) among individuals with type I diabetes, we propose a new semiparametric threshold model for censored longitudinal data analysis. We also study a new semiparametric Bayes information criterion-type criterion for identifying the parametric component of the proposed model. Cluster effects in the model are implemented as unknown fixed effects. Asymptotic properties are established for the proposed estimators. A quadratic approximation used to implement the estimation procedure makes the method very easy to implement by avoiding the computation of multiple integrals and the need for iterative algorithms. Simulation studies show that the proposed methods work well in practice. An illustration using the Wisconsin Diabetes Registry dataset suggests some interesting findings.

Journal ArticleDOI
TL;DR: A bias-corrected technique for constructing the empirical likelihood ratio is used to study a semiparametric regression model with missing response data to directly calibrate the empirical log-likelihood ratio so that the resulting ratio is asymptotically chi-squared.

Journal ArticleDOI
TL;DR: This work model the joint distribution of the successive survival times by using copula functions, and provides semiparametric estimation procedures in which copula parameters are estimated without parametric assumptions on the marginal distributions, which provides more robust estimates and checks on the fit of parametric models.
Abstract: Sequentially observed survival times are of interest in many studies but there are difficulties in analyzing such data using nonparametric or semiparametric methods. First, when the duration of followup is limited and the times for a given individual are not independent, induced dependent censoring arises for the second and subsequent survival times. Non-identifiability of the marginal survival distributions for second and later times is another issue, since they are observable only if preceding survival times for an individual are uncensored. In addition, in some studies a significant proportion of individuals may never have the first event. Fully parametric models can deal with these features, but robustness is a concern. We introduce a new approach to address these issues. We model the joint distribution of the successive survival times by using copula functions, and provide semiparametric estimation procedures in which copula parameters are estimated without parametric assumptions on the marginal distributions. This provides more robust estimates and checks on the fit of parametric models. The methodology is applied to a motivating example involving relapse and survival following colon cancer treatment.

Journal ArticleDOI
TL;DR: In this paper, the B-spline approach is applied to simultaneously estimate the linear regression vector, the non-decreasing transformation function, and a set of nonparametric regression functions.
Abstract: We consider the efficient estimation of the semiparametric additive transformation model with current status data. A wide range of survival models and econometric models can be incorporated into this general transformation framework. We apply the B-spline approach to simultaneously estimate the linear regression vector, the nondecreasing transformation function, and a set of nonparametric regression functions. We show that the parametric estimate is semiparametric efficient in the presence of multiple nonparametric nuisance functions. An explicit consistent B-spline estimate of the asymptotic variance is also provided. All nonparametric estimates are smooth, and shown to be uniformly consistent and have faster than cubic rate of convergence. Interestingly, we observe the convergence rate interfere phenomenon, i.e., the convergence rates of B-spline estimators are all slowed down to equal the slowest one. The constrained optimization is not required in our implementation. Numerical results are used to illustrate the finite sample performance of the proposed estimators.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed method of moments estimators for the unknown parameters and simulation-based estimators to overcome the possible computational difficulty of minimizing an objective function which involves multiple integrals.

Journal ArticleDOI
TL;DR: In this article, nonparametric and semiparametric modelling methods are commonly applied in many fields, such as agriculture, but such methods have not been widely adopted in forestry, other than the most similar neighbour and neighbor models.
Abstract: Nonparametric and semiparametric modelling methods are commonly applied in many fields. However, such methods have not been widely adopted in forestry, other than the most similar neighbour and nea...