scispace - formally typeset
Search or ask a question

Showing papers on "Semiparametric model published in 2010"


Journal ArticleDOI
TL;DR: This work proposes modeling the conditional quantile by a single-index function g"0(x^[email protected]"0), where a univariate link function g’0(@?) is applied to a linear combination of covariates x^ [email protected]"0, often called the single- index.

168 citations


Journal ArticleDOI
TL;DR: It is shown that, for all commonly used parametric and semiparametric models, there is no asymptotic efficiency gain by analyzing original data if the parameter of main interest has a common value across studies, the nuisance parameters have distinct values among studies, and the summary statistics are based on maximum likelihood.
Abstract: Meta-analysis is widely used to synthesize the results of multiple studies. Although meta-analysis is traditionally carried out by combining the summary statistics of relevant studies, advances in technologies and communications have made it increasingly feasible to access the original data on individual participants. In the present paper, we investigate the relative efficiency of analyzing original data versus combining summary statistics. We show that, for all commonly used parametric and semiparametric models, there is no asymptotic efficiency gain by analyzing original data if the parameter of main interest has a common value across studies, the nuisance parameters have distinct values among studies, and the summary statistics are based on maximum likelihood. We also assess the relative efficiency of the two methods when the parameter of main interest has different values among studies or when there are common nuisance parameters across studies. We conduct simulation studies to confirm the theoretical results and provide empirical comparisons from a genetic association study.

145 citations


Journal ArticleDOI
TL;DR: In this paper, a spline-based semiparametric maximum likelihood approach was proposed to analyze the Cox model with interval-censored data, where the baseline cumulative hazard function is approximated by a monotone B-spline function.
Abstract: We propose a spline-based semiparametric maximum likelihood approach to analysing the Cox model with interval-censored data With this approach, the baseline cumulative hazard function is approximated by a monotone B-spline function We extend the generalized Rosen algorithm to compute the maximum likelihood estimate We show that the estimator of the regression parameter is asymptotically normal and semiparametrically efficient, although the estimator of the baseline cumulative hazard function converges at a rate slower than root-n We also develop an easy-to-implement method for consistently estimating the standard error of the estimated regression parameter, which facilitates the proposed inference procedure for the Cox model with interval-censored data The proposed method is evaluated by simulation studies regarding its finite sample performance and is illustrated using data from a breast cosmesis study

136 citations


Journal ArticleDOI
TL;DR: In this article, the authors proposed a new method of testing stochastic dominance that improves on existing tests based on the standard bootstrap or subsampling, which admits infinite as well as finite dimensional unknown parameters, so that the variables are allowed to be residuals from nonparametric and semiparametric models.

123 citations


Journal ArticleDOI
TL;DR: This article proposed a multivariate generalization of the multiplicative volatility model of Engle and Rangel (2008), which has a nonparametric long run component and a unit multivariate GARCH short run dynamic component.

96 citations


Journal ArticleDOI
TL;DR: The doubly robust estimation of the parameters in a semiparametric conditional odds ratio model is considered and the estimators are consistent and asymptotically normal in a union model that assumes either of two variation independent baseline functions is correctly modelled but not necessarily both.
Abstract: We consider the doubly robust estimation of the parameters in a semiparametric conditional odds ratio model. Our estimators are consistent and asymptotically normal in a union model that assumes either of two variation independent baseline functions is correctly modelled but not necessarily both. Furthermore, when either outcome has finite support, our estimators are semiparametric efficient in the union model at the intersection submodel where both nuisance functions models are correct. For general outcomes, we obtain doubly robust estimators that are nearly efficient at the intersection submodel. Our methods are easy to implement as they do not require the use of the alternating conditional expectations algorithm of Chen (2007).

92 citations


Journal ArticleDOI
TL;DR: In this article, the authors show that the bootstrap distribution asymptotically imitates the distribution of the M-estimate of the Euclidean parameter, and that the confidence set has the same distribution.
Abstract: Consider M-estimation in a semiparametric model that is characterized by a Euclidean parameter of interest and an infinite-dimensional nuisance parameter. As a general purpose approach to statistical inferences, the bootstrap has found wide applications in semiparametric M-estimation and, because of its simplicity, provides an attractive alternative to the inference approach based on the asymptotic distribution theory. The purpose of this paper is to provide theoretical justifications for the use of bootstrap as a semiparametric inferential tool. We show that, under general conditions, the bootstrap is asymptotically consistent in estimating the distribution of the M-estimate of Euclidean parameter; that is, the bootstrap distribution asymptotically imitates the distribution of the M-estimate. We also show that the bootstrap confidence set has the asymptotically correct coverage probability. These general conclusions hold, in particular, when the nuisance parameter is not estimable at root-n rate, and apply to a broad class of bootstrap methods with exchangeable bootstrap weights. This paper provides a first general theoretical study of the bootstrap in semiparametric models.

89 citations


Journal ArticleDOI
TL;DR: In this article, a semiparametric profile likelihood approach based on the first-stage local linear fitting is developed to estimate both the parameter vector and the time trend function, which allows for the cross-sectional dependence in both the regressors and the residuals.
Abstract: A semiparametric fixed effects model is introduced to describe the nonlinear trending phenomenon in panel data analysis and it allows for the cross-sectional dependence in both the regressors and the residuals. A semiparametric profile likelihood approach based on the first-stage local linear fitting is developed to estimate both the parameter vector and the time trend function. As both the time series length T and the cross-sectional size N tend to infinity simultaneously, the resulting semiparametric estimator of the parameter vector is asymptotically normal with an optimal rate of convergence. Meanwhile, an asymptotic distribution for the estimate of the nonlinear time trend function is also established with also an optimal rate of convergence. Two simulated examples are provided to illustrate the finite sample behavior of the proposed estimation method. In addition, the proposed model and estimation method is applied to the analysis of two sets of real data.

88 citations


Journal ArticleDOI
TL;DR: A new semiparametric dynamic copula model is proposed where the marginals are specified as parametric GARCH-type processes, and the dependence parameter of the copula is allowed to change over time in a nonparametric way.

80 citations


Journal ArticleDOI
Yi-Hau Chen1
TL;DR: In this article, a non-parametric maximum likelihood analysis is proposed for dependent competing risk analysis, which can be readily used as a sensitivity analysis for assessing effects of potential dependent censoring and can incorporate external information on the association of competing risks.
Abstract: Summary. Competing risks problems arise in many fields of science, where two or more types of event may occur on a subject, but only the event occurring first is observed together with its occurrence time, and other events are censored. The marginal and joint distributions of event times for competing risks cannot be identified from the observed data without assuming the relationship between events. The commonly adopted independent censoring assumption may be easily violated. An alternative is to assume that the joint distribution of event times follows a known copula, which is an explicit function of the marginal distributions. On the basis of the latter assumption, we consider marginal regression analysis for dependent competing risks, with the marginal regressions performed via semiparametric transformation models, including the proportional hazards and proportional odds models. We propose a non-parametric maximum likelihood analysis, which provides explicit expressions for the score functions and information matrix, and facilitates convenient computations. Large and finite sample properties are studied. We report an illustration with data from an acquired immune deficiency syndrome clinical trial where the censoring may be dependent. The proposal can be readily used as a sensitivity analysis for assessing effects of potential dependent censoring and can incorporate external information on the association of competing risks.

77 citations


Journal ArticleDOI
TL;DR: This paper focuses on the variable selections for semiparametric varying coefficient partially linear models when the covariates in the parametric and nonparametric components are all measured with errors, and a bias-corrected variable selection procedure is proposed by combining basis function approximations with shrinkage estimations.

Journal ArticleDOI
TL;DR: In this paper, a semiparametric fitting procedure is proposed to predict characteristic extreme traffic load effects, based on extensive weight-in-motion measurements from two European sites and shows the sensitivity of the characteristic traffic load effect to the fitting process.
Abstract: To predict characteristic extreme traffic load effects, simulations are sometimes performed of bridge loading events. To generalize the truck weight data, statistical distributions are fitted to histograms of weight measurements. This paper is based on extensive weight-in-motion measurements from two European sites and shows the sensitivity of the characteristic traffic load effects to the fitting process. A semiparametric fitting procedure is proposed: direct use of the measured histogram where there are sufficient data for this to be reliable and parametric fitting to a statistical distribution in the tail region where there are less data. Calculated characteristic load effects are shown to be highly sensitive to the fit in the tail region of the histogram.

Journal ArticleDOI
TL;DR: In this article, the joint time to caries distribution of permanent first molars was modeled as a function of covariates using a dependent Bayesian semiparametric model, where survival curves can be estimated without imposing assumptions such as proportional hazards, additive hazards, proportional odds or accelerated failure time.
Abstract: Based on a data set obtained in a dental longitudinal study, conducted in Flanders (Belgium), the joint time to caries distribution of permanent first molars was modeled as a function of covariates This involves an analysis of multivariate continuous doubly-interval-censored data since: (i) the emergence time of a tooth and the time it experiences caries were recorded yearly, and (ii) events on teeth of the same child are dependent To model the joint distribution of the emergence times and the times to caries, we propose a dependent Bayesian semiparametric model A major feature of the proposed approach is that survival curves can be estimated without imposing assumptions such as proportional hazards, additive hazards, proportional odds or accelerated failure time

Journal ArticleDOI
TL;DR: A double-penalized likelihood approach for simultaneous model selection and estimation in semiparametric mixed models for longitudinal data, which provides valid inference for data with missing at random and will be more efficient if the specified model is correct.
Abstract: We propose a double-penalized likelihood approach for simultaneous model selection and estimation in semiparametric mixed models for longitudinal data. Two types of penalties are jointly imposed on the ordinary log-likelihood: the roughness penalty on the nonparametric baseline function and a nonconcave shrinkage penalty on linear coefficients to achieve model sparsity. Compared to existing estimation equation based approaches, our procedure provides valid inference for data with missing at random, and will be more efficient if the specified model is correct. Another advantage of the new procedure is its easy computation for both regression components and variance parameters. We show that the double-penalized problem can be conveniently reformulated into a linear mixed model framework, so that existing software can be directly used to implement our method. For the purpose of model inference, we derive both frequentist and Bayesian variance estimation for estimated parametric and nonparametric components. Simulation is used to evaluate and compare the performance of our method to the existing ones. We then apply the new method to a real data set from a lactation study.

Journal ArticleDOI
TL;DR: This article proposes a novel semiparametric inference procedure that depends on neither the frailty nor the censoring time distribution, and incorporates both time-dependent and time-independent covariates in the formulation.
Abstract: Recurrent event data analyses are usually conducted under the assumption that the censoring time is independent of the recurrent event process. In many applications the censoring time can be informative about the underlying recurrent event process, especially in situations where a correlated failure event could potentially terminate the observation of recurrent events. In this article, we consider a semiparametric model of recurrent event data that allows correlations between censoring times and recurrent event process via frailty. This flexible framework incorporates both time-dependent and time-independent covariates in the formulation, while leaving the distributions of frailty and censoring times unspecified. We propose a novel semiparametric inference procedure that depends on neither the frailty nor the censoring time distribution. Large sample properties of the regression parameter estimates and the estimated baseline cumulative intensity functions are studied. Numerical studies demonstrate that the proposed methodology performs well for realistic sample sizes. An analysis of hospitalization data for patients in an AIDS cohort study is presented to illustrate the proposed method.

Journal ArticleDOI
TL;DR: A semiparametric additive rate model for modelling recurrent events in the presence of a terminal event and an estimating equation for parameter estimation is constructed and the asymptotic distributions of the proposed estimators are derived.
Abstract: We propose a semiparametric additive rate model for modelling recurrent events in the presence of a terminal event. The dependence between recurrent events and terminal event is nonparametric. A general transformation model is used to model the terminal event. We construct an estimating equation for parameter estimation and derive the asymptotic distributions of the proposed estimators. Simulation studies demonstrate that the proposed inference procedure performs well in realistic settings. Application to a medical study is presented.

Journal ArticleDOI
TL;DR: In this article, a contribution to the Bayesian theory of nonparametric and semiparametric estimation is made, where two kinds of Bernstein-von Mises Theorems are obtained for the parameter itself and for functionals of the parameter.
Abstract: This paper brings a contribution to the Bayesian theory of nonparametric and semiparametric estimation. We are interested in the asymptotic normality of the posterior distribution in Gaussian linear regression models when the number of regressors increases with the sample size. Two kinds of Bernstein-von Mises Theorems are obtained in this framework: nonparametric theorems for the parameter itself, and semiparametric theorems for functionals of the parameter. We apply them to the Gaussian sequence model and to the regression of functions in Sobolev and $C^{\alpha}$ classes, in which we get the minimax convergence rates. Adaptivity is reached for the Bayesian estimators of functionals in our applications.

Journal ArticleDOI
TL;DR: The present paper deals with nonparametric and semiparametric estimation of attributable fraction functions for cohort studies with potentially censored event time data and shows the proposed estimators to be consistent, asymPTotically normal and asymptotically efficient.
Abstract: Attributable fractions are commonly used to measure the impact of risk factors on disease incidence in the population. These static measures can be extended to functions of time when the time to disease occurrence or event time is of interest. The present paper deals with nonparametric and semiparametric estimation of attributable fraction functions for cohort studies with potentially censored event time data. The semiparametric models include the familiar proportional hazards model and a broad class of transformation models. The proposed estimators are shown to be consistent, asymptotically normal and asymptotically efficient. Extensive simulation studies demonstrate that the proposed methods perform well in practical situations. A cardiovascular health study is provided. Connections to causal inference are discussed.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a prediction method based on an ordered semiparametric probit model for credit risk forecast. But the prediction model is constructed by replacing the linear regression function in the usual ordered probit models with a semi-parametric function, thus it allows for more flexible choice of regression function.

Journal ArticleDOI
TL;DR: In this article, a semiparametric multivariate fractionally cointegrated system is considered, integration orders possibly being unknown and I ( 0 ) unobservable inputs having nonparametric spectral density.

Journal ArticleDOI
TL;DR: It is argued that the commonly assumed DP prior implies a nonzero mean of the random effect distribution, even when a base measure with mean zero is specified, and can therefore lead to biased estimators and poor inference for the regression coefficients and the spline estimator of the nonparametric function.
Abstract: We consider Bayesian inference in semiparametric mixed models (SPMMs) for longitudinal data. SPMMs are a class of models that use a nonparametric function to model a time effect, a parametric function to model other covariate effects, and parametric or nonparametric random effects to account for the within-subject correlation. We model the nonparametric function using a Bayesian formulation of a cubic smoothing spline, and the random effect distribution using a normal distribution and alternatively a nonparametric Dirichlet process (DP) prior. When the random effect distribution is assumed to be normal, we propose a uniform shrinkage prior (USP) for the variance components and the smoothing parameter. When the random effect distribution is modeled nonparametrically, we use a DP prior with a normal base measure and propose a USP for the hyperparameters of the DP base measure. We argue that the commonly assumed DP prior implies a nonzero mean of the random effect distribution, even when a base measure with mean zero is specified. This implies weak identifiability for the fixed effects, and can therefore lead to biased estimators and poor inference for the regression coefficients and the spline estimator of the nonparametric function. We propose an adjustment using a postprocessing technique. We show that under mild conditions the posterior is proper under the proposed USP, a flat prior for the fixed effect parameters, and an improper prior for the residual variance. We illustrate the proposed approach using a longitudinal hormone dataset, and carry out extensive simulation studies to compare its finite sample performance with existing methods.

Journal ArticleDOI
TL;DR: In this article, the authors present a variety of semiparametric models that produce bounds on the average causal effect of a binary treatment on a binary outcome, exploiting variation in observable covariates to narrow the bounds.

Journal ArticleDOI
TL;DR: The consistency and root-n asymptotic normality of a two-step copula estimator to the pseudo-true copula parameter value according to KLIC is obtained, and a simple consistent estimator of its asymPTotic variance is provided, allowing for a first-step nonparametric estimation of the marginal survivals.

Journal ArticleDOI
TL;DR: In this article, a generalized linear-index regression model with endogenous regressors and no parametric assumptions on the error disturbances is considered, and a kernel-weighted version of the rank correlation statistic (tau) is proposed to test the significance of the effect of an endogenous regressor.
Abstract: A unifying framework to test for causal effects in nonlinear models is proposed. We consider a generalized linear-index regression model with endogenous regressors and no parametric assumptions on the error disturbances. To test the significance of the effect of an endogenous regressor, we propose a statistic that is a kernel-weighted version of the rank correlation statistic (tau) of Kendall (1938). The semiparametric model encompasses previous cases considered in the literature (continuous endogenous regressors (Blundell and Powell (2003)) and a single binary endogenous regressor (Vytlacil and Yildiz (2007))), but the testing approach is the first to allow for (i) multiple discrete endogenous regressors, (ii) endogenous regressors that are neither discrete nor continuous (e.g., a censored variable), and (iii) an arbitrary “mix” of endogenous regressors (e.g., one binary regressor and one continuous regressor).

Journal ArticleDOI
TL;DR: A penalized partial likelihood procedure is proposed to simultaneously estimate the parameters and select variables for both the parametric and the nonparametric parts of the Cox models with semiparametric relative risk, and it is shown that the resulting estimator of theparametric part possesses the oracle property, and that the estimators achieves the optimal rate of convergence.
Abstract: We study the Cox models with semiparametric relative risk, which can be partially linear with one nonparametric component, or multiple additive or nonadditive nonparametric components. A penalized partial likelihood procedure is proposed to simultaneously estimate the parameters and select variables for both the parametric and the nonparametric parts. Two penalties are applied sequentially. The first penalty, governing the smoothness of the multivariate nonlinear covariate effect function, provides a smoothing spline ANOVA framework that is exploited to derive an empirical model selection tool for the nonparametric part. The second penalty, either the smoothly-clipped-absolute-deviation (SCAD) penalty or the adaptive LASSO penalty, achieves variable selection in the parametric part. We show that the resulting estimator of the parametric part possesses the oracle property, and that the estimator of the nonparametric part achieves the optimal rate of convergence. The proposed procedures are shown to work well in simulation experiments, and then applied to a real data example on sexually transmitted diseases.

Journal ArticleDOI
TL;DR: A semi-parametric technique, called LSEbA is introduced that achieves to combine the aforementioned methods retaining the advantages of both approaches, and takes advantage of the whole pure information of the dataset even if there is a large amount of missing values.
Abstract: The importance of Software Cost Estimation at the early stages of the development life cycle is clearly portrayed by the utilization of several models and methods, appeared so far in the literature. The researchers' interest has been focused on two well known techniques, namely the parametric Regression Analysis and the non-parametric Estimation by Analogy. Despite the several comparison studies, there seems to be a discrepancy in choosing the best prediction technique between them. In this paper, we introduce a semi-parametric technique, called LSEbA that achieves to combine the aforementioned methods retaining the advantages of both approaches. Furthermore, the proposed method is consistent with the mixed nature of Software Cost Estimation data and takes advantage of the whole pure information of the dataset even if there is a large amount of missing values. The paper analytically illustrates the process of building such a model and presents the experimentation on three representative datasets verifying the benefits of the proposed model in terms of accuracy, bias and spread. Comparisons of LSEbA with linear regression, estimation by analogy and a combination of them, based on the average of their outcomes are made through accuracy metrics, statistical tests and a graphical tool, the Regression Error Characteristic curves.

Journal ArticleDOI
TL;DR: In this paper, a more flexible semiparametric model was proposed to capture the changes in price elasticities with different levels of income, including income, multiple vehicle holding, presence of multiple wage earners or rural or urban residential locations.

Journal ArticleDOI
TL;DR: In this article, a semiparametric multivariate location-scatter model is considered, where the standardized random vector of the model is fixed using simultaneously two location vectors and two scatter matrices.

Journal ArticleDOI
TL;DR: This paper proposes to approximate the unknown nonparametric nondecreasing function in the probit model with a linear combination of monotone splines, leading to only a finite number of parameters to estimate.
Abstract: Interval-censored data occur naturally in many fields and the main feature is that the failure time of interest is not observed exactly, but is known to fall within some interval. In this paper, we propose a semiparametric probit model for analyzing case 2 interval-censored data as an alternative to the existing semiparametric models in the literature. Specifically, we propose to approximate the unknown nonparametric nondecreasing function in the probit model with a linear combination of monotone splines, leading to only a finite number of parameters to estimate. Both the maximum likelihood and the Bayesian estimation methods are proposed. For each method, regression parameters and the baseline survival function are estimated jointly. The proposed methods make no assumptions about the observation process and can be applicable to any interval-censored data with easy implementation. The methods are evaluated by simulation studies and are illustrated by two real-life interval-censored data applications.

Journal ArticleDOI
TL;DR: A semi-parametric Poisson-gamma model is used to estimate the relationships between crash counts and various roadway characteristics, including curvature, traffic levels, speed limit and surface width, and key factors for explaining crash rate variability across roadways are the amount and density of traffic, presence and degree of a horizontal curve, and road classification.
Abstract: This paper uses a semi-parametric Poisson-gamma model to estimate the relationships between crash counts and various roadway characteristics, including curvature, traffic levels, speed limit and surface width A Bayesian nonparametric estimation procedure is employed for the model's link function, substantially reducing the risk of a mis-specified model It is shown via simulation that little is lost in terms of estimation quality if the nonparametric estimation procedure is used when standard parametric assumptions (eg, linear functional forms) are satisfied, but there is significant gain if the parametric assumptions are violated It is also shown that imposing appropriate monotonicity constraints on the relationships provides better function estimates Results suggest that key factors for explaining crash rate variability across roadways are the amount and density of traffic, presence and degree of a horizontal curve, and road classification Issues related to count forecasting on individual roadway segments and out-of-sample validation measures also are discussed