scispace - formally typeset
Search or ask a question

Showing papers on "Semiparametric model published in 1999"


Journal ArticleDOI
TL;DR: In this paper, the memory of non-stationary processes is estimated using a Gaussian semiparametric estimate of long-range dependence, which is consistent for d ∈ (−½, 1) and asymptotically normal for d∈ (− ½,¾) under a similar set of assumptions to those in Robinson's paper.
Abstract: Generalizing the definition of the memory parameter d in terms of the differentiated series, we showed in Velasco (Non-stationary log-periodogram regression, Forthcoming J. Economet., 1997) that it is possible to estimate consistently the memory of non-stationary processes using methods designed for stationary long-range-dependent time series. In this paper we consider the Gaussian semiparametric estimate analysed by Robinson (Gaussian semiparametric estimation of long range dependence. Ann. Stat. 23 (1995), 1630–61) for stationary processes. Without a priori knowledge about the possible non-stationarity of the observed process, we obtain that this estimate is consistent for d∈ (−½, 1) and asymptotically normal for d∈ (−½,¾) under a similar set of assumptions to those in Robinson's paper. Tapering the observations, we can estimate any degree of non-stationarity, even in the presence of deterministic polynomial trends of time. The semiparametric efficiency of this estimate for stationary sequences also extends to the non-stationary framework.

378 citations


Journal ArticleDOI
TL;DR: In this paper, the authors consider the partially linear model relating a response $Y$ to predictors ($X, T$) with mean function $X^{\top}\beta + g(T)$ when the $X$s are measured with additive error.
Abstract: We consider the partially linear model relating a response $Y$ to predictors ($X, T$) with mean function $X^{\top}\beta + g(T)$ when the $X$’s are measured with additive error. The semiparametric likelihood estimate of Severini and Staniswalis leads to biased estimates of both the parameter $\beta$ and the function $g(\cdot)$ when measurement error is ignored. We derive a simple modification of their estimator which is a semiparametric version of the usual parametric correction for attenuation. The resulting estimator of $\beta$ is shown to be consistent and its asymptotic distribution theory is derived. Consistent standard error estimates using sandwich-type ideas are also developed.

309 citations


Journal ArticleDOI
TL;DR: In this paper, a two-step estimator of the long memory parameters of a vector process is presented, where the objective function is a semiparametric version of the multivariate Gaussian likelihood function in the frequency domain.

162 citations


Journal ArticleDOI
TL;DR: The Partially Linear Additive Cox (PLAC) model as mentioned in this paper is an extension of the linear additive Cox model and allows flexible modeling of covariate effects semiparametrically.
Abstract: The partly linear additive Cox model is an extension of the (linear) Cox model and allows flexible modeling of covariate effects semiparametrically. We study asymptotic properties of the maximum partial likelihood estimator of this model with right-censored data using polynomial splines. We show that, with a range of choices of the smoothing parameter (the number of spline basis functions) required for estimation of the nonparametric components, the estimator of the finite-dimensional regression parameter is root-$n$ consistent, asymptotically normal and achieves the semiparametric information bound. Rates of convergence for the estimators of the nonparametric components are obtained. They are comparable to the rates in nonparametric regression. Implementation of the estimation approach can be done easily and is illustrated by using a simulated example.

146 citations


Posted Content
TL;DR: In this paper, a semiparametric regression model is proposed to model the relationship between sales and price discounts. But the model suffers from the curse of dimensionality and cannot capture the nonlinearities and interactions in the relationship.
Abstract: The marketing literature suggests several phenomena that may contribute to the shape of the relationship between sales and price discounts. These phenomena can produce severe nonlinearities and interactions in the curves, and we argue that those are best captured with a flexible approach. Since a fully nonparametric regression model suffers from the curse of dimensionality, we propose a semiparametric regression model. Store-level sales over time is modeled as a nonparametric function of own-and cross-item price discounts, and a parametric function of other predictors (all indicator variables). We compare the predictive validity of the semiparametric model with that of two parametric benchmark models and obtain better performance on average. The results for three product categories indicate a.o. threshold- and saturation effects for both own- and cross-item temporary price cuts. We also show how the own-item curve depends on other items’ price discounts (flexible interaction effects). In a separate analysis, we show how the shape of the deal effect curve depends on own-item promotion signals. Our results indicate that prevailing methods for the estimation of deal effects on sales are inadequate.

145 citations


Journal ArticleDOI
TL;DR: In this article, a simple maximum partial likelihood method for deriving the semiparametric maximum likelihood estimator is proposed, and a discussion of assumptions under which the selection bias model is identifiable and uniquely estimable is presented.
Abstract: SUMMARY The following problem is treated: given s possibly selection biased samples from an unknown distribution function, and assuming that the sampling rule weight functions for each of the samples are mathematically specified up to a common unknown finite-dimensional parameter, how can we use the data to estimate the unknown parameters? We propose a simple maximum partial likelihood method for deriving the semiparametric maximum likelihood estimator. A discussion of assumptions under which the selection bias model is identifiable and uniquely estimable is presented. We motivate the need for the methodology by discussing the generalised logistic regression model (Gilbert, Self & Ashby, 1998), a semiparametric selection bias model which is useful for assessing from vaccine trial data how the efficacy of an HIV vaccine varies with characteristics of the exposing virus. We show through simulations and an example that the maximum likelihood estimator in the generalised logistic regression model has satisfactory finite-sample properties.

132 citations


Journal ArticleDOI
Qi Li1
TL;DR: In this paper, general hypothesis testing problems for nonparametric and semiparametric time-series econometric models are considered and Monte Carlo simulations are conducted to examine the finite sample performances of the non-parametric omitted variable test and the test for a partially linear specification.

125 citations


Journal ArticleDOI
TL;DR: In this paper, the authors explore additive models that combine both parametric and nonparametric terms and propose a √ n-consistent backfitting estimator for the parametric component of the model.
Abstract: We explore additive models that combine both parametric and nonparametric terms and propose a √n-consistent backfitting estimator for the parametric component of the model. The theoretical properties of the estimator are developed for the case with a single nonparametric term and extended to an arbitrary number of nonparametric additive terms. An estimator for the optimal bandwidth making minimal use of asymptotic expressions for bias and variance is proposed, and a fast implementation algorithm for model fitting and bandwidth selection is developed. The practical behavior of the estimator and bandwidth selection is illustrated by simulation experiments.

102 citations


Journal ArticleDOI
TL;DR: In this article, a weighted empirical odds function is proposed for fitting the proportional odds regression model with right-censored survival times, and several classes of new regression estimators, such as the pseudo-maximum likelihood estimator, martingale residual-based estimators and minimum distance estimators are derived.
Abstract: For fitting the proportional odds regression model with right-censored survival times, we introduce some weighted empirical odds functions. These functions are solutions of some self-consistency equations and have a nice martingale representation. From these functions, several classes of new regression estimators, such as the pseudo–maximum likelihood estimator, martingale residual-based estimators, and minimum distance estimators, are derived. These estimators have desirable properties such as easy computation, asymptotic normality via a martingale analysis, and reliable asymptotic covariance estimation in closed form. Extensive numerical studies show that the minimum L 2 distance estimators have very good finite-sample behaviors compared to existing methods. Results of some simulation studies and applications to a real dataset are given. The weighted odds function–based approach also provides inference on the baseline odds function and some measures for lack-of-fit analysis.

94 citations


Journal ArticleDOI
TL;DR: A semiparametric mixed effects regression model is proposed for the analysis of clustered or longitudinal data with continuous, ordinal, or binary outcome and Monte Carlo results are presented to show that the method can improve the mean squared error of the fixed effects estimators when the random effects distribution is not Gaussian.
Abstract: A semiparametric mixed effects regression model is proposed for the analysis of clustered or longitudinal data with continuous, ordinal, or binary outcome. The common assumption of Gaussian random effects is relaxed by using a predictive recursion method (Newton and Zhang, 1999) to provide a nonparametric smooth density estimate. A new strategy is introduced to accelerate the algorithm. Parameter estimates are obtained by maximizing the marginal profile likelihood by Powell's conjugate direction search method. Monte Carlo results are presented to show that the method can improve the mean squared error of the fixed effects estimators when the random effects distribution is not Gaussian. The usefulness of visualizing the random effects density itself is illustrated in the analysis of data from the Wisconsin Sleep Survey. The proposed estimation procedure is computationally feasible for quite large data sets.

65 citations


Journal ArticleDOI
TL;DR: In this article, a semiparametric additive time-varying regression model for longitudinal data recorded at irregular intervals is proposed, which allows the influence of each covariate to vary separately with time.
Abstract: In previous work we have studied a nonparametric additive time-varying regression model for longitudinal data recorded at irregular intervals. The model allows the influence of each covariate to vary separately with time. For small datasets, however, only a limited number of covariates may be handled in this way. In this paper, we introduce a semiparametric regression model for longitudinal data. The influence of some of the covariates varies nonparametrically with time while the effect of the remaining covariates are constant. No smoothing is necessary in the estimation of the parametric terms of the model. Asymptotics are derived using martingale techniques for the cumulative regression functions, which are much easier to estimate and study than the regression functions themselves. The approach is applied to longitudinal data from the Copenhagen Study Group for Liver Diseases (Schlichting et al., 1983).

Journal ArticleDOI
TL;DR: In this article, the authors present the Bayesian analysis of a semiparametric regression model that consists of parametric and nonparametric components, where the parametric component is represented with a Fourier series where the Fourier coefficients are assumed a priori to have zero means and to decay to 0 in probability at either algebraic or geometric rates.
Abstract: Summary. This paper presents the Bayesian analysis of a semiparametric regression model that consists of parametric and nonparametric components. The nonparametric component is represented with a Fourier series where the Fourier coefficients are assumed a priori to have zero means and to decay to 0 in probability at either algebraic or geometric rates. The rate of decay controls the smoothness of the response function. The posterior analysis automatically selects the amount of smoothing that is coherent with the model and data. Posterior probabilities of the parametric and semiparametric models provide a method for testing the parametric model against a non-specific alternative. The Bayes estimator's mean integrated squared error compares favourably with the theoretically optimal estimator for kernel regression.

Journal ArticleDOI
TL;DR: A marginal mixed baseline hazards model is introduced and the proposed estimators are shown to be consistent and asymptotically Gaussian with a robust covariance matrix that can be consistently estimated.
Abstract: In multivariate failure time data analysis, a marginal regression modeling approach is often preferred to avoid assumptions on the dependence structure among correlated failure times. In this paper, a marginal mixed baseline hazards model is introduced. Estimating equations are proposed for the estimation of the marginal hazard ratio parameters. The proposed estimators are shown to be consistent and asymptotically Gaussian with a robust covariance matrix that can be consistently estimated. Simulation studies indicate the adequacy of the proposed methodology for practical sample sizes. The methodology is illustrated with a data set from the Framingham Heart Study.

Journal ArticleDOI
TL;DR: The effects of certain chemical additives at maintaining a high level of activity in protein constructs during storage is investigated using a semiparametric regression technique, extended to handle categorical explanatory variables and the protein activity response appears to be extremely erratic.
Abstract: The effects of certain chemical additives at maintaining a high level of activity in protein constructs during storage is investigated. We use a semiparametric regression technique to model the effects of the additives on protein activity. The model is extended to handle categorical explanatory variables. On the basis of the available data, the important factors are estimated to be buffer, detergent, protein concentration, and storage temperature. The relationships among protein activity and these factors appear to be moderately nonlinear with strong interaction effects. These features are revealed in a data-adaptive way by the semi parametric model, without explicit modeling of the nonlinearities or interactions. We use cross-validation to assess the fit of our model. The protein activity response appears to be extremely erratic. We recommend several sets of storage conditions and that further design points be chosen in regions around these estimated optima.

Journal ArticleDOI
TL;DR: In this paper, a semiparametric regression model under long-range dependent errors is considered, and consistent estimators for both parametric and nonparametric components are investigated.

Journal ArticleDOI
TL;DR: In this paper, the authors studied large sample theory of estimators of the error distribution for the semiparametric model Y =XTβ+g(T)+E. Under appropriate conditions, they proved that the estimators converge in probability, converge al-most surely and converge uniformly almost surely.
Abstract: The paper studies large sample theory of estimators of the error distribution for the semiparametric model Y=XTβ+g(T)+E. Under appropriate conditions, we prove that the estimators converge in probability, converge al¬most surely and converge uniformly almost surely. Asymptotic normality and the rates of convergence of the estimMors are also investigated. Finally we establish the law of the iterated logarithm for the estimators.

Journal ArticleDOI
TL;DR: In this article, the authors examined score and adjusted score statistics in the context of parametric and semiparametric regression models, where the adjustments correct for the bias induced by substituting the maximum likelihood estimates of the parameters into the test statistic.
Abstract: Tests of homogeneity are being increasingly used for the analysis of event time data, but relatively little attention has been paid to their distributional properties in settings with small to moderate sample sizes. Here we consider tests of homogeneity for recurrent event data in which the null model is a Poisson process and the alternative is a mixed Poisson process. We examine score and adjusted score statistics in the context of parametric and semiparametric regression models, where the adjustments correct for the bias induced by substituting the maximum likelihood estimates of the parameters into the test statistic. We report the results of simulation studies suggesting that the adjusted score statistics are superior in terms of size and power. We also find that the adjusted score test in the semiparametric regression model does not perform particularly well in small samples, but the adjusted score test based on a piecewise exponential model has satisfactory performance. The latter thus prov...

Journal ArticleDOI
TL;DR: The main advantage of this method rests in the fact that neither the interpretation of the parameters nor the validity of the analysis depend on the appropriateness of the PH or any of the other semiparametric models.
Abstract: We propose a method for fitting semiparametric models such as the proportional hazards (PH), additive risks (AR), and proportional odds (PO) models. Each of these semiparametric models implies that some transformation of the conditional cumulative hazard function (at each t) depends linearly on the covariates. The proposed method is based on nonparametric estimation of the conditional cumulative hazard function, forming a weighted average over a range of t-values, and subsequent use of least squares to estimate the parameters suggested by each model. An approximation to the optimal weight function is given. This allows semiparametric models to be fitted even in incomplete data cases where the partial likelihood fails (e.g., left censoring, right truncation). However, the main advantage of this method rests in the fact that neither the interpretation of the parameters nor the validity of the analysis depend on the appropriateness of the PH or any of the other semiparametric models. In fact, we propose an integrated method for data analysis where the role of the various semiparametric models is to suggest the best fitting transformation. A single continuous covariate and several categorical covariates (factors) are allowed. Simulation studies indicate that the test statistics and confidence intervals have good small-sample performance. A real data set is analyzed.


Journal ArticleDOI
TL;DR: This work obtains an explicit expression for the estimate of the regression coefficients given by the back-fitting algorithm and presents an alternative, approximate method of calculation that is less demanding with smoothing splines and loess.
Abstract: We consider semiparametric models with p regressor terms and q smooth terms. We obtain an explicit expression for the estimate of the regression coefficients given by the back-fitting algorithm. The calculation of the standard errors of these estimates based on this expression is a considerable computational exercise. We present an alternative, approximate method of calculation that is less demanding. With smoothing splines, the method is exact, while with loess, it gives good estimates of standard errors. We assess the adequacy of our approximation and of another approximation with the help of two examples.

Journal ArticleDOI
TL;DR: In this article, the estimation of a location parameter in the binary choice model with some weak distributional assumptions imposed on the error term in the latent regression model is considered, and two estimators are proposed, both of which are two-step estimators.
Abstract: This paper considers the estimation of a location parameter in the binary choice model with some weak distributional assumptions imposed on the error term in the latent regression model. Two estimators are proposed here, both of which are two-step estimators; in the first step, the slope parameters are consistently estimated by existing methods; in the second step, the location parameter is consistently estimated based on a moment condition. The estimators are shown to be consistent and asymptotically normal. A small Monte Carlo study illustrates the usefulness of the estimators. We also point out that the location and slope parameters can be estimated simultaneously.

Journal ArticleDOI
TL;DR: In this paper, a semiparametric extension of the projected score method for the elimination of nuisance parameters is proposed, where only the mean and the variance of the response variable are specified and where the mean function involves both parameters of interest and nuisance parameters.
Abstract: SUMMARY This paper proposes a semiparametric extension of the projected score method of Waterman & Lindsay (1996) for the elimination of nuisance parameters. The procedure addresses cases where only the mean and the variance of the response variable are specified and where the mean function involves both parameters of interest and nuisance parameters. Important applications of the semiparametric model include quasilikelihood models for matched designs and for measurement error models (Carroll & Stefanski, 1990). As a result of the optimality and information-unbiasedness of the quasi-score function, a second-order quasi-score basis of estimating functions for the nuisance parameter is derived. Second-order locally ancillary estimating functions (Small & McLeish, 1994, pp. 81-4) are then obtained by solving a simple linear system that corresponds to a true projection for canonical exponential family distributions. Asymptotic arguments and simulation work show that the impact of nuisance parameters is considerably reduced when adopting the proposed approach.

01 Jan 1999
TL;DR: In this article, the authors examined the asymptotic behavior of the bandwidth choice based on a general band-width selector which covers such well known data-driven methods as GCV and CV.
Abstract: Speckman (1988) proposed a kernel smoothing method to estimate the parametric component β in the semiparametric regression model y = x τ β +g(t)+e, and showed that this kernel smoothing estimator is √ n-consistent for a certain deterministic bandwidth choice. However, the important issue of automatic band- width choice in this semiparametric setting has not been examined. This paper studies the asymptotic behavior of the bandwidth choice based on a general band- width selector which covers such well known data-driven methods as GCV and CV. This automatic bandwidth choice is proved to be asymptotically optimal and its asymptotic normality is established. The resulting data-driven kernel smoothing estimator of β is then showed to be still √ n-consistent. A simulation study is per- formed to compare small sample behaviors of various commonly used bandwidth selectors in this semiparametric setting, and a real data example is given.

Journal ArticleDOI
Hemant Ishwaran1
TL;DR: In this paper, it was shown that the score function for a finite-dimensional parameter can be made arbitrarily small depending upon the direction taken in the parameter space, and that the rate continues to be unattainable even when the mixing distribution is constrained to be countably discrete.
Abstract: Z In a class of semiparametric mixture models, the score function (and consequently the effective information) for a finite-dimensional parameter can be made arbitrarily small depending upon the direction taken in the parameter space. This result holds for a broad range of semiparametric mixtures over exponential families and includes examples such as the gamma semiparametric mixture, the normal mean mixture, the Weibull semiparametric mixture and the negative binomial mixture. The near-zero information rules out the usual parametric $\sqrt{n}$ rate for the finite-dimensional parameter, but even more surprising is that the rate continues to be unattainable even when the mixing distribution is constrained to be countably discrete. Two key conditions which lead to a loss of information are the smoothness of the underlying density and whether a sufficient statistic is invertible.

Posted Content
TL;DR: In this paper, the efficiency of two-step estimators with a nonparametric first step is investigated and it is shown that the efficient moment condition often leads to an estimator that attains the semiparametric efficiency bound.
Abstract: Two step estimators with a nonparametric first step are important, particularly for sample selection models where the first step is estimation of the propensity score. In this paper we consider the efficiency of such estimators. We characterize the efficient moment condition for a given first step nonparametric estimator. We also show how it is possible to approximately attain efficiency by combining many moment conditions. In addition we find that the efficient moment condition often leads to an estimator that attains the semiparametric efficiency bound. As illustrations we consider models with expectations and semiparametric minimum distance estimation.

Journal ArticleDOI
TL;DR: In this article, a method for using parametric information to modify a nonparametric estimator at the level of relatively high-order derivatives has been proposed, which represents an alternative to methods that first fit a parametric model and then adjust it.
Abstract: Summary We suggest a method for using parametric information to modify a nonparametric estimator at the level of relatively high-order derivatives. The technique represents an alternative to methods that first fit a parametric model and then adjust it. In particular, relative to a ‘nonparametric estimator with a parametric start’, our estimator is not biased by the diVerences between parametric and nonparametric fits to low-order derivatives, since we eVectively remove all the parametric information about low-order derivatives and replace it by nonparametric information. Thus, we employ parametric information only when the nonparametric information is unreliable, and do not use it elsewhere. The method has application to both nonparametric density estimation and nonparametric regression.

Journal ArticleDOI
TL;DR: In this article, a semiparametric mixture model for human fertility studies is described, where the probability of conception is a product of two components: mixing distribution, the component that introduces the heterogeneity among the menstrual cycles that come from different couples, is characterized nonparametrically by a finite number of moments.


Book ChapterDOI
TL;DR: In this article, valid theoretical and empirical Edgeworth expansions for density-weighted averaged derivative estimates of semiparametric index models are established for density weighted averaged derivative estimators.
Abstract: We establish valid theoretical and empirical Edgeworth expansions for density-weighted averaged derivative estimates of semiparametric index models.

Journal ArticleDOI
TL;DR: In this paper, the authors compare two semi-parametric estimators designed to strike a trade-off between efficiency and robustness: a weighted average of the PEB and NEB and a kernel smoother of the NPMLE.