scispace - formally typeset
Search or ask a question

Showing papers on "Semiparametric model published in 2001"


Report SeriesDOI
01 Nov 2001
TL;DR: In this article, the authors consider the nonparametric and semiparametric methods for estimating regression models with continuous endogenous regressors and identify the "average structural function" as a parameter of central interest.
Abstract: This paper considers the nonparametric and semiparametric methods for estimating regression models with continuous endogenous regressors. We list a number of different generalizations of the linear structural equation model, and discuss how two common estimation approaches for linear equations — the "instrumental variables" and "control function" approaches — may be extended to nonparametric generalizations of the linear model and to their semiparametric variants. We consider the identification and estimation of the "Average Structural Function" and argue that this is a parameter of central interest in the analysis of semiparametric and non- parametric models with endogenous regressors. We consider a particular semiparametric model, the binary response model with linear index function and nonparametric error distribution, and describes in detail how estimation of the parameters of interest can be constructed using the "control function" approach. This estimator is applied to estimating the relation of labor force participation to nonlabor income, viewed as an endogenous regressor.

578 citations


Journal ArticleDOI
TL;DR: In this article, the authors consider estimating in a semiparametric generalized linear model for clustered data using estimating equations and show that the conventional profile-kernel method often fails to yield a √n-consistent estimator of β along with appropriate inference unless working independence is assumed or θ(t) is artificially undersmoothed, in which case asymptotic inference is possible.
Abstract: We consider estimation in a semiparametric generalized linear model for clustered data using estimating equations. Our results apply to the case where the number of observations per cluster is finite, whereas the number of clusters is large. The mean of the outcome variable μ is of the form g(μ) = XTβ + θ(T), where g(·) is a link function, X and T are covariates, β is an unknown parameter vector, and θ(t) is an unknown smooth function. Kernel estimating equations proposed previously in the literature are used to estimate the infinite-dimensional nonparametric function θ(t), and a profile-based estimating equation is used to estimate the finite-dimensional parameter vector β. We show that for clustered data, this conventional profile-kernel method often fails to yield a √n-consistent estimator of β along with appropriate inference unless working independence is assumed or θ(t) is artificially undersmoothed, in which case asymptotic inference is possible. To gain insight into these results, we derive the se...

277 citations


Journal ArticleDOI
TL;DR: The authors developed two fully Bayesian modeling approaches, employing mixture models, for the errors in a median regression model and associated families of error distributions allow for increased variability, skewness, and flexible tail behavior.
Abstract: Median regression models become an attractive alternative to mean regression models when employing flexible families of distributions for the errors. Classical approaches are typically algorithmic with desirable properties emerging asymptotically. However, nonparametric error models may be most attractive in the case of smaller sample sizes where parametric specifications are difficult to justify. Hence, a Bayesian approach, enabling exact inference given the observed data, may be appealing. In this context there is little Bayesian work. We develop two fully Bayesian modeling approaches, employing mixture models, for the errors in a median regression model. The associated families of error distributions allow for increased variability, skewness, and flexible tail behavior. The first family is semiparametric with extra variability captured nonparametrically through mixing and skewness handled parametrically. The second family, a fully nonparametric one, includes all unimodal densities on the real line with...

208 citations


Journal ArticleDOI
TL;DR: In this article, a narrow-band frequency domain least squares estimate of the cointegrating vector, and related semiparametric methods of inference for testing the memory of observables and the presence of fractional cointegration are presented.

160 citations


Journal ArticleDOI
TL;DR: In this paper, a semiparametric model was used to evaluate treatment effects in a randomized pretest-post-test trial with two treatment groups and the asymptotic properties of the estimators derived from these five methods and their relative efficiencies were discussed under this semi-parametric model.
Abstract: Several possible methods used to evaluate treatment effects in a randomized pretest–posttest trial with two treatment groups are the two-sample t test, the paired t test, analysis of covariance I (ANCOVA I), the analysis of covariance II (ANCOVA II), and generalized estimating equations (GEE). The ANCOVA I includes treatment and baseline response as covariates in a linear model and ANCOVA II additionally includes an interaction term between the baseline response and treatment indicator as a covariate. The parameters in the ANCOVAI and ANCOVAII models are generally estimated using ordinary least squares. In this article, a semiparametric model, which makes no assumptions about the response distributions, is used. The asymptotic properties of the estimators derived from these five methods and their relative efficiencies are discussed under this semiparametric model. We show that all these methods yield consistent estimators for the treatment effect which have asymptotically normal distributions under the se...

152 citations


Journal ArticleDOI
TL;DR: In this article, a semiparametric regression model is proposed to model the relationship between sales and price discounts. But, the model suffers from the curse of dimensionality, and it cannot capture complex nonlinearities and interactions in the deal effect curve that are best captured with a flexible approach.
Abstract: The marketing literature suggests several phenomena that may contribute to the nature of the relationship between sales and price discounts. These phenomena can produce complex nonlinearities and interactions in the deal effect curve that are best captured with a flexible approach. Because a fully nonparametric regression model suffers from the curse of dimensionality, the authors propose a semiparametric regression model. Store-level sales over time are modeled as a nonparametric function of own- and cross-item price discounts and a parametric function of other predictors. The authors compare the predictive validity of the semiparametric model with that of two parametric benchmark models and obtain superior performance. The results for three product categories indicate threshold and saturation effects for both ownand cross-item temporary price cuts. The authors also find that the nature of the own-item curve depends on other items’ price discounts. Comparisons with restricted model specification...

150 citations


Journal ArticleDOI
TL;DR: In this paper, a unified semiparametric Bayesian approach based on Markov random field priors for analyzing the dependence of multicategorical response variables on time, space and further covariates is presented.
Abstract: We present a unified semiparametric Bayesian approach based on Markov random field priors for analyzing the dependence of multicategorical response variables on time, space and further covariates. The general model extends dynamic, or state space, models for categorical time series and longitudinal data by including spatial effects as well as nonlinear effects of metrical covariates in flexible semiparametric form. Trend and seasonal components, different types of covariates and spatial effects are all treated within the same general framework by assigning appropriate priors with different forms and degrees of smoothness. Inference is fully Bayesian and uses MCMC techniques for posterior analysis. The approach in this paper is based on latent semiparametric utility models and is particularly useful for probit models. The methods are illustrated by applications to unemployment data and a forest damage survey.

125 citations


Journal ArticleDOI
TL;DR: Semi-parametric nonlinear mixed effects models (SNMMs) as mentioned in this paper are a generalization of self-modeling nonlinear regression (SEMOR) models that allow the data to decide some unknown or uncertain components, such as the shape of the mean response over time.
Abstract: Nonlinear mixed effects models (NLMMs) and self-modeling nonlinear regression (SEMOR) models are often used to fit repeated measures data. They use a common function shared by all subjects to model variation within each subject and some fixed and/or random parameters to model variation between subjects. The parametric NLMM may be too restrictive, and the semiparametric SEMOR model ignores correlations within each subject. In this article we propose a class of semiparametric nonlinear mixed effects models (SNMMs) that extend NLMMs, SEMOR models, and many other existing models in a natural way. A SNMM assumes that the mean function depends on some parameters and nonparametric functions. The parameters provide an interpretable data summary. The nonparametric functions provide flexibility to allow the data to decide some unknown or uncertain components, such as the shape of the mean response over time. A second-stage model with fixed and random effects is used to model the parameters. Smoothing splines are us...

119 citations


Journal ArticleDOI
TL;DR: A semiparametric ridgelet NLARX model which includes various lags of historical inflation and the GDP gap is best in terms of both forecast mean squared error and forecast mean absolute deviation error.
Abstract: We examine semiparametric nonlinear autoregressive models with exogenous variables (NLARX) via three classes of artificial neural networks: the first one uses smooth sigmoid activation functions; the second one uses radial basis activation functions; and the third one uses ridgelet activation functions. We provide root mean squared error convergence rates for these ANN estimators of the conditional mean and median functions with stationary /spl beta/-mixing data. As an empirical application, we compare the forecasting performance of linear and semiparametric NLARX models of US inflation. We find that all of our semiparametric models outperform a benchmark linear model based on various forecast performance measures. In addition, a semiparametric ridgelet NLARX model which includes various lags of historical inflation and the GDP gap is best in terms of both forecast mean squared error and forecast mean absolute deviation error.

105 citations


Journal ArticleDOI
TL;DR: A semiparametric cure rate model with a smoothing parameter that controls the degree of parametricity in the right tail of the survival distribution is proposed and it is shown that such a parameter is crucial for these kinds of models and can have an impact on the posterior estimates.
Abstract: We propose methods for Bayesian inference for a new class of semiparametric survival models with a cure fraction. Specifically, we propose a semiparametric cure rate model with a smoothing parameter that controls the degree of parametricity in the right tail of the survival distribution. We show that such a parameter is crucial for these kinds of models and can have an impact on the posterior estimates. Several novel properties of the proposed model are derived. In addition, we propose a class of improper noninformative priors based on this model and examine the properties of the implied posterior. Also, a class of informative priors based on historical data is proposed and its theoretical properties are investigated. A case study involving a melanoma clinical trial is discussed in detail to demonstrate the proposed methodology.

102 citations



Journal ArticleDOI
TL;DR: In this paper, both parametric and semiparametric methods were applied to the estimation of wage and participation equations for married women in Portugal, and significant differences between the two approaches indicate the inappropriateness of the standard parametric methods to estimation of the model and for the purpose of policy simulations.
Abstract: This paper applies both parametric and semiparametric methods to the estimation of wage and participation equations for married women in Portugal. The semiparametric estimators considered are the two-stage estimators proposed by Newey (1991) and Andrews and Schafgans (1998). The selection equation results are compared using the specification tests proposed by Horowitz (1993), Horowitz and Hardle (1994), and the wage equation results are compared using a Hausman test. Significant differences between the two approaches indicate the inappropriateness of the standard parametric methods to the estimation of the model and for the purpose of policy simulations. The greater departure seems to occur in the range of the low values of the index corresponding to a specific group of women. Copyright © 2001 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: The approach generalizes the classical normal-based one-way analysis of variance in the sense that it obviates the need for a completely specified parametric model and is applied to rain-rate data from meteorological instruments.
Abstract: We consider m distributions in which the first m − 1 are obtained by multiplicative exponential distortions of the mth distribution, which is a reference. The combined data fromm samples, one from each distribution, are used in the semiparametric large-sample problem of estimating each distortion and the reference distribution and testing the hypothesis that the distributions are identical. The approach generalizes the classical normal-based one-way analysis of variance in the sense that it obviates the need for a completely specified parametric model. An advantage is that the probability density of the reference distribution is estimated from the combined data and not only from the mth sample. A power comparison with the t and F tests and with two nonparametric tests, obtained by means of a simulation, points to the merit of the present approach. The method is applied to rain-rate data from meteorological instruments.

Journal ArticleDOI
TL;DR: In this article, a semiparametric model for large-dimension vector time series whose elements correspond to economic agents is presented, whose dependence between agents' variables is characterized using a spatial model.

Journal ArticleDOI
TL;DR: In this article, a semi-parametric partially generalised linear model for clustered data using estimating equations is considered, where the mean of the outcome variable depends on some covariates parametrically and a cluster-level covariate nonparametrically.
Abstract: We consider estimation in a semiparametric partially generalised linear model for clustered data using estimating equations. A marginal model is assumed where the mean of the outcome variable depends on some covariates parametrically and a cluster-level covariate nonparametrically. A profile-kernel method allowing for working correlation matrices is developed. We show that the nonparametric part of the model can be estimated using standard nonparametric methods, including smoothing-parameter estimation, and the parametric part of the model can be estimated in a profile fashion. The asymptotic distributions of the parameter estimators are derived, and the optimal estimators of both the nonparametric and parametric parts are shown to be obtained when the working correlation matrix equals the actual correlation matrix. The asymptotic covariance matrix of the parameter estimator is consistently estimated by the sandwich estimator. We show that the semiparametric efficient score takes on a simple form and our profile-kernel method is semiparametric efficient. The results for the case where the nonparametric part of the model is an observation-level covariate are noted to be dramatically different.

Journal ArticleDOI
TL;DR: A general class of semiparametric hazards regression models for survival data is proposed and studied in this paper, which includes some popular classes of models as subclasses, such as Cox's proportional hazards model, the accelerated failure time model and a recently proposed class of models called the accelerated hazards model.
Abstract: A general class of semiparametric hazards regression models for survival data is proposed and studied. This general class includes some popular classes of models as subclasses, such as Cox's proportional hazards model, the accelerated failure time model and a recently proposed class of models called the accelerated hazards model. In the general class of models, a covariate's effect is identified as having two separate components, namely a time-scale change on hazard progression and a relative hazards ratio. The new model is flexible in modelling survival data and may yield more accurate prediction of an individual's survival process. By way of the nested structure that includes the proportional hazards model, the accelerated failure time model and the accelerated hazards model, the general class of models may provide a numerical tool for determining which of them is more appropriate for a given dataset.

Journal ArticleDOI
TL;DR: In this article, the proper combination of parametric and nonparametric regression procedures can improve upon the shortcomings of each when used individually, where the researcher has a problem with the quality of the results.
Abstract: The proper combination of parametric and nonparametric regression procedures can improve upon the shortcomings of each when used individually. Considered is the situation where the researcher has a...

Journal ArticleDOI
TL;DR: Simulations and an application to a data set on East–West German migration illustrate similarities and dissimilarities of the estimators and test statistics of the generalized partial linear model.
Abstract: A particular semiparametric model of interest is the generalized partial linear model (GPLM) which extends the generalized linear model (GLM) by a nonparametric component. The paper reviews different estimation procedures based on kernel methods as well as test procedures on the correct specification of this model (vs. a parametric generalized linear model). Simulations and an application to a data set on East–West German migration illustrate similarities and dissimilarities of the estimators and test statistics.

Journal ArticleDOI
TL;DR: In this paper, a novel semiparametric model that can incorporate environmental and fishery data is developed to analyze stock-recruitment relationships. But unlike traditional stock-rewarding models that assume a l...
Abstract: A novel semiparametric model that can incorporate environmental and fishery data is developed to analyze stock–recruitment relationships. Unlike traditional stock–recruitment models that assume a l...

Journal ArticleDOI
TL;DR: In this article, the Dirichlet process prior, centered on a parametric form, is used as a prior distribution on the weight function, without restricting it to be of some particular functional form.
Abstract: Selection models are appropriate when a datum x enters the sample only with probability or weight w(x). It is typically assumed that the weight function w is monotone, but the precise functional form of the weight function is often unknown. In this article, the Dirichlet process prior, centered on a parametric form, is used as a prior distribution on the weight function. This allows for incorporation of knowledge about the weight function, without restricting it to be of some particular functional form. By introducing latent variables related to the selection mechanism, computation via Gibbs sampling can be implemented in the case where the total number of selected and unselected observations, N, is known. When N is unknown, a reversible jump Markov chain sampler is needed to carry out the computations. An important difficulty that can be thought of as “practical nonidentifiability” is revealed, even for selection models in which the weight functions are theoretically identifiable. The proposed solution t...

Journal ArticleDOI
TL;DR: The results show the interesting behavior of the proposed tests for situations where a short-term effect is expected, and an example investigating the impact of progesterone receptors status on local tumor relapse for patients with early breast cancer illustrates the use of the suggested tests.
Abstract: Summary. In the two-sample comparison of survival times with long-term survivors, the overall difference between the two distributions reflects differences occurring in early follow-up for susceptible subjects and in long-term follow-up for nonsusceptible subjects. In this setting, we propose statistics for testing (i) no overall, (ii) no short-term, and (iii) no long-term difference between the two distributions to be compared. The statistics are derived as follows. A semiparametric model is defined that characterizes a short-term effect and a long-term effect. By approximating this model about no difference in early survival, a time-dependent proportional hazards model is obtained. The statistics are obtained from this working model. The asymptotic distributions of the statistics for testing no overall or no short-term effects are ascertained, while that of the statistic for testing no long-term effect is valid only when the short-term effect is small. Simulation studies investigate the power properties of the proposed tests for different configurations. The results show the interesting behavior of the proposed tests for situations where a short-term effect is expected. An example investigating the impact of progesterone receptors status on local tumor relapse for patients with early breast cancer illustrates the use of the proposed tests.

Journal ArticleDOI
TL;DR: In this paper, a weighted semiparametric likelihood method is proposed to fit a proportional odds regression model to data from the case-cohort design proposed by Prentice.
Abstract: The problem of fitting a proportional odds regression model to data from the case-cohort design proposed by Prentice is considered. A weighted semiparametric likelihood method is proposed. Under the proportional odds model, the maximum weighted-semiparametric likelihood estimators of both the regression parameter and the transformation function are shown to be consistent and normally distributed. The applicability of the weighted semiparametric likelihood method to the semiparametric transformation regression models is also discussed. In particular, when the proportional hazards regression model is fitted, estimators proposed by Chen and Lo can be generated by the weighted semiparametric likelihood method under different weighting schemes. A simulation study suggests that the case-cohort design is also useful under the proportional odds regression model and the proposed method performs well with practical finite sample sizes.

Journal ArticleDOI
TL;DR: In this paper, a semi-parametric post-blackening (PB) approach was proposed for periodic streamflows. But the model was only applied to the Beaver and Weber rivers in the US.

Posted Content
TL;DR: In this article, a narrow-band frequency domain least squares estimate of the cointegrating vector, and related semiparametric methods of inference for testing the memory of observables and the presence of fractional cointegration are presented.
Abstract: Fractional cointegration is viewed from a semiparametric viewpoint as a narrow-band phenomenon at frequency zero. We study a narrow-band frequency domain least squares estimate of the cointegrating vector, and related semiparametric methods of inference for testing the memory of observables and the presence of fractional cointegration. These procedures are employed in analysing empirical macroeconomic series; their usefulness and feasibility in finite samples is supported by results of a Monte Carlo experiment.

Journal ArticleDOI
TL;DR: In this paper, the goodness of fit test is used for testing generalized linear models and semiparametric regression models against smooth alternatives against continuous and factorial covariates, and the test is shown to have sqrt(n) power.
Abstract: We propose goodness of fit tests for testing generalized linear models and semiparametric regression models against smooth alternatives. The focus is on models having both, continuous and factorial covariates. As smooth extension of a parametric or semiparametric model we use generalized varying coefficient models as proposed by Hastie&Tibshirani (JRSS B, 1993). A likelihood ratio statistic is used for testing, and asymptotic normality of the test statistic is proven. Due to a slow asymptotic convergence rate a bootstrap approach is pursued. Asymptotic expansions allow to write the estimates as linear smoothers which in turn guarantees simple and fast bootstrapping. The test is shown to have sqrt(n) power, but in contrast to parametric tests it is powerful against smooth alternatives in general.

Journal ArticleDOI
TL;DR: This article approaches the analysis using a Bayesian semiparametric model that combines parametric dose-response relationships with a flexible nonparametric specification of the distribution of the response, obtained via a product of Dirichlet process mixtures approach (PDPM).
Abstract: Modeling of developmental toxicity studies often requires simple parametric analyses of the dose-response relationship between exposure and probability of a birth defect but poses challenges because of nonstandard distributions of birth defects for a fixed level of exposure. This article is motivated by two such experiments in which the distribution of the outcome variable is challenging to both the standard logistic model with binomial response and its parametric multistage elaborations. We approach our analysis using a Bayesian semiparametric model that we tailored specifically to developmental toxicology studies. It combines parametric dose-response relationships with a flexible nonparametric specification of the distribution of the response, obtained via a product of Dirichlet process mixtures approach (PDPM). Our formulation achieves three goals: (1) the distribution of the response is modeled in a general way, (2) the degree to which the distribution of the response adapts nonparametrically to the observations is driven by the data, and (3) the marginal posterior distribution of the parameters of interest is available in closed form. The logistic regression model, as well as many of its extensions such as the beta-binomial model and finite mixture models, are special cases. In the context of the two motivating examples and a simulated example, we provide model comparisons, illustrate overdispersion diagnostics that can assist model specification, show how to derive posterior distributions of the effective dose parameters and predictive distributions of response, and discuss the sensitivity of the results to the choice of the prior distribution.

Journal ArticleDOI
TL;DR: In this paper, a special semiparametric model for a univariate density is introduced that allows analyzing a number of problems via appropriate transformations, such as testing for the presence of a mixture and detecting a wear-out trend in a failure rate.
Abstract: A special semiparametric model for a univariate density is introduced that allows analyzing a number of problems via appropriate transformations. Two problems treated in some detail are testing for the presence of a mixture and detecting a wear-out trend in a failure rate. The analysis of the semiparametric model leads to an approach that advances the maximum likelihood theory of the Grenander estimator to a multiscale analysis. The construction of the corresponding test statistic rests on an extension of a result on a two-sided Brownian motion with quadratic drift to the simultaneous control of “excursions under parabolas” at various scales of a Brownian bridge. The resulting test is shown to be asymptotically optimal in the minimax sense regarding both rate and constant, and adaptive with respect to the unknown parameter in the semiparametric model. The performance of the method is illustrated with a simulation study for the failure rate problem and with data from a flow cytometry experiment for the mixture analysis.

Posted Content
TL;DR: This article proposed two ways of dealing with the problem: (1) Estimate Lorenz curves using parametric models for income distributions, and (2) combine empirical estimation with a parametric (robust) estimation of the upper tail of the distribution using the Pareto model.
Abstract: Lorenz curves and second-order dominance criteria are known to be sensitive to data contamination in the right tail of the distribution. We propose two ways of dealing with the problem: (1) Estimate Lorenz curves using parametric models for income distributions, and (2) Combine empirical estimation with a parametric (robust) estimation of the upper tail of the distribution using the Pareto model. Approach (2) is preferred because of its flexibility. Using simulations we show the dramatic effect of a few contaminated data on the Lorenz ranking and the performance of the robust approach (2). Statistical inference tools are also provided.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed an alternative semiparametric estimator of the parameters of the covariate part, which takes into account censored observations, allows for heterogeneity of unknown form and is quite easy to implement since the estimator does not require numerical searches.
Abstract: Within the framework of the proportional hazard model proposed in Cox (1972), Han and Hausman (1990) consider the logarithm of the integrated baseline hazard function as constant in each time period. We, however, proposed an alternative semiparametric estimator of the parameters of the covariate part. The estimator is considered as semiparametric since no prespecified functional form for the error terms (or certain convolution) is needed. This estimator, proposed in Lewbel (2000) in another context, shows at least four advantages. The distribution of the latent variable error is unknown and may be related to the regressors. It takes into account censored observations, it allows for heterogeneity of unknown form and it is quite easy to implement since the estimator does not require numerical searches. Using the Spanish Labour Force Survey, we compare empirically the results of estimating several alternative models, basically on the estimator proposed in Han and Hausman (1990) and our semiparametric estimator.

Journal ArticleDOI
TL;DR: In this article, the authors generalize linear transformation models to accommodate covariate measurement error and derive inference procedures for the regression coefficients of examining the covariate effects on survival times under a generalized estimating equation framework.
Abstract: In medical studies, patients' biological parameters are often imprecisely measured due to the measuring mechanism or the biological variability. In the presence of covariate measurement error, survival analysis using the Cox model with the observed covariate may yield a biased estimate for the regression parameter. Existing research on this topic has focused on adapting the Cox model to covariates with measurement errors. In this article we generalize linear transformation models to accommodate covariate measurement error. We derive inference procedures for the regression coefficients of examining the covariate effects on survival times under a generalized estimating equation framework. Our method relaxes the normality assumption on the unobserved true covariates and the measurement errors and can be easily adopted to conduct sensitivity analyses when the magnitude of the measurement error variance is unknown. The extra variation owing to the measurement error corrections is accounted for through an asymp...