scispace - formally typeset
Search or ask a question

Showing papers on "Semiparametric model published in 2013"


Journal ArticleDOI
TL;DR: An efficient estimation procedure is developed for identifying and estimating the central subspace using a new way of parameterization which reaches the optimal semiparametric efficiency bound.
Abstract: We develop an efficient estimation procedure for identifying and estimating the central subspace. Using a new way of parameterization, we convert the problem of identifying the central subspace to the problem of estimating a finite dimensional parameter in a semiparametric model. This conversion allows us to derive an efficient estimator which reaches the optimal semiparametric efficiency bound. The resulting efficient estimator can exhaustively estimate the central subspace without imposing any distributional assumptions. Our proposed efficient estimation also provides a possibility for making inference of parameters that uniquely identify the central subspace. We conduct simulation studies and a real data analysis to demonstrate the finite sample performance in comparison with several existing methods.

79 citations


Journal ArticleDOI
TL;DR: In this article, a variable selection procedure based on modal regression is proposed, where the nonparametric functions are approximated by a $$B$$676 -spline basis.
Abstract: Semiparametric partially linear varying coefficient models (SPLVCM) are frequently used in statistical modeling. With high-dimensional covariates both in parametric and nonparametric part for SPLVCM, sparse modeling is often considered in practice. In this paper, we propose a new estimation and variable selection procedure based on modal regression, where the nonparametric functions are approximated by $$B$$ -spline basis. The outstanding merit of the proposed variable selection procedure is that it can achieve both robustness and efficiency by introducing an additional tuning parameter (i.e., bandwidth $$h$$ ). Its oracle property is also established for both the parametric and nonparametric part. Moreover, we give the data-driven bandwidth selection method and propose an EM-type algorithm for the proposed method. Monte Carlo simulation study and real data example are conducted to examine the finite sample performance of the proposed method. Both the simulation results and real data analysis confirm that the newly proposed method works very well.

68 citations


Journal ArticleDOI
TL;DR: In this article, the authors derived the contribution of the first-step estimator to the influence function of the second-step nonparametric regression, which is important to account for the dual role that the first step estimator plays in the second step non-parametric regressions, that is, that of conditioning variable and that of argument.
Abstract: We study the asymptotic distribution of three-step estimators of a finite-dimensional parameter vector where the second step consists of one or more nonparametric regressions on a regressor that is estimated in the first step. The first-step estimator is either parametric or nonparametric. Using Newey's (1994) path-derivative method, we derive the contribution of the first-step estimator to the influence function. In this derivation, it is important to account for the dual role that the first-step estimator plays in the second-step nonparametric regression, that is, that of conditioning variable and that of argument.

62 citations


Journal ArticleDOI
TL;DR: An inference algorithm based on an ecient particle Markov chain Monte Carlo method, referred to as particle Gibbs with ancestor sampling, is derived based on a mixed parametric/nonparametric model of a Wiener system.

60 citations


11 Dec 2013
TL;DR: In this article, nonparametric curve estimation was used for the first time in the context of non-parametric estimation of curve shapes, and the results showed that the nonlinearity of the curve can be improved.
Abstract: Nonparametric curve estimation , Nonparametric curve estimation , کتابخانه دیجیتال جندی شاپور اهواز

57 citations


Journal ArticleDOI
TL;DR: This work proposes a semiparametric approach to RTs, specifically, the Cox proportional hazards model with a latent speed covariate with embedded within the hierarchical framework proposed by van der Linden to model the RTs and response accuracy simultaneously.
Abstract: The item response times (RTs) collected from computerized testing represent an underutilized type of information about items and examinees In addition to knowing the examinees’ responses to each i

54 citations


ComponentDOI
TL;DR: In this article, the problem of identifying the central subspace was converted to estimating a finite dimensional parameter in a semiparametric model, and an efficient estimator was derived to reach the optimal efficiency bound.
Abstract: We develop an efficient estimation procedure for identifying and estimating the central subspace. Using a new way of parameterization, we convert the problem of identifying the central subspace to the problem of estimating a finite dimensional parameter in a semiparametric model. This conversion allows us to derive an efficient estimator which reaches the optimal semiparametric efficiency bound. The resulting efficient estimator can exhaustively estimate the central subspace without imposing any distributional assumptions. Our proposed efficient estimation also provides a possibility for making inference of parameters that uniquely identify the central subspace. We conduct simulation studies and a real data analysis to demonstrate the finite sample performance in comparison with several existing methods.

53 citations


Journal ArticleDOI
TL;DR: New expectation–maximization algorithms to analyze current status data under two popular semiparametric regression models: the proportional hazards (PH) model and the proportional odds (PO) model are proposed.
Abstract: We propose new expectation-maximization algorithms to analyze current status data under two popular semiparametric regression models: the proportional hazards (PH) model and the proportional odds (PO) model. Monotone splines are used to model the baseline cumulative hazard function in the PH model and the baseline odds function in the PO model. The proposed algorithms are derived by exploiting a data augmentation based on Poisson latent variables. Unlike previous regression work with current status data, our PH and PO model fitting methods are fast, flexible, easy to implement, and provide variance estimates in closed form. These techniques are evaluated using simulation and are illustrated using uterine fibroid data from a prospective cohort study on early pregnancy.

47 citations


Journal ArticleDOI
TL;DR: A generalized semiparametric SEM is developed that is able to handle mixed data types and to simultaneously model different functional relationships among latent variables using a Bayesian model-comparison statistic called the complete deviance information criterion (DIC).
Abstract: In behavioral, biomedical, and psychological studies, structural equation models (SEMs) have been widely used for assessing relationships between latent variables. Regression-type structural models based on parametric functions are often used for such purposes. In many applications, however, parametric SEMs are not adequate to capture subtle patterns in the functions over the entire range of the predictor variable. A different but equally important limitation of traditional parametric SEMs is that they are not designed to handle mixed data types—continuous, count, ordered, and unordered categorical. This paper develops a generalized semiparametric SEM that is able to handle mixed data types and to simultaneously model different functional relationships among latent variables. A structural equation of the proposed SEM is formulated using a series of unspecified smooth functions. The Bayesian P-splines approach and Markov chain Monte Carlo methods are developed to estimate the smooth functions and the unknown parameters. Moreover, we examine the relative benefits of semiparametric modeling over parametric modeling using a Bayesian model-comparison statistic, called the complete deviance information criterion (DIC). The performance of the developed methodology is evaluated using a simulation study. To illustrate the method, we used a data set derived from the National Longitudinal Survey of Youth.

45 citations


Journal ArticleDOI
TL;DR: In this paper, the authors developed asymptotic theory for weighted likelihood estimators under two-phase stratified sampling without replacement and sampling without replace- ment at the second phase.
Abstract: We develop asymptotic theory for weighted likelihood esti- mators (WLE) under two-phase stratied sampling without replacement. We also consider several variants of WLE's involving estimated weights and calibration. A set of empirical process tools are developed including a Glivenko-Cantelli theorem, a theorem for rates of convergence of Z- estimators, and a Donsker theorem for the inverse probability weighted em- pirical processes under two-phase sampling and sampling without replace- ment at the second phase. Using these general results, we derive asymptotic distributions of the WLE of a nite dimensional parameter in a general semiparametric model where an estimator of a nuisance parameter is es- timable either at regular or non-regular rates. We illustrate these results and methods in the Cox model with right censoring and interval censor- ing. We compare the methods via their asymptotic variances under both sampling without replacement and the more usual (and easier to analyze) assumption of Bernoulli sampling at the second phase. AMS 2000 subject classications: Primary 62E20; secondary 62G20, 62D99, 62N01.

44 citations


Journal ArticleDOI
TL;DR: In this paper, a system of multivariate semiparametric nonlinear time series models is studied with possible dependence structures and nonstationarities in the parametric and nonparametric components.

Journal ArticleDOI
TL;DR: In this paper, the additive isotonic least-squares regression model has been fit using a sequential pooled adjacent violators algorithm, estimating each isotonic component in turn, and looping until convergence.
Abstract: The additive isotonic least-squares regression model has been fit using a sequential pooled adjacent violators algorithm, estimating each isotonic component in turn, and looping until convergence. However, the individual components are not, in general, estimable. The sum of the components, i.e. the expected value of the response, has a unique estimate, which can be found using a single cone projection. Estimators for the individual components are then easily obtained, which are unique if the conditions for estimability hold. Parametrically modelled covariates are easily included in the cone projection specification. The cone structure also provides information about the degrees of freedom of the fit, which can be used in inference methods, variable selection, and estimation of the model variance. Simulations show that these methods can compare favourably to standard parametric methods, even when the parametric assumptions are correct. The estimation and inference methods can be extended to other constrain...

Journal ArticleDOI
TL;DR: In this article, a semiparametric smooth-coefficient (SPSC) stochastic production frontier model where regression coefficients are unknown smooth functions of environmental factors (Z) is proposed.

Journal ArticleDOI
TL;DR: In this paper, the authors apply the density-ratio estimator to obtain the weight function in semi-supervised learning and prove that the estimation accuracy of their method outperforms supervised learning using only labeled data.
Abstract: In this paper we study statistical properties of semi-supervised learning, which is considered to be an important problem in the field of machine learning. In standard supervised learning only labeled data is observed, and classification and regression problems are formalized as supervised learning. On the other hand, in semi-supervised learning, unlabeled data is also obtained in addition to labeled data. Hence, the ability to exploit unlabeled data is important to improve prediction accuracy in semi-supervised learning. This problem is regarded as a semiparametric estimation problem with missing data. Under discriminative probabilistic models, it was considered that unlabeled data is useless to improve the estimation accuracy. Recently, the weighted estimator using unlabeled data achieves a better prediction accuracy compared to the learning method using only labeled data, especially when the discriminative probabilistic model is misspecified. That is, improvement under the semiparametric model with missing data is possible when the semiparametric model is misspecified. In this paper, we apply the density-ratio estimator to obtain the weight function in semi-supervised learning. Our approach is advantageous because the proposed estimator does not require well-specified probabilistic models for the probability of the unlabeled data. Based on statistical asymptotic theory, we prove that the estimation accuracy of our method outperforms supervised learning using only labeled data. Some numerical experiments present the usefulness of our methods.

Book ChapterDOI
01 Jan 2013
TL;DR: In this paper, the authors propose a class of smooth constrained nonparametric and semiparametric frontier estimators that may be particularly appealing to practitioners who require smooth (i.e., continuously differentiable) estimates that, in addition, are consistent with theoretical axioms of production.
Abstract: Production frontiers (i.e., “production functions”) specify the maximum output of firms, industries, or economies as a function of their inputs. A variety of innovative methods have been proposed for estimating both “deterministic” and “stochastic” frontiers. However, existing approaches are either parametric in nature, rely on nonsmooth nonparametric methods, or rely on nonparametric or semiparametric methods that ignore theoretical axioms of production theory, each of which can be problematic. In this chapter we propose a class of smooth constrained nonparametric and semiparametric frontier estimators that may be particularly appealing to practitioners who require smooth (i.e., continuously differentiable) estimates that, in addition, are consistent with theoretical axioms of production.

Journal ArticleDOI
TL;DR: An overview of the additive hazard regression model is given and a data set from a study of the natural history of human papillomavirus (HPV) in HIV-positive and HIV-negative women is applied.
Abstract: There are several statistical methods for time-to-event analysis, among which is the Cox proportional hazards model that is most commonly used However, when the absolute change in risk, instead of the risk ratio, is of primary interest or when the proportional hazard assumption for the Cox proportional hazards model is violated, an additive hazard regression model may be more appropriate In this paper, we give an overview of this approach and then apply a semiparametric as well as a nonparametric additive model to a data set from a study of the natural history of human papillomavirus (HPV) in HIV-positive and HIV-negative women The results from the semiparametric model indicated on average an additional 14 oncogenic HPV infections per 100 woman-years related to CD4 count < 200 relative to HIV-negative women, and those from the nonparametric additive model showed an additional 40 oncogenic HPV infections per 100 women over 5 years of followup, while the estimated hazard ratio in the Cox model was 382 Although the Cox model can provide a better understanding of the exposure disease association, the additive model is often more useful for public health planning and intervention

Journal ArticleDOI
TL;DR: The effect of fragmentation on market quality is nonlinear and non-monotonic, and the implied quality of the market under perfect competition is superior to that under monopoly provision, but the transition between the two is complicated.

Journal ArticleDOI
TL;DR: In this paper, the authors present recent developments in model selection and model averaging for parametric and nonparametric models, where the estimated model is the weighted sum of all the submodels.
Abstract: This paper presents recent developments in model selection and model averaging for parametric and nonparametric models. While there is extensive literature on model selection under parametric settings, we present recently developed results in the context of nonparametric models. In applications, estimation and inference are often conducted under the selected model without considering the uncertainty from the selection process. This often leads to inefficiency in results and misleading confidence intervals. Thus an alternative to model selection is model averaging where the estimated model is the weighted sum of all the submodels. This reduces model uncertainty. In recent years, there has been significant interest in model averaging and some important developments have taken place in this area. We present results for both the parametric and nonparametric cases. Some possible topics for future research are also indicated.

Journal ArticleDOI
TL;DR: A spline-based sieve estimation method which overcomes numerical difficulties encountered in the existing semiparametric maximum likelihood estimation for the unknown nonparametric component in models and produces generally more efficient estimators than the existing method.

Book ChapterDOI
TL;DR: In this article, a selective review on some recent developments of nonparametric methods in both continuous and discrete time finance is given, particularly in the areas of non-parametric estimation and testing of diffusion processes.
Abstract: This paper gives a selective review on some recent developments of nonparametric methods in both continuous and discrete time finance, particularly in the areas of nonparametric estimation and testing of diffusion processes, nonparametric testing of parametric diffusion models, nonparametric pricing of derivatives, nonparametric estimation and hypothesis testing for nonlinear pricing kernel, and nonparametric predictability of asset returns. For each financial context, the paper discusses the suitable statistical concepts, models, and modeling procedures, as well as some of their applications to financial data. Their relative strengths and weaknesses are discussed. Much theoretical and empirical research is needed in this area, and more importantly, the paper points to several aspects that deserve further investigation.

Journal ArticleDOI
TL;DR: This paper studies the identification of a particular case of the 3PL model, namely when the discrimination parameters are all constant and equal to 1, and shows that, after introducing two identification restrictions, the distribution G and the item parameters are identified provided an infinite quantity of items is available.
Abstract: In this paper, we study the identification of a particular case of the 3PL model, namely when the discrimination parameters are all constant and equal to 1. We term this model, 1PL-G model. The identification analysis is performed under three different specifications. The first specification considers the abilities as unknown parameters. It is proved that the item parameters and the abilities are identified if a difficulty parameter and a guessing parameter are fixed at zero. The second specification assumes that the abilities are mutually independent and identically distributed according to a distribution known up to the scale parameter. It is shown that the item parameters and the scale parameter are identified if a guessing parameter is fixed at zero. The third specification corresponds to a semi-parametric 1PL-G model, where the distribution G generating the abilities is a parameter of interest. It is not only shown that, after fixing a difficulty parameter and a guessing parameter at zero, the item parameters are identified, but also that under those restrictions the distribution G is not identified. It is finally shown that, after introducing two identification restrictions, either on the distribution G or on the item parameters, the distribution G and the item parameters are identified provided an infinite quantity of items is available.

Posted Content
TL;DR: In this article, a semiparametric two-step estimator is proposed, which has the same structure as parametric doubly robust estimators in their second step, but retains a fully nonparametric specification in the first step.
Abstract: We study semiparametric two-step estimators which have the same structure as parametric doubly robust estimators in their second step, but retain a fully nonparametric specification in the first step. Such estimators exist in many economic applications, including a wide range of missing data and treatment effect models. We show that these estimators are ?n-consistent and asymptotically normal under weaker than usual conditions on the accuracy of the first stage estimates, have smaller first order bias and second order variance, and that their finite-sample distribution can be approximated more accurately by classical first order asymptotics. We argue that because of these refinements our estimators are useful in many settings where semiparametric estimation and inference are traditionally believed to be unreliable. We also illustrate the practical relevance of our approach through simulations and an empirical application.

Journal ArticleDOI
TL;DR: This paper explores the statistical analysis of interval-censored failure time data with applications and parametric and nonparametric methods of analysis are carried out.
Abstract: The analysis of survival data is a major focus of statistics. Interval censored data reflect uncertainty as to the exact times the units failed within an interval. This type of data frequently comes from tests or situations where the objects of interest are not constantly monitored. Thus events are known only to have occurred between the two observation periods. Interval censoring has become increasingly common in the areas that produce failure time data. This paper explores the statistical analysis of interval-censored failure time data with applications. Three different data sets, namely Breast Cancer, Hemophilia, and AIDS data were used to illustrate the methods during this study. Both parametric and nonparametric methods of analysis are carried out in this study. Theory and methodology of fitted models for the interval-censored data are described. Fitting of parametric and non-parametric models to three real data sets are considered. Results derived from different methods are presented and also compared.

Journal ArticleDOI
TL;DR: Theoretically, the proposed semiparametric extension of the Gaussian bigraphical model outperforms the parametric Gaussian model for non-Gaussian data and is competitive with its parametric counterpart for Gaussian data.
Abstract: In multivariate analysis, a Gaussian bigraphical model is commonly used for modelling matrix-valued data. In this paper, we propose a semiparametric extension of the Gaussian bigraphical model, called the nonparanormal bigraphical model. A projected nonparametric rank-based regularization approach is employed to estimate sparse precision matrices and produce graphs under a penalized likelihood framework. Theoretically, our semiparametric procedure achieves the parametric rates of convergence for both matrix estimation and graph recovery. Empirically, our approach outperforms the parametric Gaussian model for non-Gaussian data and is competitive with its parametric counterpart for Gaussian data. Extensions to the categorical bigraphical model and the missing data problem are discussed. Copyright 2013, Oxford University Press.

Journal ArticleDOI
TL;DR: This paper proposed an estimation method that involves minimizing a weighted negative partial loglikelihood function plus an adaptive lasso penalty, with the initial values obtained from nonparametric maximum likelihood estimation.
Abstract: We study variable selection in general transformation models for right-censored data. The models studied can incorporate external time-varying covariates, and they include the proportional hazards model and the proportional odds model as special cases. We propose an estimation method that involves minimizing a weighted negative partial loglikelihood function plus an adaptive lasso penalty, with the initial values obtained from nonparametric maximum likelihood estimation. The objective function is parametric and convex, so the minimization is easy to implement. We show that our selection has oracle properties and that the estimator is semiparametrically efficient. We demonstrate the small-sample performance of the proposed method via simulations, and we use the method to analyse data from the Atherosclerosis Risk in Communities Study. Copyright 2013, Oxford University Press.

Journal ArticleDOI
TL;DR: In this article, the authors proposed time-varying nonparametric and semiparametric estimators of the conditional cross-correlation matrix in the context of portfolio allocation.
Abstract: This article proposes time-varying nonparametric and semiparametric estimators of the conditional cross-correlation matrix in the context of portfolio allocation. Simulations results show that the nonparametric and semiparametric models are best in DGPs with substantial variability or structural breaks in correlations. Only when correlations are constant does the parametric DCC model deliver the best outcome. The methodologies are illustrated by evaluating two interesting portfolios. The first portfolio consists of the equity sector SPDRs and the S&P 500, while the second one contains major currencies. Results show the nonparametric model generally dominates the others when evaluating in-sample. However, the semiparametric model is best for out-of-sample analysis.

Journal ArticleDOI
TL;DR: In this article, a new family of Bayesian semiparametric models for the conditional distribution of daily stock index returns is introduced, which capture key stylized facts of such returns, namely, heavy tails, asymmetry, volatility clustering, and the leverage effect.
Abstract: This article introduces a new family of Bayesian semiparametric models for the conditional distribution of daily stock index returns. The proposed models capture key stylized facts of such returns, namely, heavy tails, asymmetry, volatility clustering, and the “leverage effect.” A Bayesian nonparametric prior is used to generate random density functions that are unimodal and asymmetric. Volatility is modeled parametrically. The new model is applied to the daily returns of the S&P 500, FTSE 100, and EUROSTOXX 50 indices and is compared with GARCH, stochastic volatility, and other Bayesian semiparametric models.

Journal ArticleDOI
TL;DR: In this article, nonlinear least squares (NLLS) estimators are proposed for semiparametric binary response models under conditional median restrictions, which can be easily implementable using standard software packages such as Stata.
Abstract: In this paper, nonlinear least squares (NLLS) estimators are proposed for semiparametric binary response models under conditional median restrictions. The estimators can be identical to NLLS procedures for parametric binary response models (e.g. Probit), and consequently have the advantage of being easily implementable using standard software packages such as Stata. This is in contrast to existing estimators for the model, such as the maximum score estimator (Manski, 1975, 1985) and the smoothed maximum score (SMS) estimator (Horowitz, 1992). Two simple bias correction methods—a proposed jackknife method and an alternative nonlinear regression function—result in the same rate of convergence as SMS. The results from a Monte Carlo study show that the new estimators perform well in finite samples.

Journal ArticleDOI
TL;DR: A Bayesian semiparametric space-time model where the spatially-temporally varying coefficient is decomposed as fixed, spatially varying, and temporally varying coefficients is proposed that shows its improvement compared with the Bayesian spatial-temporal models with normality assumption on spatial random effects and theBayesian model with the Dirichlet process prior on the random intercept.
Abstract: In spatiotemporal analysis, the effect of a covariate on the outcome usually varies across areas and time. The spatial configuration of the areas may potentially depend on not only the structured random intercept but also spatially varying coefficients of covariates. In addition, the normality assumption of the distribution of spatially varying coefficients could lead to potential biases of estimations. In this article, we proposed a Bayesian semiparametric space-time model where the spatially-temporally varying coefficient is decomposed as fixed, spatially varying, and temporally varying coefficients. We nonparametrically modeled the spatially varying coefficients of space-time covariates by using the area-specific Dirichlet process prior with weights transformed via a generalized transformation. We modeled the temporally varying coefficients of covariates through the dynamic model. We also took into account the uncertainty of inclusion of the spatially-temporally varying coefficients by variable selection procedure through determining the probabilities of different effects for each covariate. The proposed semiparametric approach shows its improvement compared with the Bayesian spatial-temporal models with normality assumption on spatial random effects and the Bayesian model with the Dirichlet process prior on the random intercept. We presented a simulation example to evaluate the performance of the proposed approach with the competing models. We used an application to low birth weight data in South Carolina as an illustration.

Journal ArticleDOI
TL;DR: In this article, a new mixture of regressions model was proposed, which is a generalisation of the semiparametric two-component mixture model studied in Bordes, Delmas, and Vandekerkhove.
Abstract: We introduce in this paper a new mixture of regressions model which is a generalisation of the semiparametric two-component mixture model studied in Bordes, Delmas, and Vandekerkhove [(2006b), ‘Semiparametric Estimation of a Two-component Mixture Model When a Component is Known’, Scandinavian Journal of Statistics, 33, 733–752]. Namely, we consider a two-component mixture of regressions model in which one component is entirely known while the proportion, the slope, the intercept, and the error distribution of the other component are unknown. Our model is said to be semiparametric in the sense that the probability density function (pdf) of the error involved in the unknown regression model cannot be modelled adequately by using a parametric density family. When the pdfs of the errors involved in each regression model are supposed to be zero-symmetric, we propose an estimator of the various (Euclidean and functional) parameters of the model, and establish under mild conditions their almost sure rates of con...