scispace - formally typeset


Semiparametric model

About: Semiparametric model is a(n) research topic. Over the lifetime, 3297 publication(s) have been published within this topic receiving 116743 citation(s).

More filters
01 Jan 1978
Abstract: Applied nonparametric statistics , Applied nonparametric statistics , مرکز فناوری اطلاعات و اطلاع رسانی کشاورزی

4,239 citations

Journal ArticleDOI
Simon N. Wood1
Abstract: Summary. Recent work by Reiss and Ogden provides a theoretical basis for sometimes preferring restricted maximum likelihood (REML) to generalized cross-validation (GCV) for smoothing parameter selection in semiparametric regression. However, existing REML or marginal likelihood (ML) based methods for semiparametric generalized linear models (GLMs) use iterative REML or ML estimation of the smoothing parameters of working linear approximations to the GLM. Such indirect schemes need not converge and fail to do so in a non-negligible proportion of practical analyses. By contrast, very reliable prediction error criteria smoothing parameter selection methods are available, based on direct optimization of GCV, or related criteria, for the GLM itself. Since such methods directly optimize properly defined functions of the smoothing parameters, they have much more reliable convergence properties. The paper develops the first such method for REML or ML estimation of smoothing parameters. A Laplace approximation is used to obtain an approximate REML or ML for any GLM, which is suitable for efficient direct optimization. This REML or ML criterion requires that Newton–Raphson iteration, rather than Fisher scoring, be used for GLM fitting, and a computationally stable approach to this is proposed. The REML or ML criterion itself is optimized by a Newton method, with the derivatives required obtained by a mixture of implicit differentiation and direct methods. The method will cope with numerical rank deficiency in the fitted model and in fact provides a slight improvement in numerical robustness on the earlier method of Wood for prediction error criteria based smoothness selection. Simulation results suggest that the new REML and ML methods offer some improvement in mean-square error performance relative to GCV or Akaike's information criterion in most cases, without the small number of severe undersmoothing failures to which Akaike's information criterion and GCV are prone. This is achieved at the same computational cost as GCV or Akaike's information criterion. The new approach also eliminates the convergence failures of previous REML- or ML-based approaches for penalized GLMs and usually has lower computational cost than these alternatives. Example applications are presented in adaptive smoothing, scalar on function regression and generalized additive model selection.

3,680 citations

Reference EntryDOI
31 Aug 2012
TL;DR: A statistical generative model called independent component analysis is discussed, which shows how sparse coding can be interpreted as providing a Bayesian prior, and answers some questions which were not properly answered in the sparse coding framework.
Abstract: Independent component models have gained increasing interest in various fields of applications in recent years. The basic independent component model is a semiparametric model assuming that a p-variate observed random vector is a linear transformation of an unobserved vector of p independent latent variables. This linear transformation is given by an unknown mixing matrix, and one of the main objectives of independent component analysis (ICA) is to estimate an unmixing matrix by means of which the latent variables can be recovered. In this article, we discuss the basic independent component model in detail, define the concepts and analysis tools carefully, and consider two families of ICA estimates. The statistical properties (consistency, asymptotic normality, efficiency, robustness) of the estimates can be analyzed and compared via the so called gain matrices. Some extensions of the basic independent component model, such as models with additive noise or models with dependent observations, are briefly discussed. The article ends with a short example. Keywords: blind source separation; fastICA; independent component model; independent subspace analysis; mixing matrix; overcomplete ICA; undercomplete ICA; unmixing matrix

2,976 citations

Journal ArticleDOI
Abstract: Many papers have regressed non-parametric estimates of productive efficiency on environmental variables in two-stage procedures to account for exogenous factors that might affect firms’ performance. None of these have described a coherent data-generating process (DGP). Moreover, conventional approaches to inference employed in these papers are invalid due to complicated, unknown serial correlation among the estimated efficiencies. We first describe a sensible DGP for such models. We propose single and double bootstrap procedures; both permit valid inference, and the double bootstrap procedure improves statistical efficiency in the second-stage regression. We examine the statistical performance of our estimators using Monte Carlo experiments.

2,586 citations

Journal ArticleDOI
Abstract: In applied problems it is common to specify a model for the conditional mean of a response given a set of regressors. A subset of the regressors may be missing for some study subjects either by design or happenstance. In this article we propose a new class of semiparametric estimators, based on inverse probability weighted estimating equations, that are consistent for parameter vector α0 of the conditional mean model when the data are missing at random in the sense of Rubin and the missingness probabilities are either known or can be parametrically modeled. We show that the asymptotic variance of the optimal estimator in our class attains the semiparametric variance bound for the model by first showing that our estimation problem is a special case of the general problem of parameter estimation in an arbitrary semiparametric model in which the data are missing at random and the probability of observing complete data is bounded away from 0, and then deriving a representation for the efficient score...

2,215 citations

Network Information
Related Topics (5)
Nonparametric statistics

19.9K papers, 844.1K citations

91% related

97.3K papers, 2.6M citations

87% related
Linear model

19K papers, 1M citations

85% related
Multivariate statistics

18.4K papers, 1M citations

85% related
Statistical hypothesis testing

19.5K papers, 1M citations

81% related
No. of papers in the topic in previous years